Arithmetic Operation Instruction Processing Patents (Class 712/221)

Floating point or vector (Class 712/222)

Multithreaded processor with efficient processing for convergence device applications

Patent number: 6968445

Abstract: A multithreaded processor includes an instruction decoder for decoding retrieved instructions to determine an instruction type for each of the retrieved instructions, an integer unit coupled to the instruction decoder for processing integer type instructions, and a vector unit coupled to the instruction decoder for processing vector type instructions. A reduction unit is preferably associated with the vector unit and receives parallel data elements processed in the vector unit. The reduction unit generates a serial output from the parallel data elements. The processor may be configured to execute at least control code, digital signal processor (DSP) code, Java code and network processing code, and is therefore well-suited for use in a convergence device. The processor is preferably configured to utilize token triggered threading in conjunction with instruction pipelining.

Type: Grant

Filed: October 11, 2002

Date of Patent: November 22, 2005

Assignee: Sandbridge Technologies, Inc.

Inventors: Erdem Hokenek, Mayan Moudgill, C. John Glossner
Sign generation bypass path to aligner for reducing signed data load latency

Patent number: 6965985

Abstract: A method for reducing signed load latency in a microprocessor has been developed. The method includes transferring a part of data to an aligner via a bypass, and generating a sign bit from the part of the data. The sign bit is transferred to the aligner along the bypass, and the data is separately transferred to the aligner along a data path.

Type: Grant

Filed: November 27, 2001

Date of Patent: November 15, 2005

Assignee: Sun Mirosystems, Inc.

Inventors: David M. Pini, Yuefei Ge, Anup S. Tirumala
Processor for making more efficient use of idling components and program conversion apparatus for the same

Patent number: 6966056

Abstract: A processor that has a plurality of instruction slots each of which stores an instruction to be executed in parallel. One of the plurality of instruction slots is a first instruction slot and another a second instruction slot. A special instruction stored in the first instruction slot is executed by a first functional unit that executes instructions stored in the first instruction slot, and a second functional unit that executes instructions stored in the second instruction slot. An instruction stored in the second instruction slot is executed in parallel by a third functional unit that executes instructions stored in the second instruction slot.

Type: Grant

Filed: March 14, 2001

Date of Patent: November 15, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventor: Kenichi Kawaguchi
Accumulator-based load-store CPU architecture implementation in a programmable logic device

Patent number: 6963966

Abstract: Methods and structures for efficiently implementing an accumulator-based load-store CPU architecture in a programmable logic device (PLD). The PLD includes programmable logic blocks, each logic block including function generators that can be optionally programmed to function as lookup tables or as RAM blocks. Each element of the CPU is implemented using these logic blocks, including an instruction register, an accumulator pointer, a register file, and an operation block. The register file is implemented using function generators configured as RAM blocks. This implementation eliminates the need for time-consuming accesses to an off-chip register file or to a dedicated RAM block.

Type: Grant

Filed: July 30, 2002

Date of Patent: November 8, 2005

Assignee: Xilinx, Inc.

Inventor: Jorge Ernesto Carrillo
Data processing unit, microprocessor, and method for performing an instruction

Patent number: 6961846

Abstract: The present invention relates to a data processing unit for executing instructions stored in a memory comprising a plurality of registers coupled with an execution unit comprising a logic unit for execution of logic operations. The logic unit comprises a first logic operator which can be coupled with a first and second register as an input register and which generates an output bit as a result of a logic operation. It further comprises a Boolean operator which receives the output bit of the first logic operator as a first input and second input bit from a third register which generates an output bit as a result of a Boolean operation.

Type: Grant

Filed: September 12, 1997

Date of Patent: November 1, 2005

Assignee: Infineon Technologies North America Corp.

Inventors: Rod G. Fleck, Karl-Heinz Mattheis
System to perform horizontal additions

Patent number: 6961845

Abstract: A method and apparatus for including in a processor instructions for performing intra-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data. The processor performs operations on data elements in the first packed data to generate a plurality of data elements in a second packed data in response to receiving an instruction. At least two of the plurality of data elements in the second packed data store the result of an intra-add operation on data elements in the first packed data.

Type: Grant

Filed: July 9, 2002

Date of Patent: November 1, 2005

Assignee: Intel Corporation

Inventor: Patrice Roussel
Methods and apparatus to support conditional execution in a VLIW-based array processor with subword execution

Patent number: 6954842

Abstract: General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions.

Type: Grant

Filed: August 28, 2003

Date of Patent: October 11, 2005

Assignee: PTS Corporation

Inventors: Thomas L. Drabenstott, Gerald G. Pechanek, Edwin F. Barry, Charles W. Kurak, Jr.
Fixed point unit pipeline allowing partial instruction execution during the instruction dispatch cycle

Patent number: 6944753

Abstract: A method for allowing a partial instruction to be executed in a fixed point unit pipeline during the instruction dispatch cycle creates a mask used to select which bits of the operands participate in a future logical operation of the fixed point unit back a cycle to the instruction dispatch stage of the fixed point unit. As an S/390 System improvement applicable to other computers, the mask is determined and created two cycles ahead of execution, or two cycles before the mask is actually used. Also, in the method used for moving the mask generation back by one cycle, mask generation overlaps the dispatch stage in the I-unit, and this provides a handshake between the I-unit and E-unit of the fixed point unit of the central processor unit of the computer system. The control setting selection process occurs in a predetermination cycle stage or e-1 (em1) stage for the mask generation and the register file read address.

Type: Grant

Filed: April 11, 2001

Date of Patent: September 13, 2005

Assignee: International Business Machines Corporation

Inventors: Fadi Y. Busaba, Christopher A. Krygowski, Wen H. Li
Predicated execution of instructions in processors

Patent number: 6944853

Abstract: A processor includes a series of predicate registers 135. Each predicate register is switchable between at least respective first and second states and each is assignable to one or more predicated-execution instructions. A control information holding unit 131 holds items of control information which correspond respectively to the predicate registers. An operating unit 133 is provided for each one of the predicate registers and receives items of control information Li and Li+1 and items of state information Pi, Pi?1. Each operating unit is operable to perform a selected state determining operation in which the state of its own predicate register is determined in dependence upon the received items. The operating units operate in parallel with one another to perform respective such state determining operations. The state determining operations can be used to bring about state changes required in prologue, kernel and epilogue stages of a software-pipelined loop.

Type: Grant

Filed: May 22, 2001

Date of Patent: September 13, 2005

Assignee: PTS Corporation

Inventor: Nigel Peter Topham
System and method for encoding constant operands in a wide issue processor

Patent number: 6922773

Abstract: For use in a data processor comprising an instruction execution pipeline comprising N processing stages, a system and method of encoding constant operands is disclosed. The system comprises a constant generator unit that is capable of generating both short constant operands and long constant operands. The constant generator unit extracts the bits of a short constant operand from an instruction syllable and right justifies the bits in an output syllable. For long constant operands, the constant generator unit extracts K low order bits from an instruction syllable and T high order bits from an extension syllable. The right justified K low order bits and the T high order bits are combined to represent the long constant operand in one output syllable. In response to the status of op code bits located within a constant generation instruction, the constant generator unit enables and disables multiplexers to automatically generate the appropriate short or long constant operand.

Type: Grant

Filed: December 29, 2000

Date of Patent: July 26, 2005

Assignees: STMicroelectronics, Inc., Hewlett-Packard Company

Inventors: Paolo Faraboschi, Alexander J. Starr, Anthony X. Jarvis, Geoffrey M. Brown, Mark Owen Homewood, Gary L. Vondran
Method and system for executing conditional instructions using a test register address that points to a test register from which a test code is selected

Patent number: 6918029

Abstract: A method and system of executing computer instructions is described. Each instruction defines first and second operands and an operation to be carried out on said operands. Each instruction also contains an address field of a predetermined bit length which identifies a test register holding a plurality of test bits greater than the predetermined bit length. The test register holds a test code defining a test condition. The test condition is checked against at least one condition code and the operation is selectively carried out in dependence on whether the condition code satisfies the test condition. In one embodiment, the condition codes are set on a lane-by-lane basis for packed operands.

Type: Grant

Filed: January 14, 2003

Date of Patent: July 12, 2005

Assignee: Broadcom Corporation

Inventor: Sophie Wilson
Pipelined processor including a loosely coupled side pipe

Patent number: 6918028

Abstract: A digital data processor having a main pipeline to which a side pipe is loosely coupled. In particular, the side pipe is coupled to the main pipeline at a point after which an instruction entering the side pipe cannot cause an exception. When such an instruction enters the first stage of the side pipe, a copy or “ghost” of this instruction is created. While the actual instruction flows down the side pipe, this ghost instruction is allowed to flow independently down the main pipeline as if it were a non-squashable no-op. When the ghost reaches the retirement stage of the main pipeline, it is retired in normal program order, regardless of the status of the actual instruction. However, in addition, each system resource that is still waiting for a result from the actual instruction is marked appropriately. When the actual instruction finally completes in the side pipe, the only consequence, other than those local to the side pipe itself, is that any results are forwarded to the awaiting resources.

Type: Grant

Filed: March 28, 2000

Date of Patent: July 12, 2005

Assignee: Analog Devices, Inc.

Inventor: David B. Witt
Data processing circuits and interfaces

Patent number: 6901503

Abstract: An integrated circuit contains a microprocessor core, program memory and separate data storage, together with analog and digital signal processing circuitry. The ALU is 16 bits wide, but a 32-bit shift unit is provided, using a pair of 16-bit registers. The processor has a fixed length instruction format, with an instruction set including multiply and divide operations which use the shift unit over several cycles. No interrupts are provided. external pins of the integrated circuit allow for single stepping and other debug operations, and a serial interface (SIF) which allows external communication of test dat or working data as necessary. The serial interface has four wires (SERIN, SEROUT, SERCLK, SERLOADB), allowing handshaking with a master apparatus, and allowing direct access to the memory space of the processor core, without specific program control.

Type: Grant

Filed: October 29, 2001

Date of Patent: May 31, 2005

Assignee: Cambridge Consultants Ltd.

Inventors: Stephen John Barlow, Alistair Guy Morfey, James Digby Collier
Adaptive computing engine with dataflow graph based sequencing in reconfigurable mini-matrices of composite functional blocks

Patent number: 6874079

Abstract: Aspects of a method and system for digital signal processing within an adaptive computing engine are described. These aspects include a mini-matrix, the mini-matrix comprising a set of composite blocks, each composite block capable of executing a predetermined set of instructions. A sequencer is included for controlling the set of composite blocks and directing instructions among the set of composite blocks based on a data-flow graph. Further, a data network is included and transmits data to and from the set of composite blocks and to the sequencer, while a status network routes status word data resulting from instruction execution in the set of composite blocks. With the present invention, an effective combination of hardware resources is provided in a manner that provides multi-bit digital signal processing capabilities for an embedded system environment, particularly in an implementation of an adaptive computing engine.

Type: Grant

Filed: July 25, 2001

Date of Patent: March 29, 2005

Assignee: Quicksilver Technology

Inventor: Eugene B. Hogenauer
Apparatus and method for data processing using multiply-accumalate instructions

Patent number: 6862678

Abstract: An apparatus and a method of data processing system that uses multiply-accumulate instructions. The apparatus for processing data includes, a special register bank of N-bit data processing registers, a general register bank of N-bit data processing registers, a selector, a multiplier and an accumulator. The selector is coupled to the special register bank and the general register bank and is used for selecting one of the special and general register banks and outputting N-bit data from the selected register banks. The outputted N-bit data and the N-bit data held in the general register bank form a 2N-bit addition operand. The multiplier is used for performing multiply operation upon a first operand and a second operand and outputting an 2N-bit result. The accumulator is coupled to the multiplier, the selector and the general register bank and is used for performing accumulate operation upon the 2N-bit result and the 2N-bit addition operand and outputting a 2N-bit accumulated result.

Type: Grant

Filed: November 9, 2000

Date of Patent: March 1, 2005

Assignee: Faraday Technology Corp.

Inventors: Min-Cheng Kao, Ching-Jer Liang, Calvin Guey
Optimal redundant arithmetic for microprocessors design

Patent number: 6848043

Abstract: Methods and apparatus for improving system performance using redundant arithmetic are disclosed. In one embodiment, one or more dependency chains are formed. A dependency chain may comprise of two or more instructions. A first instruction may generate a result in a redundant form. A second instruction may accept the result from the first instruction as a first input operand. The instructions in the dependency chain may execute separately from instructions not in the dependency chain.

Type: Grant

Filed: April 27, 2000

Date of Patent: January 25, 2005

Assignee: Intel Corporation

Inventors: Thomas Y. Yeh, Hong Wang, Ralph Kling, Yong-Fong Lee
Method and apparatus for token triggered multithreading

Patent number: 6842848

Abstract: Techniques for token triggered multithreading in a multithreaded processor are disclosed. An instruction issuance sequence for a plurality of threads of the multithreaded processor is controlled by associating with each of the threads at least one register which stores a value identifying a next thread to be permitted to issue one or more instructions, and utilizing the stored value to control the instruction issuance sequence. For example, each of a plurality of hardware thread units of the multithreaded processor may include a corresponding local register updatable by that hardware thread unit, with the local register for a given one of the hardware thread units storing a value identifying the next thread to be permitted to issue one or more instructions after the given hardware thread unit has issued one or more instructions. A global register arrangement may also or alternatively be used.

Type: Grant

Filed: October 11, 2002

Date of Patent: January 11, 2005

Assignee: Sandbridge Technologies, Inc.

Inventors: Erdem Hokenek, Mayan Moudgill, C. John Glossner
System and method for controlling conditional branching utilizing a control instruction having a reduced word length

Patent number: 6842852

Abstract: An execution control instruction is applied to an information processor of the type processing instructions by pipelining to suppress the occurrence of branch hazard. The execution control instruction contains: a condition field for specifying an execution condition; and an instruction-specifying field for defining, in binary code, the number of instructions to be executed only conditionally. In response to the execution control instruction, a nullification controller decides, based on control flags provided from an arithmetic logic unit, whether or not the execution condition specified by the condition field is satisfied. And based on the outcome of this decision, the controller determines whether or not that number of instructions, which has been defined by the instruction-specifying field for instructions succeeding the execution control instruction, should be nullified.

Type: Grant

Filed: February 8, 2000

Date of Patent: January 11, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Masayuki Yamasaki, Minoru Okamoto
DSP data type matching for operation using multiple functional units

Patent number: 6842850

Abstract: An instruction set architecture (ISA) for application specific signal processor (ASSP) is tailored to digital signal processing applications. The ISA implemented with the ASSP, is adapted to DSP algorithmic structures. The ISA of the present invention includes flexible data typing, permutation, and type matching of operands. The flexible data typing, permutation and type matching of operands provides programming flexibility to support different filtering and DSP algorithms having different types of filter coefficients or data samples. A data typer and aligner within each signal processing unit within the ASSP supports flexible data typing, permutation and type matching of operands of the instruction set architecture.

Type: Grant

Filed: February 25, 2003

Date of Patent: January 11, 2005

Assignee: Intel Corporation

Inventors: Kumar Ganapathy, Ruban Kanapathipillai
Method and apparatus for floating point operations and format conversion operations

Publication number: 20040268094

Abstract: A method and apparatus are described for converting a number from a floating point format to an integer format or from an integer format to a floating point format responsive to a control signal of a control signal format.

Type: Application

Filed: February 14, 2001

Publication date: December 30, 2004

Inventors: Mohammad Abdallah, Prasad Modali, Chien-Yu Huang, Hsien-Cheng E. Hsieh, Thomas R. Huff, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
Adaptive integrated circuitry with heterogeneous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements

Patent number: 6836839

Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.

Type: Grant

Filed: March 22, 2001

Date of Patent: December 28, 2004

Assignee: Quicksilver Technology, Inc.

Inventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
Data packet arithmetic logic devices and methods

Publication number: 20040260914

Abstract: New instruction definitions for a packet add (PADD) operation and for a single instruction multiple add (SMAD) operation are disclosed. In addition, a new dedicated PADD logic device that performs the PADD operation in about one to two processor clock cycles is disclosed. Also, a new dedicated SMAD logic device that performs a single instruction multiple data add (SMAD) operation in about one to two clock cycles is disclosed.

Type: Application

Filed: June 23, 2003

Publication date: December 23, 2004

Inventors: Corey Gee, Bapiraju Vinnakota, Saleem Mohammadali, Carl A. Alberola
Host-fabric adapter and method of connecting a host system to a channel-based switched fabric in a data network

Patent number: 6831916

Abstract: A host system is provided with one or more host-fabric adapters installed therein for connecting to a switched fabric of a data network. The host-fabric adapter comprises a micro-controller subsystem configured to establish connections and support data transfers via the switched fabric, and a serial interface which provides an interface with the switched fabric. The micro-controller subsystem includes a Micro-Engine (ME) which executes a ME instruction to send source and destination addresses during a control cycle, and interface logic blocks which supply addressed data from designated sources to the Micro-Engine (ME) at the same time for execution of the ME instruction during a data cycle subsequent to the control cycle.

Type: Grant

Filed: September 28, 2000

Date of Patent: December 14, 2004

Inventors: Balaji Parthasarathy, Dominic J. Gasbarro, Tom E. Burton, Brian M. Leitner
Processor core for using external extended arithmetic unit efficiently and processor incorporating the same

Patent number: 6832117

Abstract: A processor core for realizing efficient operation processing by connecting an extended arithmetic unit to its exterior and a processor incorporating such a processing core are provided. The processor includes the processor core, a data memory accessed by the processor core, and the extended arithmetic unit connected to the exterior of the processor core for processing a particular instruction. The extended arithmetic unit executes an arithmetic operation by using arithmetic operation data retained in a register file in the processor core, and directly outputs an arithmetic operation result to the processor core. Then, the processor core saves the result of the arithmetic operation executed by the extended arithmetic unit and inputted therefrom in the register file in the processor core.

Type: Grant

Filed: September 21, 2000

Date of Patent: December 14, 2004

Assignee: Kabushiki Kaisha Toshiba

Inventor: Takashi Miyamori
Method of translating a source operation to a target operation, computer program, storage medium, computer and translation

Patent number: 6817012

Abstract: A method is provided for translating a source operation to a target operation. The source operation acts on one or more source operands, each comprising a binary integer of a first bit-width. The target operation is required to be evaluated by a processor, such as a computer, which performs integer operations on binary integers of a second bit-width which is greater than first bit-width. The source operation is translated to a target operation having at least one target operand. The method identifies whether the value of unused bits of the or each target operand affects the value of the target operation and whether the target operand or any of the target operands is capable of having one or more unused bits of inappropriate value. If so, a correcting operation is added to the target operation for correcting the value of each of the bits of inappropriate value before performing the target operation.

Type: Grant

Filed: October 2, 2000

Date of Patent: November 9, 2004

Assignee: Sharp Kabushiki Kaisha

Inventors: Vincent Zammit, Andrew Kay
Efficient complex multiplication and fast fourier transform (FFT) implementation on the ManArray architecture

Publication number: 20040221137

Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.

Type: Application

Filed: June 3, 2004

Publication date: November 4, 2004

Applicant: PTS Corporation

Inventors: Nikos P. Pitsianis, Gerald G. Pechanek, Ricardo E. Rodriguez
Processor, compiling apparatus, and compile program recorded on a recording medium

Publication number: 20040215940

Abstract: Each of registers R0 to R31 is divided into the upper 32-bit area and the lower 32-bit area. A register writing control unit 431 outputs information to the selectors 4321 and 4322 on the registers and the locations (upper and lower areas) in which data is written by the instructions that have issued in one cycle. Each of the selectors 4321 and 4322 selects one out of pieces of data that have been output from first, second, and third arithmetic operation units 44, 45, and 46 and writes the selected data in the upper or lower area in one register. A dependency analysis unit 110 in a compiling apparatus considers the upper and lower registers in one 64-bit register as separate resources, analyzes the data dependency relations between the instructions, and generates a dependency graph that indicates the data dependency relations. A instruction rearrangement unit 111 rearranges the instructions and generates execution codes using the dependency graph.

Type: Application

Filed: May 17, 2004

Publication date: October 28, 2004

Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.

Inventors: Taketo Heishi, Kensuke Odani
Multithreaded programmable processor and system with partitioned operations

Publication number: 20040210745

Abstract: A programmable processor and method for improving the performance of processors by incorporating an execution unit configurable to execute a plurality of instruction streams from the plurality of threads, wherein each instruction stream includes a group instruction that operates on a plurality of data elements in partitioned fields of at least one of the registers to produce a catenated result.

Type: Application

Filed: January 16, 2004

Publication date: October 21, 2004

Applicant: MICROUNITY SYSTEMS ENGINEERING, INC.

Inventors: Craig Hansen, John Moussouris
Method and apparatus for efficiently generating, storing, and consuming arithmetic flags between producing and consuming macroinstructions when emulating with microinstructions

Patent number: 6807625

Abstract: An apparatus and method for efficiently generating arithmetic flags in a computer system. The system includes an eflags register to stored partially computed flags computed by an arithmetic logic unit. The stored partial flags are computed in one cycle. The stored flags are decoded by one of two consuming instructions, PRODF or TBIT, in a second cycle.

Type: Grant

Filed: February 18, 2000

Date of Patent: October 19, 2004

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Patrick Knebel, Mark Gibson, Rohit Bhatia, Kevin David Safford
Programmable processor and method for partitioned group element selection operation

Publication number: 20040205323

Abstract: A programmable processor and method for improving the performance of processors by incorporating an execution unit operable to decode and execute single instructions specifying a data selection operand and a first and a second register providing a plurality of data elements, the data selection operand comprising a plurality of fields each selecting one of the plurality of data elements, the execution unit operable to provide the data element selected by each field of the data selection operand to a predetermined position in a catenated result.

Type: Application

Filed: January 15, 2004

Publication date: October 14, 2004

Applicant: MICROUNITY SYSTEMS ENGINEERING, INC.

Inventors: Craig Hansen, John Moussouris
Programmable processor with group floating-point operations

Publication number: 20040199750

Abstract: A programmable processor that comprises a general purpose processor architecture, capable of operation independent of another host processor, having a virtual memory addressing unit, an instruction path and a data path; an external interface; a cache operable to retain data communicated between the external interface and the data path; at least one register file configurable to receive and store data from the data path and to communicate the stored data to the data path; and a multi-precision execution unit coupled to the data path. The multi-precision execution unit is configurable to dynamically partition data received from the data path to account for an elemental width of the data and is capable of performing group floating-point operations on multiple operands in partitioned fields of operand registers and returning catenated results. In other embodiments the multi-precision execution unit is additionally configurable to execute group integer and/or group data handling operations.

Type: Application

Filed: August 25, 2003

Publication date: October 7, 2004

Applicant: MICRO UNITY SYSTEMS ENGINEERING, INC.

Inventors: Craig Hansen, John Moussouris
Apparatus and method for generating packed sum of absolute differences

Publication number: 20040199751

Abstract: An apparatus for performing an MMX PSADBW instruction is disclosed. The apparatus includes carry-generating subtraction logic that generates packed differences of the subtrahend from the minuend and associated carry bits indicating whether the difference is positive or negative. The apparatus selectively inverts the differences based on the carry bits. Addition logic adds the selectively inverted differences and carry bits substantially in parallel to generate the PSADBW instruction result. In one embodiment, the apparatus also includes two muxes. The first mux selects the selectively inverted differences in the case of a PSADBW instruction and selects a multiply instruction's partial products otherwise. The second mux selects the carry bits in the case of a PSADBW instruction and selects a second multiply instruction's partial products otherwise. The two mux outputs are provided to the addition logic.

Type: Application

Filed: January 27, 2004

Publication date: October 7, 2004

Applicant: VIA Technologies, Inc.

Inventors: Daniel W.J. Johnson, Albert J. Loper
Intra-register subword-add instructions

Publication number: 20040193847

Abstract: Intra-register subword add instructions yield results that are a function of a sum having as at least some of its addends unary functions of at least two subwords stored in the same register. For example, one “TreeAdd” instruction yields a sum of all subwords in a register. A “parallel accumulate” PAcc instruction yields a result with four 2-byte result subwords. Each result subword is the sum of 2-byte value in a first operand register and two of eight 1-byte subwords in a second operand register. A “Parallel Accumulate Magnitude” PAccMagLR also yields a result with four 2-byte subwords. Each of these subwords is the sum of a 2-byte value in a first operand register and the absolute values of two 1-byte values in a second operand register. These instructions provide for substantial performance enhancements for motion estimation used in video compression.

Type: Application

Filed: March 31, 2003

Publication date: September 30, 2004

Inventors: Ruby B. Lee, Dale Morris
Packet processor

Patent number: 6799267

Abstract: A packet processor having a general-purpose arithmetic operator and another dedicated circuit, which extracts a particular field from the general-purpose register as object field, on which the predetermined general-purpose arithmetic operation is to be performed by the general-purpose arithmetic operator and writes a result of the arithmetic operation by the general-purpose arithmetic operator into the general-purpose register as updated information of the particular field. Based on the extraction and write process of the packet field designated by software (instructions), the packet processor realizes high flexibility and high speed processing.

Type: Grant

Filed: December 20, 2000

Date of Patent: September 28, 2004

Assignee: Fujitsu Limited

Inventors: Yuji Kojima, Tetsumei Tsuruoka, Kenichi Abiru, Yasuyuki Umezaki, Yoshitomo Shimozono
Soft error detection in high speed microprocessors

Patent number: 6785847

Abstract: Aspects for soft error detection for a superscalar microprocessor are described. The aspects include a first pipeline, the first pipeline including a first arithmetic logic unit, ALU, comparator and a first general purpose register, GPR, for storing first data, and a second pipeline, the second pipeline including a second GPR and a second ALU comparator, the second GPR for storing second data, the second data being a copy of the first data. A detection system utilizes one of the first and second ALU comparators to perform a comparison of the second data with the first data during an idle state of the first and second pipelines.

Type: Grant

Filed: August 3, 2000

Date of Patent: August 31, 2004

Assignee: International Business Machines Corporation

Inventors: Paul J. Jordan, Peter J. Klim
Apparatus and method for an enhanced integer divide in an IA64 architecture

Patent number: 6779106

Abstract: An apparatus and method for performing integer divide operations in an IA64 architecture based data processing system is provided. The apparatus and method insert integer divide checks in place of NOP instructions in the instruction bundles associated with integer divide operations. The checks serve to identify typically encountered integer divide operations. Based on such identifications, the integer divide operation may be short-circuited such that the appropriate result may be returned without having to complete the integer divide operation.

Type: Grant

Filed: September 28, 2000

Date of Patent: August 17, 2004

Assignee: International Business Machines Corporation

Inventor: Geoffrey Owen Blandy
Method and software for partitioned group element selection operation

Publication number: 20040153632

Abstract: A system and software for improving the performance of processors by incorporating an execution unit operable to decode and execute single instructions specifying a data selection operand and a first and a second register providing a plurality of data elements, the data selection operand comprising a plurality of fields each selecting one of the plurality of data elements, the execution unit operable to provide the data element selected by each field of the data selection operand to a predetermined position in a catenated result.

Type: Application

Filed: January 16, 2004

Publication date: August 5, 2004

Applicant: MICROUNITY SYSTEMS ENGINEERING, INC.

Inventors: Craig Hansen, John Moussouris
Floating point unit pipeline synchronized with processor pipeline

Patent number: 6772327

Abstract: An FPU pipeline is synchronized with a CPU pipeline. Synchronization is achieved by having stalls and freezes in any one pipeline cause stalls and freezes in the other pipeline as well. Exceptions are kept precise even for long floating point operations. Precise exceptions are achieved by having a first execution stage of the FPU pipeline generate a busy signal, when a first floating point instruction enters a first execution stage of the FPU pipeline. When a second floating point instruction is decoded by the FPU pipeline before the first floating point instruction has finished executing in the first stage of the FPU pipeline, then both pipelines are stalled.

Type: Grant

Filed: May 9, 2002

Date of Patent: August 3, 2004

Assignee: Hitachi Micro Systems, Inc.

Inventors: Prasenjit Biswas, Gautam Dewan, Kevin Iadonato, Norio Nakagawa, Kunio Uchiyama
Dyadic instruction processing instruction set architecture with 20-bit and 40-bit DSP and control instructions

Patent number: 6772319

Abstract: An instruction set architecture (ISA) to convert voice and data samples into packets for transmission over a network and to convert packets received from the network into voice and data samples. In one embodiment, the ISA includes a digital signal processing (DSP) instruction set architecture for a plurality of signal processing units and a control instruction set architecture to control the execution of DSP instructions by the plurality of signal processing units. In another embodiment, the ISA includes a plurality of DSP instructions including a 20-bit DSP instruction and a 40-bit DSP instruction and a plurality of control instructions to control execution of the plurality of DSP instructions including a 20-bit control instruction and a 40-bit control instruction. The DSP instructions may be dyadic DSP instructions including a main DSP operation and a sub DSP operation.

Type: Grant

Filed: August 8, 2002

Date of Patent: August 3, 2004

Assignee: Intel Corporation

Inventors: Kumar Ganapathy, Ruban Kanapathipillai
System and method for multi-type instruction set architecture

Publication number: 20040128486

Abstract: A method and system including transmitting data in an architectural format between execution units in a multi-type instruction set architecture and converting data received in the architectural format to an internal format and data output in the internal format to the architectural format based on an operation code and a data type of a microinstruction.

Type: Application

Filed: December 31, 2002

Publication date: July 1, 2004

Inventors: Zeev Sperber, Ittai Anati, Oded Liron, Mohammad Abdallah
Processor

Patent number: 6757813

Abstract: In a processor executing plural instructions simultaneously, writin-destination-register numbers of the plural instructions to be executed simultaneously are compared, and kinds of operations to be executed by the plural instructions are changed in response to a comparison result. When the writing-destination-register numbers of the plural instructions are identical, a constant operation is applied to plural operation results obtained from the plural instructions to obtain an operation result and the operation result is written into the writing-destination-register instructed by the plural instructions. Results outputted from plural processing units are put together into one result and the result is stored in one register. Thus, register use efficiency and process efficiency are improved.

Type: Grant

Filed: June 23, 2000

Date of Patent: June 29, 2004

Assignee: NEC Corporation

Inventor: Hiroyuki Igura
Decompression bit processing with a general purpose alignment tool

Patent number: 6757820

Abstract: A method and apparatus for performing single-instruction bit field extraction and for counting a number of leading zeros in a sequence of bits on a general purpose processor are provided. The fast bit extraction operations are accomplished by executing a first instruction for extracting an arbitrary number of bits of a sequence of bits stored in two or more source registers of the processor starting at an arbitrary offset and the storing the extracted bits in a destination register. Both the source and the destination registers are specified by the instruction. In addition, a second instruction is provided for counting the number of leading zeros in a sequence of bits stored in two or more source registers of the processor and then storing a binary value representing the number of leading zeros in a destination register. Again the source and the destination registers are specified by the second instruction.

Type: Grant

Filed: January 31, 2003

Date of Patent: June 29, 2004

Assignee: Sun Microsystems, Inc.

Inventors: Subramania Sudharsanan, Jeffrey Meng Wah Chan, Marc Tremblay
Instruction set for bi-directional conversion and transfer of integer and floating point data

Patent number: 6754810

Abstract: An apparatus and method for bi-directional format conversion and transfer of data between integer and floating point registers is provided. A floating point register is configured to store floating point data, and integer data, in a variety of numerical formats. Data is moved in and out of the floating point register as integer data, and is converted into floating point format as needed. Separate processor instructions are provided for format conversion and data transfer to allow conversion and transfer operations to be separated.

Type: Grant

Filed: April 10, 2002

Date of Patent: June 22, 2004

Assignee: I.P.-First, L.L.C.

Inventors: Timothy A. Elliott, G. Glenn Henry
General-purpose processor that can rapidly perform binary polynomial arithmetic operations

Publication number: 20040117601

Abstract: One embodiment of the invention is a general-purpose processor. The general-purpose processor is configured to receive and execute instructions. The processor includes an integer execution unit. The processor also includes a binary polynomial execution unit.

Type: Application

Filed: December 12, 2002

Publication date: June 17, 2004

Inventors: Lawrence A. Spracklen, Sheueling Chang Shantz
Methods and apparatuses to clear state for operation of a stack

Patent number: 6751725

Abstract: Methods and apparatuses to clear state for operation of a stack. According to one embodiment of the invention, a processor comprises a set of one or more storage areas and a decode unit. The set of one or more storage areas are to store a plurality of tags and a top of stack indication, where each of the plurality of tags is to indicate if a register is in an empty or non-empty state. The decode unit is to decode scalar floating point instructions and packed data instructions, where at least certain of said scalar floating point instructions specify registers in a stack referenced manner and at least certain of said packed data instructions specify registers in a non-stack referenced manner. In addition, the packed data instructions include an instruction to mark the end of blocks of the packed data instructions in programs. The processor also comprises circuitry to cause the plurality of tags to indicate the empty state responsive to execution of the instruction.

Type: Grant

Filed: February 16, 2001

Date of Patent: June 15, 2004

Assignee: Intel Corporation

Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
Microprocessor with instruction for saturating and packing data

Patent number: 6748521

Abstract: A data processing system is provided with a digital signal processor which has an instruction for saturating multiple fields of a selected set of source operands and storing the separate saturated results in a selected destination register. A first 32-bit operand (600) and a second 32-bit operand (602) are treated as four 16-bit fields and the sixteen bits in each field are saturated separately. Multi-field saturation circuitry is operable to treat a source operand as a number of fields, such that a multi-field saturated (610) result is produced that includes a number of saturated results each corresponding to each field. One instruction is provided which treats an operand pair as having two packed fields, and another instruction is provided that treats the operand pair has having four packed fields. Saturation circuitry is operable to selectively treat a field as either a signed value or an unsigned value.

Type: Grant

Filed: October 31, 2000

Date of Patent: June 8, 2004

Assignee: Texas Instruments Incorporated

Inventor: David Hoyle
Method and apparatus for instruction set architecture to perform primary and shadow digital signal processing sub-instructions simultaneously

Patent number: 6748516

Abstract: Disclosed is a method, apparatus, and an instruction set architecture (ISA) for an application specific signal processor (ASSP) tailored to digital signal processing (DSP) applications. A single DSP instruction includes a pair of sub-instructions: a primary DSP sub-instruction and a shadow DSP sub-instruction. Both the primary and the shadow DSP sub-instructions are dyadic DSP instructions performing two operations in one instruction cycle. Each signal processing unit of the ASSP includes a primary stage to execute a primary DSP sub-instruction based upon current data and a shadow stage to simultaneously execute a shadow DSP sub-instruction based upon delayed data stored locally within registers of the signal processing units. The present invention efficiently executes DSP instructions by simultaneously executing primary DSP sub-instructions (based upon current data) and shadow DSP sub-instructions (based upon delayed locally stored data) with a single DSP instruction.

Type: Grant

Filed: January 29, 2002

Date of Patent: June 8, 2004

Assignee: Intel Corporation

Inventors: Kumar Ganapathy, Ruban Kanapathipillai
Microprocessor with instructions for shuffling and dealing data

Patent number: 6745319

Abstract: A data processing system is provided with a digital signal processor (DSP) which has a shuffle instruction for shuffling a source operand (600) and storing the shuffled result in a selected destination register (610). A shuffled result is formed by interleaving bits from a first source operand portion with bits from a second operand portion. A de-interleave and pack (DEAL) instruction is provided for de-interleaving a source operand. The shuffle instruction and the DEAL instruction have an exactly inverse effect. The DSP includes swizzle circuitry that performs interleaving or de-interleaving in a single execution phase.

Type: Grant

Filed: October 31, 2000

Date of Patent: June 1, 2004

Assignee: Texas Instruments Incorporated

Inventors: Keith Balmer, David Hoyle, Lewis Nardini
Lookahead register value tracking

Patent number: 6742112

Abstract: Apparatus and methods to track a register value. A microprocessor can include a first register, a control circuit, and an adder. The first register can store a tracked register value. The control circuit can include an instruction input to receive at least a portion of an instruction and a first output to output an arithmetic operation indication. The adder can include a control input to receive the arithmetic operation indication, a first input to receive an immediate operand of an instruction, and a second input to receive the tracked register value.

Type: Grant

Filed: December 29, 1999

Date of Patent: May 25, 2004

Assignee: Intel Corporation

Inventors: Adi Yoaz, Ronny Ronen, Stephan J. Jourdan, Michael Bekerman
Processor having a conditional branch extension of an instruction set architecture

Patent number: 6732259

Abstract: A processor having a conditional branch extension of an instruction set architecture which incorporates a set of high performance floating point operations. The instruction set architecture incorporates a variety of data formats including single precision and double precision data formats, as well as the paired-single data format that allows two simultaneous operations on a pair of operands. The extension includes instructions directed to branching if, for example, either one of two condition codes is false or true, if any of three condition codes are false or true, or if any one of four condition codes are false or true.

Type: Grant

Filed: July 30, 1999

Date of Patent: May 4, 2004

Assignee: MIPS Technologies, Inc.

Inventors: Radhika Thekkath, G. Michael Uhler, Ying-wai Ho, Chandlee B. Harrell

prev … 7 8 9 10 11 12 13 14 next