Arithmetic Operation Instruction Processing Patents (Class 712/221)
  • Patent number: 6968445
    Abstract: A multithreaded processor includes an instruction decoder for decoding retrieved instructions to determine an instruction type for each of the retrieved instructions, an integer unit coupled to the instruction decoder for processing integer type instructions, and a vector unit coupled to the instruction decoder for processing vector type instructions. A reduction unit is preferably associated with the vector unit and receives parallel data elements processed in the vector unit. The reduction unit generates a serial output from the parallel data elements. The processor may be configured to execute at least control code, digital signal processor (DSP) code, Java code and network processing code, and is therefore well-suited for use in a convergence device. The processor is preferably configured to utilize token triggered threading in conjunction with instruction pipelining.
    Type: Grant
    Filed: October 11, 2002
    Date of Patent: November 22, 2005
    Assignee: Sandbridge Technologies, Inc.
    Inventors: Erdem Hokenek, Mayan Moudgill, C. John Glossner
  • Patent number: 6965985
    Abstract: A method for reducing signed load latency in a microprocessor has been developed. The method includes transferring a part of data to an aligner via a bypass, and generating a sign bit from the part of the data. The sign bit is transferred to the aligner along the bypass, and the data is separately transferred to the aligner along a data path.
    Type: Grant
    Filed: November 27, 2001
    Date of Patent: November 15, 2005
    Assignee: Sun Mirosystems, Inc.
    Inventors: David M. Pini, Yuefei Ge, Anup S. Tirumala
  • Patent number: 6966056
    Abstract: A processor that has a plurality of instruction slots each of which stores an instruction to be executed in parallel. One of the plurality of instruction slots is a first instruction slot and another a second instruction slot. A special instruction stored in the first instruction slot is executed by a first functional unit that executes instructions stored in the first instruction slot, and a second functional unit that executes instructions stored in the second instruction slot. An instruction stored in the second instruction slot is executed in parallel by a third functional unit that executes instructions stored in the second instruction slot.
    Type: Grant
    Filed: March 14, 2001
    Date of Patent: November 15, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Kenichi Kawaguchi
  • Patent number: 6963966
    Abstract: Methods and structures for efficiently implementing an accumulator-based load-store CPU architecture in a programmable logic device (PLD). The PLD includes programmable logic blocks, each logic block including function generators that can be optionally programmed to function as lookup tables or as RAM blocks. Each element of the CPU is implemented using these logic blocks, including an instruction register, an accumulator pointer, a register file, and an operation block. The register file is implemented using function generators configured as RAM blocks. This implementation eliminates the need for time-consuming accesses to an off-chip register file or to a dedicated RAM block.
    Type: Grant
    Filed: July 30, 2002
    Date of Patent: November 8, 2005
    Assignee: Xilinx, Inc.
    Inventor: Jorge Ernesto Carrillo
  • Patent number: 6961846
    Abstract: The present invention relates to a data processing unit for executing instructions stored in a memory comprising a plurality of registers coupled with an execution unit comprising a logic unit for execution of logic operations. The logic unit comprises a first logic operator which can be coupled with a first and second register as an input register and which generates an output bit as a result of a logic operation. It further comprises a Boolean operator which receives the output bit of the first logic operator as a first input and second input bit from a third register which generates an output bit as a result of a Boolean operation.
    Type: Grant
    Filed: September 12, 1997
    Date of Patent: November 1, 2005
    Assignee: Infineon Technologies North America Corp.
    Inventors: Rod G. Fleck, Karl-Heinz Mattheis
  • Patent number: 6961845
    Abstract: A method and apparatus for including in a processor instructions for performing intra-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data. The processor performs operations on data elements in the first packed data to generate a plurality of data elements in a second packed data in response to receiving an instruction. At least two of the plurality of data elements in the second packed data store the result of an intra-add operation on data elements in the first packed data.
    Type: Grant
    Filed: July 9, 2002
    Date of Patent: November 1, 2005
    Assignee: Intel Corporation
    Inventor: Patrice Roussel
  • Patent number: 6954842
    Abstract: General purpose flags (ACFs) are defined and encoded utilizing a hierarchical one-, two- or three-bit encoding. Each added bit provides a superset of the previous functionality. With condition combination, a sequential series of conditional branches based on complex conditions may be avoided and complex conditions can then be used for conditional execution. ACF generation and use can be specified by the programmer. By varying the number of flags affected, conditional operation parallelism can be widely varied, for example, from mono-processing to octal-processing in VLIW execution, and across an array of processing elements (PE)s. Multiple PEs can generate condition information at the same time with the programmer being able to specify a conditional execution in one processor based upon a condition generated in a different processor using the communications interface between the processing elements to transfer the conditions.
    Type: Grant
    Filed: August 28, 2003
    Date of Patent: October 11, 2005
    Assignee: PTS Corporation
    Inventors: Thomas L. Drabenstott, Gerald G. Pechanek, Edwin F. Barry, Charles W. Kurak, Jr.
  • Patent number: 6944753
    Abstract: A method for allowing a partial instruction to be executed in a fixed point unit pipeline during the instruction dispatch cycle creates a mask used to select which bits of the operands participate in a future logical operation of the fixed point unit back a cycle to the instruction dispatch stage of the fixed point unit. As an S/390 System improvement applicable to other computers, the mask is determined and created two cycles ahead of execution, or two cycles before the mask is actually used. Also, in the method used for moving the mask generation back by one cycle, mask generation overlaps the dispatch stage in the I-unit, and this provides a handshake between the I-unit and E-unit of the fixed point unit of the central processor unit of the computer system. The control setting selection process occurs in a predetermination cycle stage or e-1 (em1) stage for the mask generation and the register file read address.
    Type: Grant
    Filed: April 11, 2001
    Date of Patent: September 13, 2005
    Assignee: International Business Machines Corporation
    Inventors: Fadi Y. Busaba, Christopher A. Krygowski, Wen H. Li
  • Patent number: 6944853
    Abstract: A processor includes a series of predicate registers 135. Each predicate register is switchable between at least respective first and second states and each is assignable to one or more predicated-execution instructions. A control information holding unit 131 holds items of control information which correspond respectively to the predicate registers. An operating unit 133 is provided for each one of the predicate registers and receives items of control information Li and Li+1 and items of state information Pi, Pi?1. Each operating unit is operable to perform a selected state determining operation in which the state of its own predicate register is determined in dependence upon the received items. The operating units operate in parallel with one another to perform respective such state determining operations. The state determining operations can be used to bring about state changes required in prologue, kernel and epilogue stages of a software-pipelined loop.
    Type: Grant
    Filed: May 22, 2001
    Date of Patent: September 13, 2005
    Assignee: PTS Corporation
    Inventor: Nigel Peter Topham
  • Patent number: 6922773
    Abstract: For use in a data processor comprising an instruction execution pipeline comprising N processing stages, a system and method of encoding constant operands is disclosed. The system comprises a constant generator unit that is capable of generating both short constant operands and long constant operands. The constant generator unit extracts the bits of a short constant operand from an instruction syllable and right justifies the bits in an output syllable. For long constant operands, the constant generator unit extracts K low order bits from an instruction syllable and T high order bits from an extension syllable. The right justified K low order bits and the T high order bits are combined to represent the long constant operand in one output syllable. In response to the status of op code bits located within a constant generation instruction, the constant generator unit enables and disables multiplexers to automatically generate the appropriate short or long constant operand.
    Type: Grant
    Filed: December 29, 2000
    Date of Patent: July 26, 2005
    Assignees: STMicroelectronics, Inc., Hewlett-Packard Company
    Inventors: Paolo Faraboschi, Alexander J. Starr, Anthony X. Jarvis, Geoffrey M. Brown, Mark Owen Homewood, Gary L. Vondran
  • Patent number: 6918029
    Abstract: A method and system of executing computer instructions is described. Each instruction defines first and second operands and an operation to be carried out on said operands. Each instruction also contains an address field of a predetermined bit length which identifies a test register holding a plurality of test bits greater than the predetermined bit length. The test register holds a test code defining a test condition. The test condition is checked against at least one condition code and the operation is selectively carried out in dependence on whether the condition code satisfies the test condition. In one embodiment, the condition codes are set on a lane-by-lane basis for packed operands.
    Type: Grant
    Filed: January 14, 2003
    Date of Patent: July 12, 2005
    Assignee: Broadcom Corporation
    Inventor: Sophie Wilson
  • Patent number: 6918028
    Abstract: A digital data processor having a main pipeline to which a side pipe is loosely coupled. In particular, the side pipe is coupled to the main pipeline at a point after which an instruction entering the side pipe cannot cause an exception. When such an instruction enters the first stage of the side pipe, a copy or “ghost” of this instruction is created. While the actual instruction flows down the side pipe, this ghost instruction is allowed to flow independently down the main pipeline as if it were a non-squashable no-op. When the ghost reaches the retirement stage of the main pipeline, it is retired in normal program order, regardless of the status of the actual instruction. However, in addition, each system resource that is still waiting for a result from the actual instruction is marked appropriately. When the actual instruction finally completes in the side pipe, the only consequence, other than those local to the side pipe itself, is that any results are forwarded to the awaiting resources.
    Type: Grant
    Filed: March 28, 2000
    Date of Patent: July 12, 2005
    Assignee: Analog Devices, Inc.
    Inventor: David B. Witt
  • Patent number: 6901503
    Abstract: An integrated circuit contains a microprocessor core, program memory and separate data storage, together with analog and digital signal processing circuitry. The ALU is 16 bits wide, but a 32-bit shift unit is provided, using a pair of 16-bit registers. The processor has a fixed length instruction format, with an instruction set including multiply and divide operations which use the shift unit over several cycles. No interrupts are provided. external pins of the integrated circuit allow for single stepping and other debug operations, and a serial interface (SIF) which allows external communication of test dat or working data as necessary. The serial interface has four wires (SERIN, SEROUT, SERCLK, SERLOADB), allowing handshaking with a master apparatus, and allowing direct access to the memory space of the processor core, without specific program control.
    Type: Grant
    Filed: October 29, 2001
    Date of Patent: May 31, 2005
    Assignee: Cambridge Consultants Ltd.
    Inventors: Stephen John Barlow, Alistair Guy Morfey, James Digby Collier
  • Patent number: 6874079
    Abstract: Aspects of a method and system for digital signal processing within an adaptive computing engine are described. These aspects include a mini-matrix, the mini-matrix comprising a set of composite blocks, each composite block capable of executing a predetermined set of instructions. A sequencer is included for controlling the set of composite blocks and directing instructions among the set of composite blocks based on a data-flow graph. Further, a data network is included and transmits data to and from the set of composite blocks and to the sequencer, while a status network routes status word data resulting from instruction execution in the set of composite blocks. With the present invention, an effective combination of hardware resources is provided in a manner that provides multi-bit digital signal processing capabilities for an embedded system environment, particularly in an implementation of an adaptive computing engine.
    Type: Grant
    Filed: July 25, 2001
    Date of Patent: March 29, 2005
    Assignee: Quicksilver Technology
    Inventor: Eugene B. Hogenauer
  • Patent number: 6862678
    Abstract: An apparatus and a method of data processing system that uses multiply-accumulate instructions. The apparatus for processing data includes, a special register bank of N-bit data processing registers, a general register bank of N-bit data processing registers, a selector, a multiplier and an accumulator. The selector is coupled to the special register bank and the general register bank and is used for selecting one of the special and general register banks and outputting N-bit data from the selected register banks. The outputted N-bit data and the N-bit data held in the general register bank form a 2N-bit addition operand. The multiplier is used for performing multiply operation upon a first operand and a second operand and outputting an 2N-bit result. The accumulator is coupled to the multiplier, the selector and the general register bank and is used for performing accumulate operation upon the 2N-bit result and the 2N-bit addition operand and outputting a 2N-bit accumulated result.
    Type: Grant
    Filed: November 9, 2000
    Date of Patent: March 1, 2005
    Assignee: Faraday Technology Corp.
    Inventors: Min-Cheng Kao, Ching-Jer Liang, Calvin Guey
  • Patent number: 6848043
    Abstract: Methods and apparatus for improving system performance using redundant arithmetic are disclosed. In one embodiment, one or more dependency chains are formed. A dependency chain may comprise of two or more instructions. A first instruction may generate a result in a redundant form. A second instruction may accept the result from the first instruction as a first input operand. The instructions in the dependency chain may execute separately from instructions not in the dependency chain.
    Type: Grant
    Filed: April 27, 2000
    Date of Patent: January 25, 2005
    Assignee: Intel Corporation
    Inventors: Thomas Y. Yeh, Hong Wang, Ralph Kling, Yong-Fong Lee
  • Patent number: 6842848
    Abstract: Techniques for token triggered multithreading in a multithreaded processor are disclosed. An instruction issuance sequence for a plurality of threads of the multithreaded processor is controlled by associating with each of the threads at least one register which stores a value identifying a next thread to be permitted to issue one or more instructions, and utilizing the stored value to control the instruction issuance sequence. For example, each of a plurality of hardware thread units of the multithreaded processor may include a corresponding local register updatable by that hardware thread unit, with the local register for a given one of the hardware thread units storing a value identifying the next thread to be permitted to issue one or more instructions after the given hardware thread unit has issued one or more instructions. A global register arrangement may also or alternatively be used.
    Type: Grant
    Filed: October 11, 2002
    Date of Patent: January 11, 2005
    Assignee: Sandbridge Technologies, Inc.
    Inventors: Erdem Hokenek, Mayan Moudgill, C. John Glossner
  • Patent number: 6842852
    Abstract: An execution control instruction is applied to an information processor of the type processing instructions by pipelining to suppress the occurrence of branch hazard. The execution control instruction contains: a condition field for specifying an execution condition; and an instruction-specifying field for defining, in binary code, the number of instructions to be executed only conditionally. In response to the execution control instruction, a nullification controller decides, based on control flags provided from an arithmetic logic unit, whether or not the execution condition specified by the condition field is satisfied. And based on the outcome of this decision, the controller determines whether or not that number of instructions, which has been defined by the instruction-specifying field for instructions succeeding the execution control instruction, should be nullified.
    Type: Grant
    Filed: February 8, 2000
    Date of Patent: January 11, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Masayuki Yamasaki, Minoru Okamoto
  • Patent number: 6842850
    Abstract: An instruction set architecture (ISA) for application specific signal processor (ASSP) is tailored to digital signal processing applications. The ISA implemented with the ASSP, is adapted to DSP algorithmic structures. The ISA of the present invention includes flexible data typing, permutation, and type matching of operands. The flexible data typing, permutation and type matching of operands provides programming flexibility to support different filtering and DSP algorithms having different types of filter coefficients or data samples. A data typer and aligner within each signal processing unit within the ASSP supports flexible data typing, permutation and type matching of operands of the instruction set architecture.
    Type: Grant
    Filed: February 25, 2003
    Date of Patent: January 11, 2005
    Assignee: Intel Corporation
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Publication number: 20040268094
    Abstract: A method and apparatus are described for converting a number from a floating point format to an integer format or from an integer format to a floating point format responsive to a control signal of a control signal format.
    Type: Application
    Filed: February 14, 2001
    Publication date: December 30, 2004
    Inventors: Mohammad Abdallah, Prasad Modali, Chien-Yu Huang, Hsien-Cheng E. Hsieh, Thomas R. Huff, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
  • Patent number: 6836839
    Abstract: The present invention concerns a new category of integrated circuitry and a new methodology for adaptive or reconfigurable computing. The preferred IC embodiment includes a plurality of heterogeneous computational elements coupled to an interconnection network. The plurality of heterogeneous computational elements include corresponding computational elements having fixed and differing architectures, such as fixed architectures for different functions such as memory, addition, multiplication, complex multiplication, subtraction, configuration, reconfiguration, control, input, output, and field programmability. In response to configuration information, the interconnection network is operative in real-time to configure and reconfigure the plurality of heterogeneous computational elements for a plurality of different functional modes, including linear algorithmic operations, non-linear algorithmic operations, finite state machine operations, memory operations, and bit-level manipulations.
    Type: Grant
    Filed: March 22, 2001
    Date of Patent: December 28, 2004
    Assignee: Quicksilver Technology, Inc.
    Inventors: Paul L. Master, Eugene Hogenauer, Walter James Scheuermann
  • Publication number: 20040260914
    Abstract: New instruction definitions for a packet add (PADD) operation and for a single instruction multiple add (SMAD) operation are disclosed. In addition, a new dedicated PADD logic device that performs the PADD operation in about one to two processor clock cycles is disclosed. Also, a new dedicated SMAD logic device that performs a single instruction multiple data add (SMAD) operation in about one to two clock cycles is disclosed.
    Type: Application
    Filed: June 23, 2003
    Publication date: December 23, 2004
    Inventors: Corey Gee, Bapiraju Vinnakota, Saleem Mohammadali, Carl A. Alberola
  • Patent number: 6831916
    Abstract: A host system is provided with one or more host-fabric adapters installed therein for connecting to a switched fabric of a data network. The host-fabric adapter comprises a micro-controller subsystem configured to establish connections and support data transfers via the switched fabric, and a serial interface which provides an interface with the switched fabric. The micro-controller subsystem includes a Micro-Engine (ME) which executes a ME instruction to send source and destination addresses during a control cycle, and interface logic blocks which supply addressed data from designated sources to the Micro-Engine (ME) at the same time for execution of the ME instruction during a data cycle subsequent to the control cycle.
    Type: Grant
    Filed: September 28, 2000
    Date of Patent: December 14, 2004
    Inventors: Balaji Parthasarathy, Dominic J. Gasbarro, Tom E. Burton, Brian M. Leitner
  • Patent number: 6832117
    Abstract: A processor core for realizing efficient operation processing by connecting an extended arithmetic unit to its exterior and a processor incorporating such a processing core are provided. The processor includes the processor core, a data memory accessed by the processor core, and the extended arithmetic unit connected to the exterior of the processor core for processing a particular instruction. The extended arithmetic unit executes an arithmetic operation by using arithmetic operation data retained in a register file in the processor core, and directly outputs an arithmetic operation result to the processor core. Then, the processor core saves the result of the arithmetic operation executed by the extended arithmetic unit and inputted therefrom in the register file in the processor core.
    Type: Grant
    Filed: September 21, 2000
    Date of Patent: December 14, 2004
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Takashi Miyamori
  • Patent number: 6817012
    Abstract: A method is provided for translating a source operation to a target operation. The source operation acts on one or more source operands, each comprising a binary integer of a first bit-width. The target operation is required to be evaluated by a processor, such as a computer, which performs integer operations on binary integers of a second bit-width which is greater than first bit-width. The source operation is translated to a target operation having at least one target operand. The method identifies whether the value of unused bits of the or each target operand affects the value of the target operation and whether the target operand or any of the target operands is capable of having one or more unused bits of inappropriate value. If so, a correcting operation is added to the target operation for correcting the value of each of the bits of inappropriate value before performing the target operation.
    Type: Grant
    Filed: October 2, 2000
    Date of Patent: November 9, 2004
    Assignee: Sharp Kabushiki Kaisha
    Inventors: Vincent Zammit, Andrew Kay
  • Publication number: 20040221137
    Abstract: Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
    Type: Application
    Filed: June 3, 2004
    Publication date: November 4, 2004
    Applicant: PTS Corporation
    Inventors: Nikos P. Pitsianis, Gerald G. Pechanek, Ricardo E. Rodriguez
  • Publication number: 20040215940
    Abstract: Each of registers R0 to R31 is divided into the upper 32-bit area and the lower 32-bit area. A register writing control unit 431 outputs information to the selectors 4321 and 4322 on the registers and the locations (upper and lower areas) in which data is written by the instructions that have issued in one cycle. Each of the selectors 4321 and 4322 selects one out of pieces of data that have been output from first, second, and third arithmetic operation units 44, 45, and 46 and writes the selected data in the upper or lower area in one register. A dependency analysis unit 110 in a compiling apparatus considers the upper and lower registers in one 64-bit register as separate resources, analyzes the data dependency relations between the instructions, and generates a dependency graph that indicates the data dependency relations. A instruction rearrangement unit 111 rearranges the instructions and generates execution codes using the dependency graph.
    Type: Application
    Filed: May 17, 2004
    Publication date: October 28, 2004
    Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD.
    Inventors: Taketo Heishi, Kensuke Odani
  • Publication number: 20040210745
    Abstract: A programmable processor and method for improving the performance of processors by incorporating an execution unit configurable to execute a plurality of instruction streams from the plurality of threads, wherein each instruction stream includes a group instruction that operates on a plurality of data elements in partitioned fields of at least one of the registers to produce a catenated result.
    Type: Application
    Filed: January 16, 2004
    Publication date: October 21, 2004
    Applicant: MICROUNITY SYSTEMS ENGINEERING, INC.
    Inventors: Craig Hansen, John Moussouris
  • Patent number: 6807625
    Abstract: An apparatus and method for efficiently generating arithmetic flags in a computer system. The system includes an eflags register to stored partially computed flags computed by an arithmetic logic unit. The stored partial flags are computed in one cycle. The stored flags are decoded by one of two consuming instructions, PRODF or TBIT, in a second cycle.
    Type: Grant
    Filed: February 18, 2000
    Date of Patent: October 19, 2004
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Patrick Knebel, Mark Gibson, Rohit Bhatia, Kevin David Safford
  • Publication number: 20040205323
    Abstract: A programmable processor and method for improving the performance of processors by incorporating an execution unit operable to decode and execute single instructions specifying a data selection operand and a first and a second register providing a plurality of data elements, the data selection operand comprising a plurality of fields each selecting one of the plurality of data elements, the execution unit operable to provide the data element selected by each field of the data selection operand to a predetermined position in a catenated result.
    Type: Application
    Filed: January 15, 2004
    Publication date: October 14, 2004
    Applicant: MICROUNITY SYSTEMS ENGINEERING, INC.
    Inventors: Craig Hansen, John Moussouris
  • Publication number: 20040199750
    Abstract: A programmable processor that comprises a general purpose processor architecture, capable of operation independent of another host processor, having a virtual memory addressing unit, an instruction path and a data path; an external interface; a cache operable to retain data communicated between the external interface and the data path; at least one register file configurable to receive and store data from the data path and to communicate the stored data to the data path; and a multi-precision execution unit coupled to the data path. The multi-precision execution unit is configurable to dynamically partition data received from the data path to account for an elemental width of the data and is capable of performing group floating-point operations on multiple operands in partitioned fields of operand registers and returning catenated results. In other embodiments the multi-precision execution unit is additionally configurable to execute group integer and/or group data handling operations.
    Type: Application
    Filed: August 25, 2003
    Publication date: October 7, 2004
    Applicant: MICRO UNITY SYSTEMS ENGINEERING, INC.
    Inventors: Craig Hansen, John Moussouris
  • Publication number: 20040199751
    Abstract: An apparatus for performing an MMX PSADBW instruction is disclosed. The apparatus includes carry-generating subtraction logic that generates packed differences of the subtrahend from the minuend and associated carry bits indicating whether the difference is positive or negative. The apparatus selectively inverts the differences based on the carry bits. Addition logic adds the selectively inverted differences and carry bits substantially in parallel to generate the PSADBW instruction result. In one embodiment, the apparatus also includes two muxes. The first mux selects the selectively inverted differences in the case of a PSADBW instruction and selects a multiply instruction's partial products otherwise. The second mux selects the carry bits in the case of a PSADBW instruction and selects a second multiply instruction's partial products otherwise. The two mux outputs are provided to the addition logic.
    Type: Application
    Filed: January 27, 2004
    Publication date: October 7, 2004
    Applicant: VIA Technologies, Inc.
    Inventors: Daniel W.J. Johnson, Albert J. Loper
  • Publication number: 20040193847
    Abstract: Intra-register subword add instructions yield results that are a function of a sum having as at least some of its addends unary functions of at least two subwords stored in the same register. For example, one “TreeAdd” instruction yields a sum of all subwords in a register. A “parallel accumulate” PAcc instruction yields a result with four 2-byte result subwords. Each result subword is the sum of 2-byte value in a first operand register and two of eight 1-byte subwords in a second operand register. A “Parallel Accumulate Magnitude” PAccMagLR also yields a result with four 2-byte subwords. Each of these subwords is the sum of a 2-byte value in a first operand register and the absolute values of two 1-byte values in a second operand register. These instructions provide for substantial performance enhancements for motion estimation used in video compression.
    Type: Application
    Filed: March 31, 2003
    Publication date: September 30, 2004
    Inventors: Ruby B. Lee, Dale Morris
  • Patent number: 6799267
    Abstract: A packet processor having a general-purpose arithmetic operator and another dedicated circuit, which extracts a particular field from the general-purpose register as object field, on which the predetermined general-purpose arithmetic operation is to be performed by the general-purpose arithmetic operator and writes a result of the arithmetic operation by the general-purpose arithmetic operator into the general-purpose register as updated information of the particular field. Based on the extraction and write process of the packet field designated by software (instructions), the packet processor realizes high flexibility and high speed processing.
    Type: Grant
    Filed: December 20, 2000
    Date of Patent: September 28, 2004
    Assignee: Fujitsu Limited
    Inventors: Yuji Kojima, Tetsumei Tsuruoka, Kenichi Abiru, Yasuyuki Umezaki, Yoshitomo Shimozono
  • Patent number: 6785847
    Abstract: Aspects for soft error detection for a superscalar microprocessor are described. The aspects include a first pipeline, the first pipeline including a first arithmetic logic unit, ALU, comparator and a first general purpose register, GPR, for storing first data, and a second pipeline, the second pipeline including a second GPR and a second ALU comparator, the second GPR for storing second data, the second data being a copy of the first data. A detection system utilizes one of the first and second ALU comparators to perform a comparison of the second data with the first data during an idle state of the first and second pipelines.
    Type: Grant
    Filed: August 3, 2000
    Date of Patent: August 31, 2004
    Assignee: International Business Machines Corporation
    Inventors: Paul J. Jordan, Peter J. Klim
  • Patent number: 6779106
    Abstract: An apparatus and method for performing integer divide operations in an IA64 architecture based data processing system is provided. The apparatus and method insert integer divide checks in place of NOP instructions in the instruction bundles associated with integer divide operations. The checks serve to identify typically encountered integer divide operations. Based on such identifications, the integer divide operation may be short-circuited such that the appropriate result may be returned without having to complete the integer divide operation.
    Type: Grant
    Filed: September 28, 2000
    Date of Patent: August 17, 2004
    Assignee: International Business Machines Corporation
    Inventor: Geoffrey Owen Blandy
  • Publication number: 20040153632
    Abstract: A system and software for improving the performance of processors by incorporating an execution unit operable to decode and execute single instructions specifying a data selection operand and a first and a second register providing a plurality of data elements, the data selection operand comprising a plurality of fields each selecting one of the plurality of data elements, the execution unit operable to provide the data element selected by each field of the data selection operand to a predetermined position in a catenated result.
    Type: Application
    Filed: January 16, 2004
    Publication date: August 5, 2004
    Applicant: MICROUNITY SYSTEMS ENGINEERING, INC.
    Inventors: Craig Hansen, John Moussouris
  • Patent number: 6772327
    Abstract: An FPU pipeline is synchronized with a CPU pipeline. Synchronization is achieved by having stalls and freezes in any one pipeline cause stalls and freezes in the other pipeline as well. Exceptions are kept precise even for long floating point operations. Precise exceptions are achieved by having a first execution stage of the FPU pipeline generate a busy signal, when a first floating point instruction enters a first execution stage of the FPU pipeline. When a second floating point instruction is decoded by the FPU pipeline before the first floating point instruction has finished executing in the first stage of the FPU pipeline, then both pipelines are stalled.
    Type: Grant
    Filed: May 9, 2002
    Date of Patent: August 3, 2004
    Assignee: Hitachi Micro Systems, Inc.
    Inventors: Prasenjit Biswas, Gautam Dewan, Kevin Iadonato, Norio Nakagawa, Kunio Uchiyama
  • Patent number: 6772319
    Abstract: An instruction set architecture (ISA) to convert voice and data samples into packets for transmission over a network and to convert packets received from the network into voice and data samples. In one embodiment, the ISA includes a digital signal processing (DSP) instruction set architecture for a plurality of signal processing units and a control instruction set architecture to control the execution of DSP instructions by the plurality of signal processing units. In another embodiment, the ISA includes a plurality of DSP instructions including a 20-bit DSP instruction and a 40-bit DSP instruction and a plurality of control instructions to control execution of the plurality of DSP instructions including a 20-bit control instruction and a 40-bit control instruction. The DSP instructions may be dyadic DSP instructions including a main DSP operation and a sub DSP operation.
    Type: Grant
    Filed: August 8, 2002
    Date of Patent: August 3, 2004
    Assignee: Intel Corporation
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Publication number: 20040128486
    Abstract: A method and system including transmitting data in an architectural format between execution units in a multi-type instruction set architecture and converting data received in the architectural format to an internal format and data output in the internal format to the architectural format based on an operation code and a data type of a microinstruction.
    Type: Application
    Filed: December 31, 2002
    Publication date: July 1, 2004
    Inventors: Zeev Sperber, Ittai Anati, Oded Liron, Mohammad Abdallah
  • Patent number: 6757813
    Abstract: In a processor executing plural instructions simultaneously, writin-destination-register numbers of the plural instructions to be executed simultaneously are compared, and kinds of operations to be executed by the plural instructions are changed in response to a comparison result. When the writing-destination-register numbers of the plural instructions are identical, a constant operation is applied to plural operation results obtained from the plural instructions to obtain an operation result and the operation result is written into the writing-destination-register instructed by the plural instructions. Results outputted from plural processing units are put together into one result and the result is stored in one register. Thus, register use efficiency and process efficiency are improved.
    Type: Grant
    Filed: June 23, 2000
    Date of Patent: June 29, 2004
    Assignee: NEC Corporation
    Inventor: Hiroyuki Igura
  • Patent number: 6757820
    Abstract: A method and apparatus for performing single-instruction bit field extraction and for counting a number of leading zeros in a sequence of bits on a general purpose processor are provided. The fast bit extraction operations are accomplished by executing a first instruction for extracting an arbitrary number of bits of a sequence of bits stored in two or more source registers of the processor starting at an arbitrary offset and the storing the extracted bits in a destination register. Both the source and the destination registers are specified by the instruction. In addition, a second instruction is provided for counting the number of leading zeros in a sequence of bits stored in two or more source registers of the processor and then storing a binary value representing the number of leading zeros in a destination register. Again the source and the destination registers are specified by the second instruction.
    Type: Grant
    Filed: January 31, 2003
    Date of Patent: June 29, 2004
    Assignee: Sun Microsystems, Inc.
    Inventors: Subramania Sudharsanan, Jeffrey Meng Wah Chan, Marc Tremblay
  • Patent number: 6754810
    Abstract: An apparatus and method for bi-directional format conversion and transfer of data between integer and floating point registers is provided. A floating point register is configured to store floating point data, and integer data, in a variety of numerical formats. Data is moved in and out of the floating point register as integer data, and is converted into floating point format as needed. Separate processor instructions are provided for format conversion and data transfer to allow conversion and transfer operations to be separated.
    Type: Grant
    Filed: April 10, 2002
    Date of Patent: June 22, 2004
    Assignee: I.P.-First, L.L.C.
    Inventors: Timothy A. Elliott, G. Glenn Henry
  • Publication number: 20040117601
    Abstract: One embodiment of the invention is a general-purpose processor. The general-purpose processor is configured to receive and execute instructions. The processor includes an integer execution unit. The processor also includes a binary polynomial execution unit.
    Type: Application
    Filed: December 12, 2002
    Publication date: June 17, 2004
    Inventors: Lawrence A. Spracklen, Sheueling Chang Shantz
  • Patent number: 6751725
    Abstract: Methods and apparatuses to clear state for operation of a stack. According to one embodiment of the invention, a processor comprises a set of one or more storage areas and a decode unit. The set of one or more storage areas are to store a plurality of tags and a top of stack indication, where each of the plurality of tags is to indicate if a register is in an empty or non-empty state. The decode unit is to decode scalar floating point instructions and packed data instructions, where at least certain of said scalar floating point instructions specify registers in a stack referenced manner and at least certain of said packed data instructions specify registers in a non-stack referenced manner. In addition, the packed data instructions include an instruction to mark the end of blocks of the packed data instructions in programs. The processor also comprises circuitry to cause the plurality of tags to indicate the empty state responsive to execution of the instruction.
    Type: Grant
    Filed: February 16, 2001
    Date of Patent: June 15, 2004
    Assignee: Intel Corporation
    Inventors: David Bistry, Larry Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi, Millind Mittal, Benny Eitan
  • Patent number: 6748521
    Abstract: A data processing system is provided with a digital signal processor which has an instruction for saturating multiple fields of a selected set of source operands and storing the separate saturated results in a selected destination register. A first 32-bit operand (600) and a second 32-bit operand (602) are treated as four 16-bit fields and the sixteen bits in each field are saturated separately. Multi-field saturation circuitry is operable to treat a source operand as a number of fields, such that a multi-field saturated (610) result is produced that includes a number of saturated results each corresponding to each field. One instruction is provided which treats an operand pair as having two packed fields, and another instruction is provided that treats the operand pair has having four packed fields. Saturation circuitry is operable to selectively treat a field as either a signed value or an unsigned value.
    Type: Grant
    Filed: October 31, 2000
    Date of Patent: June 8, 2004
    Assignee: Texas Instruments Incorporated
    Inventor: David Hoyle
  • Patent number: 6748516
    Abstract: Disclosed is a method, apparatus, and an instruction set architecture (ISA) for an application specific signal processor (ASSP) tailored to digital signal processing (DSP) applications. A single DSP instruction includes a pair of sub-instructions: a primary DSP sub-instruction and a shadow DSP sub-instruction. Both the primary and the shadow DSP sub-instructions are dyadic DSP instructions performing two operations in one instruction cycle. Each signal processing unit of the ASSP includes a primary stage to execute a primary DSP sub-instruction based upon current data and a shadow stage to simultaneously execute a shadow DSP sub-instruction based upon delayed data stored locally within registers of the signal processing units. The present invention efficiently executes DSP instructions by simultaneously executing primary DSP sub-instructions (based upon current data) and shadow DSP sub-instructions (based upon delayed locally stored data) with a single DSP instruction.
    Type: Grant
    Filed: January 29, 2002
    Date of Patent: June 8, 2004
    Assignee: Intel Corporation
    Inventors: Kumar Ganapathy, Ruban Kanapathipillai
  • Patent number: 6745319
    Abstract: A data processing system is provided with a digital signal processor (DSP) which has a shuffle instruction for shuffling a source operand (600) and storing the shuffled result in a selected destination register (610). A shuffled result is formed by interleaving bits from a first source operand portion with bits from a second operand portion. A de-interleave and pack (DEAL) instruction is provided for de-interleaving a source operand. The shuffle instruction and the DEAL instruction have an exactly inverse effect. The DSP includes swizzle circuitry that performs interleaving or de-interleaving in a single execution phase.
    Type: Grant
    Filed: October 31, 2000
    Date of Patent: June 1, 2004
    Assignee: Texas Instruments Incorporated
    Inventors: Keith Balmer, David Hoyle, Lewis Nardini
  • Patent number: 6742112
    Abstract: Apparatus and methods to track a register value. A microprocessor can include a first register, a control circuit, and an adder. The first register can store a tracked register value. The control circuit can include an instruction input to receive at least a portion of an instruction and a first output to output an arithmetic operation indication. The adder can include a control input to receive the arithmetic operation indication, a first input to receive an immediate operand of an instruction, and a second input to receive the tracked register value.
    Type: Grant
    Filed: December 29, 1999
    Date of Patent: May 25, 2004
    Assignee: Intel Corporation
    Inventors: Adi Yoaz, Ronny Ronen, Stephan J. Jourdan, Michael Bekerman
  • Patent number: 6732259
    Abstract: A processor having a conditional branch extension of an instruction set architecture which incorporates a set of high performance floating point operations. The instruction set architecture incorporates a variety of data formats including single precision and double precision data formats, as well as the paired-single data format that allows two simultaneous operations on a pair of operands. The extension includes instructions directed to branching if, for example, either one of two condition codes is false or true, if any of three condition codes are false or true, or if any one of four condition codes are false or true.
    Type: Grant
    Filed: July 30, 1999
    Date of Patent: May 4, 2004
    Assignee: MIPS Technologies, Inc.
    Inventors: Radhika Thekkath, G. Michael Uhler, Ying-wai Ho, Chandlee B. Harrell