Arithmetic Operation Instruction Processing Patents (Class 712/221)

Floating point or vector (Class 712/222)

System and method for parallel computing multiple packed-sum absolute differences (PSAD) in response to a single instruction

Publication number: 20030005267

Abstract: A system and method are presented in which multiple packed-sum absolute differences (PSAD) are computed in response to a single instruction. One embodiment of the system comprises a first register configured to store a first operand having data elements, and a second register configured to store a second operand having data elements. Additionally, the system comprises a processor configured to perform multiple PSAD calculations between the data elements of the second operand and a first subset of data elements of the first operand. The multiple PSAD calculations are performed in response to a single instruction set. One embodiment of the method comprises the steps of receiving a single instruction, and performing multiple PSAD calculations in response to the single instruction.

Type: Application

Filed: June 19, 2002

Publication date: January 2, 2003

Inventors: Igor M. Koba, Mikhail Chernomordik
Digital signal processing device

Patent number: 6502182

Abstract: A digital signal processing device applicable to a signal processing system using a CPU is mainly configured by an external memory and a digital signal processor (i.e., DSP), which are connected together using a data bus and an address bus. The external memory stores multiplier data and coefficient data as well as basic instructions. In the DSP, an ALU calculates addresses for accessing the external memory via the address bus. A bus control unit identifies the multiplier data, coefficient data and basic instructions respectively, which are read from the external memory. The DSP performs calculations containing multiplication using the multiplier data and coefficient data. The DSP is controlled in operations in response to a CPU mode and a DSP mode, one of which is selected by decoding the basic instruction(s) identified by the bus control unit. At the CPU mode, the basic instructions of sixteen bits are subjected to coding to produce high-speed instructions of thirty-two bits for controlling the DSP.

Type: Grant

Filed: April 28, 1999

Date of Patent: December 31, 2002

Assignee: Yamaha Corporation

Inventor: Morito Morishima
Data manipulation instruction for enhancing value and efficiency of complex arithmetic

Patent number: 6502117

Abstract: A method and apparatus for performing complex arithmetic is disclosed. In one embodiment, a method comprises decoding a single instruction, and in response to decoding the single instruction, moving a first operand occupying lower order bits of a first storage area to higher order bits of a result, moving a second operand occupying higher order bits of a second storage area to lower order bits of the result, and negating one of the first and second operands of the result.

Type: Grant

Filed: June 4, 2001

Date of Patent: December 31, 2002

Assignee: Intel Corporation

Inventors: Roger A. Golliver, Carole Dulong
Method and apparatus for staggering execution of a single packed data instruction using the same circuit

Publication number: 20020184474

Abstract: A method and apparatus are disclosed for staggering execution of an instruction. According to one embodiment of the invention, a single macro instruction is received wherein the single macro instruction specifies at least two logical registers and wherein the two logical registers respectively store a first and second packed data operands having corresponding data elements. An operation specified by the single macro instruction is then performed independently on a first and second plurality of the corresponding data elements from said first and second packed data operands at different times using the same circuit to independently generate a first and second plurality of resulting data elements. The first and second plurality of resulting data elements are stored in a single logical register as a third packed data operand.

Type: Application

Filed: June 6, 2002

Publication date: December 5, 2002

Inventors: Patrice Roussel, Glenn J. Hinton, Shreekant S. Thakkar, Brent R. Boswell, Karol F. Menezes
Method and apparatus for staggering execution of a single packed data instruction using the same circuit

Publication number: 20020178348

Abstract: A method and apparatus are disclosed for staggering execution of an instruction. According to one embodiment of the invention, a single macro instruction is received wherein the single macro instruction specifies at least two logical registers and wherein the two logical registers respectively store a first and second packed data operands having corresponding data elements. An operation specified by the single macro instruction is then performed independently on a first and second plurality of the corresponding data elements from said first and second packed data operands at different times using the same circuit to independently generate a first and second plurality of resulting data elements. The first and second plurality of resulting data elements are stored in a single logical register as a third packed data operand.

Type: Application

Filed: June 6, 2002

Publication date: November 28, 2002

Inventors: Patrice Roussel, Glenn J. Hinton, Shreekant S. Thakkar, Brent R. Boswell, Karol F. Menezes
Fixed point unit pipeline allowing partial instruction execution during the instruction dispatch cycle

Publication number: 20020174324

Abstract: A method for allowing a partial instruction to be executed in a fixed point unit pipeline during the instruction dispatch cycle creates a mask used to select which bits of the operands participate in a future logical operation of the fixed point unit back a cycle to the instruction dispatch stage of the fixed point unit. As an S/390 System improvement applicable to other computers, the mask is determined and created two cycles ahead of execution, or two cycles before the mask is actually used. Also, in the method used for moving the mask generation back by one cycle, mask generation overlaps the dispatch stage in the I-unit, and this provides a handshake between the I-unit and E-unit of the fixed point unit of the central processor unit of the computer system. The control setting selection process occurs in a predetermination cycle stage or e-1 (em1) stage for the mask generation and the register file read address.

Type: Application

Filed: April 11, 2001

Publication date: November 21, 2002

Applicant: International Business Machines Corporation

Inventors: Fadi Y. Busaba, Christopher A. Krygowski, Wen H. Li
Updating condition status register based on instruction specific modification information in set/clear pair upon instruction commit in out-of-order processor

Patent number: 6484251

Abstract: A processor including a register, an execution unit, a temporary result buffer, and a commit function circuit. The register includes at least one register bit and may include one or more sticky bits. The execution unit is suitable for executing a set of computer instructions. The temporary result buffer is configured to receive, from the execution unit, register bit modification information provided by the instructions. The temporary result buffer is suitable for storing the modification information in set/clear pairs of bits corresponding to respective register bits of the register. The commit function circuit is configured to receive the set/clear pairs of bits from the temporary result buffer when the instruction is committed. The commit function circuit is suitable for generating an updated bit in response to receiving the set/clear pairs of bits. The updated bit is then committed to the corresponding register bit of the register.

Type: Grant

Filed: October 14, 1999

Date of Patent: November 19, 2002

Assignee: International Business Machines Corporation

Inventors: Robert Greg McDonald, Peichun Peter Liu, Christopher Hans Olson
Conversion from packed floating point data to packed 8-bit integer data in different architectural registers

Patent number: 6480868

Abstract: A method and instruction for converting a number from a floating point format to an integer format are described. Numbers are stored in the floating point format in a register of a first set of architectural registers in a packed format. At least one of the numbers in the floating point format is converted to at least one 8-bit number in the integer format. The 8-bit number in the integer format is placed in a register of a second set of architectural registers in the packed format.

Type: Grant

Filed: April 27, 2001

Date of Patent: November 12, 2002

Assignee: Intel Corporation

Inventors: Mohammad A.F. Abdallah, Hsien-Cheng E. Hsieh, Thomas R. Huff, Vladimir Pentkovski, Patrice Roussel, Shreekant S. Thakkar
Data processing circuits and interfaces

Publication number: 20020161988

Abstract: An integrated circuit contains a microprocessor core, program memory and separate data storage, together with analog and digital signal processing circuitry. The ALU is 16 bits wide, but a 32-bit shift unit is provided, using a pair of 16-bit registers. The processor has a fixed length instruction format, with an instruction set including multiply and divide operations which use the shift unit over several cycles. No interrupts are provided. external pins of the integrated circuit allow for single stepping and other debug operations, and a serial interface (SIF) which allows external communication of test dat or working data as necessary. The serial interface has four wires (SERIN, SEROUT, SERCLK, SERLOADB), allowing handshaking with a master apparatus, and allowing direct access to the memory space of the processor core, without specific program control.

Type: Application

Filed: October 29, 2001

Publication date: October 31, 2002

Applicant: Cambridge Consultants Limited.

Inventors: Stephen John Barlow, Alistair Guy Morfey, James Digby Collier
Coprocessor architecture for control processors using synchronous logic

Publication number: 20020156994

Abstract: A method for implementing coprocessors for control processors uses synchronous logic design method to achieve low cost and high performance in control processors. The coprocessor comprising of signed two's complement multiplication, signed divide, shift left and shift right, and normalization comprises of most of the math functions required for implementing digital signal processing algorithms. The processor coprocessor architecture uses data dependency to compute the time duration required to perform the math computation. This results in efficient implementation of DSP algorithms and eventually translates to better system level performance. The technique described to implement math computations uses a register file interface, existing instruction set, and existing legacy software development infrastructure to implement DSP systems.

Type: Application

Filed: April 24, 2001

Publication date: October 24, 2002

Inventors: Sanjay Agarwal, Sapna Agrawal
Superscalar processor and method for incrementally issuing store instructions

Patent number: 6463524

Abstract: A superscalar processor and method are disclosed for efficiently executing a store instruction. The store instruction is stored in an issue queue within the processor. A first part of the store instruction is issued from the issue queue to a first one of different execution units in response to a first operand becoming available. A second part of the store instruction is issued from the issue queue to a second one of the different execution units in response to a second operand becoming available. The store instruction is completed in response to executing the first part of the store instruction by the first one of the execution units and the second part of the store instruction by the second one of the execution units.

Type: Grant

Filed: August 26, 1999

Date of Patent: October 8, 2002

Assignee: International Business Machines Corporation

Inventors: Maureen Delaney, Hung Qui Le, Dung Quoc Nguyen, Robert McDonald, David W. Victor
Register rotation prediction and precomputation

Publication number: 20020144098

Abstract: Instruction-level parallelism in software pipelined loops is exploited by predicting future register rotations. A processor includes an architected current frame marker register and at least one unarchitected frame marker register. Register rotation prediction is achieved by setting the register rotation of future iterations of a software loop to be a function of the unarchitected frame marker registers. True data dependencies remain, but the dependencies caused solely by register renaming are removed. Dynamic predication is used to predicate instructions from future iterations, allowing them to be squashed if dependencies are later found. The register renaming that results from the prediction can be included in instructions in a buffer, or a renaming stage in an execution pipeline can perform the renaming.

Type: Application

Filed: March 28, 2001

Publication date: October 3, 2002

Applicant: Intel Corporation

Inventors: Hong Wang, Christopher J. Hughes, Ralph Kling, Yong-Fong Lee, Daniel M. Lavery, John Shen, Jamison Collins
Data type conversion based on comparison of type information of registers and execution result

Patent number: 6460135

Abstract: In a microprocessor, a type information comparator compares type information of an execution result of an instruction with type information of the type information register corresponding to the data register which is requested by said instruction, and generates an exceptional interrupt in the case of disagreement. An input output execution unit simultaneously receives the data of the data register and the type information of the type information register corresponding to said data register and performs an input and output for an external via an external bus. A calculation execution unit simultaneously receives the data of the data register and the type information of the type information register corresponding to the data register, and executes calculation.

Type: Grant

Filed: October 4, 1999

Date of Patent: October 1, 2002

Assignee: NEC Corporation

Inventor: Shigeru Suganuma
Seed ROM for reciprocal computation

Patent number: 6446106

Abstract: A method and apparatus for performing a divide operation in a computer are described. The apparatus includes a first memory containing estimated reciprocal terms, and a second memory containing reciprocal error terms. An adder adds a selected estimated reciprocal term from the first memory and a selected reciprocal error term from the second memory to provide the reciprocal. The selected estimated reciprocal term and the selected reciprocal error term correspond to at least a portion of a divisor. The apparatus includes a multiplier for multiplying a dividend by the reciprocal to generate a quotient. The method includes the step of looking up an estimated reciprocal term in a first lookup table stored in a first computer memory wherein the estimated reciprocal term corresponds to at least a portion of a given divisor. A reciprocal error term is looked up in a second lookup table stored in a second computer memory, the error term corresponds to at least a portion of the divisor.

Type: Grant

Filed: May 29, 2001

Date of Patent: September 3, 2002

Assignee: Micron Technology, Inc.

Inventor: James R. Peterson
Method and apparatus for single cycle processing of data associated with separate accumulators in a dual multiply-accumulate architecture

Patent number: 6446193

Abstract: A method and apparatus for reducing instruction cycles in a digital signal processor wherein the processor includes a multiplier unit, an adder, a memory, and at least one pair of first and second accumulators. The accumulators include respective guard, high and low parts. The method and apparatus enable vectoring the respective first and second high parts from the accumulators to define a single vectored register responsive to a single instruction cycle and processing the data in the vectored register.

Type: Grant

Filed: September 8, 1997

Date of Patent: September 3, 2002

Assignee: Agere Systems Guardian Corp.

Inventors: Mazhar M. Alidina, Sivanand Simanapalli, Larry R. Tate
Dyadic operations instruction processor with configurable functional blocks

Patent number: 6446195

Abstract: An instruction set architecture (ISA) for application specific signal processor (ASSP) is tailored to digital signal processing applications. The instruction set architecture implemented with the ASSP, is adapted to DSP algorithmic structures. The instruction word of the ISA is typically 20 bits but can be expanded to 40-bits to control two instructions to be executed in series or parallel. All DSP instructions of the ISA are dyadic DSP instructions performing two operations with one instruction in one cycle. The DSP instructions or operations in the preferred embodiment include a multiply instruction (MULT), an addition instruction (ADD), a minimize/maximize instruction (MIN/MAX) also referred to as an extrema instruction, and a no operation instruction (NOP) each having an associated operation code (“opcode”). The present invention efficiently executes DSP instructions by means of the instruction set architecture and the hardware architecture of the application specific signal processor.

Type: Grant

Filed: January 31, 2000

Date of Patent: September 3, 2002

Assignee: Intel Corporation

Inventors: Kumar Ganapathy, Ruban Kanapathipillai
Processor with different width functional units ignoring extra bits of bus wider than instruction width

Patent number: 6442676

Abstract: A data processing system contains a processor supporting both Narrow and Wide instructions and Narrow and Wide word size fixed-point and floating-point operands. The processor communicates over a bus utilizing a Wide word size with the remainder of the data processing system consisting of industry standard memory and peripheral devices. Narrow word sized instructions are stored on Wide word-sized storage devices. In a preferred embodiment, the processor bus has a first integer number of significant data lines. The processor is responsively coupled to the processor bus and includes a first decoder for decoding a first set of instructions received over the set of processor data lines, The first set of instructions each contains a second integer number, less than the first integer number, of significant bits.

Type: Grant

Filed: June 30, 1999

Date of Patent: August 27, 2002

Assignee: Bull HN Information Systems Inc.

Inventor: Russell W. Guenthner
Distance controlled concatenation of selected portions of elements of packed data

Patent number: 6438676

Abstract: A data processor uses storage units that are subdivisible into predetermined fields for executing instructions that cause the data processor to handle numbers from respective ones of the fields separately. The processor has an instruction that addresses a first and a second one of the storage units. In response the data processor takes a first and second group of successive bits from a first and second one of the fields of the first one of the storage units, places the first and second groups of successive bits at respective shifted positions both in the same field in a result storage unit, a bit position distance between the shifted positions being controlled by a content of the second one of the storage units.

Type: Grant

Filed: August 4, 1999

Date of Patent: August 20, 2002

Assignee: TriMedia Technologies

Inventor: Fransiscus W. Sijstermans
Digital signal processor

Patent number: 6430681

Abstract: In a digital signal processor having an improved arithmetic processing efficiency, there is provided in parallel a first ROM for storing branch commands and a second ROM for storing arithmetic commands. The ROMs are connected to a branch command decoder and an arithmetic command decoder, respectively. Operations of a first memory control circuit and a second memory control circuit are controlled in response to instructions from the branch command decoder, while operations of an arithmetic circuit are controlled in response to instructions from the arithmetic command decoder. By processing the branch commands and the arithmetic commands in parallel, the operation efficiency of the arithmetic circuit is enhanced.

Type: Grant

Filed: June 18, 1999

Date of Patent: August 6, 2002

Assignee: Sanyo Electric Co., Ltd.

Inventor: Fumiaki Nagao
Accurate high speed digital signal processor

Patent number: 6427203

Abstract: An improved digital signal processor, in which arithmetic multiply-add instructions are performed faster with substantial accuracy. The digital signal processor performs multiply-add instructions with look-ahead rounding, so that rounding after repeated arithmetic operations proceeds much more rapidly. The digital signal processor is also augmented with additional instruction formats which are particularly useful for digital signal processing. A first additional instruction format allows the digital signal processor to incorporate a small constant immediately into an instruction, such as to add a small constant value to a register value, or to multiply a register by a small constant value; this allows the digital signal processor to conduct the arithmetic operation with only one memory lookup instead of two.

Type: Grant

Filed: August 22, 2000

Date of Patent: July 30, 2002

Assignee: Sigma Designs, Inc.

Inventor: Yann Le Cornec
NEAR-ORTHOGONAL DUAL-MAC INSTRUCTION SET ARCHITECTURE WITH MINIMAL ENCODING BITS

Publication number: 20020099923

Abstract: A near-orthogonal dual-MAC instruction set is provided which implements virtually the entire functionality of the orthogonal instruction set of 272 commands using only 65 commands. The reduced instruction set is achieved by eliminating instructions based on symmetry with respect to the result of the commands and by imposing simple restrictions related to items such as the order of data presentation by the programmer. Specific selections of commands are also determined by the double word aligned memory architecture which is associated with the dual-MAC architecture. The reduced instruction set architecture preserves the functionality and inherent parallelism of the command set and requires fewer command bits to implement than the full orthogonal set.

Type: Application

Filed: August 12, 1998

Publication date: July 25, 2002

Inventors: MAZHAR M. ALIDINA, SIRVAND SIMANAPALLI, LARRY R. TATE, MARK E. THIERBACH
Variable length instruction decoder

Patent number: 6425070

Abstract: The present invention is a novel and improved method and circuit for digital signal processing. One aspect of the invention calls for the use of a variable length instruction set. A portion of the variable length instructions may be stored in adjacent locations within memory space with the beginning and ending of instructions occurring across memory word boundaries. Furthermore, additional aspects of the invention are realized by having instructions contain variable numbers of instruction fragments. Each instruction fragment causes a particular operation, or operations, to be performed allowing multiple operations during each clock cycle. Thus, multiple operations are performed during each clock cycle, reducing the total number of clock cycles necessary to perform a task. The exemplary DSP includes a set of three data buses over which data may be exchanged with a register bank and three data memories.

Type: Grant

Filed: March 18, 1998

Date of Patent: July 23, 2002

Assignee: Qualcomm, Inc.

Inventors: Qiuzhen Zou, Gilbert C. Sih, Inyup Kang, Quaeed Motiwala, Deepu John, Li Zhang, Haitao Zhang, Way-Shing Lee
Method and apparatus for staggering execution of an instruction

Patent number: 6425073

Abstract: A method and apparatus are disclosed for staggering execution of an instruction. According to one embodiment of the invention, a single macro instruction is received wherein the single macro instruction specifies at least two logical registers and wherein the two logical registers respectively store a first and second packed data operands having corresponding data elements. An operation specified by the single macro instruction is then performed independently on a first and second plurality of the corresponding data elements from said first and second packed data operands at different times using the same circuit to independently generate a first and second plurality of resulting data elements. The first and second plurality of resulting data elements are stored in a single logical register as a third packed data operand.

Type: Grant

Filed: March 13, 2001

Date of Patent: July 23, 2002

Assignee: Intel Corporation

Inventors: Patrice Roussel, Glenn J. Hinton, Shreekant S. Thakkar, Brent R. Boswell, Karol F. Menezes
Apparatus and method for performing intra-add operation

Patent number: 6418529

Abstract: Method and apparatus for including in a processor, instructions for performing intra-add operations on packed data. In one embodiment, an execution unit is coupled to a storage area. The storeage area has stored therein a first packed data operand and a second packed data operand. The execution unit performs operations on data elements in the first packed data operand and the second packed data operand to generate a plurality of data elements in a packed data result in response to receiving a single instruction. At least two of the plurality of data elements in a packed data result store the result of an intra-add operation using the first packed data operand and the second packed data operand.

Type: Grant

Filed: March 31, 1998

Date of Patent: July 9, 2002

Assignee: Intel Corporation

Inventor: Patrice Roussel
System and method for encoding constant operands in a wide issue processor

Publication number: 20020087834

Abstract: For use in a data processor comprising an instruction execution pipeline comprising N processing stages, a system and method of encoding constant operands is disclosed. The system comprises a constant generator unit that is capable of generating both short constant operands and long constant operands. The constant generator unit extracts the bits of a short constant operand from an instruction syllable and right justifies the bits in an output syllable. For long constant operands, the constant generator unit extracts K low order bits from an instruction syllable and T high order bits from an extension syllable. The right justified K low order bits and the T high order bits are combined to represent the long constant operand in one output syllable. In response to the status of op code bits located within a constant generation instruction, the constant generator unit enables and disables multiplexers to automatically generate the appropriate short or long constant operand.

Type: Application

Filed: December 29, 2000

Publication date: July 4, 2002

Applicant: STMicroelectronics, Inc.

Inventors: Paolo Faraboschi, Alexander J. Starr, Anthony X. Jarvis, Geoffrey M. Brown, Mark Owen Homewood, Gary L. Vondran
Status register associated with MMX register file for tracking writes

Patent number: 6412065

Abstract: A portion of an x86 microprocessor that supports MMX instructions provides a write tracking unit that tracks writes to a separately provided MMX register file, and updates a status register accordingly. A write control unit uses the contents of the status register to control transfers between the MMX register file and the FP register file, so as to only copy those registers that have changed. According to another aspect of the invention, the write control unit insures that architecturally required modifications to the exponent portion of FP registers corresponding to modified MMX registers are provided.

Type: Grant

Filed: June 25, 1999

Date of Patent: June 25, 2002

Assignee: IP First, L.L.C.

Inventor: Albert J. Loper, Jr.
Method and apparatus for interfacing between a digital signal processor and a baseband circuit for wireless communication system

Patent number: 6412029

Abstract: A method and apparatus for communicating transmit and receive data between a digital signal processor and the baseband processing circuitry in a digital communications station such as a digital cellular telephone. The invention utilizes a transmit buffer and a receive buffer for smoothing out the flow of data. TRANSMIT BUFFER EMPTY and RECEIVE BUFFER FULL interrupts indicating the need for data to be retrieved from the transmit buffer or sent to the receive buffer, respectively, are serviced by a DMA with translation circuitry rather than the DSP. The DMA with translation circuitry intercepts the interrupts and services them by transferring data directly to or from the DSP's RAM without disturbing the DSP. The translation circuitry also arbitrates between TRANSMIT BUFFER EMPTY and RECEIVE BUFFER FULL interrupts so as to service the RECEIVE BUFFER FULL interrupts first since they have stricter timing requirements.

Type: Grant

Filed: April 29, 1999

Date of Patent: June 25, 2002

Assignee: Agere Systems Guardian Corp.

Inventors: Hussein K. Mecklai, Andrew Lawrence Webb
Dynamic allocation of resources in multiple microprocessor pipelines

Patent number: 6408377

Abstract: A microprocessor having M parallel pipelines and N arithmetic logic units, where N is less than M. A single instruction fetch stage fetches multi-stage instructions, and a single instruction decoder provides a parallel set of three instructions to the three pipelines. The two ALUs are dynamically connected to two of the pipelines having instructions requiring an ALU, while the third pipeline executes an instruction in parallel that does not require an ALU. The third pipeline may have a move unit connected to it.

Type: Grant

Filed: April 26, 2001

Date of Patent: June 18, 2002

Assignee: Rise Technology Company

Inventor: Kenneth K. Munson
Rapid execution of floating point load control word instructions

Patent number: 6405305

Abstract: A microprocessor with a floating point unit configured to rapidly execute floating point load control word (FLDCW) type instructions in an out of program order context is disclosed. The floating point unit is configured to schedule instructions older than the FLDCW-type instruction before the FLDCW-type instruction is scheduled. The FLDCW-type instruction acts as a barrier to prevent instructions occurring after the FLDCW-type instruction in program order from executing before the FLDCW-type instruction. Indicator bits may be used to simplify instruction scheduling, and copies of the floating point control word may be stored for instruction that have long execution cycles. A method and computer configured to rapidly execute FLDCW-type instructions in an out of program order context are also disclosed.

Type: Grant

Filed: September 10, 1999

Date of Patent: June 11, 2002

Assignee: Advanced Micro Devices, Inc.

Inventors: Stephan G. Meier, Jeffrey E. Trull, Derrick R. Meyer, Norbert Juffa
Instruction set for bi-directional conversion and transfer of integer and floating point data

Patent number: 6405306

Abstract: An apparatus and method for bi-directional format conversion and transfer of data between integer and floating point registers is provided. A floating point register is configured to store floating point data, and integer data, in a variety of numerical formats. Data is moved in and out of the floating point register as integer data, and is converted into floating point format as needed. Separate processor instructions are provided for format conversion and data transfer to allow conversion and transfer operations to be separated.

Type: Grant

Filed: May 25, 2001

Date of Patent: June 11, 2002

Assignee: IP First LLC

Inventors: Timothy A. Elliott, G. Glenn Henry
Method and apparatus for performing vector and scalar multiplication and calculating rounded products

Patent number: 6393554

Abstract: A multiplier capable of performing signed and unsigned scalar and vector multiplication is disclosed. The multiplier is configured to receive signed or unsigned multiplier and multiplicand operands in scalar or packed vector form. An effective sign for the multiplier and multiplicand operands may be calculated based upon each operand's most significant bit and a control signal. The effective signs may then be used to create and select a number of partial products according to Booth's algorithm. Once the partial products have been created and selected, they may be summed and the results may be output. The results may be signed or unsigned, and may represent vector or scalar quantities. When a vector multiplication is performed, the multiplier may be configured to generate and select partial products so as to effectively isolate the multiplication process for each pair of vector components.

Type: Grant

Filed: January 19, 2000

Date of Patent: May 21, 2002

Assignee: Advanced Micro Devices, Inc.

Inventors: Stuart F. Oberman, Ming Siu, Ravi Krishna Cherukuri
Method for performing multiply-add operations on packed data

Patent number: 6385634

Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.

Type: Grant

Filed: August 31, 1995

Date of Patent: May 7, 2002

Assignee: Intel Corporation

Inventors: Alexander D. Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt
Method and apparatus for processing interruptible, multi-cycle instructions

Patent number: 6378022

Abstract: A method and apparatus for processing multi-cycle instructions. In one embodiment, the method includes beginning execution of a multi-cycle instruction by the processing device, and, during execution of the multi-cycle instruction, comparing a threshold value to a count value that indicates a number of remaining cycles before completion of the multi-cycle instruction. In one embodiment, the apparatus includes a processing unit. The processing unit includes an interrupt control module that has an interrupt request signal input and a second input to receive a multi-cycle instruction interrupt signal. The multi-cycle instruction interrupt signal is to indicate an interruptible interval when an interrupt of a multi-cycle instruction is permitted prior to completion of the multi-cycle instruction.

Type: Grant

Filed: June 17, 1999

Date of Patent: April 23, 2002

Assignee: Motorola, Inc.

Inventors: William C. Moyer, Jeffrey W. Scott, Michael D. Fitzsimmons
Processor with conditional execution of every instruction

Patent number: 6374346

Abstract: A general purpose microprocessor architecture enabling more efficient computations of a type in which Boolean operations and arithmetic operations conditioned on the results of the Boolean operations are interleaved. The microprocessor is provided with a plurality of general purpose registers (“GPRs” 102)and an arithmetic logic unit (“ALU” 104), capable of performing arithmetic operations and Boolean operations. The ALU has a first input (108) and a second input (110), and an output (112), the first and second inputs receiving values stored in the GPRs. The output stores the results of the arithmetic logic unit operations in the GPRs. At least one of the GPRs is capable of receiving directly from the ALU a result of a Boolean operation. In one embodiment, at least one of the GPRs (PN)capable of receiving directly from the ALU a result of a Boolean operation is configured so as to cause the conditioning of an arithmetic operation of the ALU based on the value stored in the GPR.

Type: Grant

Filed: January 23, 1998

Date of Patent: April 16, 2002

Assignee: Texas Instruments Incorporated

Inventors: Natarajan Seshan, Laurence R. Simar, Jr., Reid E. Tatge, Alan L. Davis
Single instruction multiple data processing

Publication number: 20020040427

Abstract: A data processing system is provided with an instruction (PKH) that combines a packing operation of respective portions of input operand data words (Rn, Rm) into an output data word (Rd) together with the ability to select one of the portions to be combined from a variable position (k) within its respective input operand data word in a manner that allows additional processing to be carried out together with the packing operation. The instruction conveniently combines either the top or bottom half of one of the input operand data words with a half data word portion selected from a variable position within the other input operand data word.

Type: Application

Filed: September 24, 2001

Publication date: April 4, 2002

Inventor: Dominic Hugo Symes
Digital signal processor having enhanced utilization of multiply accumulate (MAC) stage and method

Patent number: 6367003

Abstract: A digital signal processor (DSP) architecture which allows the DSP Multiply-Accumulator (MAC) to be used for special fixed functions during those times when the programmable portions of the DSP are not using the MAC circuitry. During the idle times, the DSP processor gives control of the MAC to the fixed function circuit. The fixed functions provided by the fixed function circuit can include digital filters, including a Finite Impulse Response filters (FIR), an Infinite Impulse Response (IIR) filter, or an oversampling filter associated with a sigma-delta converter. The DSP may, under program control, set up specific parameters for the fixed function, provide parameters to the fixed function parameter memory, or obtain results from the fixed function. Parameters for the fixed function circuit include the type of filter, the number of taps and the filter coefficients. For a decimation filter, the fixed function parameters can also include the decimation factor.

Type: Grant

Filed: August 14, 2000

Date of Patent: April 2, 2002

Assignee: Micron Technology, Inc.

Inventor: Henry A. Davis
Processing architecture having a matrix-transpose capability

Publication number: 20020032710

Abstract: According to the invention, a matrix of elements is processed in a processor. A first subset of matrix elements is loaded from a first location and a second subset of matrix elements is loaded from a second location. A third subset of matrix elements is stored in a first destination and a fourth subset of matrix elements is stored in a second destination. The loading and storing steps result from the same instruction issue.

Type: Application

Filed: March 8, 2001

Publication date: March 14, 2002

Inventors: Ashley Saulsbury, Daniel S. Rice, Michael W. Parkin, Nyles Nettleton
Method for conserving memory storage during an interpolation operation

Patent number: 6351757

Abstract: A method for conserving memory storage during the execution of a interpolation operation uses a pool of interpolation commands. When a interpolation operation is requested, it is performed using one of the pooled interpolation commands. If none of the pooled interpolation commands are available, the interpolation command having the smallest difference between its interpolated value and final value is selected, and its interpolated value converted to its final value. The selected interpolation command is then reassigned to interpolate the requested data.

Type: Grant

Filed: November 13, 1998

Date of Patent: February 26, 2002

Assignee: Creative Technology Ltd.

Inventors: Eric W. Lange, Stephen Hoge
Parallel processing instructions routed through plural differing capacity units of operand address generators coupled to multi-ported memory and ALUs

Patent number: 6341343

Abstract: Three parallel instruction processing pipelines of a microprocessor share two data memory ports for obtaining operands and writing back results. Since a significant proportion of the instructions of a typical computer program do not require reading operands from the memory, the probability is high that at least one of any three program instructions to be executed at the same time need not fetch an operand from memory. The two memory ports are thus connected at any given time with the two of the three pipelines which are processing instructions that require memory access, the pipeline without access to the memory processing an instruction that does not need it. To do so, the added third pipeline need not have all the same resources as the other two pipelines, so its stages are made to have a reduced capability in order to save space and reduce power consumption.

Type: Grant

Filed: April 26, 2001

Date of Patent: January 22, 2002

Assignee: Rise Technology Company

Inventor: Kenneth K. Munson
Data processing system and method for performing an arithmetic operation on a plurality of signed data values

Patent number: 6338135

Abstract: Disclosed is a data processing system and a method for performing an arithmetic operation on a plurality of signed data values. In the data processing system and the method, there is a first step in which two or more signed data values are encoded into a composite value and an arithmetic operation is applied to the composite value to produce an encoded result. The encoded result can then be decoded to produce final results where each final result represents the application of the arithmetic operation to a corresponding signed data value. Thus, by using the encoded composite value, a single arithmetic operation can be applied simultaneously to multiple data values and the result then decoded. The decoded result represents the result of applying the arithmetic operation to each data value separately. The advantage of this operation is that operations can be formed on multiple data values without requiring the provision of dedicated hardware or new instructions as required by the prior art.

Type: Grant

Filed: November 20, 1998

Date of Patent: January 8, 2002

Assignee: Arm Limited

Inventor: Wilco Dijkstra
Pairing of load-ALU-store with conditional branch

Patent number: 6338136

Abstract: An apparatus and method are provided for executing a compare-and-jump operation in a pipeline microprocessor. Typically, the compare-and-jump operation is specified by two micro instructions. The first micro instruction, an ALU micro instruction, directs the microprocessor to perform an ALU operation, resulting in update of a flags register. The second micro instruction, a conditional jump micro instruction, directs the microprocessor to examine the flags register and to branch program control to a target address if a prescribed condition is met. The apparatus has a jump combiner that detects the ALU micro instruction and the conditional jump micro instruction in a micro instruction queue. The jump combiner indicates the prescribed condition for the conditional branch in a field of the ALU micro instruction, and then deletes the conditional jump micro instruction from the queue. The apparatus also has execution logic that performs the ALU operation, generates the result, and updates the flags register.

Type: Grant

Filed: May 18, 1999

Date of Patent: January 8, 2002

Assignee: IP-First, LLC

Inventors: Gerard M. Col, Rodney E. Hooker
System and method for executing store instructions

Patent number: 6336183

Abstract: In a processor, store instructions are divided or cracked into store data and store address generation portions for separate and parallel execution within two execution units. The address generation portion of the store instruction is executed within the load store unit, while the store data portion of the instruction is executed in an execution unit other than the load store unit. If the store instruction is a fixed point execution unit, then the store data portion is executed within the fixed point unit. If the store instruction is a floating point store instruction, then the store data portion of the store instruction is executed within the floating point unit.

Type: Grant

Filed: February 26, 1999

Date of Patent: January 1, 2002

Assignee: International Business Machines Corporation

Inventors: Hung Qui Le, Robert Greg McDonald, David James Shippy, Larry Edward Thatcher
Method and apparatus for handling partial register accesses

Patent number: 6334183

Abstract: The present invention includes a partial register write handler. The write handler receives either two or three operands. An execution unit operates on portions of two operands, rather than on full operands. The result of the execution unit has fewer bits than an “additional” operand, which may be any of the two or three operands received by the write handler. An output multiplexer receives all of the bits of an execution unit result and selected bits of the additional operand, and produces an output that has as many bits as the additional operand. If the output of the multiplexer is a string of bits, the string of bits contains the execution unit result as a substring of bits. The remaining bits of the output of the multiplexer are selected from the additional operand.

Type: Grant

Filed: November 18, 1998

Date of Patent: December 25, 2001

Assignee: Intrinsity, Inc.

Inventors: James S. Blomgren, Anthony M. Petro
Encoder, method thereof and graphic processing apparatus

Patent number: 6329999

Abstract: An encoder capable of making a processing time shorter, wherein the position of a first “1” bit seen from the MSB of digital data is output as a first bit encoded data and the second “1” bit is output as the second bit encoded data. A predetermined calculation is performed in parallel on the upper 8 bits of the digital data in the valid detector, the priority encoder, and the first valid bit mask unit, while a predetermined calculation is performed in parallel on the lower 8 bits in another priority encoder and another first valid bit mask unit.

Type: Grant

Filed: April 1, 1999

Date of Patent: December 11, 2001

Assignee: Sony Corporation

Inventors: Tatsumi Mitsushita, Katsuya Kita
Program executing apparatus and program converting method

Patent number: 6324641

Abstract: To simplify the process relative to an instruction array including an instruction for a process with flag handling executed by a compiler when converting a high-level program into in a format executable by a program executing apparatus, a number of operating circuits, namely, an ALU circuit and a AND operation circuit, are provided to operate in parallel to handle different flags in a flag group based on the results of respective operations. A value comparison instruction and a bit test instruction are converted into common operation process instructions, and branch instructions, dependent on the result of the execution of the operation process instructions, are prepared so as to detect different flag patterns. The common use of an operation instruction for a number of flag-handling instructions simplifies a compiler judgement.

Type: Grant

Filed: September 18, 1998

Date of Patent: November 27, 2001

Assignee: Sanyo Electric Co., Ltd.

Inventors: Kenshi Matsumoto, Yasuhito Koumura, Hiroki Miura
Microprocessor comprising bit concatenation means

Patent number: 6317825

Abstract: The invention relates to a microprocessor (MP) comprising means to decode (DEC1) a compact instruction (BMV) for the concatenation of at least one bit (bi) of a first binary word (W1) with at least one bit of a second binary word (W2), and means (REGBANK, MUX, BSHIFT) to process this instruction in one clock cycle. Advantages: fast processing of a concatenation operation. Application especially to chip cards.

Type: Grant

Filed: May 3, 2000

Date of Patent: November 13, 2001

Assignee: Inside Technologies

Inventor: Sean Commercial
Information processing apparatus with parallel accumulation capability

Publication number: 20010037427

Abstract: An initial value of read address is set in a first initial address register; an initial value of write address is set in a second initial address register; and the number of data to be accumulated by an accumulator and the frequency of repetition of accumulation are set in an accumulator count register. A controller controls the timing of output of an initial read address from a first memory controller, the timing of initialization by an initialize, and the timing of output of an initial write address from a second memory controller. Reading of data, accumulation and writing of data proceed in parallel in each cycle of accumulation.

Type: Application

Filed: March 12, 2001

Publication date: November 1, 2001

Inventors: Yoshihiro Ogawa, Toshihisa Kamemaru, Hirokazu Suzuki
System and method for performing a MOVHPS-MOVLPS instruction

Patent number: 6307553

Abstract: An apparatus and method for performing a MOVHPS-MOVLPS operation on packed data using computer-implemented steps is described. In one embodiment, a first packed data operand having a pair of data elements is accessed. A second packed data operand having two pairs of data elements is then accessed. One of the two pairs of data elements in the second packed data operand is replaced with the pair of data elements in the first packed data operand.

Type: Grant

Filed: March 31, 1998

Date of Patent: October 23, 2001

Inventor: Mohammad Abdallah
Using two barrel shifters to implement shift, rotate, rotate with carry, and shift double as specified by the X86 architecture

Patent number: 6304956

Abstract: A novel method and apparatus of performing data bit moving functions on a data word using two barrel shifters: a left shifter and a right shifter. The present invention is able to handle both shift and rotate functions using one shifter unit. Specifically, for shift functions, only one of the two shifters is used to perform the shifting function. On the other hand, for rotate functions, both shifters are needed for shifting the data word. The amounts of the right shift and left shift depend on the number defined by the count operand and the specific shift/rotate instruction requested.

Type: Grant

Filed: March 25, 1999

Date of Patent: October 16, 2001

Assignee: Rise Technology Company

Inventor: Dzung X. Tran
Instruction cache alignment mechanism for branch targets based on predicted execution frequencies

Patent number: 6301652

Abstract: A compiler system and method is provided that can 1) generate a second instruction stream from a first instruction stream, 2) read in and process predetermined external information regarding the basic blocks that makes up the second instruction stream and 3) place certain of the basic blocks on cache line boundaries based on predicted execution frequencies. In particular, the compiler system and method utilize profile information containing predicted block execution or edge-weight execution frequencies to determine which of the basic blocks to align on cache line boundaries. One method for obtaining profile information includes precompiling the source code, creating an executable program, executing the program with test inputs, and outputting a profile containing execution frequency information. Once the profile information is obtained, the source code can then be recompiled using the profile information. The compiler can then selectively cache align those blocks identified as important.

Type: Grant

Filed: January 31, 1996

Date of Patent: October 9, 2001

Assignee: International Business Machines Corporation

Inventors: Edward Curtis Prosser, Robert Ralph Roediger, William Jon Schmidt

prev … 9 10 11 12 13 14 next