Multiplication Followed By Addition (i.e., X*y+z) Patents (Class 708/523)

Multipurpose arithmetic functional unit

Patent number: 7640285

Abstract: Multipurpose arithmetic functional units can perform planar attribute interpolation and unary function approximation operations. In one embodiment, planar interpolation operations for coordinates (x, y) are executed by computing A*x+B*y+C, and unary function approximation operations for operand x are executed by computing F2(xb)*xh2+F1(xb)*xh+F0(xb), where xh=x?xb. Shared multiplier and adder circuits are advantageously used to implement the product and sum operations for both classes of operations.

Type: Grant

Filed: October 20, 2004

Date of Patent: December 29, 2009

Assignee: NVIDIA Corporation

Inventors: Stuart F. Oberman, Ming Y. Siu
SEARCHING, SORTING, AND DISPLAYING VIDEO CLIPS AND SOUND FILES BY RELEVANCE

Publication number: 20090313236

Abstract: A documents database has a plurality of documents, including but not limited to text files, video clips and sound files. Each document is associated with at least one category of a plurality of categories in a categories database, and each category has at least one keyword. A search request having at least one search term is received from a user, and a categories database is searched for categories having a keyword corresponding to the user search term to identify first level categories. The other keywords from the identified first level categories are retrieved and the documents database is searched for documents having a user search term or a retrieved keyword. The identified documents are then ranked and presented to the user. Other search expansion techniques, and display techniques, are also discussed.

Type: Application

Filed: June 13, 2008

Publication date: December 17, 2009

Applicant: NEWS DISTRIBUTION NETWORK, INC.

Inventors: Paul Matthew Hernacki, Gregory Alton Peters
METHOD AND SYSTEM FOR AVOIDING UNDERFLOW IN A FLOATING-POINT OPERATION

Publication number: 20090292754

Abstract: Methods and systems for detecting underflow in a floating-point operation are disclosed. In accordance with an example disclosed method a plurality of comparator circuits and a plurality of logic devices coupled to the plurality of comparator circuits are operated to determine whether performing a floating-point operation using a floating-point hardware unit will generate an underflow condition. The operating of the plurality of comparator circuits and the logic devices involves inputting a multiply-add operation result value to at least some of the plurality of comparator circuits. In addition, a plurality of logic outputs are outputted via the plurality of logic devices. The plurality of logic outputs are indicative of comparison operations performed by at least some of the comparator circuits based on the multiply-add operation result value. An underflow indicator is outputted based on the plurality of logic outputs.

Type: Application

Filed: July 31, 2009

Publication date: November 26, 2009

Inventor: Marius A. Cornea-Hasegan
Large-factor multiplication in an array of processors

Publication number: 20090292756

Abstract: A processor to calculate a product-component having fewer digits than an entire product of a multiplication of a multiplicand and a multiplier. A memory holds at least one multiplicand-component having fewer digits than the multiplicand and at least one multiplier-component having fewer digits than the multiplier. A logic then calculates the product-component based on the multiplicand-components and the multiplier-components in the memory. Collectively, a plurality of the processors can calculate all of the product-components of the product.

Type: Application

Filed: May 23, 2008

Publication date: November 26, 2009

Inventors: Gibson D. Elliot, Jay Randall Stoner
Method and apparatus for efficient integer transform

Patent number: 7624138

Abstract: A method and apparatus for including in a processor instructions for performing integer transforms including multiply-add operations and horizontal-add operations on packed data. In one embodiment, a processor is coupled to a memory that stores a first packed byte data and a second packed byte data. The processor performs operations on said first packed byte data and said second packed byte data to generate a third packed data in response to receiving a multiply-add instruction. A plurality of the 16-bit data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed byte data. The processor adds together at least a first and a second 16-bit data element of the third packed data in response to receiving an horizontal-add instruction to generate a 16-bit result as one of a plurality of data elements of a fourth packed data.

Type: Grant

Filed: December 30, 2003

Date of Patent: November 24, 2009

Assignee: Intel Corporation

Inventors: Eric Debes, William W. Macy, Jonathan J. Tyler
Leading Zero Estimation Modification for Unfused Rounding Catastrophic Cancellation

Publication number: 20090287757

Abstract: Modifying a leading zero estimation during an unfused multiply add operation of (A*B)+C. A plurality of terms x and y may be received, and each may be based on truncated terms s and t (e.g., in performing the unfused multiply add operation) and the shifted C term. A first leading zero estimation may be calculated based on the terms x and y. It may be determined if near total catastrophic cancellation has occurred. A carry in from a right most number of bits of the terms s and t and the most significant truncated bits of s and t may be used to generate a second leading zero estimation based on the first leading zero estimation if the near total catastrophic cancellation has occurred.

Type: Application

Filed: May 15, 2008

Publication date: November 19, 2009

Inventor: Leonard D. Rarick
Multiply and accumulate digital filter operations

Publication number: 20090248769

Abstract: A multiply and accumulate engine may implement a digital filter. In some embodiments, the number of coefficients that are stored may be equal to only half of the number of filter taps that are implemented. This may be done by doing multiplications operand by operand within two data registers in a first direction and then shifting directions so that the first operand in a first register is multiplied by the last operand in another register. In some embodiments, the multiply and accumulate engine may be implemented as a two cycle engine wherein in the first stage, multiply and accumulate operations are implemented and then stored into a register. In a second stage and a second cycle, the results stored in the register are further accumulated.

Type: Application

Filed: March 26, 2008

Publication date: October 1, 2009

Inventor: Teck-Kuen Chua
Processor which Implements Fused and Unfused Multiply-Add Instructions in a Pipelined Manner

Publication number: 20090248779

Abstract: Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.

Type: Application

Filed: March 28, 2008

Publication date: October 1, 2009

Inventors: Jeffrey S. Brooks, Christopher H. Olson
Device for synthesis of a composite digital signal with explicit control of the first three moments thereof

Patent number: 7596472

Abstract: The device determines the weighting coefficients to be applied to N digital source signals to form a composite signal. The first- to third-order moments of the composite signal must respectively present mean value, variance and skewness characteristics predefined by a user. The device introduces an additional variable, in the form of a weighting matrix W. The vector w being the vector of the weighting coefficients and wT the transpose of the vector w, the difference W?wwT is a positive semidefinite matrix. Moreover, the device performs linearization, around a vector wref of reference weighting coefficients, of the skewness constraint on the third-order moments using a matrix A = [ W w w T 1 ] as further intermediate variable.

Type: Grant

Filed: December 19, 2006

Date of Patent: September 29, 2009

Assignee: Prax Value

Inventor: Francois Oustry
Applications of cascading DSP slices

Patent number: 7567997

Abstract: In one embodiment an IC is disclosed which includes a plurality of cascaded digital signal processing slices, wherein each slice has a multiplier coupled to an adder via a multiplexer and each slice has a direct connection to an adjoining slice; and means for configuring the plurality of digital signal processing slices to perform one or more mathematical operations, via, for example, opmodes. This IC allows for the implementation of some basic math functions, such as add, subtract, multiply and divide. Many other applications may be implemented using the one or more DSP slices, for example, accumulate, multiply accumulate (MACC), a wide multiplexer, barrel shifter, counter, and folded, decimating, and interpolating FIRs to name a few.

Type: Grant

Filed: December 21, 2004

Date of Patent: July 28, 2009

Assignee: XILINX, Inc.

Inventors: James M. Simkins, Steven P. Young, Jennifer Wong, Bernard J. New, Alvin Y. Ching
Method for Estimating Software Development Effort

Publication number: 20090177447

Abstract: A method for estimating software development effort comprises the steps of: generating a database containing a plurality of source softwares; calculating the Grey relational coefficients between the software to be developed and a source software in the database for each feature they exhibit; calculating the weights for each Grey relational coefficient; multiplying each Grey relational coefficient with the corresponding weight; calculating the Grey relational grade by summing up the products produced in the multiplying step; calculating the Grey relational grades for all remaining source softwares in the database; and comparing the Grey relational grades to estimate the effort for developing the software to be developed.

Type: Application

Filed: January 4, 2008

Publication date: July 9, 2009

Applicant: NATIONAL TSING HUA UNIVERSITY

Inventors: Chao Jung Hsu, Chin Yu Huang
RECONFIGURABLE ARITHMETIC UNIT AND HIGH-EFFICIENCY PROCESSOR HAVING THE SAME

Publication number: 20090150471

Abstract: Provided are a reconfigurable arithmetic unit and a processor having the same. The reconfigurable arithmetic unit can perform an addition operation or a multiplication operation according to an instruction by sharing an adder. The reconfigurable arithmetic unit includes a booth encoder for encoding a multiplier, a partial product generator for generating a plurality of partial products using the encoded multiplier and a multiplicand, a Wallace tree circuit for compressing the partial products into a first partial product and a second partial product, a first Multiplexer (MUX) for selecting and outputting one of the first partial product and a first addition input according to a selection signal, a second MUX for selecting and outputting one of the second partial product and a second addition input according to the selection signal, and a Carry Propagation Adder (CPA) for adding an output of the first MUX and an output of the second MUX to output an operation result.

Type: Application

Filed: June 10, 2008

Publication date: June 11, 2009

Inventors: Yil Suk YANG, Jung Hee SUK, Chun Gi LYUH, Tae Moon ROH, Jong Dae KIM
Processor for computing a packed sum of absolute differences and packed multiply-add

Patent number: 7516307

Abstract: A method and apparatus is disclosed that computes multiple absolute differences from packed data and sums each one of the multiple absolute differences together to produce a result. According to one embodiment, a processor includes a decode unit to decode a packed sum of absolute differences (PSAD) instruction having an opcode format to identify a set of packed data operands. The decode unit initiates a sequence of operations on the set of packed data operands in response to decoding the PSAD instruction. An execution unit performs a first operation of the sequence of operations initiated by the decode logic, and a bus provides the execution unit with the set of packed data operands as identified in accordance with the opcode format.

Type: Grant

Filed: November 6, 2001

Date of Patent: April 7, 2009

Assignee: Intel Corporation

Inventors: Mohammad A. Abdallah, Vladimir Pentkovski
Microprocessor

Publication number: 20090077154

Abstract: Provided is a microprocessor including a complex-MAC unit that operates in response to a complex-MAC instruction. The complex-MAC unit receives first and second complex data (each having 2m-bit length) from a first register having a register length of at least 2m+1 bits, and also receives third and fourth complex data (each having 2m-bit length) from a second register having a register length of at least 2m+1 bits, to calculate a sum of real parts or imaginary parts of a complex product of the first and third complex data and a complex product of the second and fourth complex data. The complex-MAC unit adds the obtained sum of the real parts or imaginary parts to a stored value of the third register, and overwrites the third register with the cumulative total value. The third register has a register length of at least 2m+2 bits.

Type: Application

Filed: September 10, 2008

Publication date: March 19, 2009

Applicant: NEC ELECTRONICS CORPORATION

Inventors: Hideki Matsuyama, Masayuki Daitou
ARITHMETIC PROCESSING SYSTEM AND METHOD THEREOF

Publication number: 20090070399

Abstract: An arithmetic processing system processes a sensing signal and a first approximate offset signal to obtain a second approximate offset signal. The system includes a first arithmetic processor and a second arithmetic processor. The first arithmetic processor receives and processes the sensing signal and the first approximate offset signal to output a first arithmetic signal. The second arithmetic processor processes the first arithmetic signal to output a second arithmetic signal, and the second arithmetic signal is added with a predetermined offset signal to obtain the second approximate offset signal, and the second approximate offset signal is closer to a real offset signal of the sensing signal than the first approximate offset signal. A method of arithmetic processing is also disclosed.

Type: Application

Filed: November 6, 2007

Publication date: March 12, 2009

Applicant: ASIA OPTICAL CO., INC.

Inventors: Kun-Chi Liao, Yu-Ting Lee
MULTIPLICATION CIRCUIT, DIGITAL FILTER, SIGNAL PROCESSING DEVICE, SYNTHESIS DEVICE, SYNTHESIS PROGRAM, AND SYNTHESIS PROGRAM RECORDING MEDIUM

Publication number: 20090030963

Abstract: The conventional two's complement multiplier which is constituted by a Booth encoder, a partial production generation circuit, and an adder has a problem that the circuit scale would be increased because a bit extension is performed when the multiplier is adapted to an unsigned multiplication. A multiplication circuit of the present invention is provided with a first Booth encoder (1) for encoding lower-order several bits of a multiplier according to first rules of encoding using a Booth algorithm, and a second Booth encoder (5) for encoding most-significant several bits of the multiplier according to second rules of encoding using a Booth algorithm, which are different from the first rules of encoding, and thereby the most-significant several bits of the multiplier are encoded using the Booth algorithm which is different from that for the lower-order several bits.

Type: Application

Filed: February 8, 2007

Publication date: January 29, 2009

Inventor: Kouichi Nagano
Arithmetic circuit with multiplexed addend inputs

Patent number: 7480690

Abstract: Described are arithmetic circuits divided logically into a product generator and an adder. Multiplexing circuitry logically disposed between the product generator and the adder supports conventional functionality by providing partial products from the product generator to addend terminals of the adder. The multiplexing circuitry can also be controlled to direct a number of external added inputs to the adder. The additional addend inputs can include inputs and outputs cascaded from other arithmetic circuits.

Type: Grant

Filed: December 21, 2004

Date of Patent: January 20, 2009

Assignee: XILINX, Inc.

Inventors: James M. Simkins, Steven P. Young, Jennifer Wong, Bernard J. New, Alvin Y. Ching
Programmable logic device with cascading DSP slices

Patent number: 7472155

Abstract: Described is a programmable logic device (PLD) with columns of DSP slices that can be cascaded to create DSP circuits of varying size and complexity. Each DSP slice includes a plurality of operand input ports and a slice output port, all of which are programmably connected to general routing and logic resources. The operand ports receive operands for processing, and a slice output port conveys processed results. Each slice additionally includes a feedback port connected to the respective slice output port, to support accumulate functions in this embodiment, and a cascade input port connected to the output port of an upstream slice to support cascading.

Type: Grant

Filed: December 21, 2004

Date of Patent: December 30, 2008

Assignee: Xilinx, Inc.

Inventors: James M. Simkins, Steven P. Young, Jennifer Wong, Bernard J. New, Alvin Y. Ching
Programmable logic device with pipelined DSP slices

Patent number: 7467175

Abstract: Described is a programmable logic device (PLD) with columns of DSP slices that can be combined to create DSP circuits of varying size and complexity. DSP slices in accordance with some embodiments includes programmable operand input registers that can be configured to introduce different amounts of delay, from zero to two clock cycles, for example, to support pipelining. In one such embodiment, each DSP slice includes a partial-product generator having a multiplier port, a multiplicand port, and a product port. The multiplier and multiplicand ports connect to the operand input port via respective first and second operand input registers, each of which is capable of introducing from zero to two clock cycles of delay. In another embodiment, the output of at least one operand input register can connect to the input of an operand input register of a downstream DSP slice so that operands can be transferred among one or more slices.

Type: Grant

Filed: December 21, 2004

Date of Patent: December 16, 2008

Assignee: XILINX, Inc.

Inventors: James M. Simkins, Steven P. Young, Jennifer Wong, Bernard J. New, Alvin Y. Ching
X87 FUSED MULTIPLY-ADD INSTRUCTION

Publication number: 20080256162

Abstract: An x87 fused multiply-add (FMA) instruction in the instruction set of an x86 architecture microprocessor is disclosed. The FMA instruction implicitly specifies the two factor operands as the top two operands of the x87 FPU register stack and explicitly specifies the third addend operand as a third x87 FPU register stack register. The microprocessor multiplies the first two operands and adds the product to the third operand to generate a result. The result is stored into the third register and the first two operands are popped off the stack. In an alternate embodiment, the third operand is also implicitly specified as being stored in the register that is two registers below the top of stack register; the result is also stored therein. The instruction opcode value is in the x87 opcode range.

Type: Application

Filed: July 23, 2007

Publication date: October 16, 2008

Applicant: VIA Technologies, Inc.

Inventors: G. Glenn Henry, Timothy A. Elliott, Terry Parks
Multiplier-accumulator block mode splitting

Patent number: 7437401

Abstract: A programmable logic device is provided that includes a MAC block having mode splitting capabilities. Different modes of operation may be implemented simultaneously whereby the multipliers and other DSP circuitry of the MAC block may be allocated among the different modes of operation. For example, one multiplier may be used to implement a multiply mode while another two multipliers may be used to implement a sum of two multipliers mode.

Type: Grant

Filed: February 20, 2004

Date of Patent: October 14, 2008

Assignee: Altera Corporation

Inventors: Leon Zheng, Martin Langhammer, Steven Perry, Paul Metzgen, Gregory Starr, William Hwang, Kumara Tharmalingam
Multipurpose functional unit with multiply-add and format conversion pipeline

Patent number: 7428566

Abstract: A multipurpose functional unit is configurable to support a number of operations including multiply-add and format conversion operations, as well as other integer and/or floating-point arithmetic operations, Boolean operations, and logical test operations.

Type: Grant

Filed: November 10, 2004

Date of Patent: September 23, 2008

Assignee: Nvidia Corporation

Inventors: Ming Y. Siu, Stuart F. Oberman
Multi-format multiplier unit

Publication number: 20080195685

Abstract: Multiplication engines and multiplication methods are provided. A multiplication engine for a digital processor includes a first multiplier to generate unequally weighted partial products from input operands in a first multiplier mode; a second multiplier to generate equally weighted partial products from input operands in a second multiplier mode; a multiplexer to select the unequally weighted partial products in the first multiplier mode and to select the equally weighted partial products in the second multiplier mode; and a carry save adder array configured to combine the selected partial products in the first multiplier mode and in the second multiplier mode.

Type: Application

Filed: January 10, 2008

Publication date: August 14, 2008

Applicant: Analog Devices, Inc.

Inventors: Andreas D. Olofsson, Baruch Yanovitch
Dual Multiply-Accumulator Operation Optimized for Even and Odd Multisample Calculations

Publication number: 20080189347

Abstract: According to some embodiments, a dual multiply-accumulate operation optimized for even and odd multisample calculations is disclosed.

Type: Application

Filed: March 4, 2008

Publication date: August 7, 2008

Inventors: Bradley C. Aldrich, Nigel C. Paver, William T. Maghielse
SYSTEM AND METHOD FOR IMPLEMENTING A REED SOLOMON MULTIPLICATION SECTION FROM EXCLUSIVE-OR LOGIC

Publication number: 20080155382

Abstract: Various methods and systems for implementing Reed Solomon multiplication sections from exclusive-OR (XOR) logic are disclosed. For example, a system includes a Reed Solomon multiplication section, which includes XOR-based logic. The XOR-based logic includes an input, an output, and one or more XOR gates. A symbol X is received at the input of the XOR-based logic. The one or more XOR gates are coupled to generate a product of a power of ? and X at the output, wherein ? is a root of a primitive polynomial of a Reed Solomon code. Such a Reed Solomon multiplication section, which can include one or more multipliers implemented using XOR-based logic, can be included in a Reed Solomon encoder or decoder.

Type: Application

Filed: March 11, 2008

Publication date: June 26, 2008

Inventors: Qiujie Dong, Andrew J. Thurston
Multiplier

Publication number: 20080140753

Abstract: An electronically implemented method includes multiplying a number A, and a number B, where A is composed of segments ai and B is composed of segments bj where i and j are integers greater than 1. The multiplying includes determining partial product values for at least some of aibj and determining a sum of partial product values for aibj and ajbi where ai=bj and bj=ai for respective values of i and j, by multiplying one of (1) aibj and (2) ajbi by two. A sum is determined and stored in a memory storage element of the determined partial product values and the determined sum of partial product values for aibj and ajbi.

Type: Application

Filed: December 8, 2006

Publication date: June 12, 2008

Inventors: Vinodh Gopal, Gilbert M. Wolrich, Wajdi Feghali, Robert P. Ottavi
METHOD AND APPARATUS FOR EFFICIENT MATRIX MULTIPLICATION IN A DIRECT SEQUENCE CDMA SYSTEM

Publication number: 20080140752

Abstract: System and method for processing symbols in a communication system are disclosed and may include in a processor that receives symbols to be coded for transmission over a wireless medium, grouping elements of an input matrix across a second dimension of the input matrix to form groups of matrix elements while multiplying the input matrix and an input vector. The input vector may include the symbols to be coded for transmission over the wireless medium. The method may also include pre-computing possible permutations of partial results for each of the groups of matrix elements, and assigning the partial results from each of the groups of matrix elements to each of a corresponding index of a first dimension of the input matrix to form a matrix of assigned partial results.

Type: Application

Filed: January 21, 2008

Publication date: June 12, 2008

Inventor: Yung-hsiang Lee
Single Precision Vector Dot Product with "Word" Vector Write Mask

Publication number: 20080114826

Abstract: The present invention is generally related to the field of image processing, and more specifically to an instruction set for processing images. Vector processing may involve performing a plurality of dot product operations to generate operands for generating operands for a new vector. The dot product operations may require the issue of a plurality of permute instructions to arrange the vector operands in desired locations of a target register. Embodiments of the invention provide a dot product instruction wherein a mask field may be used to specify a particular location of a target register in which to transfer data, thereby avoiding the need for permute instructions for arranging data, reducing dependencies between instructions, and the usage of temporary registers.

Type: Application

Filed: October 31, 2006

Publication date: May 15, 2008

Inventors: Eric Oliver Mejdrich, Adam James Muff
Dual-multiply-accumulator operation optimized for even and odd multisample calculations

Patent number: 7353244

Abstract: According to some embodiments, a dual multiply-accumulate operation optimized for even and odd multisample calculations is disclosed.

Type: Grant

Filed: April 16, 2004

Date of Patent: April 1, 2008

Assignee: Marvell International Ltd.

Inventors: Bradley C. Aldrich, Nigel C. Paver, William T. Maghielse
Dual Mode Floating Point Multiply Accumulate Unit

Publication number: 20070185953

Abstract: Included are embodiments of a Multiply-Accumulate Unit to process multiple format floating point operands. For short format operands, embodiments of the Multiply Accumulate Unit are configured to process data with twice the throughput as long and mixed format data. At least one embodiment can include a short exponent calculation component configured to receive short format data, a long exponent calculation component configured to receive long format data, and a mixed exponent calculation component configured to receive short exponent data, the mixed exponent calculation component further configured to received long format data. Embodiments also include a mantissa datapath configured for implementation to accommodate processing of long, mixed, and short floating point operands.

Type: Application

Filed: February 6, 2007

Publication date: August 9, 2007

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
Pipelined multiply-accumulate unit and out-of-order completion logic for a superscalar digital signal processor and method of operation thereof

Patent number: 7231510

Abstract: A mechanism for, and method of, processing multiply-accumulate instructions with out-of-order completion in a pipeline, for use in a processor having an at least four-wide instruction issue architecture, and a digital signal processor (DSP) incorporating the mechanism or the method. In one embodiment, the mechanism including: (1) a multiply-accumulate unit (MAC) having an initial multiply stage and a subsequent accumulate stage and (2) out-of-order completion logic, associated with the MAC, that causes interim results produced by the multiply stage to be stored when the accumulate stage is unavailable and allows younger instructions to complete before the multiply-accumulate instructions.

Type: Grant

Filed: November 13, 2001

Date of Patent: June 12, 2007

Assignee: VeriSilicon Holdings (Cayman Islands) Co. Ltd.

Inventors: Hung T. Nguyen, Shannon A. Wichman
Multi-purpose floating point and integer multiply-add functional unit with multiplication-comparison test addition and exponent pipelines

Patent number: 7225323

Abstract: A multipurpose functional unit is configurable to support a number of operations including multiply-add and comparison testing operations, as well as other integer and/or floating-point arithmetic operations, Boolean operations, and format conversion operations.

Type: Grant

Filed: November 10, 2004

Date of Patent: May 29, 2007

Assignee: NVIDIA Corporation

Inventors: Ming Y. Siu, Stuart F. Oberman
Programmable processor with group floating-point operations

Patent number: 7216217

Abstract: A programmable processor that comprises a general purpose processor architecture, capable of operation independent of another host processor, having a virtual memory addressing unit, an instruction path and a data path; an external interface; a cache operable to retain data communicated between the external interface and the data path; at least one register file configurable to receive and store data from the data path and to communicate the stored data to the data path; and a multi-precision execution unit coupled to the data path. The multi-precision execution unit is configurable to dynamically partition data received from the data path to account for an elemental width of the data and is capable of performing group floating-point operations on multiple operands in partitioned fields of operand registers and returning catenated results. In other embodiments the multi-precision execution unit is additionally configurable to execute group integer and/or group data handling operations.

Type: Grant

Filed: August 25, 2003

Date of Patent: May 8, 2007

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, John Moussouris
Pipelined processor method and circuit with interleaving of iterative operations

Patent number: 7206927

Abstract: A method of executing an instruction stream in a pipelined execution unit of depth, p, comprises loading the instruction stream; detecting an iteration of an instruction in the loaded instruction stream; interleaving p steams of instances of the instruction in the pipeline; detecting an end of the iteration; and combining results obtained from the p streams after all programmed iterations have completed. A computational circuit comprises a register which can hold a value representing both an operand and result of an iterative operation; a multiplexer having a first input connected to receive the operand from the register, a second input connected to a source of an identify value for the iterative operation, and an output; and an operator circuit having an input connected to receive a value from the multiplexer output, and an output connected to return thee result to the register.

Type: Grant

Filed: November 19, 2002

Date of Patent: April 17, 2007

Assignee: Analog Devices, Inc.

Inventor: Abhijit Giri
Extended-precision accumulation of multiplier output

Patent number: 7181484

Abstract: A multiply unit includes an extended precision accumulator. Microprocessor instructions are provided for manipulating portions of the extended precision accumulator including an instruction to move the contents of a portion of the extended accumulator to a general-purpose register (“MFLHXU”) and an instruction to move the contents of a general-purpose register to a portion of the extended accumulator (“MTLHX”).

Type: Grant

Filed: February 21, 2001

Date of Patent: February 20, 2007

Assignee: MIPS Technologies, Inc.

Inventors: Morten Stribaek, Pascal Paillier
Programmable logic device including multipliers and configurations thereof to reduce resource utilization

Patent number: 7142010

Abstract: In a programmable logic device having dedicated multiplier circuitry, some of the scan chain registers normally used for testing the device are located adjacent input registers of the multipliers. Those scan chain registers are ANDed with the input registers, and can be loaded with templates of ones and zeroes. This allows, e.g., subset multiplication if the least significant bits are loaded with zeroes and the remaining bits are loaded with ones. The multipliers preferably are arranged in blocks with other components, such as adders, that allow them to be configured as finite impulse response (FIR) filters. In such configurations, the scan chain registers can be used to load filter coefficients, avoiding the use of scarce logic and routing resources of the device.

Type: Grant

Filed: December 19, 2003

Date of Patent: November 28, 2006

Assignee: Altera Corporation

Inventors: Martin Langhammer, Chiao Kai Hwang, Gregory Starr
Performance optimized approach for efficient downsampling operations

Patent number: 7127482

Abstract: An algorithm and hardware structure is described for numerical operations on signals that is reconfigurable to operate in a downsampling or non-downsampling mode. According to one embodiment, a plurality of adders and multipliers are reconfigurable via a switching fabric to operate as a plurality of MAAC ( multiply-add-accumulator) kernels (described in detail below), when operating in a non-downsampling mode and a plurality of MAAC kernels and AMAAC (add-multiply-add-accumulator) kernals (described in detail below), when operating in a downsampling mode.

Type: Grant

Filed: November 19, 2001

Date of Patent: October 24, 2006

Assignee: Intel Corporation

Inventors: Yan Hou, Hong Jiang, Kam Leung
Extending the range of computational fields of integers

Patent number: 7111166

Abstract: An extension of the serial/parallel Montgomery modular multiplication method with simultaneous reduction as previously implemented by the applicants, adapted innovatively to perform both in the prime number and in the GF(2q) polynomial based number field, in such a way as to simplify the flow of operands, by performing a multiple anticipatory function to enhance the previous modular multiplication procedures.

Type: Grant

Filed: May 14, 2001

Date of Patent: September 19, 2006

Assignee: Fortress U&T Div. M-Systems Flash Disk Pioneers Ltd.

Inventors: Itai Dror, Carmi David Gressel, Michael Mostovoy, Alexey Molchanov
Multiply-accumulate (MAC) unit for single-instruction/multiple-data (SIMD) instructions

Patent number: 7107305

Abstract: A tightly coupled dual 16-bit multiply-accumulate (MAC) unit for performing single-instruction/multiple-data (SIMD) operations may forward an intermediate result to another operation in a pipeline to resolve an accumulating dependency penalty. The MAC unit may also be used to perform 32-bit×32-bit operations.

Type: Grant

Filed: October 5, 2001

Date of Patent: September 12, 2006

Assignee: Intel Corporation

Inventors: Deli Deng, Anthony Jebson, Yuyun Liao, Nigel C. Paver, Steve J. Strazdus
Virtually parallel multiplier-accumulator

Patent number: 7080113

Abstract: A virtually parallel multiplier-accumulator (VMAC) that can execute more than or less than one MAC operation in a single system clock cycle. The inventive VMAC advantageously employs a resource/time-sharing methodology with multiple sequential computational stages.

Type: Grant

Filed: July 17, 2003

Date of Patent: July 18, 2006

Assignee: Agere Systems Inc.

Inventors: Hyun Lee, Shaun P. Whalen
Multiply accumulator for two N bit multipliers and an M bit addend

Patent number: 7043517

Abstract: A multiply accumulator performs a multiplication-and-addition operation for a first multiplier with N bits, a second multiplier with N bits, and an addend with M bits, wherein M is larger than 2N. The multiply accumulator includes a modified Booth encoder and a multiplication-and-addition unit. The modified Booth encoder performs a Booth encoding to either the first multiplier or its bit inversion by supplementing a multiplier sign bit behind a least significant bit of either the first multiplier or its bit inversion. The multiplication-and-addition unit includes a carry save adder tree and a sign extension adder and achieves a high speed of the multiplication-and-addition operation by simultaneously performing the multiplication and addition.

Type: Grant

Filed: March 7, 2003

Date of Patent: May 9, 2006

Assignee: Faraday Technology Corp.

Inventor: Chi-jui Chung
Method and system for performing parallel integer multiply accumulate operations on packed data

Patent number: 7043518

Abstract: A multiply accumulate unit (“MAC”) that performs operations on packed integer data. In one embodiment, the MAC receives 2 32-bit data words which, depending on the specified mode of operation, each contain either four 8-bit operands, two 16-bit operands, or one 32-bit operand. Depending on the mode of operation, the MAC performs either sixteen 8×8 operations, four 16×16 operations, or one 32×32 operation. Results may be individually retrieved from registers and the corresponding accumulator cleared after the read cycle. In addition, the accumulators may be globally initialized. Two results from the 8×8 operations may be packed into a single 32-bit register. The MAC may also shift and saturate the products as required.

Type: Grant

Filed: February 9, 2004

Date of Patent: May 9, 2006

Assignee: Cradle Technologies, Inc.

Inventors: Moshe B. Simon, Erik P. Machnicki, David A. Harrison, Rakesh K. Singh
SIMD sum of product arithmetic method and circuit, and semiconductor integrated circuit device equipped with the SIMD sum of product arithmetic circuit

Patent number: 7043519

Abstract: In an SIMD sum of product arithmetic method of enabling a concurrent execution of 2n (where n is a natural number) parallel sum of product arithmetic (operations), the SIMD sum of product arithmetic is executed using 2m (m=0, . . . , log2 n) accumulators as one set, and by replacing a 2p?1th accumulator with an adjacent 2pth (p=1, . . . , n/2) accumulator, without changing a sequence of accumulator addresses, in the set, as accumulator addresses to be allocated to sum of product arithmetic circuits for the SIMD sum of product arithmetic.

Type: Grant

Filed: September 5, 2001

Date of Patent: May 9, 2006

Assignee: Fujitsu Limited

Inventor: Masayuki Tsuji
Apparatus for multiplying and accumulating numeric quantities

Patent number: 7035890

Abstract: An apparatus for multiplying and accumulating numeric quantities, including a multiplier for receiving the numeric quantities, with the multiplier having a sum output and a carry output. A first shift register has an input coupled to the sum output of the multiplier, and a second shift register has an input coupled to the carry output of the multiplier. An adder and third shift register are used to complete processing of the apparatus' arithmetic operations.

Type: Grant

Filed: March 1, 2001

Date of Patent: April 25, 2006

Assignee: 8x8, Inc

Inventors: Jan Fandrianto, Chi Shin Wang, Sehat Sutardja, Hedley K. J. Rainnie, Bryan R. Martin
Residue number system based pre-computation and dual-pass arithmetic modular operation approach to implement encryption protocols efficiently in electronic integrated circuits

Patent number: 7027598

Abstract: A pre-computation and dual-pass modular operation approach to implement encryption protocols efficiently in electronic integrated circuits is disclosed. An encrypted electronic message is received and another electronic message generated based on the encryption protocol. Two passes of Montgomery's method are used for a modular operation that is associated with the encryption protocol along with pre-computation of a constant based on a modulus. The modular operation may be a modular multiplication or a modular exponentiation. Modular arithmetic may be performed using the residue number system (RNS) and two RNS bases with conversions between the two RNS bases. A minimal number of register files are used for the computations along with an array of multiplier circuits and an array of modular reduction circuits. The approach described allows for high throughput for large encryption keys with a relatively small number of logical gates.

Type: Grant

Filed: September 19, 2001

Date of Patent: April 11, 2006

Assignee: Cisco Technology, Inc.

Inventors: Mihailo M. Stojancic, Mahesh S. Maddury, Kenneth J. Tomei
Pre-computation and dual-pass modular arithmetic operation approach to implement encryption protocols efficiently in electronic integrated circuits

Patent number: 7027597

Abstract: A pre-computation and dual-pass modular operation approach to implement encryption protocols efficiently in electronic integrated circuits is disclosed. An encrypted electronic message is received and another electronic message generated based on the encryption protocol. Two passes of Montgomery's method are used for a modular operation that is associated with the encryption protocol along with pre-computation of a constant based on a modulus. The modular operation may be a modular multiplication or a modular exponentiation. Modular arithmetic may be performed using the residue number system (RNS) and two RNS bases with conversions between the two RNS bases. A minimal number of register files are used for the computations along with an array of multiplier circuits and an array of modular reduction circuits. The approach described allows for high throughput for large encryption keys with a relatively small number of logical gates.

Type: Grant

Filed: September 18, 2001

Date of Patent: April 11, 2006

Assignee: Cisco Technologies, Inc.

Inventors: Mihailo M. Stojancic, Mahesh S. Maddury, Kenneth J. Tomei
Methods and apparatus for performing parallel integer multiply accumulate operations

Patent number: 7013321

Abstract: According to the invention, a processing core that executes a parallel multiply accumulate operation is disclosed. Included in the processing core are a first, second and third input operand registers; a number of functional blocks; and, an output operand register. The first, second and third input operand registers respectively include a number of first input operands, a number of second input operands and a number of third input operands. Each of the number of functional blocks performs a multiply accumulate operation. The output operand register includes a number of output operands. Each of the number of output operands is related to one of the number of first input operands, one of the number of second input operands and one of the number of third input operands.

Type: Grant

Filed: November 21, 2001

Date of Patent: March 14, 2006

Assignee: Sun Microsystems, Inc.

Inventor: Ashley Saulsbury
Data processor with enhanced instruction execution and method

Patent number: 7010558

Abstract: An apparatus and method for performing enhanced algorithmic processing, including reduced cycle-count fast Fourier transform (FFT) calculations. In one aspect, the invention comprises a user-configurable processor having an extension instruction adapted for reduced cycle-count algorithmic operations. In one exemplary embodiment, the processor is an extensible core, and the extension instruction comprises a 32-bit instruction word linked with existing circuitry in the processor core used for multiply-accumulate (mac) instructions. 16-bit, 24-bit, and dual 16-bit multiply options are available for the multiply/accumulate unit of the processor. The extension instruction is pipelined to the same number of stages as the mac instructions, thereby avoiding unnecessary stalls and increasing performance. A modified accumulator data path used in support of the foregoing instruction is also described.

Type: Grant

Filed: April 18, 2002

Date of Patent: March 7, 2006

Assignee: ARC International

Inventor: Chris Morris
Dyadic DSP instruction predecode signal selective multiplexing data from input buses to first and second plurality of functional blocks to execute main and sub operations

Patent number: 6988184

Abstract: Methods of performing dyadic digital signal processing (DSP) instructions. In one embodiment of the invention, the method includes fetching a dyadic DSP instruction having a main operation and a sub operation; predecoding the dyadic DSP instruction to generate predecoded instruction signals; and decoding the predecoded instruction signals to generate select signals to selectively couple data from a first plurality of buses coupled to inputs of multiplexers of a first plurality of DSP functional blocks to execute the main operation of the dyadic DSP instruction in one processor cycle and to selectively couple data from a second plurality of buses coupled to inputs of multiplexers of a second plurality of DSP functional blocks to execute the sub operation of the dyadic DSP instruction in the one processor cycle.

Type: Grant

Filed: August 2, 2002

Date of Patent: January 17, 2006

Assignee: Intel Corporation

Inventors: Kumar Ganapathy, Ruban Kanapathipillai
Method and apparatus for implementing single/dual packed multi-way addition instructions having accumulation options

Patent number: 6976049

Abstract: The present invention relates to a method and system for providing a single accumulatable packed multi-way addition instruction having the functionality of multiple instructions without causing any timing problems in the execute stage. Specifically, the accumulatable packed multi-way combination instruction may be associated with at least one destination and a plurality of operands and set a polarity of each of a plurality of source operands derived from the plurality of operands, if requested by the instruction. The instruction also may add selected pairs of the plurality of source operands in predetermined orders to obtain at least one result and, if requested by the instruction, accumulating the plurality of results to obtain at least one accumulated result; output at least one predetermined pair of the at least one result and the at least one accumulated result; and accumulate condition codes for each of the at least one result and the at least one accumulated result, if requested by the instruction.

Type: Grant

Filed: March 28, 2002

Date of Patent: December 13, 2005

Assignee: Intel Corporation

Inventor: Gad Sheaffer

prev 1 2 3 4 5 6 7 8 next