Sum Of Products Generation Patents (Class 708/603)

Multimedia instruction set for wide data paths

Patent number: 6675286

Abstract: Partitioned sigma instructions are provided in which processor capacity is effectively distributed among multiple sigma operations which are executed concurrently. Special registers are included for aligning data on memory word boundaries to reduce packing overhead in providing long data words for multimedia instructions which implement shifting data sequences over multiple iterations. Extended partitioned arithmetic instructions are provided to improve precision and avoid accumulated carry over errors. Partitioned formatting instructions, including partitioned interleave, partitioned compress, and partitioned interleave and compress pack subwords in an effective order for other partitioned operations.

Type: Grant

Filed: April 27, 2000

Date of Patent: January 6, 2004

Assignee: University of Washington

Inventors: Weiyun Sun, Stefan G. Berg, Donglok Kim, Yongmin Kim
Pre-reduction technique within a multiplier/accumulator architecture

Publication number: 20030158879

Abstract: An apparatus and method for compressing a reduction array into an accumulated carry-save sum. The reduction array includes a partial product matrix, a carry-save sum, and a constant value row. A compressor array generates a previous accumulated carry-save sum. A three-input/two-output carry-save adder pre-reduces the constant value row and the previously accumulated carry-save sum into a two-row intermediate carry-save sum that is added to the partial product matrix to form a current accumulated carry-save sum.

Type: Application

Filed: December 11, 2000

Publication date: August 21, 2003

Applicant: International Business Machines Corporation

Inventors: Ohsang Kwon, Kevin J. Nowka
Method and apparatus for arithmetic operation

Patent number: 6609143

Abstract: It is an object of the present invention to provide an arithmetic logic unit that can perform a sum-of-products operation in a reduced number of processing cycles without carrying out data transfer and additions even in obtaining a single result from a plurality of divided input data words. Data words X and Y are input. A product of the high-order bits of X and Y is calculated using first decoder 511, first selector 521, first partial product generator 531 and first full adder 541. A product of the low-order bits of X and Y is also calculated using second decoder 512, second selector 522, second partial product generator 532 and second full adder 542. These products are adaptively shifted at a shifter 55 and then added up with a fed back data word Z at a third full adder 56 and a carry-propagation adder 58. In this manner, the data word Z, representing the result of the sum-of-products operation, is obtained.

Type: Grant

Filed: July 20, 2000

Date of Patent: August 19, 2003

Assignee: Matsushita Electric Industrial Co., Ltd

Inventors: Tomochika Kanakogi, Masaitsu Nakajima
Multiply-accumulate accelerator with data re-use

Publication number: 20030145030

Abstract: Input data is received by an execution unit. One or more current multiply-accumulate operations are performed by the execution unit on the received input data and on input data received by the execution unit for one or more prior multiply-accumulate operations and saved by the execution unit.

Type: Application

Filed: January 31, 2002

Publication date: July 31, 2003

Inventor: Gad S. Sheaffer
Ram based processing engine for simultaneous sum of products computation

Patent number: 6598062

Abstract: A processing engine (10) that generates sum of products (SOP) values for incoming data. The processing engine (10) includes a calculation module (30) for generating intermediate and SOP values based on the incoming data and coefficient values, wherein the intermediate values are defined by product values and partial presum values. A feedback module (50) stores the intermediate values until the calculation module (30) generates SOP values. The processing engine (10) further includes a reordering module (70) for reordering the SOP values. The feedback module (50) includes a switching mechanism (52) for retrieving intermediate values from the calculation module (30) until the calculation module (30) generates SOP values. Thus, a feedback RAM (53) can store the intermediate values without the need for buffering RAM at the input stage.

Type: Grant

Filed: May 31, 2000

Date of Patent: July 22, 2003

Assignee: Northrop Grumman Corporation

Inventor: Derek Layne
System and method for efficient hardware implementation of a perfect precision blending function

Patent number: 6584483

Abstract: The present invention is directed to an apparatus and method for efficiently calculating an intermediate value between a first end value and a second end value such that the area and time required to implement this operation is minimized. The present invention is also used to efficiently multiply a value by a fraction. A fraction is involved in calculating an intermediate value and also for multiplying by a fraction. When the denominator of the fraction is odd, the binary representation of the blending function, which is used to calculate an intermediate value, exhibits special characteristics. The special characteristics allow the present invention to, among others, avoid the use of multipliers, which require a large number of gates to implement. This invention exploits this and other special characteristics in order to efficiently implement in hardware the blending function and to efficiently multiply a value by a fraction.

Type: Grant

Filed: December 30, 1999

Date of Patent: June 24, 2003

Assignee: Intel Corporation

Inventors: Tom Altus, Jacob D. Doweck
Multiplier array processing system with enhanced utilization at lower precision

Patent number: 6584482

Abstract: A multiplier array processing system which improves the utilization of the multiplier and adder array for lower-precision arithmetic is described. New instructions are defined which provide for the deployment of additional multiply and add operations as a result of a single instruction, and for the deployment of greater multiply and add operands as the symbol size is decreased.

Type: Grant

Filed: August 19, 1999

Date of Patent: June 24, 2003

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig C. Hansen, Henry Massalin
Computer system and method for parallel computations using table approximation

Patent number: 6567831

Abstract: A method optimizes function evaluations performed by of a VLIW processor through enhanced parallelism by evaluating the function by table approximation using decomposition into a Taylor series.

Type: Grant

Filed: April 20, 2000

Date of Patent: May 20, 2003

Assignee: Elbrus International Limited

Inventor: Vadim E. Loginov
Digital signal processor with coupled multiply-accumulate units

Patent number: 6557022

Abstract: Two multiply-accumulate units are coupled together so that the computation (B*C)+(D*E) can be completed in one cycle. An adder (216) adds together the products of the two multipliers (206), (208). The sum is applied to the first accumulator (220). Preferably, the second product is also applied to the second accumulator (222), and a multiplexer (218) applies either a zero or the second product to the adder (216). If two unrelated computations are to be executed simultaneously, then the zero is applied, and the output of the second accumulator is fed back to the register file (PI2). If a single (B*C)+(D*E) computation is to be executed, then the second product is applied to the adder, and the output of the second accumulator is disregarded.

Type: Grant

Filed: February 26, 2000

Date of Patent: April 29, 2003

Assignee: Qualcomm, Incorporated

Inventors: Gilbert C. Sih, Xufeng Chen, De D. Hsu
Multiply-accumulate (MAC) unit for single-instruction/multiple-data (SIMD) instructions

Publication number: 20030069913

Abstract: A tightly coupled dual 16-bit multiply-accumulate (MAC) unit for performing single-instruction/multiple-data (SIMD) operations may forward an intermediate result to another operation in a pipeline to resolve an accumulating dependency penalty. The MAC unit may also be used to perform 32-bit×32-bit operations.

Type: Application

Filed: October 5, 2001

Publication date: April 10, 2003

Inventors: Deli Deng, Anthony Jebson, Yuyun Liao, Nigel C. Paver, Steve J. Strazdus
Non-constant reduced-complexity multiplication in signal processing transforms

Publication number: 20030061252

Abstract: A machine or method used in signal processing transforms involving computing one or more sums each of one or more products. A multiplier has one or both of its two inputs restricted to limited sets of numbers having given finite-precision numeric formats. The multiplier is not a constant multiplier capable only of computing the product of any first number and a constant. The multiplier is not a general multiplier capable of computing the product of any pair of numbers. The multiplier has lower complexity than a general multiplier, but more flexibility than a constant multiplier. The invention can be used to reduce the overall computational complexity of signal processing transforms. The invention can be used when transform weights are fixed and known. The invention can be used when transform inputs, though random, come from small, known sets, as is the case in digital communications.

Type: Application

Filed: September 27, 2001

Publication date: March 27, 2003

Inventor: Charles D. Murphy
Circuit and method for multiplying and accumulating the sum of two products in a single cycle

Patent number: 6523055

Abstract: A multiplication accumulation circuit (abbreviated as “MAC”) has five input buses that carry signals for operands A, B, C, D and E, a control bus that carries signals for controlling the operations performed on the received operands, and an output bus that carries a signal generated by the MAC. Each of operands A, B, C and D can be four different operands that are used as follows by the MAC: (1) to perform two multiplications simultaneously, and (2) to perform an addition of the products of the two multiplications and the fifth operand E, e.g. generate on the output bus a signal of value A*C+B*D+E. Alternatively, operands A and B can be, respectively, the upper and lower halves of a first double word to be used as a multiplicand. Similarly, operands C and D can be the upper and lower halves of a second double word to be used as a multiplier.

Type: Grant

Filed: January 20, 1999

Date of Patent: February 18, 2003

Assignee: LSI Logic Corporation

Inventors: Robert K. Yu, Satish Padmanabhan, Chakra R. Srivatsa, Shailesh I. Shah
Arithmetic circuit for accumulative operation

Patent number: 6519621

Abstract: An improved arithmetic circuit for accumulative operation for use in digital signal processors, microprocessors and so forth is described, in which the pipelined control becomes effective during accumulative operation by eliminating idling stages in the pipeline structure. In accordance with the improved arithmetic circuit, during accumulative operation, the next operation is initiated with intermediate results of the current operation while the current operation is being executed and not yet completed so that it is possible to improve the speed of accumulative operation and reduce the scale of integration.

Type: Grant

Filed: May 10, 1999

Date of Patent: February 11, 2003

Assignee: Kabushiki Kaisha Toshiba

Inventor: Naoka Yano
Floating-point and integer multiply-add and multiply-accumulate

Patent number: 6480872

Abstract: A method and a device including, in one embodiment, a multiply array and at least one adder to perform a floating-point multiplication followed by an addition when operands are in floating-point format. The device is also configured to perform an integer multiplication followed by an accumulation when operands are in integer format. The device is further configured to perform a floating-point multiply-add or an integer multiply-accumulation in response to control signals. In another embodiment, the device contains an adder and the adder is capable of performing a floating-point addition and an integer accumulation. The adder is configured to be extra wide to reduce operand misalignment. Moreover, the device stalls the process in response to operand misalignment.

Type: Grant

Filed: January 21, 1999

Date of Patent: November 12, 2002

Assignee: SandCraft, Inc.

Inventor: Jack H. Choquette
SIMD sum of product arithmetic method and circuit, and semiconductor integrated circuit device equipped with the SIMD sum of product arithmetic circuit

Publication number: 20020138535

Abstract: In an SIMD sum of product arithmetic method of enabling a concurrent execution of 2n (where n is a natural number) parallel sum of product arithmetic (operations), the SIMD sum of product arithmetic is executed using 2m (m=0, . . . , log2n) accumulators as one set, and by replacing a 2p-1th accumulator with an adjacent 2pth (p=1, . . . , n) accumulator, without changing a sequence of accumulator addresses, in the set, as accumulator addresses to be allocated to sum of product arithmetic circuits for the SIMD sum of product arithmetic.

Type: Application

Filed: September 5, 2001

Publication date: September 26, 2002

Applicant: Fujitsu Limited

Inventor: Masayuki Tsuji
Multiple function processing core for communication signals

Patent number: 6449630

Abstract: An apparatus for processing digital signals includes a multiplier having a first input and a second input and an output producing a product. An adder is connected to receive the product from the multiplier as a first input to produce a sum. A first register is connected to receive and store the sum and to provide a second input to the adder in response to a clock signal. A second register is connected to receive and store the output of the first register in response to an inverse of the clock signal to enable the addition of two products in a single clock cycle.

Type: Grant

Filed: April 7, 1999

Date of Patent: September 10, 2002

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventor: Jay Bao
Multiply accumulate modules and parallel multipliers and methods of designing multiply accumulate modules and parallel multipliers

Publication number: 20020116433

Abstract: A multiply-accumulate module (100) includes a multiply-accumulate core (120), which includes a plurality of Booth encoder cells (104a). The multiply-accumulate core (120) also includes a plurality of Booth decoder cells (110a) connected to at least one of the Booth encoder cells (104a) and a plurality of Wallace tree cells (112a) connected to at least one of the Booth decoder cells (110a). Moreover, at least one first Wallace tree cell (112a1) or at least one first Booth decoder cell (110a1), or any combination thereof, includes a first plurality of transistors, and at least one second Wallace tree cell (112a2) or at least one second Booth decoder cell (110a2), or any combination thereof, includes a second plurality of transistors. In addition, at least one critical path of the multiply-accumulate module (100) includes the at least one first cell and a width of at least one of the first plurality of transistors is greater than a width of at least one of the second plurality of transistors.

Type: Application

Filed: September 27, 2001

Publication date: August 22, 2002

Inventors: Kaoru Awaka, Hiroshi Takahashi, Shigetoshi Muramatsu, Akihiro Takegama
Apparatus and method for performing multiplication operations

Publication number: 20020116434

Abstract: The present invention provides an apparatus and method for processing data using a multiplying circuit for performing a multiplication of a W/2 bit data value by a W bit data value. An instruction decoder is provided which is responsive to a multiply instruction to control the multiplying circuit to generate a multiplication result for the computation M×N, where M and N are W bit data words. The multiplying circuit is arranged to execute a first operation in the which the data word N is multiplied by the most significant W/2 bits of the data word M to generate a first intermediate result having 3W/2 bits, and to then execute a second operation in which the data word N is multiplied by the least significant W/2 bits of the data word M to generate a second intermediate result having 3W/2 bits. The first intermediate result is shifted by W/2 with respect to the second intermediate result and added to the second intermediate result to generate the multiplication result.

Type: Application

Filed: December 27, 2000

Publication date: August 22, 2002

Inventor: Alexander Edward Nancekievill
Sums of production datapath

Patent number: 6438569

Abstract: A method and apparatus for a sums of products datapath. According to one embodiment of the invention, an apparatus has a number of inputs and a number generation units. Each of the generation units is coupled to the inputs. Each of the generation units includes a separate selection circuit coupled to each one of the inputs to selectively pass the signal provided on that input. In addition, each of the generation units includes a number of reduction circuits having inputs coupled to mutually exclusive pluralities of the selection circuits and each having an output. The apparatus also includes a first and second summation circuit coupled to the output of the plurality of reduction circuits in mutually exclusive pluralities of generation units. Additionally, the apparatus includes a subtraction circuit coupled to an output of the first and second summation circuit.

Type: Grant

Filed: September 20, 1999

Date of Patent: August 20, 2002

Assignee: PMC-Sierra, Inc.

Inventor: Curtis Abbott
Programmable accelerator for a programmable processor system

Patent number: 6397240

Abstract: A programmable multi-mode accelerator is disclosed for use with a programmable processor or microprocessor. The programmable multi-mode accelerator allows a programmable processor to execute specific algorithms, such as certain types of finite impulse response (FIR), correlation and Viterbi computations, that require low-precision operations at an extremely high rate. The accelerator extends the digital signal processor's performance into the required range for low-precision computations. The accelerator can be coupled with the main data path of a programmable processor or microprocessor and can directly read and write to the main register files of the programmable processor. In an illustrative implementation, the accelerator data path accesses its input values (source operands) directly from a main register file of the programmable processor and writes results back into a second main register file.

Type: Grant

Filed: February 18, 1999

Date of Patent: May 28, 2002

Assignee: Agere Systems Guardian Corp.

Inventors: John Susantha Fernando, Stefan Thurnhofer
Method and apparatus for performing multiply-add operations on packed data

Publication number: 20020059355

Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.

Type: Application

Filed: November 19, 2001

Publication date: May 16, 2002

Applicant: Intel Corporation

Inventors: Alexander D. Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt
Method for performing multiply-add operations on packed data

Patent number: 6385634

Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.

Type: Grant

Filed: August 31, 1995

Date of Patent: May 7, 2002

Assignee: Intel Corporation

Inventors: Alexander D. Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt
Product sum operation device capable of carrying out fast operation

Patent number: 6385635

Abstract: Multipliers 107 through 110 carry out an multiplication operation with two data out of the four data transferred from a memory over buses 101 through 104. The multiplication results are subjected to an addition or subtraction operation with each other in adder-subtracters 111 and 112. The operation results obtained by the adder-subtracters 111 and 112 are supplied to adders 113 and 114 where they are added to values held by accumulators 115 and 116. A latch circuit 105 supplies the data transferred through the bus 102 to the multiplier 109 when a control signal 106 indicates “ON”. The latch circuit 105 temporarily holds the data transferred through the bus 102 and supplies the data held therein to the multiplier 109 when a control signal 106 indicates “OFF”.

Type: Grant

Filed: April 23, 1999

Date of Patent: May 7, 2002

Assignee: NEC Corporation

Inventor: Daiji Ishii
Method and apparatus for computing a sum of packed data elements using SIMD multiply circuitry

Patent number: 6377970

Abstract: A method and apparatus that adds each one of multiple elements of a packed data together to produce a result. According to one such a method and apparatus, each of a first set of portions of partial products is produced using a first set of partial product selectors in a multiplier, each of the first set of portions of the partial products being zero. Each of the multiple elements is inserted into one of a second set of portions of the partial products using a second set of partial product selectors, each of the second set of portions of the partial products being aligned. Each of the multiple elements are added together to produce the result including a field having the sum of the multiple elements.

Type: Grant

Filed: March 31, 1998

Date of Patent: April 23, 2002

Assignee: Intel Corporation

Inventors: Mohammad A. Abdallah, Vladimir Pentkovski
Long instruction word controlling plural independent processor operations

Patent number: 6370558

Abstract: A data processing apparatus including a multiplier unit forming a product from L bits of each two data buses of N bits each N is greater than L. The multiplier forms a N bit output having a first portion which is the L most significant bits of the of product and a second portion which is M other bits not including the L least significant bits of the product, where N is the sum of M and L. In the preferred embodiment the M other bits are derived from other bits of the two input data busses, such as the M other bits of the first input data bus. An arithmetic logic unit performs parallel operations (addition, subtraction, Boolean functions) controlled by the same instructions. This arithmetic logic unit is divisible into a selected number of sections for performing identical operations on independent sections of its inputs. The multiplier unit may form dual products from separate parts of the input data.

Type: Grant

Filed: October 3, 2000

Date of Patent: April 9, 2002

Assignee: Texas Instruments Incorporated

Inventors: Karl M. Guttag, Christopher J. Read, Keith Balmer
Method and arrangement in a transposed digital FIR filter for multiplying a binary input signal with tap coefficients and a method for designing a transposed digital filter

Patent number: 6370556

Abstract: The invention relates to a method and an arrangement in a transposed digital FIR filter for multiplying a binary input signal by tap coefficients, and to a method for designing such a filter. The invention comprises a shift register (51, 52) shifting in the direction of the least significant bit and copying the most significant bit or filling in zero values. The register receives the binary input signal of the filter and has outputs for outputting the content of the desired bit positions. A plurality of bit-serial subtractor and adder elements (53-57) multiply the binary input signal by N+1 different tap coefficients by combining output bits of the shift register (51, 52). The subtractor and/or adder elements form a network wherein at least one element participates in the multiplying operation of at least two different tap coefficients.

Type: Grant

Filed: September 27, 1995

Date of Patent: April 9, 2002

Assignee: Tritech Microelectronics, Ltd

Inventors: Tapio Saramäki, Tapani Ritoniemi, Ville Eerola, Timo Husu, Eero Pajarre, Seppo Ingalsuo
Wide word multiplier using booth encoding

Publication number: 20020040379

Abstract: A multiplier for computing a final product of a first operand and a second operand comprising a multiplier array for forming a product of the first operand and second operand in carry-save form; a carry-save adder for adding said carry-save partial products and an accumulatd sum to produce a carry and save values; a carry-lookahead adder for adding said carry and save values to produce a product value and a carry-out value; a general purpose adder for adding said carry-out and said product value to produce said final product.

Type: Application

Filed: January 2, 2001

Publication date: April 4, 2002

Inventor: Maher Amer
Data alignment between buses

Patent number: 6330631

Abstract: A bus bridge for a computer system for bridging first and second buses includes a shift and accumulate unit. The shift and accumulate unit includes a shifter having an input connected to receive bytes from one of the first and second buses and an output providing a selectable shift to the received bytes. The shift and accumulate unit also includes an accumulator having an input connected to receive the output of the shifter and providing accumulation of selectable bits of the shifted bytes, the accumulator having an output for supplying realigned bytes to be passed to the other of the first and second buses. The combination of the shifter and the accumulator permits a desired amount of shift to be combined with the accumulation of selected bits or bytes to realign sets of bytes from one bus and to form sets of bytes for the other bus. Burst transfer is also possible by operating the shift and accumulate unit to operate in successive cycles for successive sets of input bytes from one of the buses.

Type: Grant

Filed: February 3, 1999

Date of Patent: December 11, 2001

Assignee: Sun Microsystems, Inc.

Inventor: Andrew Crosland
Processor with reconfigurable arithmetic data path

Patent number: 6247036

Abstract: A reconfigurable processor includes at least three (3) MacroSequencers (10)-(16) which are configured in an array. Each of the MacroSequencers is operable to receive on a separate one of four buses (18) an input from the other three MacroSequencers and from itself in a feedback manner. In addition, a control bus (20) is operable to provide control signals to all of the MacroSequencers for the purpose of controlling the instruction sequence associated therewith and also for inputting instructions thereto. Each of the MacroSequencers includes a plurality of executable units having inputs and outputs and each for providing an associated execution algorithm. The outputs of the execution units are input to an output selector which selects the outputs for outputs on at least one external output and on at least one feedback path. An input selector (66) is provided having an input for receiving at least one external output and at least the feedback path.

Type: Grant

Filed: January 21, 1997

Date of Patent: June 12, 2001

Assignee: Infinite Technology Corp.

Inventors: George Landers, Earle Jennings, Tim B. Smith, Glen Haas
Multiple sum-of-products circuit and its use in electronic equipment and microcomputers

Patent number: 6233596

Abstract: An objective of this invention is a design that improves the memory usage ratio and execution speed of a sum-of-products operation instruction, improves the critical path of sum-of-products operations, and prevents overflows. A sum-of-products operation circuit executes sum-of-products operations a number of times that is specified by number-of-executions information comprised within a sum-of-products operation instruction, under the control of a control circuit. The number of times the sum-of-products operation is to be executed is set into a register, that number is decremented every time one cycle of the sum-of-products operation ends, and the sum-of-products operation instruction ends when the value in the register reaches zero. If an interrupt is received during the execution of a plurality of sum-of-products operations, execution of the sum-of-products operations resumes after the interrupt processing. First and second sum-of-products input data are read at the same time by a single memory access.

Type: Grant

Filed: June 5, 1998

Date of Patent: May 15, 2001

Assignee: Seiko Epson Corporation

Inventors: Satoshi Kubota, Makoto Kudo, Yoshiyuki Miyayama
Discrete cosine high-speed arithmetic unit and related arithmetic unit

Patent number: 6223195

Abstract: This arithmetic unit for carrying out partial sum of products for transform operations such as discrete cosine transform is provided which includes a plurality of first units for calculating in parallel sums of and/or differences between a plurality of input variables or sums of and/or differences between a plurality of values obtained by multiplying said plurality of input variables by a constant. The arithmetic unit also includes a processing unit having a plurality of shift units for shifting outputs from said plurality of first units by respectively predetermined numbers of digit-shifts and a plurality of second units for calculating concurrently sums of outputs from said plurality of shift units. The arithmetic can be used, for example, as a high speed discrete cosine unit, a high speed Hartley transform unit or a high speed Hough transform unit.

Type: Grant

Filed: December 14, 1999

Date of Patent: April 24, 2001

Assignee: Hitachi, Ltd.

Inventor: Motonobu Tonomura
Apparatus and method for performing multi-dimensional computations based on intra-add operation

Patent number: 6212618

Abstract: A method and apparatus for including in a processor, instructions for performing multiply-intra-add operations on packed data is described. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first and a second packed data. The processor performs operations on data elements in the first packed data and the second packed data to generate a plurality of data elements in a third packed data in response to receiving an instruction. At least two of the plurality of data elements in the third packed data store the result of multiply-intra-add operations.

Type: Grant

Filed: March 31, 1998

Date of Patent: April 3, 2001

Assignee: Intel Corporation

Inventor: Patrice L. Roussel
Content addressable memory accessed by the sum of two operands

Patent number: 6161164

Abstract: Within a content addressable memory, the latency in a memory access is reduced by combining the steps of effective address generation addition and searching within the content-addressable memory. Two inputs to the content-addressable memory are conditioned and then supplied to matching cells, which determine which address stored in the content-addressable memory will be output. This is accomplished without a full adder being implemented to add the two input operands before being supplied to the content-addressable memory.

Type: Grant

Filed: September 16, 1996

Date of Patent: December 12, 2000

Assignee: International Business Machines Corp.

Inventors: Sang Hoo Dhong, Joel Abraham Silberman
Calculation of a scalar product in a direct-type FIR filter

Patent number: 6131105

Abstract: The invention relates to a direct-type FIR filter, a method for calculating a scalar product in a FIR filter, and a method for designing a direct-type FIR filter. Successive words of a digital input signal are delayed in a delay line having delays (50A-50D) of the duration of one word, and the scalar product between the variously delayed words derived from the delay line and the corresponding constant coefficients is calculated. In accordance with the invention, calculation of the scalar product comprises a) combining the bits of words at the input (X0) and outputs (X1-X4) of the delay line bit by bit in a network of bit-serial subtractor and/or adder elements (51-56) wherein at least one of the bit-serial elements is involved in the multiplication operation of at least two different coefficients, and b) multiplying (49A-K) the multiplication results from the network by powers of two, and summing together (45-48) the results to yield the scalar product.

Type: Grant

Filed: January 23, 1996

Date of Patent: October 10, 2000

Assignee: Tritech Microelectronics LTD

Inventors: Eero Pajarre, Ville Eerola, Tapio Saramaki, Tapani Ritoniemi, Timo Husu, Seppo Ingalsuo
Product-sum calculation circuit constructed of small-size ROM

Patent number: 6101522

Abstract: There is provided is a product-sum calculation circuit which can be constructed of a ROM having a small capacity. In this product-sum calculation circuit, when multiplier selection signals A0 through A2 select X as a multiplier, a second selector circuit 103 selects a product Ck.times.X obtained by multiplying a multiplicand Ck by the multiplier X and outputs the same to an output control circuit 104. In this case, the output control circuit 104 outputs the product Ck.times.X without shifting the same. When the multiplier selection signals A0 through A2 select (2.sup.n)X as the multiplier, the second selector circuit 103 selects the product Ck.times.X obtained by multiplying the multiplicand Ck by the multiplier X and outputs the same to the output control circuit 104 similar to the case where the multiplier X is selected. In this case, the output control circuit 104 outputs (2.sup.n)-fold value of (Ck.times.X) by shifting leftward the product Ck.times.X by n bits. Therefore, merely by storing (Ck.times.

Type: Grant

Filed: May 29, 1998

Date of Patent: August 8, 2000

Assignee: Sharp Kabushiki Kaisha

Inventor: Yuichi Sato
Method and apparatus for simultaneously multiplying two or more independent pairs of operands and summing the products

Patent number: 6085213

Abstract: A multiplier capable of performing signed and unsigned scalar and vector multiplication is disclosed. The multiplier is configured to receive signed or unsigned multiplier and multiplicand operands in scalar or packed vector form. An effective sign for the multiplier and multiplicand operands may be calculated based upon each operand's most significant bit and a control signal. The effective signs may then be used to create and select a number of partial products according to Booth's algorithm. Once the partial products have been created and selected, they may be summed and the results may be output. The results may be signed or unsigned, and may represent vector or scalar quantities. When a vector multiplication is performed, the multiplier may be configured to generate and select partial products so as to effectively isolate the multiplication process for each pair of vector components. The multiplier may also be configured to sum the products of the vector components to form the vector dot product.

Type: Grant

Filed: March 27, 1998

Date of Patent: July 4, 2000

Assignee: Advanced Micro Devices, Inc.

Inventors: Stuart Oberman, Ming Siu
Apparatus useful in floating point arithmetic

Patent number: 6078939

Abstract: A computer and a method of using the computer to separate a floating-point number into high and low parts and for evaluating a dominant arithmetic object and a remainder object. The dominant object is associated with the first arithmetic object by using the high parts of the floating-point number. The evaluation of a remainder arithmetic object associates the first arithmetic object with the high and low parts of the floating-point numbers. A sum of the dominant and remainder arithmetic objects returns a value corresponding to the first arithmetic object.

Type: Grant

Filed: September 30, 1997

Date of Patent: June 20, 2000

Assignee: Intel Corporation

Inventors: Shane A. Story, Ping Tak Peter Tang
Method and device for computing product sums

Patent number: 6058411

Abstract: The invention pertains to the field of computing technology and microelectronics and is useful in producing high-speed integrated circuits and sets of integrated circuits for digital processing of signals, for computing product sums, and for multiplication and addition processes. The inventive method essentially comprises sending a set of values of the electrical parameter of a signal into transmission channels, the number m of which corresponds to the unloading of terms. A computation system, to which corresponds the number n, determines the number of all possible combinations of the set of l/m values of the signal's electrical parameter, which values are sent into a transmission channel according to their weighting. The set of l/m values of the electrical parameter of the signal is sent through a divide-by-n circuit, while transmission is effected from the previous transmission channel to the input of the next transmission channel.

Type: Grant

Filed: October 14, 1997

Date of Patent: May 2, 2000

Assignees: Rashid Bakievich Khalidov, Alisher Vahidovich Shaihov, Lancaster, Technologies, LLC.

Inventor: Aleksandr Jurievich Boltunov
Apparatus for performing multiply-add operations on packed data

Patent number: 6035316

Abstract: A processor having a first and second storage having a first and second packed data, respectively. Each packed data includes a first, second, third, and fourth data element. A multiply-add circuit is coupled to the first and second storage areas. The multiply-add circuit includes a first, second, third, and fourth multiplier, wherein each of the multipliers receives a corresponding set of said data elements. The multiply-add circuit further includes a first adder coupled to the first and second multipliers, and second adder coupled to the third and fourth multipliers. A third storage area is coupled to the adders. The third storage area includes a first and second field for saving output of the first and second adders, respectively, as first and second data elements of a third packed data.

Type: Grant

Filed: February 23, 1996

Date of Patent: March 7, 2000

Assignee: Intel Corporation

Inventors: Alexander D. Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt, Derrick Chu Lin, Ahmet Bindal
Discrete cosine high-speed arithmetic unit and related arithmetic unit

Patent number: 6029185

Abstract: An arithmetic unit for carrying out partial sum of products for transform operations such as discrete cosine transform is provided which includes a plurality of first units for calculating in parallel sums of and/or differences between a plurality of input variables or sums of and/or differences between a plurality of values obtained by multiplying said plurality of input variables by a constant. The arithmetic unit also includes a processing unit having a plurality of shift units for shifting outputs from said plurality of first units by respectively predetermined numbers of digit-shifts and a plurality of second units for calculating concurrently sums of outputs from said plurality of shift units. The arithmetic can be used, for example as a high speed discrete cosine unit, a high speed Hartley transform unit or a high speed Hough transform unit.

Type: Grant

Filed: November 15, 1996

Date of Patent: February 22, 2000

Assignee: Hitachi, Ltd.

Inventor: Motonobu Tonomura
Computer system for performing complex digital filters

Patent number: 5983253

Abstract: A method and apparatus for performing complex digital filters. According to one aspect of the invention, a computer system generally having a transmitting unit, a processor, and a storage device is described. The storage device is coupled to the processor and has stored therein a routine. When executed by the processor, the routine causes the processor to perform a digital filter on unfiltered data items using complex coefficients to generate an output data stream. Execution of the routine causes the processor to perform outer and inner loops. The outer loop steps through corresponding relationships between the complex coefficients and the unfiltered data items. Each of these corresponding relationships is used by the digital filter to generate the output data stream. The inner loop steps the complex coefficients. Within the inner loop, the unfiltered data item corresponding to the current complex coefficient is determined according to the current corresponding relationship.

Type: Grant

Filed: December 20, 1995

Date of Patent: November 9, 1999

Assignee: Intel Corporation

Inventors: Stephen A. Fischer, Larry M. Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi
Apparatus and method for summing 1-bit signals

Patent number: 5983258

Abstract: An arithmetic stage calculates the sum AX+BY where A and B are 1-bit signals and X and Y p bit coefficients X=7 and Y=3 and the corresponding bits b.sub.1 to b.sub.5 are represented together with the corresponding logical states of A and B. It will be seen that for example column b.sub.3 together with columns A and B is the truth table of an NAND gate. Column b.sub.2 together with columns A and B is the truth table of a COINCIDENCE gate.In the example of FIG. 5 column b.sub.4 equals B; column b.sub.1 is logical 0 whatever the states of A and B; and column b.sub.5 is NOT A.Thus in accordance with one illustrative embodiment of the invention the arithmetic stage 40 may be implemented by the logic circuit of FIG. 6 wherebit b.sub.5 is produced by inverting A,bit b.sub.4 is produced by coupling output b.sub.1 to input B, via a direct connection 60,bit b.sub.3 is produced by a NAND gate 61,bit b.sub.2 is produced by a COINCIDENCE gate 62, andbit b.sub.1 is produced by coupling output b.sub.

Type: Grant

Filed: November 26, 1997

Date of Patent: November 9, 1999

Assignees: Sony Corporation, Sony United Kingdom Limited

Inventors: Peter Charles Eastty, Christopher Sleight, Peter Damien Thorpe
System for signal processing using multiply-add operations

Patent number: 5983257

Abstract: A computer system which includes a multimedia input device which generates an audio or video input signal and a processor coupled to the multimedia input device. The system further includes a storage device coupled to the processor and having stored therein a signal processing routine for multiplying and accumulating input values representative of the audio or video input signal. The signal processing routine, when executed by the processor, causes the processor to perform several steps. These steps include performing a packed multiply add on a first set of values packed into a first source and a second set of values packed into a second source each representing input signals to generate a packed intermediate result. The packed intermediate result is added to an accumulator to generate a packed accumulated result in the accumulator. These steps may be iterated with the first set of values and portions of the second set of values to the accumulator to generate the packed accumulated result.

Type: Grant

Filed: December 26, 1995

Date of Patent: November 9, 1999

Assignee: Intel Corporation

Inventors: Carole Dulong, Larry M. Mennemeier, Tuan H. Bui, Eiichi Kowashi, Alexander D. Peleg, Benny Eitan, Stephen A. Fischer, Benny Maytal, Millind Mittal
Apparatus for performing multiply-add operations on packed data

Patent number: 5983256

Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.

Type: Grant

Filed: October 29, 1997

Date of Patent: November 9, 1999

Assignee: Intel Corporation

Inventors: Alexander Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt
Sum-of-products arithmetic unit

Patent number: 5944775

Abstract: A sum-of-products arithmetic unit includes a coefficient register, a data register, a multiplier, an adder, and a data bus used for the transfer of data to and from an external unit. Provision is made to allow addresses in the data register in which sum-of-products arithmetic data is to be stored to be specified without externally specifying an individual address in the data register for each piece of arithmetic data. This provision comprises an automatic data batch storage section, automatic address setting section, or address setting section.

Type: Grant

Filed: July 31, 1997

Date of Patent: August 31, 1999

Assignee: Fujitsu Limited

Inventor: Matsui Satoshi
Efficient correlation over a sliding window

Patent number: 5931893

Abstract: The details of an improved correlator and efficient method of correlation are disclosed. The last M received signal samples are compared with all shifts of a given M-bit binary codeword. The correlator adds or subtracts each of the signal samples accordingly, as the corresponding shift of the codeword contains a binary "1" or "0" in that position. The total is output for each new signal sample received, with a shift of one position between the signal samples and the codeword.

Type: Grant

Filed: November 11, 1997

Date of Patent: August 3, 1999

Assignee: Ericsson, Inc.

Inventors: Paul W. Dent, Eric Wang
Product-sum device suitable for IIR and FIR operations

Patent number: 5904731

Abstract: A product-sum device has a data register and a coefficient register and calculates the sum of products of the outputs of the data register and coefficient register. The data register has register elements that successively shift presently held data whenever new data is supplied to the data register. The coefficient register has register elements corresponding to the register elements. The product-sum device carries out a filtering operation at high speed.

Type: Grant

Filed: April 28, 1995

Date of Patent: May 18, 1999

Assignee: Fujitsu Limited

Inventor: Satoshi Matsui

prev 1 2 3