Sum Of Products Generation Patents (Class 708/603)
-
Patent number: 6675286Abstract: Partitioned sigma instructions are provided in which processor capacity is effectively distributed among multiple sigma operations which are executed concurrently. Special registers are included for aligning data on memory word boundaries to reduce packing overhead in providing long data words for multimedia instructions which implement shifting data sequences over multiple iterations. Extended partitioned arithmetic instructions are provided to improve precision and avoid accumulated carry over errors. Partitioned formatting instructions, including partitioned interleave, partitioned compress, and partitioned interleave and compress pack subwords in an effective order for other partitioned operations.Type: GrantFiled: April 27, 2000Date of Patent: January 6, 2004Assignee: University of WashingtonInventors: Weiyun Sun, Stefan G. Berg, Donglok Kim, Yongmin Kim
-
Publication number: 20030158879Abstract: An apparatus and method for compressing a reduction array into an accumulated carry-save sum. The reduction array includes a partial product matrix, a carry-save sum, and a constant value row. A compressor array generates a previous accumulated carry-save sum. A three-input/two-output carry-save adder pre-reduces the constant value row and the previously accumulated carry-save sum into a two-row intermediate carry-save sum that is added to the partial product matrix to form a current accumulated carry-save sum.Type: ApplicationFiled: December 11, 2000Publication date: August 21, 2003Applicant: International Business Machines CorporationInventors: Ohsang Kwon, Kevin J. Nowka
-
Patent number: 6609143Abstract: It is an object of the present invention to provide an arithmetic logic unit that can perform a sum-of-products operation in a reduced number of processing cycles without carrying out data transfer and additions even in obtaining a single result from a plurality of divided input data words. Data words X and Y are input. A product of the high-order bits of X and Y is calculated using first decoder 511, first selector 521, first partial product generator 531 and first full adder 541. A product of the low-order bits of X and Y is also calculated using second decoder 512, second selector 522, second partial product generator 532 and second full adder 542. These products are adaptively shifted at a shifter 55 and then added up with a fed back data word Z at a third full adder 56 and a carry-propagation adder 58. In this manner, the data word Z, representing the result of the sum-of-products operation, is obtained.Type: GrantFiled: July 20, 2000Date of Patent: August 19, 2003Assignee: Matsushita Electric Industrial Co., LtdInventors: Tomochika Kanakogi, Masaitsu Nakajima
-
Publication number: 20030145030Abstract: Input data is received by an execution unit. One or more current multiply-accumulate operations are performed by the execution unit on the received input data and on input data received by the execution unit for one or more prior multiply-accumulate operations and saved by the execution unit.Type: ApplicationFiled: January 31, 2002Publication date: July 31, 2003Inventor: Gad S. Sheaffer
-
Patent number: 6598062Abstract: A processing engine (10) that generates sum of products (SOP) values for incoming data. The processing engine (10) includes a calculation module (30) for generating intermediate and SOP values based on the incoming data and coefficient values, wherein the intermediate values are defined by product values and partial presum values. A feedback module (50) stores the intermediate values until the calculation module (30) generates SOP values. The processing engine (10) further includes a reordering module (70) for reordering the SOP values. The feedback module (50) includes a switching mechanism (52) for retrieving intermediate values from the calculation module (30) until the calculation module (30) generates SOP values. Thus, a feedback RAM (53) can store the intermediate values without the need for buffering RAM at the input stage.Type: GrantFiled: May 31, 2000Date of Patent: July 22, 2003Assignee: Northrop Grumman CorporationInventor: Derek Layne
-
Patent number: 6584483Abstract: The present invention is directed to an apparatus and method for efficiently calculating an intermediate value between a first end value and a second end value such that the area and time required to implement this operation is minimized. The present invention is also used to efficiently multiply a value by a fraction. A fraction is involved in calculating an intermediate value and also for multiplying by a fraction. When the denominator of the fraction is odd, the binary representation of the blending function, which is used to calculate an intermediate value, exhibits special characteristics. The special characteristics allow the present invention to, among others, avoid the use of multipliers, which require a large number of gates to implement. This invention exploits this and other special characteristics in order to efficiently implement in hardware the blending function and to efficiently multiply a value by a fraction.Type: GrantFiled: December 30, 1999Date of Patent: June 24, 2003Assignee: Intel CorporationInventors: Tom Altus, Jacob D. Doweck
-
Patent number: 6584482Abstract: A multiplier array processing system which improves the utilization of the multiplier and adder array for lower-precision arithmetic is described. New instructions are defined which provide for the deployment of additional multiply and add operations as a result of a single instruction, and for the deployment of greater multiply and add operands as the symbol size is decreased.Type: GrantFiled: August 19, 1999Date of Patent: June 24, 2003Assignee: Microunity Systems Engineering, Inc.Inventors: Craig C. Hansen, Henry Massalin
-
Patent number: 6567831Abstract: A method optimizes function evaluations performed by of a VLIW processor through enhanced parallelism by evaluating the function by table approximation using decomposition into a Taylor series.Type: GrantFiled: April 20, 2000Date of Patent: May 20, 2003Assignee: Elbrus International LimitedInventor: Vadim E. Loginov
-
Patent number: 6557022Abstract: Two multiply-accumulate units are coupled together so that the computation (B*C)+(D*E) can be completed in one cycle. An adder (216) adds together the products of the two multipliers (206), (208). The sum is applied to the first accumulator (220). Preferably, the second product is also applied to the second accumulator (222), and a multiplexer (218) applies either a zero or the second product to the adder (216). If two unrelated computations are to be executed simultaneously, then the zero is applied, and the output of the second accumulator is fed back to the register file (PI2). If a single (B*C)+(D*E) computation is to be executed, then the second product is applied to the adder, and the output of the second accumulator is disregarded.Type: GrantFiled: February 26, 2000Date of Patent: April 29, 2003Assignee: Qualcomm, IncorporatedInventors: Gilbert C. Sih, Xufeng Chen, De D. Hsu
-
Publication number: 20030069913Abstract: A tightly coupled dual 16-bit multiply-accumulate (MAC) unit for performing single-instruction/multiple-data (SIMD) operations may forward an intermediate result to another operation in a pipeline to resolve an accumulating dependency penalty. The MAC unit may also be used to perform 32-bit×32-bit operations.Type: ApplicationFiled: October 5, 2001Publication date: April 10, 2003Inventors: Deli Deng, Anthony Jebson, Yuyun Liao, Nigel C. Paver, Steve J. Strazdus
-
Publication number: 20030061252Abstract: A machine or method used in signal processing transforms involving computing one or more sums each of one or more products. A multiplier has one or both of its two inputs restricted to limited sets of numbers having given finite-precision numeric formats. The multiplier is not a constant multiplier capable only of computing the product of any first number and a constant. The multiplier is not a general multiplier capable of computing the product of any pair of numbers. The multiplier has lower complexity than a general multiplier, but more flexibility than a constant multiplier. The invention can be used to reduce the overall computational complexity of signal processing transforms. The invention can be used when transform weights are fixed and known. The invention can be used when transform inputs, though random, come from small, known sets, as is the case in digital communications.Type: ApplicationFiled: September 27, 2001Publication date: March 27, 2003Inventor: Charles D. Murphy
-
Patent number: 6523055Abstract: A multiplication accumulation circuit (abbreviated as “MAC”) has five input buses that carry signals for operands A, B, C, D and E, a control bus that carries signals for controlling the operations performed on the received operands, and an output bus that carries a signal generated by the MAC. Each of operands A, B, C and D can be four different operands that are used as follows by the MAC: (1) to perform two multiplications simultaneously, and (2) to perform an addition of the products of the two multiplications and the fifth operand E, e.g. generate on the output bus a signal of value A*C+B*D+E. Alternatively, operands A and B can be, respectively, the upper and lower halves of a first double word to be used as a multiplicand. Similarly, operands C and D can be the upper and lower halves of a second double word to be used as a multiplier.Type: GrantFiled: January 20, 1999Date of Patent: February 18, 2003Assignee: LSI Logic CorporationInventors: Robert K. Yu, Satish Padmanabhan, Chakra R. Srivatsa, Shailesh I. Shah
-
Patent number: 6519621Abstract: An improved arithmetic circuit for accumulative operation for use in digital signal processors, microprocessors and so forth is described, in which the pipelined control becomes effective during accumulative operation by eliminating idling stages in the pipeline structure. In accordance with the improved arithmetic circuit, during accumulative operation, the next operation is initiated with intermediate results of the current operation while the current operation is being executed and not yet completed so that it is possible to improve the speed of accumulative operation and reduce the scale of integration.Type: GrantFiled: May 10, 1999Date of Patent: February 11, 2003Assignee: Kabushiki Kaisha ToshibaInventor: Naoka Yano
-
Patent number: 6480872Abstract: A method and a device including, in one embodiment, a multiply array and at least one adder to perform a floating-point multiplication followed by an addition when operands are in floating-point format. The device is also configured to perform an integer multiplication followed by an accumulation when operands are in integer format. The device is further configured to perform a floating-point multiply-add or an integer multiply-accumulation in response to control signals. In another embodiment, the device contains an adder and the adder is capable of performing a floating-point addition and an integer accumulation. The adder is configured to be extra wide to reduce operand misalignment. Moreover, the device stalls the process in response to operand misalignment.Type: GrantFiled: January 21, 1999Date of Patent: November 12, 2002Assignee: SandCraft, Inc.Inventor: Jack H. Choquette
-
Publication number: 20020138535Abstract: In an SIMD sum of product arithmetic method of enabling a concurrent execution of 2n (where n is a natural number) parallel sum of product arithmetic (operations), the SIMD sum of product arithmetic is executed using 2m (m=0, . . . , log2n) accumulators as one set, and by replacing a 2p-1th accumulator with an adjacent 2pth (p=1, . . . , n) accumulator, without changing a sequence of accumulator addresses, in the set, as accumulator addresses to be allocated to sum of product arithmetic circuits for the SIMD sum of product arithmetic.Type: ApplicationFiled: September 5, 2001Publication date: September 26, 2002Applicant: Fujitsu LimitedInventor: Masayuki Tsuji
-
Patent number: 6449630Abstract: An apparatus for processing digital signals includes a multiplier having a first input and a second input and an output producing a product. An adder is connected to receive the product from the multiplier as a first input to produce a sum. A first register is connected to receive and store the sum and to provide a second input to the adder in response to a clock signal. A second register is connected to receive and store the output of the first register in response to an inverse of the clock signal to enable the addition of two products in a single clock cycle.Type: GrantFiled: April 7, 1999Date of Patent: September 10, 2002Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventor: Jay Bao
-
Publication number: 20020116433Abstract: A multiply-accumulate module (100) includes a multiply-accumulate core (120), which includes a plurality of Booth encoder cells (104a). The multiply-accumulate core (120) also includes a plurality of Booth decoder cells (110a) connected to at least one of the Booth encoder cells (104a) and a plurality of Wallace tree cells (112a) connected to at least one of the Booth decoder cells (110a). Moreover, at least one first Wallace tree cell (112a1) or at least one first Booth decoder cell (110a1), or any combination thereof, includes a first plurality of transistors, and at least one second Wallace tree cell (112a2) or at least one second Booth decoder cell (110a2), or any combination thereof, includes a second plurality of transistors. In addition, at least one critical path of the multiply-accumulate module (100) includes the at least one first cell and a width of at least one of the first plurality of transistors is greater than a width of at least one of the second plurality of transistors.Type: ApplicationFiled: September 27, 2001Publication date: August 22, 2002Inventors: Kaoru Awaka, Hiroshi Takahashi, Shigetoshi Muramatsu, Akihiro Takegama
-
Publication number: 20020116434Abstract: The present invention provides an apparatus and method for processing data using a multiplying circuit for performing a multiplication of a W/2 bit data value by a W bit data value. An instruction decoder is provided which is responsive to a multiply instruction to control the multiplying circuit to generate a multiplication result for the computation M×N, where M and N are W bit data words. The multiplying circuit is arranged to execute a first operation in the which the data word N is multiplied by the most significant W/2 bits of the data word M to generate a first intermediate result having 3W/2 bits, and to then execute a second operation in which the data word N is multiplied by the least significant W/2 bits of the data word M to generate a second intermediate result having 3W/2 bits. The first intermediate result is shifted by W/2 with respect to the second intermediate result and added to the second intermediate result to generate the multiplication result.Type: ApplicationFiled: December 27, 2000Publication date: August 22, 2002Inventor: Alexander Edward Nancekievill
-
Patent number: 6438569Abstract: A method and apparatus for a sums of products datapath. According to one embodiment of the invention, an apparatus has a number of inputs and a number generation units. Each of the generation units is coupled to the inputs. Each of the generation units includes a separate selection circuit coupled to each one of the inputs to selectively pass the signal provided on that input. In addition, each of the generation units includes a number of reduction circuits having inputs coupled to mutually exclusive pluralities of the selection circuits and each having an output. The apparatus also includes a first and second summation circuit coupled to the output of the plurality of reduction circuits in mutually exclusive pluralities of generation units. Additionally, the apparatus includes a subtraction circuit coupled to an output of the first and second summation circuit.Type: GrantFiled: September 20, 1999Date of Patent: August 20, 2002Assignee: PMC-Sierra, Inc.Inventor: Curtis Abbott
-
Patent number: 6397240Abstract: A programmable multi-mode accelerator is disclosed for use with a programmable processor or microprocessor. The programmable multi-mode accelerator allows a programmable processor to execute specific algorithms, such as certain types of finite impulse response (FIR), correlation and Viterbi computations, that require low-precision operations at an extremely high rate. The accelerator extends the digital signal processor's performance into the required range for low-precision computations. The accelerator can be coupled with the main data path of a programmable processor or microprocessor and can directly read and write to the main register files of the programmable processor. In an illustrative implementation, the accelerator data path accesses its input values (source operands) directly from a main register file of the programmable processor and writes results back into a second main register file.Type: GrantFiled: February 18, 1999Date of Patent: May 28, 2002Assignee: Agere Systems Guardian Corp.Inventors: John Susantha Fernando, Stefan Thurnhofer
-
Publication number: 20020059355Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.Type: ApplicationFiled: November 19, 2001Publication date: May 16, 2002Applicant: Intel CorporationInventors: Alexander D. Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt
-
Patent number: 6385634Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.Type: GrantFiled: August 31, 1995Date of Patent: May 7, 2002Assignee: Intel CorporationInventors: Alexander D. Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt
-
Patent number: 6385635Abstract: Multipliers 107 through 110 carry out an multiplication operation with two data out of the four data transferred from a memory over buses 101 through 104. The multiplication results are subjected to an addition or subtraction operation with each other in adder-subtracters 111 and 112. The operation results obtained by the adder-subtracters 111 and 112 are supplied to adders 113 and 114 where they are added to values held by accumulators 115 and 116. A latch circuit 105 supplies the data transferred through the bus 102 to the multiplier 109 when a control signal 106 indicates “ON”. The latch circuit 105 temporarily holds the data transferred through the bus 102 and supplies the data held therein to the multiplier 109 when a control signal 106 indicates “OFF”.Type: GrantFiled: April 23, 1999Date of Patent: May 7, 2002Assignee: NEC CorporationInventor: Daiji Ishii
-
Patent number: 6377970Abstract: A method and apparatus that adds each one of multiple elements of a packed data together to produce a result. According to one such a method and apparatus, each of a first set of portions of partial products is produced using a first set of partial product selectors in a multiplier, each of the first set of portions of the partial products being zero. Each of the multiple elements is inserted into one of a second set of portions of the partial products using a second set of partial product selectors, each of the second set of portions of the partial products being aligned. Each of the multiple elements are added together to produce the result including a field having the sum of the multiple elements.Type: GrantFiled: March 31, 1998Date of Patent: April 23, 2002Assignee: Intel CorporationInventors: Mohammad A. Abdallah, Vladimir Pentkovski
-
Patent number: 6370558Abstract: A data processing apparatus including a multiplier unit forming a product from L bits of each two data buses of N bits each N is greater than L. The multiplier forms a N bit output having a first portion which is the L most significant bits of the of product and a second portion which is M other bits not including the L least significant bits of the product, where N is the sum of M and L. In the preferred embodiment the M other bits are derived from other bits of the two input data busses, such as the M other bits of the first input data bus. An arithmetic logic unit performs parallel operations (addition, subtraction, Boolean functions) controlled by the same instructions. This arithmetic logic unit is divisible into a selected number of sections for performing identical operations on independent sections of its inputs. The multiplier unit may form dual products from separate parts of the input data.Type: GrantFiled: October 3, 2000Date of Patent: April 9, 2002Assignee: Texas Instruments IncorporatedInventors: Karl M. Guttag, Christopher J. Read, Keith Balmer
-
Patent number: 6370556Abstract: The invention relates to a method and an arrangement in a transposed digital FIR filter for multiplying a binary input signal by tap coefficients, and to a method for designing such a filter. The invention comprises a shift register (51, 52) shifting in the direction of the least significant bit and copying the most significant bit or filling in zero values. The register receives the binary input signal of the filter and has outputs for outputting the content of the desired bit positions. A plurality of bit-serial subtractor and adder elements (53-57) multiply the binary input signal by N+1 different tap coefficients by combining output bits of the shift register (51, 52). The subtractor and/or adder elements form a network wherein at least one element participates in the multiplying operation of at least two different tap coefficients.Type: GrantFiled: September 27, 1995Date of Patent: April 9, 2002Assignee: Tritech Microelectronics, LtdInventors: Tapio Saramäki, Tapani Ritoniemi, Ville Eerola, Timo Husu, Eero Pajarre, Seppo Ingalsuo
-
Publication number: 20020040379Abstract: A multiplier for computing a final product of a first operand and a second operand comprising a multiplier array for forming a product of the first operand and second operand in carry-save form; a carry-save adder for adding said carry-save partial products and an accumulatd sum to produce a carry and save values; a carry-lookahead adder for adding said carry and save values to produce a product value and a carry-out value; a general purpose adder for adding said carry-out and said product value to produce said final product.Type: ApplicationFiled: January 2, 2001Publication date: April 4, 2002Inventor: Maher Amer
-
Patent number: 6330631Abstract: A bus bridge for a computer system for bridging first and second buses includes a shift and accumulate unit. The shift and accumulate unit includes a shifter having an input connected to receive bytes from one of the first and second buses and an output providing a selectable shift to the received bytes. The shift and accumulate unit also includes an accumulator having an input connected to receive the output of the shifter and providing accumulation of selectable bits of the shifted bytes, the accumulator having an output for supplying realigned bytes to be passed to the other of the first and second buses. The combination of the shifter and the accumulator permits a desired amount of shift to be combined with the accumulation of selected bits or bytes to realign sets of bytes from one bus and to form sets of bytes for the other bus. Burst transfer is also possible by operating the shift and accumulate unit to operate in successive cycles for successive sets of input bytes from one of the buses.Type: GrantFiled: February 3, 1999Date of Patent: December 11, 2001Assignee: Sun Microsystems, Inc.Inventor: Andrew Crosland
-
Patent number: 6247036Abstract: A reconfigurable processor includes at least three (3) MacroSequencers (10)-(16) which are configured in an array. Each of the MacroSequencers is operable to receive on a separate one of four buses (18) an input from the other three MacroSequencers and from itself in a feedback manner. In addition, a control bus (20) is operable to provide control signals to all of the MacroSequencers for the purpose of controlling the instruction sequence associated therewith and also for inputting instructions thereto. Each of the MacroSequencers includes a plurality of executable units having inputs and outputs and each for providing an associated execution algorithm. The outputs of the execution units are input to an output selector which selects the outputs for outputs on at least one external output and on at least one feedback path. An input selector (66) is provided having an input for receiving at least one external output and at least the feedback path.Type: GrantFiled: January 21, 1997Date of Patent: June 12, 2001Assignee: Infinite Technology Corp.Inventors: George Landers, Earle Jennings, Tim B. Smith, Glen Haas
-
Patent number: 6233596Abstract: An objective of this invention is a design that improves the memory usage ratio and execution speed of a sum-of-products operation instruction, improves the critical path of sum-of-products operations, and prevents overflows. A sum-of-products operation circuit executes sum-of-products operations a number of times that is specified by number-of-executions information comprised within a sum-of-products operation instruction, under the control of a control circuit. The number of times the sum-of-products operation is to be executed is set into a register, that number is decremented every time one cycle of the sum-of-products operation ends, and the sum-of-products operation instruction ends when the value in the register reaches zero. If an interrupt is received during the execution of a plurality of sum-of-products operations, execution of the sum-of-products operations resumes after the interrupt processing. First and second sum-of-products input data are read at the same time by a single memory access.Type: GrantFiled: June 5, 1998Date of Patent: May 15, 2001Assignee: Seiko Epson CorporationInventors: Satoshi Kubota, Makoto Kudo, Yoshiyuki Miyayama
-
Patent number: 6223195Abstract: This arithmetic unit for carrying out partial sum of products for transform operations such as discrete cosine transform is provided which includes a plurality of first units for calculating in parallel sums of and/or differences between a plurality of input variables or sums of and/or differences between a plurality of values obtained by multiplying said plurality of input variables by a constant. The arithmetic unit also includes a processing unit having a plurality of shift units for shifting outputs from said plurality of first units by respectively predetermined numbers of digit-shifts and a plurality of second units for calculating concurrently sums of outputs from said plurality of shift units. The arithmetic can be used, for example, as a high speed discrete cosine unit, a high speed Hartley transform unit or a high speed Hough transform unit.Type: GrantFiled: December 14, 1999Date of Patent: April 24, 2001Assignee: Hitachi, Ltd.Inventor: Motonobu Tonomura
-
Patent number: 6212618Abstract: A method and apparatus for including in a processor, instructions for performing multiply-intra-add operations on packed data is described. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first and a second packed data. The processor performs operations on data elements in the first packed data and the second packed data to generate a plurality of data elements in a third packed data in response to receiving an instruction. At least two of the plurality of data elements in the third packed data store the result of multiply-intra-add operations.Type: GrantFiled: March 31, 1998Date of Patent: April 3, 2001Assignee: Intel CorporationInventor: Patrice L. Roussel
-
Patent number: 6161164Abstract: Within a content addressable memory, the latency in a memory access is reduced by combining the steps of effective address generation addition and searching within the content-addressable memory. Two inputs to the content-addressable memory are conditioned and then supplied to matching cells, which determine which address stored in the content-addressable memory will be output. This is accomplished without a full adder being implemented to add the two input operands before being supplied to the content-addressable memory.Type: GrantFiled: September 16, 1996Date of Patent: December 12, 2000Assignee: International Business Machines Corp.Inventors: Sang Hoo Dhong, Joel Abraham Silberman
-
Patent number: 6131105Abstract: The invention relates to a direct-type FIR filter, a method for calculating a scalar product in a FIR filter, and a method for designing a direct-type FIR filter. Successive words of a digital input signal are delayed in a delay line having delays (50A-50D) of the duration of one word, and the scalar product between the variously delayed words derived from the delay line and the corresponding constant coefficients is calculated. In accordance with the invention, calculation of the scalar product comprises a) combining the bits of words at the input (X0) and outputs (X1-X4) of the delay line bit by bit in a network of bit-serial subtractor and/or adder elements (51-56) wherein at least one of the bit-serial elements is involved in the multiplication operation of at least two different coefficients, and b) multiplying (49A-K) the multiplication results from the network by powers of two, and summing together (45-48) the results to yield the scalar product.Type: GrantFiled: January 23, 1996Date of Patent: October 10, 2000Assignee: Tritech Microelectronics LTDInventors: Eero Pajarre, Ville Eerola, Tapio Saramaki, Tapani Ritoniemi, Timo Husu, Seppo Ingalsuo
-
Patent number: 6101522Abstract: There is provided is a product-sum calculation circuit which can be constructed of a ROM having a small capacity. In this product-sum calculation circuit, when multiplier selection signals A0 through A2 select X as a multiplier, a second selector circuit 103 selects a product Ck.times.X obtained by multiplying a multiplicand Ck by the multiplier X and outputs the same to an output control circuit 104. In this case, the output control circuit 104 outputs the product Ck.times.X without shifting the same. When the multiplier selection signals A0 through A2 select (2.sup.n)X as the multiplier, the second selector circuit 103 selects the product Ck.times.X obtained by multiplying the multiplicand Ck by the multiplier X and outputs the same to the output control circuit 104 similar to the case where the multiplier X is selected. In this case, the output control circuit 104 outputs (2.sup.n)-fold value of (Ck.times.X) by shifting leftward the product Ck.times.X by n bits. Therefore, merely by storing (Ck.times.Type: GrantFiled: May 29, 1998Date of Patent: August 8, 2000Assignee: Sharp Kabushiki KaishaInventor: Yuichi Sato
-
Patent number: 6085213Abstract: A multiplier capable of performing signed and unsigned scalar and vector multiplication is disclosed. The multiplier is configured to receive signed or unsigned multiplier and multiplicand operands in scalar or packed vector form. An effective sign for the multiplier and multiplicand operands may be calculated based upon each operand's most significant bit and a control signal. The effective signs may then be used to create and select a number of partial products according to Booth's algorithm. Once the partial products have been created and selected, they may be summed and the results may be output. The results may be signed or unsigned, and may represent vector or scalar quantities. When a vector multiplication is performed, the multiplier may be configured to generate and select partial products so as to effectively isolate the multiplication process for each pair of vector components. The multiplier may also be configured to sum the products of the vector components to form the vector dot product.Type: GrantFiled: March 27, 1998Date of Patent: July 4, 2000Assignee: Advanced Micro Devices, Inc.Inventors: Stuart Oberman, Ming Siu
-
Patent number: 6078939Abstract: A computer and a method of using the computer to separate a floating-point number into high and low parts and for evaluating a dominant arithmetic object and a remainder object. The dominant object is associated with the first arithmetic object by using the high parts of the floating-point number. The evaluation of a remainder arithmetic object associates the first arithmetic object with the high and low parts of the floating-point numbers. A sum of the dominant and remainder arithmetic objects returns a value corresponding to the first arithmetic object.Type: GrantFiled: September 30, 1997Date of Patent: June 20, 2000Assignee: Intel CorporationInventors: Shane A. Story, Ping Tak Peter Tang
-
Patent number: 6058411Abstract: The invention pertains to the field of computing technology and microelectronics and is useful in producing high-speed integrated circuits and sets of integrated circuits for digital processing of signals, for computing product sums, and for multiplication and addition processes. The inventive method essentially comprises sending a set of values of the electrical parameter of a signal into transmission channels, the number m of which corresponds to the unloading of terms. A computation system, to which corresponds the number n, determines the number of all possible combinations of the set of l/m values of the signal's electrical parameter, which values are sent into a transmission channel according to their weighting. The set of l/m values of the electrical parameter of the signal is sent through a divide-by-n circuit, while transmission is effected from the previous transmission channel to the input of the next transmission channel.Type: GrantFiled: October 14, 1997Date of Patent: May 2, 2000Assignees: Rashid Bakievich Khalidov, Alisher Vahidovich Shaihov, Lancaster, Technologies, LLC.Inventor: Aleksandr Jurievich Boltunov
-
Patent number: 6035316Abstract: A processor having a first and second storage having a first and second packed data, respectively. Each packed data includes a first, second, third, and fourth data element. A multiply-add circuit is coupled to the first and second storage areas. The multiply-add circuit includes a first, second, third, and fourth multiplier, wherein each of the multipliers receives a corresponding set of said data elements. The multiply-add circuit further includes a first adder coupled to the first and second multipliers, and second adder coupled to the third and fourth multipliers. A third storage area is coupled to the adders. The third storage area includes a first and second field for saving output of the first and second adders, respectively, as first and second data elements of a third packed data.Type: GrantFiled: February 23, 1996Date of Patent: March 7, 2000Assignee: Intel CorporationInventors: Alexander D. Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt, Derrick Chu Lin, Ahmet Bindal
-
Patent number: 6029185Abstract: An arithmetic unit for carrying out partial sum of products for transform operations such as discrete cosine transform is provided which includes a plurality of first units for calculating in parallel sums of and/or differences between a plurality of input variables or sums of and/or differences between a plurality of values obtained by multiplying said plurality of input variables by a constant. The arithmetic unit also includes a processing unit having a plurality of shift units for shifting outputs from said plurality of first units by respectively predetermined numbers of digit-shifts and a plurality of second units for calculating concurrently sums of outputs from said plurality of shift units. The arithmetic can be used, for example as a high speed discrete cosine unit, a high speed Hartley transform unit or a high speed Hough transform unit.Type: GrantFiled: November 15, 1996Date of Patent: February 22, 2000Assignee: Hitachi, Ltd.Inventor: Motonobu Tonomura
-
Patent number: 5983253Abstract: A method and apparatus for performing complex digital filters. According to one aspect of the invention, a computer system generally having a transmitting unit, a processor, and a storage device is described. The storage device is coupled to the processor and has stored therein a routine. When executed by the processor, the routine causes the processor to perform a digital filter on unfiltered data items using complex coefficients to generate an output data stream. Execution of the routine causes the processor to perform outer and inner loops. The outer loop steps through corresponding relationships between the complex coefficients and the unfiltered data items. Each of these corresponding relationships is used by the digital filter to generate the output data stream. The inner loop steps the complex coefficients. Within the inner loop, the unfiltered data item corresponding to the current complex coefficient is determined according to the current corresponding relationship.Type: GrantFiled: December 20, 1995Date of Patent: November 9, 1999Assignee: Intel CorporationInventors: Stephen A. Fischer, Larry M. Mennemeier, Alexander D. Peleg, Carole Dulong, Eiichi Kowashi
-
Patent number: 5983258Abstract: An arithmetic stage calculates the sum AX+BY where A and B are 1-bit signals and X and Y p bit coefficients X=7 and Y=3 and the corresponding bits b.sub.1 to b.sub.5 are represented together with the corresponding logical states of A and B. It will be seen that for example column b.sub.3 together with columns A and B is the truth table of an NAND gate. Column b.sub.2 together with columns A and B is the truth table of a COINCIDENCE gate.In the example of FIG. 5 column b.sub.4 equals B; column b.sub.1 is logical 0 whatever the states of A and B; and column b.sub.5 is NOT A.Thus in accordance with one illustrative embodiment of the invention the arithmetic stage 40 may be implemented by the logic circuit of FIG. 6 wherebit b.sub.5 is produced by inverting A,bit b.sub.4 is produced by coupling output b.sub.1 to input B, via a direct connection 60,bit b.sub.3 is produced by a NAND gate 61,bit b.sub.2 is produced by a COINCIDENCE gate 62, andbit b.sub.1 is produced by coupling output b.sub.Type: GrantFiled: November 26, 1997Date of Patent: November 9, 1999Assignees: Sony Corporation, Sony United Kingdom LimitedInventors: Peter Charles Eastty, Christopher Sleight, Peter Damien Thorpe
-
Patent number: 5983257Abstract: A computer system which includes a multimedia input device which generates an audio or video input signal and a processor coupled to the multimedia input device. The system further includes a storage device coupled to the processor and having stored therein a signal processing routine for multiplying and accumulating input values representative of the audio or video input signal. The signal processing routine, when executed by the processor, causes the processor to perform several steps. These steps include performing a packed multiply add on a first set of values packed into a first source and a second set of values packed into a second source each representing input signals to generate a packed intermediate result. The packed intermediate result is added to an accumulator to generate a packed accumulated result in the accumulator. These steps may be iterated with the first set of values and portions of the second set of values to the accumulator to generate the packed accumulated result.Type: GrantFiled: December 26, 1995Date of Patent: November 9, 1999Assignee: Intel CorporationInventors: Carole Dulong, Larry M. Mennemeier, Tuan H. Bui, Eiichi Kowashi, Alexander D. Peleg, Benny Eitan, Stephen A. Fischer, Benny Maytal, Millind Mittal
-
Patent number: 5983256Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.Type: GrantFiled: October 29, 1997Date of Patent: November 9, 1999Assignee: Intel CorporationInventors: Alexander Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt
-
Patent number: 5944775Abstract: A sum-of-products arithmetic unit includes a coefficient register, a data register, a multiplier, an adder, and a data bus used for the transfer of data to and from an external unit. Provision is made to allow addresses in the data register in which sum-of-products arithmetic data is to be stored to be specified without externally specifying an individual address in the data register for each piece of arithmetic data. This provision comprises an automatic data batch storage section, automatic address setting section, or address setting section.Type: GrantFiled: July 31, 1997Date of Patent: August 31, 1999Assignee: Fujitsu LimitedInventor: Matsui Satoshi
-
Patent number: 5931893Abstract: The details of an improved correlator and efficient method of correlation are disclosed. The last M received signal samples are compared with all shifts of a given M-bit binary codeword. The correlator adds or subtracts each of the signal samples accordingly, as the corresponding shift of the codeword contains a binary "1" or "0" in that position. The total is output for each new signal sample received, with a shift of one position between the signal samples and the codeword.Type: GrantFiled: November 11, 1997Date of Patent: August 3, 1999Assignee: Ericsson, Inc.Inventors: Paul W. Dent, Eric Wang
-
Patent number: 5904731Abstract: A product-sum device has a data register and a coefficient register and calculates the sum of products of the outputs of the data register and coefficient register. The data register has register elements that successively shift presently held data whenever new data is supplied to the data register. The coefficient register has register elements corresponding to the register elements. The product-sum device carries out a filtering operation at high speed.Type: GrantFiled: April 28, 1995Date of Patent: May 18, 1999Assignee: Fujitsu LimitedInventor: Satoshi Matsui