Carry-save Adders (i.e., Csas) Patents (Class 708/629)
-
Patent number: 11789701Abstract: A multiplier circuit is provided to multiply a first operand and a second operand. The multiplier circuit includes a carry-save adder network comprising a plurality of carry-save adders to perform partial product additions to reduce a plurality of partial products to a redundant result value that represents a product of the first operand and the second operand. A number of the carry-save adders that is used to generate the redundant result value is controllable and is dependent on a width of at least one of the first operand and the second operand.Type: GrantFiled: August 5, 2020Date of Patent: October 17, 2023Assignee: Arm LimitedInventors: Tai Li, Jack William Derek Andrew, Michael Alexander Kennedy
-
Patent number: 10042605Abstract: A processor includes a carry save array multiplier. The carry save array multiplier includes an array of cascaded partial product generators. The array of cascaded partial product generators is configured to generate an output value as a product of two operands presented at inputs of the multiplier. The array of cascaded partial product generators is also configured to generate an output value as a sum of two operands presented at inputs of the multiplier.Type: GrantFiled: April 19, 2016Date of Patent: August 7, 2018Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Christian Wiencke, Armin Stingl
-
Patent number: 9632751Abstract: According to one embodiment, an arithmetic circuit includes follows. The arithmetic unit performs an arithmetic operation including addition and multiplication to generate a first value of (n+m) bits. The rounding preprocessor performs an OR operation on lower (m?k) bits of the first value to generate a second value of 1 bit. The register stores a third value of (n+k+1) bits obtained by concatenating upper (n+k) bits of the first value and the second value. The rounding postprocessor calculates a carry bit value of 1 bit from a most significant bit of the third value and lower (k+1) bits of the third value, and adds the carry bit value to upper n bits of the third value.Type: GrantFiled: December 24, 2013Date of Patent: April 25, 2017Assignee: KABUSHIKI KAISHA TOSHIBAInventor: Koichiro Ban
-
Patent number: 8996601Abstract: The disclosed embodiments relate to apparatus for accurately, efficiently and quickly executing a multiplication instruction. The disclosed embodiments can provide a multiplier module having an optimized layout that can help speed up computation of a result during a multiply operation so that cycle delay can be reduced and so that power consumption can be reduced.Type: GrantFiled: June 21, 2012Date of Patent: March 31, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Scott A. Hilker, George Q. Phan
-
Patent number: 8918446Abstract: Methods and apparatus relating to reducing power consumption in multi-precision floating point multipliers are described. In an embodiment, certain portions of a multiplier are disabled in response to two or more multiplication operations with the same data size and data type occurring back-to-back. Other embodiments are also claimed and described.Type: GrantFiled: December 14, 2010Date of Patent: December 23, 2014Assignee: Intel CorporationInventors: Brent R. Boswell, Thierry Pons, Tom Aviram
-
Patent number: 8868634Abstract: A method and apparatus are described for performing multiplication in a processor to generate a product. In one embodiment, a 64-bit multiplier and a 64-bit multiplicand may be multiplied together over four cycles by merging different partial product (PP) subsets, generated by a Booth encoder and a PP generator, with feedback sum and carry results. The logic inputs of a plurality of multiplexers may be selected on a cyclical basis to efficiently compress (i.e., merge) each PP subset with feedback sum and carry results. A pair of preliminary sum results stored during one cycle may be outputted during a subsequent cycle and processed by a logic gate (e.g., an XOR gate) to generate a feedback sum result that is merged with a feedback carry result and a PP subset. Final sum and carry results may be added to generate the product of the multiplier and the multiplicand.Type: GrantFiled: December 2, 2011Date of Patent: October 21, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Srikanth Arekapudi, Sudherssen Kalaiselvan
-
Patent number: 8838664Abstract: The disclosed embodiments relate to methods and apparatus for accurately, efficiently and quickly executing a fused multiply-and-accumulate instruction with respect to floating-point operands that have packed-single-precision format. The disclosed embodiments can speed up computation of a high-part of a result during a fused multiply-and-accumulate operation so that cycle delay can be reduced and so that power consumption can be reduced.Type: GrantFiled: June 29, 2011Date of Patent: September 16, 2014Assignee: Advanced Micro Devices, Inc.Inventors: David Oliver, Debjit Dassarma, Hanbing Liu, Scott Hilker
-
Publication number: 20140164457Abstract: An extensible iterative multiplier design is provided. Embodiments provide cascaded 8-bit multipliers for simplifying the performance of multi-byte multiplications. Booth encoding is performed in the lowest order multiplier, with the result of the Booth encoding then provided to higher order multipliers. Additionally, multiply-add operations can be performed by initializing a partial product sum register. Configurable connections between the multipliers facilitate a variety of possible multiplication options, including the possibility of varying the width of the operands.Type: ApplicationFiled: December 7, 2013Publication date: June 12, 2014Applicant: Wave Semiconductor, Inc.Inventors: Samit Chaudhuri, Radoslav Danilak
-
Patent number: 8667040Abstract: An apparatus having operand registers, an opcode detector, a carryless preformat unit, a compressor, a left shifter, and exclusive-OR logic. The operand registers receive operands for a carryless multiplication. The opcode detector receives a carryless multiplication instruction, and asserts a carryless signal. The carryless preformat unit partitions a first operand into a plurality of parts that are such that a Booth encoder is precluded from selection of second partial products of a second operand, where the second partial products reflect implicit carry operations. The compressor sums first partial products of the second operand via carry save adders arranged in a Wallace tree configuration, where generation of carry bits is disabled. The left shifter shifts one or more outputs of the compressor. The exclusive-OR logic executes an exclusive-OR function to yield a carryless multiplication result.Type: GrantFiled: December 3, 2010Date of Patent: March 4, 2014Assignee: VIA Technologies, Inc.Inventor: Timothy A. Elliott
-
Patent number: 8661072Abstract: A shared parallel adder tree for executing multiple different population count operations on a single datum includes a number of carry-save adders (CSAs) and/or half adders (HAs), arranged in rows, where certain CSAs and HAs are dedicated to a single population count operation, while other CSAs and HAs are shared among two or more population count operations. The datum is applied to the first row in the tree. Partial sums of the number of ones at various locations within the tree are routed to certain CSAs and/or HAs “down” the tree to propagate the particular population count operations. Carry-propagate adders generate at least a portion of the final sum of the number of ones in certain population count operations. An “AND” operation on a particular number of the bits in the datum provides the high order bit of the resulting sum of the particular population count operation.Type: GrantFiled: August 19, 2008Date of Patent: February 25, 2014Assignee: International Business Machines CorporationInventors: Bartholomew Blaner, Todd R. Iglehart, Robert K. Montoye
-
Patent number: 8645448Abstract: An apparatus having a carryless preformat unit, a Booth encoder, a compressor, a left shifter, and exclusive-OR logic. The carryless preformat unit receives a multiplier operand and partitions the multiplier operand into parts. The Booth encoder receives the parts and directs selection of first partial products of a multiplicand that do not reflect implicit carry operations. The compressor sums the first partial products via a configuration of carry save adders that generate sum bits and carry bits, where generation of the carry bits is disabled during execution of the carryless multiplication. The left shifter shifts bits of one or more outputs of the compressor. The exclusive-OR logic is coupled to the compressor and the left shifter, and is configured to execute an exclusive-OR function on the outputs to yield a carryless multiplication result.Type: GrantFiled: December 3, 2010Date of Patent: February 4, 2014Assignee: VIA Technologies, Inc.Inventor: Timothy A. Elliott
-
Patent number: 8606842Abstract: Provided are N-digit addition and subtraction units and N-digit addition and subtraction modules in which borrowing and carrying are not propagated in modules having basic digits. In the units and modules, an output pattern of results of addition and subtraction is predicted based on a relation between an augend and an addend and a relation between a minuend and a subtrahend, respectively, thereby preventing borrowing and carrying from being propagated in modules having basic digits.Type: GrantFiled: August 21, 2008Date of Patent: December 10, 2013Assignee: Tokyo Denki UniversityInventors: Hiroshi Kasahara, Tsugio Nakamura, Jin Sato
-
Patent number: 8601048Abstract: Herein described is a method and system of implementing integrated circuit logic modules that provide maximum efficiency and minimum energy dissipation. In a representative embodiment, a method of implementing one or more digital signal processing functions comprises determining one or more parameters associated with generating an optimal logic module. The one or more parameters may comprise the circuit area of the logic module and the processing time through a critical path of the logic module. In a representative embodiment, the system comprises a logic module that utilizes four full adders arranged in a tree configuration. In a representative embodiment, the logic module comprises a carry-save accumulator that provides maximum efficiency and minimal energy dissipation.Type: GrantFiled: January 5, 2005Date of Patent: December 3, 2013Assignee: Broadcom CorporationInventor: Christian Lutkemeyer
-
Publication number: 20130031154Abstract: A self-timed multiplier unit includes a multiplier and a clock generator. The multiplier has a first set of semiconductor circuits in a critical path. The clock generator has a second set of semiconductor circuits configured to control a clock period of said clock generator selected to set a clock period longer than the propagation delay through the critical path of the multiplier. The clock generator may include a delay circuit having a delay to set the clock period longer than the propagation delay through the critical path of said multiplier. The clock generator uses circuit with identical logical design including the same standard cells, the same logic design or the same floor plan. Close matching of these circuit causes the multiplier and the clock generator to experience the same PVT speed variations.Type: ApplicationFiled: July 27, 2012Publication date: January 31, 2013Applicant: TEXAS INSTRUMENTS DEUTSCHLAND GMBHInventors: Christian Wiencke, Horst Diewald
-
Patent number: 8364741Abstract: A multiplier includes an operation unit that adds or subtracts a first group selected from a current input data, and a second group selected from a next input data corresponding to the first group to generate an operation result, a Booth's encoder that encodes the operation result according to Booth's algorithm, and generates code data, a partial product generation unit that calculates a partial product from the code data as a first partial product, and calculates, in a case where the first group and the second group are specific combination, a second partial product, and an adder that cumulatively adds an output from the partial product generation unit. The specific combination is a combination in which the highest-order bit of each of the first group and the second group is the same value, and the third least significant bit obtained after the subtraction operation is 1.Type: GrantFiled: February 25, 2009Date of Patent: January 29, 2013Assignee: Renesas Electronics CorporationInventor: Yoichi Katayama
-
Patent number: 8352533Abstract: There is provided a semiconductor integrated circuit including: a plurality of first logic blocks which are reconfigurable, the plurality of first logic blocks inputting data of a first bit width and performing computation; a first network connecting the plurality of first logic blocks in a dynamically reconfigurable manner; a plurality of second logic blocks inputting data of a second bit width different from the first bit width and performing computation; a second network connected to outputs of the plurality of second logic blocks; and a third network connecting a carry bit output of a computing unit included in the first logic block to an input of a computing unit included in the second logic block in a dynamically reconfigurable manner.Type: GrantFiled: December 11, 2008Date of Patent: January 8, 2013Assignee: Fujitsu Semiconductor LimitedInventor: Hiroshi Furukawa
-
Patent number: 8275822Abstract: Multiplication engines and multiplication methods are provided. A multiplication engine for a digital processor includes a first multiplier to generate unequally weighted partial products from input operands in a first multiplier mode; a second multiplier to generate equally weighted partial products from input operands in a second multiplier mode; a multiplexer to select the unequally weighted partial products in the first multiplier mode and to select the equally weighted partial products in the second multiplier mode; and a carry save adder array configured to combine the selected partial products in the first multiplier mode and in the second multiplier mode.Type: GrantFiled: January 10, 2008Date of Patent: September 25, 2012Assignee: Analog Devices, Inc.Inventors: Andreas D. Olofsson, Baruch Yanovitch
-
Patent number: 8244790Abstract: A multiplier circuit is disclosed including a Wallace tree block and a carry propagation adder. The Wallace tree block includes a sum calculation block adding partial products for each digit and a carry calculation block adding carries obtained in the addition by the sum calculation block. In the case of multiplication over an extension field (finite field GF(2n)) of two, a result of calculation by the sum calculation block is outputted. The carry propagation adder adds the result of calculation by the sum calculation block and a result of calculation by the carry calculation block. In the case of multiplication for integers (finite field GF(p)), a result of calculation by the carry propagation adder is outputted.Type: GrantFiled: January 21, 2004Date of Patent: August 14, 2012Assignee: International Business Machines CorporationInventors: Akashi Satoh, Kohji Takano
-
Patent number: 8099450Abstract: Combining circuitry for combining a plurality of multi-bit partial product terms in a multiplier circuit includes a plurality of compression columns, each column receiving a plurality of partial product term bits. At least one compression column includes: a first circuit arranged to receive a first set of the plurality of partial product term bits for the at least one compression column, the first circuit further arranged to combine the first set of term bits to produce a first combined term bit set; and a second circuit arranged to receive a second set of the plurality of term bits for the at least one compression column and all of the first combined term bit set.Type: GrantFiled: July 20, 2006Date of Patent: January 17, 2012Assignee: STMicroelectronics (Research & Development) Ltd.Inventor: Tariq Kurd
-
Patent number: 8019805Abstract: A floating point multiplier circuit includes partial product generation logic configured to generate a plurality of partial products from multiplicand and multiplier values. The plurality of partial products corresponds to a first and second portion of the multiplier value during respective first and second partial product execution phases. The multiplier also includes a plurality of carry save adders configured to accumulate the plurality of partial products generated during the first and second partial product execution phases into a redundant product during respective first and second carry save adder execution phases. The multiplier further includes a first carry propagate adder coupled to the plurality of carry save adders and configured to reduce a first and second portion of the redundant product to a multiplicative product during respective first and second carry propagate adder phases. The first carry propagate adder phase begins after the second carry save adder execution phase completes.Type: GrantFiled: December 9, 2003Date of Patent: September 13, 2011Assignee: GLOBALFOUNDRIES Inc.Inventor: Debjit Das Sarma
-
Publication number: 20100235414Abstract: A Montgomery multiplication device calculates a Montgomery product of an operand X and an operand Y with respect to a modulus M and includes a plurality of processing elements. In a first clock cycle, two intermediate partial sums are created by obtaining an input of length w?1 from a preceding processing element as w?1 least significant bits. The most significant bit is configured as either zero or one. Then, two partial sums are calculated using a word of the operand Y, a word of the modulus M, a bit of the operand X, and the two intermediate partial sums. In a second clock cycle, a selection bit is obtained from a subsequent processing element and one of the two partial sums is selected based on the value of the selection bit. Then, the selected partial sum is used for calculation of a word of the Montgomery product.Type: ApplicationFiled: March 1, 2010Publication date: September 16, 2010Inventors: Miaoqing Huang, Krzysztof Gaj
-
Patent number: 7797366Abstract: Techniques for the design and use of a digital signal processor, including processing transmissions in a communications (e.g., code division multiple access) system. Power-efficient sign extension for Booth multiplication processes involves applying a sign bit in a Booth multiplication tree. The sign bit allows the Booth multiplication process to perform a sign extension step. This further involves one-extending a predetermined partial product row of the Booth multiplication tree using a sign bit for preserving the correct sign of the predetermined partial product row. The process and system resolve the signal value of the sign bit by generating a sign-extension bit in the Booth multiplication tree. The sign-extension bit is positioned in a carry-out column to extend the product of the Booth multiplication process.Type: GrantFiled: February 15, 2006Date of Patent: September 14, 2010Assignee: QUALCOMM IncorporatedInventors: Shankar Krithivasan, Christopher Edward Koob, William C. Anderson
-
Publication number: 20100057824Abstract: A computer system for computing a binary operation involving a first term multiplied by a second term resulting in a product, where the product is conditionally added to a third term in a central processing unit. The central processing unit includes a carry save adder configured to add a plurality of partial products obtained from the product of the first term and the second term to obtain a first partial result and a second partial result, a multiplexer configured to output one selected from the group consisting of the second term, the third term, and zero, and an alignment shifter configured to shift an output of the multiplexer to align the output of the multiplexer with the first partial result and the second partial result to obtain a shifted term. The shifted term, the first partial result and the second partial result are added together to obtain a result of the binary operation.Type: ApplicationFiled: September 3, 2008Publication date: March 4, 2010Applicant: SUN MICROSYSTEMS, INC.Inventor: Leonard D. Rarick
-
Publication number: 20080243976Abstract: The present invention relates to a multiply apparatus and a method for multiplying a first operand consisting of na bits and a second operand consisting of nx bits. In one embodiment the multiply apparatus comprising a CSA (CSA) unit with nx rows each comprising na AND gates for calculating a single bit product of two single bit input values and adder cells for adding results of a preceding row to a following row and a last output row for outputting a carry vector and a sum vector, and logic circuitry for selectively inverting the single bit products at the most significant position of the nx?1 first rows and at the na?1 least significant positions of the output row in response to a first configuration signal before inputting the selectively inverted single bit products to respective adder cells for switching the CSA unit selectively between processing of signed two's complement operands and unsigned operands in response to the first configuration signal.Type: ApplicationFiled: March 28, 2008Publication date: October 2, 2008Applicant: TEXAS INSTRUMENTS DEUTSCHLAND GMBHInventor: Christian Wiencke
-
Patent number: 7424506Abstract: A method is presented comprising analyzing two or more input terms on a per-bit basis within each level of bit-significance. Maximally segmenting each of the levels of bit-significance into one or more one-, two-, and/or three-bit groups, and designing a hyperpipelined hybrid Wallace tree adder utilizing one or more full-adders, half-adders, and associated register based, at least in part, on the maximal segmentation of the input terms.Type: GrantFiled: March 31, 2001Date of Patent: September 9, 2008Assignee: Durham Logistics LLCInventor: John T. Orchard
-
Patent number: 7392277Abstract: A cascaded differential domino four-to-two reducer. In an embodiment, the four-to-two reducer is constructed of a first three-to-two reducer and a second three-to-two reducer directly connected to the first three-to-two reducer. In a further embodiment, the first and second three-to-two reducer both include a symmetric carry generate gate.Type: GrantFiled: June 29, 2001Date of Patent: June 24, 2008Assignee: Intel CorporationInventor: Thomas D. Fletcher
-
Patent number: 7373368Abstract: A multiply execution unit that can generate the integer product of a multiplicand and a multiplier and is also operable to generate the XOR product of the multiplicand and the multiplier. The multiply execution unit includes a summing circuit for summing a plurality of partial products. The summing circuit includes a plurality of rows. The summing circuit can generate an integer sum of the plurality of partial products and can generate an XOR sum of the plurality of partial products. The summing circuit includes a plurality of compressors in the first row of the summing circuit. The plurality of compressors each has more than three inputs that receive data, a carry output, and a sum output.Type: GrantFiled: July 15, 2004Date of Patent: May 13, 2008Assignee: Sun Microsystems, Inc.Inventors: Leonard D. Rarick, Shu-Chin Tai
-
Patent number: 7334200Abstract: A low-error fixed-width multiplier receives a W-bit input and produces a W-bit product. In an embodiment, a multiplier (Y) is encoded using modified Booth coding. The encoded multiplier (Y) and a multiplicand (X) are processed together to generate partial products. The partial products are accumulated to generate a product (P). To compensate for the quantization error, Booth encoder outputs are used for the generation of error compensation bias. The truncated bits are divided into two groups, a major least significant bit group and a minor least significant bit group, depending upon their effects on the quantization error. Different error compensation methods are applied to each group.Type: GrantFiled: February 22, 2005Date of Patent: February 19, 2008Assignee: Broadcom CorporationInventors: Keshab K. Parhi, Jin-Gyun Chung, Kwang-Cheol Lee, Kyung-Ju Cho
-
Patent number: 7313585Abstract: A multiplier circuit is disclosed for multiplying a multiplicand by a multiplier. The multiplier circuit includes a partial product generator and a partial product adder. The partial product generator includes a first input to receive a multiplicand; a second input to receive a multiplier; partial product generation means for producing a plurality of partial products based on the multiplicand and the multiplier; and an output coupled to the partial product generation means to provide the plurality of partial products. The partial product adder includes an input coupled to the output of the partial product generator; a plurality of adders to add the plurality of partial products to produce a final product, the plurality of adders comprising a plurality of compressors having substantially the same width; and an output coupled to the plurality of adders to provide the final product.Type: GrantFiled: August 30, 2003Date of Patent: December 25, 2007Assignee: Hewlett-Packard Development Company, L.P.Inventor: Paul W. Winterrowd
-
Patent number: 7212959Abstract: A method and apparatus for accumulating arbitrary length strings of input values, such as floating point values, in a layered tree structure such that the order of adds at each layer is maintained. The accumulating utilizes a shared adder, and includes means for directing initial inputs and intermediate result values.Type: GrantFiled: August 8, 2001Date of Patent: May 1, 2007Inventors: Stephen Clark Purcell, Scott Kimura, Mark L. Wood Patrick
-
Patent number: 7124162Abstract: A Wallace tree structure such as that used in a digital signal processor (DSP) is arranged to sum vectors. The structure has a number of adder stages, each of which may have half adders with two input nodes, and full adders with three input nodes. The structure is designed with reference to the vectors to be summed. The number of full- and half-adders in each stage and the arrangement of vector inputs depends upon their characteristics. An algorithm calculates the possible tree structures and input arrangements, and selects an optimum design having a small final stage ripple adder after the last stage of the Wallace tree structure, the design being based upon the characteristics of the vector inputs. This leads to reduced propagation delay and a reduced amount of semiconductor material for implementation of the DSP.Type: GrantFiled: October 29, 2002Date of Patent: October 17, 2006Assignee: Freescale Semiconductor, Inc.Inventors: Alain Combes, Franz Steininger
-
Patent number: 7111033Abstract: A carry save adder circuit for reducing the number of inputs to a lower number of outputs, the carry save adder circuit including four carry save adders, the four carry save adders being arranged in two layers with the first and second carry save adders being arranged in a first of said layers and the third and fourth carry save adders being arranged in a second of the layers, said third and fourth carry save adders being arranged to provide the outputs, the third and fourth carry save adders each receiving at least one output from each of the first and second carry save adders and the first and second carry save adders being arranged to receive at least some of the inputs.Type: GrantFiled: July 30, 2001Date of Patent: September 19, 2006Assignee: STMicroelectronics S.A.Inventor: Sebastien Ferroussat
-
Patent number: 7111166Abstract: An extension of the serial/parallel Montgomery modular multiplication method with simultaneous reduction as previously implemented by the applicants, adapted innovatively to perform both in the prime number and in the GF(2q) polynomial based number field, in such a way as to simplify the flow of operands, by performing a multiple anticipatory function to enhance the previous modular multiplication procedures.Type: GrantFiled: May 14, 2001Date of Patent: September 19, 2006Assignee: Fortress U&T Div. M-Systems Flash Disk Pioneers Ltd.Inventors: Itai Dror, Carmi David Gressel, Michael Mostovoy, Alexey Molchanov
-
Patent number: 7043520Abstract: A partial carry-save format is employed for a finite impulse response filter output representation, thereby reducing a number of flip-flops and hence power. By replacing the least significant bit processing section on the output side of the finite impulse response filter with a combined carry-save adder and carry-propagate adder followed by a register rather than two flip-flops, the present invention reduces the load on the clock and achieves reduced propagation delay. To further improve the performance of the finite impulse response filter, a simpler carry-save adder is employed in the least significant bit section, which is possible due to the use of a single register at an input to each of the carry-save adders rather than two flip-flops, one for a carry output and one for a sum output from the adder.Type: GrantFiled: November 29, 2003Date of Patent: May 9, 2006Assignee: Agere Systems Inc.Inventors: Patrik Larsson, Christopher John Nicol
-
Patent number: 6989843Abstract: A sample-to-pixel calculation unit in a graphics system may comprise an adder tree. The adder tree includes a plurality of adder cells coupled in a tree configuration. Input values are presented to a first layer of adder cells. Each input value may have two associated control signals: a data valid signal and a winner-take-all signal. The final output of the adder tree equals (a) a sum of those input values whose data valid signals are asserted provided that none of the winner-take-all signals are asserted, or (b) a selected one of the input values if one of the winner-take-all bits is asserted. The selected input value is the one whose winner-take-all bit is set. The adder tree may be used to perform sums of weighted sample attributes and/or sums of coefficients values as part of pixel value computations.Type: GrantFiled: June 28, 2001Date of Patent: January 24, 2006Assignee: Sun Microsystems, Inc.Inventors: N. David Naegle, Scott R. Nelson
-
Patent number: 6978426Abstract: A low-error fixed-width multiplier receives a W-bit input and produces a W-bit product. In an embodiment, a multiplier (Y) is encoded using modified Booth coding. The encoded multiplier (Y) and a multiplicand (X) are processed together to generate partial products. The partial products are accumulated to generate a product (P). To compensate for the quantization error, Booth encoder outputs are used for the generation of error compensation bias. The truncated bits are divided into two groups, a major least significant bit group and a minor least significant bit group, depending upon their effects on the quantization error. Different error compensation methods are applied to each group.Type: GrantFiled: August 30, 2002Date of Patent: December 20, 2005Assignee: Broadcom CorporationInventors: Keshab K. Parhi, Jin-Gyun Chung, Kwang-Cheol Lee, Kyung-Ju Cho
-
Patent number: 6973471Abstract: A multiplier (42) forms a product from two signed operands without performing a sign extension of the multiplicand (A). A modified Booth's recoding of the multiplier operand (B) is begun immediately without being delayed by a sign extension operation. While recoding and partial product generation is occurring, a determination is made in parallel whether or not a sign extension adjustment term must be created. When needed, a value equal to (?B) (2N), where N is equal to a bit width of the multiplicand (A), is formed in parallel with the recoding and partial product generation. The sign extension adjustment term is coupled to a plurality of carry save adders (49, 51, 53) that compress a plurality of partial products to a sum term and a carry term. A final add stage combines the sum term and carry term to provide a product with correct sign extension.Type: GrantFiled: February 22, 2002Date of Patent: December 6, 2005Assignee: Freescale Semiconductor, Inc.Inventor: Trinh Huy Nguyen
-
Patent number: 6763367Abstract: An apparatus and method for compressing a reduction array into an accumulated carry-save sum. The reduction array includes a partial product matrix, a carry-save sum, and a constant value row. A compressor array generates a previous accumulated carry-save sum. A three-input/two-output carry-save adder pre-reduces the constant value row and the previously accumulated carry-save sum into a two-row intermediate carry-save sum that is added to the partial product matrix to form a current accumulated carry-save sum.Type: GrantFiled: December 11, 2000Date of Patent: July 13, 2004Assignee: International Business Machines CorporationInventors: Ohsang Kwon, Kevin J. Nowka
-
Patent number: 6732135Abstract: In a digital processor performing division, quotient accumulation apparatus is formed of a set of muxes and a single carry save adder. Partial quotients are accumulated in carry-save form with proper sign extension. Delay of partial quotient bit fragments from one iteration to a following iteration enables the apparatus to limit use to one carry save adder. By enlarging minimal logic, the quotient accumulation apparatus operates at a rate fast enough to support the rate of fast dividers.Type: GrantFiled: January 31, 2000Date of Patent: May 4, 2004Assignee: Hewlett-Packard Development Company, L.P.Inventors: Sridhar Samudrala, John D. Clouser, William R. Grundmann
-
Patent number: 6721774Abstract: A digital multiplier 110 for multiplying a plurality of multiplicand signals X0-X23 representing a multiplicand and a plurality of multiplier signals Y0-Y23 representing a multiplier. In it, a plurality of intermediate results signals, such as partial product signals, are generated from the multiplicand signals and the multiplier signals. A plurality of adder circuits 40 are also provided for adding the intermediate results signals to generate a plurality of final result signals representing the result of multiplying the multiplicand and the multiplier, wherein at least some of the adder circuits receive first signals representing intermediate addition results from at least two prior adder stages and also receive second signals representing intermediate results generated as the result of only a single addition.Type: GrantFiled: May 7, 1998Date of Patent: April 13, 2004Assignee: Texas Instruments IncorporatedInventors: Wai Lee, Toshiyuki Sakuta
-
Patent number: 6692534Abstract: The present invention provides an apparatus for booth decoding which stores the most significant bit of the lower half of the number used as the key for booth decoding. By using this stored bit to determine the rightmost booth group corresponding to the upper half of the key, booth decoding may be accomplished more quickly using an apparatus that is simpler and smaller than prior art assemblies.Type: GrantFiled: September 8, 1999Date of Patent: February 17, 2004Assignee: Sun Microsystems, Inc.Inventors: Yong Wang, Allan Tzeng
-
Patent number: 6615229Abstract: The present invention relates to a new low-power, high performance multiplier circuit design, and more specifically to a partitioned multiplier implemented using a modified, symmetrical Wallace tree structure that enables the power to parts of the multiplier to be selectively turned on and off. A multiplier implemented using complementary pass-transistor logic (CPL) 3:2 carry save adders (CSAs) includes a left array with a first multiple of CPL CSAs, a right array with a second multiple of CPL CSAs, and a merge block coupled to the left array and the right array, such that the left and right arrays are not coupled to each other. The left and right arrays are configured to independently receive power, such that, each array can be turned on and off without affecting the other array. The merge block includes a third multiple of CPL CSAs and the merge block can be configured to output a result value of a multiplication operation.Type: GrantFiled: June 29, 2000Date of Patent: September 2, 2003Assignee: Intel CorporationInventors: Narsing Vijayrao, Chi Keung Lee, Kumar Sudarshan
-
Patent number: 6611857Abstract: A multiplier (12) is disclosed that includes an encoder (36), a hierarchy of compressors (40, 42, 44, 50, 52, 60 and 70), a bit detector (130) and a switch (134). The encoder (36) is operable to receive a first and second encoder input. The compressors (40, 42, 44, 50, 52, 60 and 70) are coupled to the encoder (36). The compressors (40,42, 44, 50, 52, 60 and 70) are operable to receive a first number of inputs and to generate a second number of outputs, with the second number being less than the first number. The bit detector (130) is operable to monitor the first encoder input to determine whether the first encoder input is in a reduced precision range (28). The bit detector (130) is also operable to deactivate a subset of the compressors (40 and 50) when the bit detector (130) determines that the first encoder input is in the reduced precision range (28). The switch (134) is coupled to a specified one of the compressors (42).Type: GrantFiled: November 15, 2000Date of Patent: August 26, 2003Assignee: Texas Instruments IncorporatedInventors: Carl E. Lemonds, Alan Gatherer
-
Publication number: 20030093454Abstract: A Wallace tree structure such as that used in a DSP is arranged to sum vectors. The structure has a number of adder stages (365, 370, 375), each of which may have half adders (300) with two input nodes, and full adders (310) with three input nodes. The structure is designed with reference to the vectors to be summed. The number of full- and half-adders in each stage and the arrangement of vector inputs depends upon their characteristics. An algorithm calculates the possible tree structures and input arrangements, and selects an optimum design having a small final stage ripple adder (380), the design being based upon the characteristics of the vector inputs. This leads to reduced propagation delay and a reduced amount of semiconductor material for implementation of the DSP.Type: ApplicationFiled: October 29, 2002Publication date: May 15, 2003Inventors: Alain Combes, Franz Steininger
-
Patent number: 6535901Abstract: A method and apparatus for generating a fast multiply accumulation circuit includes processing that begins by determining number of current partial products for a multiplication of a first multiplicand and a second multiplicand. The processing then continues by determining size of the current partial products. The processing then continues by identifying one of a plurality of reduction patterns based on the size of the current partial products. The processing then continues by determining number of, and configuration of, full adders and half adders required for a reduction function of the current partial products based on the one of the plurality of reduction patterns and the size of the current partial products, wherein the multiply-accumulator performs the reduction function.Type: GrantFiled: April 26, 2000Date of Patent: March 18, 2003Assignee: Sigmatel, Inc.Inventor: Robert T Grisamore
-
Patent number: 6535902Abstract: A multiplier circuit has an encoder and a partial product bit generating circuit. The encoder receives a multiplier bit signal arid is used to output a plurality of encode signals. The partial product bit generating circuit receives the encode signals along with a multiplicand bit signal from each digit place and is used to generate a partial product bit for each digit place. The partial product bit generating circuit has a first selection circuit which is used to select a logically true signal from among the encode signals in accordance with a value of the multiplicand bit signal. Therefore, the circuit can be reduced in size by reducing the number of necessary elements without sacrificing its high speed capability.Type: GrantFiled: April 2, 2001Date of Patent: March 18, 2003Assignee: Fujitsu LimitedInventor: Gensuke Goto
-
Patent number: 6484193Abstract: A fully pipelined parallel multiplier with a fast clock cycle. The pipelined parallel multiplier contains three units: a bit-product matrix unit, a reduction unit, and an addition unit. The bit-product matrix is configured to receive two binary numbers, a multiplier and a multiplicand. A bit-product matrix is formed based on these two numbers. The bit-product matrix unit forms a first pipeline stage. The bit-product matrix is latched to the reduction unit using d-type latch circuits. The reduction unit includes a plurality of reduction stages, with each reduction stage acting as a pipeline stage. The reduction unit reduces the matrix down to a two-row matrix. Intermediate results are latched from one stage to the next using d-type latch circuits. The reduction unit also contains a plurality of half-adder and full-adder circuits. The final two-row matrix formed by the reduction unit is then latched to an addition unit.Type: GrantFiled: July 30, 1999Date of Patent: November 19, 2002Assignee: Advanced Micro Devices, Inc.Inventors: Gwangwoo Johnny Choe, James R. MacDonald
-
Patent number: 6446104Abstract: A double-precision multiplier for use in the floating point pipeline of a processor has an array multiplier and a carry-save partial-product accumulator. Double precision multiplication is accomplished by generating a plurality of partial products and summing these in the carry-save partial-product accumulator. The partial-product accumulator has a carry-save adder, a sum register, a carry-out counter and an extender. The carry-out counter receives a carry outputs of the carry-save adder and array multiplier, and the extender is coupled to extend the sum register dependent upon the contents of the carry-out counter. The extension occurs during addition of the most significant partial product to the sum of less significant partial products.Type: GrantFiled: September 15, 1999Date of Patent: September 3, 2002Assignee: Sun Microsystems, Inc.Inventors: Tzungren Allan Tzeng, Choon Ping Chng
-
Patent number: 6442582Abstract: A multiplier carry bit compression apparatus and method for a multiplier using Wallace tree addition structures uses a plurality of early and late carry bit compression operations for each level of the Wallace tree addition structure. For each level in a Wallace tree addition structure, each early carry bit compression operation compresses early compression bits prior to each corresponding late carry bit compression operation that compresses late carry bits.Type: GrantFiled: June 17, 1999Date of Patent: August 27, 2002Assignee: ATI International SRLInventor: Stephen C. Hale
-
Publication number: 20020116433Abstract: A multiply-accumulate module (100) includes a multiply-accumulate core (120), which includes a plurality of Booth encoder cells (104a). The multiply-accumulate core (120) also includes a plurality of Booth decoder cells (110a) connected to at least one of the Booth encoder cells (104a) and a plurality of Wallace tree cells (112a) connected to at least one of the Booth decoder cells (110a). Moreover, at least one first Wallace tree cell (112a1) or at least one first Booth decoder cell (110a1), or any combination thereof, includes a first plurality of transistors, and at least one second Wallace tree cell (112a2) or at least one second Booth decoder cell (110a2), or any combination thereof, includes a second plurality of transistors. In addition, at least one critical path of the multiply-accumulate module (100) includes the at least one first cell and a width of at least one of the first plurality of transistors is greater than a width of at least one of the second plurality of transistors.Type: ApplicationFiled: September 27, 2001Publication date: August 22, 2002Inventors: Kaoru Awaka, Hiroshi Takahashi, Shigetoshi Muramatsu, Akihiro Takegama