Carrysave Adders (i.e., Csas) Patents (Class 708/629)

Patent number: 10042605Abstract: A processor includes a carry save array multiplier. The carry save array multiplier includes an array of cascaded partial product generators. The array of cascaded partial product generators is configured to generate an output value as a product of two operands presented at inputs of the multiplier. The array of cascaded partial product generators is also configured to generate an output value as a sum of two operands presented at inputs of the multiplier.Type: GrantFiled: April 19, 2016Date of Patent: August 7, 2018Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Christian Wiencke, Armin Stingl

Patent number: 9632751Abstract: According to one embodiment, an arithmetic circuit includes follows. The arithmetic unit performs an arithmetic operation including addition and multiplication to generate a first value of (n+m) bits. The rounding preprocessor performs an OR operation on lower (m?k) bits of the first value to generate a second value of 1 bit. The register stores a third value of (n+k+1) bits obtained by concatenating upper (n+k) bits of the first value and the second value. The rounding postprocessor calculates a carry bit value of 1 bit from a most significant bit of the third value and lower (k+1) bits of the third value, and adds the carry bit value to upper n bits of the third value.Type: GrantFiled: December 24, 2013Date of Patent: April 25, 2017Assignee: KABUSHIKI KAISHA TOSHIBAInventor: Koichiro Ban

Patent number: 8996601Abstract: The disclosed embodiments relate to apparatus for accurately, efficiently and quickly executing a multiplication instruction. The disclosed embodiments can provide a multiplier module having an optimized layout that can help speed up computation of a result during a multiply operation so that cycle delay can be reduced and so that power consumption can be reduced.Type: GrantFiled: June 21, 2012Date of Patent: March 31, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Scott A. Hilker, George Q. Phan

Patent number: 8918446Abstract: Methods and apparatus relating to reducing power consumption in multiprecision floating point multipliers are described. In an embodiment, certain portions of a multiplier are disabled in response to two or more multiplication operations with the same data size and data type occurring backtoback. Other embodiments are also claimed and described.Type: GrantFiled: December 14, 2010Date of Patent: December 23, 2014Assignee: Intel CorporationInventors: Brent R. Boswell, Thierry Pons, Tom Aviram

Patent number: 8868634Abstract: A method and apparatus are described for performing multiplication in a processor to generate a product. In one embodiment, a 64bit multiplier and a 64bit multiplicand may be multiplied together over four cycles by merging different partial product (PP) subsets, generated by a Booth encoder and a PP generator, with feedback sum and carry results. The logic inputs of a plurality of multiplexers may be selected on a cyclical basis to efficiently compress (i.e., merge) each PP subset with feedback sum and carry results. A pair of preliminary sum results stored during one cycle may be outputted during a subsequent cycle and processed by a logic gate (e.g., an XOR gate) to generate a feedback sum result that is merged with a feedback carry result and a PP subset. Final sum and carry results may be added to generate the product of the multiplier and the multiplicand.Type: GrantFiled: December 2, 2011Date of Patent: October 21, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Srikanth Arekapudi, Sudherssen Kalaiselvan

Patent number: 8838664Abstract: The disclosed embodiments relate to methods and apparatus for accurately, efficiently and quickly executing a fused multiplyandaccumulate instruction with respect to floatingpoint operands that have packedsingleprecision format. The disclosed embodiments can speed up computation of a highpart of a result during a fused multiplyandaccumulate operation so that cycle delay can be reduced and so that power consumption can be reduced.Type: GrantFiled: June 29, 2011Date of Patent: September 16, 2014Assignee: Advanced Micro Devices, Inc.Inventors: David Oliver, Debjit Dassarma, Hanbing Liu, Scott Hilker

Publication number: 20140164457Abstract: An extensible iterative multiplier design is provided. Embodiments provide cascaded 8bit multipliers for simplifying the performance of multibyte multiplications. Booth encoding is performed in the lowest order multiplier, with the result of the Booth encoding then provided to higher order multipliers. Additionally, multiplyadd operations can be performed by initializing a partial product sum register. Configurable connections between the multipliers facilitate a variety of possible multiplication options, including the possibility of varying the width of the operands.Type: ApplicationFiled: December 7, 2013Publication date: June 12, 2014Applicant: Wave Semiconductor, Inc.Inventors: Samit Chaudhuri, Radoslav Danilak

Patent number: 8667040Abstract: An apparatus having operand registers, an opcode detector, a carryless preformat unit, a compressor, a left shifter, and exclusiveOR logic. The operand registers receive operands for a carryless multiplication. The opcode detector receives a carryless multiplication instruction, and asserts a carryless signal. The carryless preformat unit partitions a first operand into a plurality of parts that are such that a Booth encoder is precluded from selection of second partial products of a second operand, where the second partial products reflect implicit carry operations. The compressor sums first partial products of the second operand via carry save adders arranged in a Wallace tree configuration, where generation of carry bits is disabled. The left shifter shifts one or more outputs of the compressor. The exclusiveOR logic executes an exclusiveOR function to yield a carryless multiplication result.Type: GrantFiled: December 3, 2010Date of Patent: March 4, 2014Assignee: VIA Technologies, Inc.Inventor: Timothy A. Elliott

Patent number: 8661072Abstract: A shared parallel adder tree for executing multiple different population count operations on a single datum includes a number of carrysave adders (CSAs) and/or half adders (HAs), arranged in rows, where certain CSAs and HAs are dedicated to a single population count operation, while other CSAs and HAs are shared among two or more population count operations. The datum is applied to the first row in the tree. Partial sums of the number of ones at various locations within the tree are routed to certain CSAs and/or HAs “down” the tree to propagate the particular population count operations. Carrypropagate adders generate at least a portion of the final sum of the number of ones in certain population count operations. An “AND” operation on a particular number of the bits in the datum provides the high order bit of the resulting sum of the particular population count operation.Type: GrantFiled: August 19, 2008Date of Patent: February 25, 2014Assignee: International Business Machines CorporationInventors: Bartholomew Blaner, Todd R. Iglehart, Robert K. Montoye

Patent number: 8645448Abstract: An apparatus having a carryless preformat unit, a Booth encoder, a compressor, a left shifter, and exclusiveOR logic. The carryless preformat unit receives a multiplier operand and partitions the multiplier operand into parts. The Booth encoder receives the parts and directs selection of first partial products of a multiplicand that do not reflect implicit carry operations. The compressor sums the first partial products via a configuration of carry save adders that generate sum bits and carry bits, where generation of the carry bits is disabled during execution of the carryless multiplication. The left shifter shifts bits of one or more outputs of the compressor. The exclusiveOR logic is coupled to the compressor and the left shifter, and is configured to execute an exclusiveOR function on the outputs to yield a carryless multiplication result.Type: GrantFiled: December 3, 2010Date of Patent: February 4, 2014Assignee: VIA Technologies, Inc.Inventor: Timothy A. Elliott

Patent number: 8606842Abstract: Provided are Ndigit addition and subtraction units and Ndigit addition and subtraction modules in which borrowing and carrying are not propagated in modules having basic digits. In the units and modules, an output pattern of results of addition and subtraction is predicted based on a relation between an augend and an addend and a relation between a minuend and a subtrahend, respectively, thereby preventing borrowing and carrying from being propagated in modules having basic digits.Type: GrantFiled: August 21, 2008Date of Patent: December 10, 2013Assignee: Tokyo Denki UniversityInventors: Hiroshi Kasahara, Tsugio Nakamura, Jin Sato

Patent number: 8601048Abstract: Herein described is a method and system of implementing integrated circuit logic modules that provide maximum efficiency and minimum energy dissipation. In a representative embodiment, a method of implementing one or more digital signal processing functions comprises determining one or more parameters associated with generating an optimal logic module. The one or more parameters may comprise the circuit area of the logic module and the processing time through a critical path of the logic module. In a representative embodiment, the system comprises a logic module that utilizes four full adders arranged in a tree configuration. In a representative embodiment, the logic module comprises a carrysave accumulator that provides maximum efficiency and minimal energy dissipation.Type: GrantFiled: January 5, 2005Date of Patent: December 3, 2013Assignee: Broadcom CorporationInventor: Christian Lutkemeyer

Publication number: 20130031154Abstract: A selftimed multiplier unit includes a multiplier and a clock generator. The multiplier has a first set of semiconductor circuits in a critical path. The clock generator has a second set of semiconductor circuits configured to control a clock period of said clock generator selected to set a clock period longer than the propagation delay through the critical path of the multiplier. The clock generator may include a delay circuit having a delay to set the clock period longer than the propagation delay through the critical path of said multiplier. The clock generator uses circuit with identical logical design including the same standard cells, the same logic design or the same floor plan. Close matching of these circuit causes the multiplier and the clock generator to experience the same PVT speed variations.Type: ApplicationFiled: July 27, 2012Publication date: January 31, 2013Applicant: TEXAS INSTRUMENTS DEUTSCHLAND GMBHInventors: Christian Wiencke, Horst Diewald

Patent number: 8364741Abstract: A multiplier includes an operation unit that adds or subtracts a first group selected from a current input data, and a second group selected from a next input data corresponding to the first group to generate an operation result, a Booth's encoder that encodes the operation result according to Booth's algorithm, and generates code data, a partial product generation unit that calculates a partial product from the code data as a first partial product, and calculates, in a case where the first group and the second group are specific combination, a second partial product, and an adder that cumulatively adds an output from the partial product generation unit. The specific combination is a combination in which the highestorder bit of each of the first group and the second group is the same value, and the third least significant bit obtained after the subtraction operation is 1.Type: GrantFiled: February 25, 2009Date of Patent: January 29, 2013Assignee: Renesas Electronics CorporationInventor: Yoichi Katayama

Patent number: 8352533Abstract: There is provided a semiconductor integrated circuit including: a plurality of first logic blocks which are reconfigurable, the plurality of first logic blocks inputting data of a first bit width and performing computation; a first network connecting the plurality of first logic blocks in a dynamically reconfigurable manner; a plurality of second logic blocks inputting data of a second bit width different from the first bit width and performing computation; a second network connected to outputs of the plurality of second logic blocks; and a third network connecting a carry bit output of a computing unit included in the first logic block to an input of a computing unit included in the second logic block in a dynamically reconfigurable manner.Type: GrantFiled: December 11, 2008Date of Patent: January 8, 2013Assignee: Fujitsu Semiconductor LimitedInventor: Hiroshi Furukawa

Patent number: 8275822Abstract: Multiplication engines and multiplication methods are provided. A multiplication engine for a digital processor includes a first multiplier to generate unequally weighted partial products from input operands in a first multiplier mode; a second multiplier to generate equally weighted partial products from input operands in a second multiplier mode; a multiplexer to select the unequally weighted partial products in the first multiplier mode and to select the equally weighted partial products in the second multiplier mode; and a carry save adder array configured to combine the selected partial products in the first multiplier mode and in the second multiplier mode.Type: GrantFiled: January 10, 2008Date of Patent: September 25, 2012Assignee: Analog Devices, Inc.Inventors: Andreas D. Olofsson, Baruch Yanovitch

Patent number: 8244790Abstract: A multiplier circuit is disclosed including a Wallace tree block and a carry propagation adder. The Wallace tree block includes a sum calculation block adding partial products for each digit and a carry calculation block adding carries obtained in the addition by the sum calculation block. In the case of multiplication over an extension field (finite field GF(2n)) of two, a result of calculation by the sum calculation block is outputted. The carry propagation adder adds the result of calculation by the sum calculation block and a result of calculation by the carry calculation block. In the case of multiplication for integers (finite field GF(p)), a result of calculation by the carry propagation adder is outputted.Type: GrantFiled: January 21, 2004Date of Patent: August 14, 2012Assignee: International Business Machines CorporationInventors: Akashi Satoh, Kohji Takano

Patent number: 8099450Abstract: Combining circuitry for combining a plurality of multibit partial product terms in a multiplier circuit includes a plurality of compression columns, each column receiving a plurality of partial product term bits. At least one compression column includes: a first circuit arranged to receive a first set of the plurality of partial product term bits for the at least one compression column, the first circuit further arranged to combine the first set of term bits to produce a first combined term bit set; and a second circuit arranged to receive a second set of the plurality of term bits for the at least one compression column and all of the first combined term bit set.Type: GrantFiled: July 20, 2006Date of Patent: January 17, 2012Assignee: STMicroelectronics (Research & Development) Ltd.Inventor: Tariq Kurd

Patent number: 8019805Abstract: A floating point multiplier circuit includes partial product generation logic configured to generate a plurality of partial products from multiplicand and multiplier values. The plurality of partial products corresponds to a first and second portion of the multiplier value during respective first and second partial product execution phases. The multiplier also includes a plurality of carry save adders configured to accumulate the plurality of partial products generated during the first and second partial product execution phases into a redundant product during respective first and second carry save adder execution phases. The multiplier further includes a first carry propagate adder coupled to the plurality of carry save adders and configured to reduce a first and second portion of the redundant product to a multiplicative product during respective first and second carry propagate adder phases. The first carry propagate adder phase begins after the second carry save adder execution phase completes.Type: GrantFiled: December 9, 2003Date of Patent: September 13, 2011Assignee: GLOBALFOUNDRIES Inc.Inventor: Debjit Das Sarma

Publication number: 20100235414Abstract: A Montgomery multiplication device calculates a Montgomery product of an operand X and an operand Y with respect to a modulus M and includes a plurality of processing elements. In a first clock cycle, two intermediate partial sums are created by obtaining an input of length w?1 from a preceding processing element as w?1 least significant bits. The most significant bit is configured as either zero or one. Then, two partial sums are calculated using a word of the operand Y, a word of the modulus M, a bit of the operand X, and the two intermediate partial sums. In a second clock cycle, a selection bit is obtained from a subsequent processing element and one of the two partial sums is selected based on the value of the selection bit. Then, the selected partial sum is used for calculation of a word of the Montgomery product.Type: ApplicationFiled: March 1, 2010Publication date: September 16, 2010Inventors: Miaoqing Huang, Krzysztof Gaj

Patent number: 7797366Abstract: Techniques for the design and use of a digital signal processor, including processing transmissions in a communications (e.g., code division multiple access) system. Powerefficient sign extension for Booth multiplication processes involves applying a sign bit in a Booth multiplication tree. The sign bit allows the Booth multiplication process to perform a sign extension step. This further involves oneextending a predetermined partial product row of the Booth multiplication tree using a sign bit for preserving the correct sign of the predetermined partial product row. The process and system resolve the signal value of the sign bit by generating a signextension bit in the Booth multiplication tree. The signextension bit is positioned in a carryout column to extend the product of the Booth multiplication process.Type: GrantFiled: February 15, 2006Date of Patent: September 14, 2010Assignee: QUALCOMM IncorporatedInventors: Shankar Krithivasan, Christopher Edward Koob, William C. Anderson

Publication number: 20100057824Abstract: A computer system for computing a binary operation involving a first term multiplied by a second term resulting in a product, where the product is conditionally added to a third term in a central processing unit. The central processing unit includes a carry save adder configured to add a plurality of partial products obtained from the product of the first term and the second term to obtain a first partial result and a second partial result, a multiplexer configured to output one selected from the group consisting of the second term, the third term, and zero, and an alignment shifter configured to shift an output of the multiplexer to align the output of the multiplexer with the first partial result and the second partial result to obtain a shifted term. The shifted term, the first partial result and the second partial result are added together to obtain a result of the binary operation.Type: ApplicationFiled: September 3, 2008Publication date: March 4, 2010Applicant: SUN MICROSYSTEMS, INC.Inventor: Leonard D. Rarick

Publication number: 20080243976Abstract: The present invention relates to a multiply apparatus and a method for multiplying a first operand consisting of na bits and a second operand consisting of nx bits. In one embodiment the multiply apparatus comprising a CSA (CSA) unit with nx rows each comprising na AND gates for calculating a single bit product of two single bit input values and adder cells for adding results of a preceding row to a following row and a last output row for outputting a carry vector and a sum vector, and logic circuitry for selectively inverting the single bit products at the most significant position of the nx?1 first rows and at the na?1 least significant positions of the output row in response to a first configuration signal before inputting the selectively inverted single bit products to respective adder cells for switching the CSA unit selectively between processing of signed two's complement operands and unsigned operands in response to the first configuration signal.Type: ApplicationFiled: March 28, 2008Publication date: October 2, 2008Applicant: TEXAS INSTRUMENTS DEUTSCHLAND GMBHInventor: Christian Wiencke

Patent number: 7424506Abstract: A method is presented comprising analyzing two or more input terms on a perbit basis within each level of bitsignificance. Maximally segmenting each of the levels of bitsignificance into one or more one, two, and/or threebit groups, and designing a hyperpipelined hybrid Wallace tree adder utilizing one or more fulladders, halfadders, and associated register based, at least in part, on the maximal segmentation of the input terms.Type: GrantFiled: March 31, 2001Date of Patent: September 9, 2008Assignee: Durham Logistics LLCInventor: John T. Orchard

Patent number: 7392277Abstract: A cascaded differential domino fourtotwo reducer. In an embodiment, the fourtotwo reducer is constructed of a first threetotwo reducer and a second threetotwo reducer directly connected to the first threetotwo reducer. In a further embodiment, the first and second threetotwo reducer both include a symmetric carry generate gate.Type: GrantFiled: June 29, 2001Date of Patent: June 24, 2008Assignee: Intel CorporationInventor: Thomas D. Fletcher

Patent number: 7373368Abstract: A multiply execution unit that can generate the integer product of a multiplicand and a multiplier and is also operable to generate the XOR product of the multiplicand and the multiplier. The multiply execution unit includes a summing circuit for summing a plurality of partial products. The summing circuit includes a plurality of rows. The summing circuit can generate an integer sum of the plurality of partial products and can generate an XOR sum of the plurality of partial products. The summing circuit includes a plurality of compressors in the first row of the summing circuit. The plurality of compressors each has more than three inputs that receive data, a carry output, and a sum output.Type: GrantFiled: July 15, 2004Date of Patent: May 13, 2008Assignee: Sun Microsystems, Inc.Inventors: Leonard D. Rarick, ShuChin Tai

Patent number: 7334200Abstract: A lowerror fixedwidth multiplier receives a Wbit input and produces a Wbit product. In an embodiment, a multiplier (Y) is encoded using modified Booth coding. The encoded multiplier (Y) and a multiplicand (X) are processed together to generate partial products. The partial products are accumulated to generate a product (P). To compensate for the quantization error, Booth encoder outputs are used for the generation of error compensation bias. The truncated bits are divided into two groups, a major least significant bit group and a minor least significant bit group, depending upon their effects on the quantization error. Different error compensation methods are applied to each group.Type: GrantFiled: February 22, 2005Date of Patent: February 19, 2008Assignee: Broadcom CorporationInventors: Keshab K. Parhi, JinGyun Chung, KwangCheol Lee, KyungJu Cho

Patent number: 7313585Abstract: A multiplier circuit is disclosed for multiplying a multiplicand by a multiplier. The multiplier circuit includes a partial product generator and a partial product adder. The partial product generator includes a first input to receive a multiplicand; a second input to receive a multiplier; partial product generation means for producing a plurality of partial products based on the multiplicand and the multiplier; and an output coupled to the partial product generation means to provide the plurality of partial products. The partial product adder includes an input coupled to the output of the partial product generator; a plurality of adders to add the plurality of partial products to produce a final product, the plurality of adders comprising a plurality of compressors having substantially the same width; and an output coupled to the plurality of adders to provide the final product.Type: GrantFiled: August 30, 2003Date of Patent: December 25, 2007Assignee: HewlettPackard Development Company, L.P.Inventor: Paul W. Winterrowd

Patent number: 7212959Abstract: A method and apparatus for accumulating arbitrary length strings of input values, such as floating point values, in a layered tree structure such that the order of adds at each layer is maintained. The accumulating utilizes a shared adder, and includes means for directing initial inputs and intermediate result values.Type: GrantFiled: August 8, 2001Date of Patent: May 1, 2007Inventors: Stephen Clark Purcell, Scott Kimura, Mark L. Wood Patrick

Patent number: 7124162Abstract: A Wallace tree structure such as that used in a digital signal processor (DSP) is arranged to sum vectors. The structure has a number of adder stages, each of which may have half adders with two input nodes, and full adders with three input nodes. The structure is designed with reference to the vectors to be summed. The number of full and halfadders in each stage and the arrangement of vector inputs depends upon their characteristics. An algorithm calculates the possible tree structures and input arrangements, and selects an optimum design having a small final stage ripple adder after the last stage of the Wallace tree structure, the design being based upon the characteristics of the vector inputs. This leads to reduced propagation delay and a reduced amount of semiconductor material for implementation of the DSP.Type: GrantFiled: October 29, 2002Date of Patent: October 17, 2006Assignee: Freescale Semiconductor, Inc.Inventors: Alain Combes, Franz Steininger

Patent number: 7111166Abstract: An extension of the serial/parallel Montgomery modular multiplication method with simultaneous reduction as previously implemented by the applicants, adapted innovatively to perform both in the prime number and in the GF(2q) polynomial based number field, in such a way as to simplify the flow of operands, by performing a multiple anticipatory function to enhance the previous modular multiplication procedures.Type: GrantFiled: May 14, 2001Date of Patent: September 19, 2006Assignee: Fortress U&T Div. MSystems Flash Disk Pioneers Ltd.Inventors: Itai Dror, Carmi David Gressel, Michael Mostovoy, Alexey Molchanov

Patent number: 7111033Abstract: A carry save adder circuit for reducing the number of inputs to a lower number of outputs, the carry save adder circuit including four carry save adders, the four carry save adders being arranged in two layers with the first and second carry save adders being arranged in a first of said layers and the third and fourth carry save adders being arranged in a second of the layers, said third and fourth carry save adders being arranged to provide the outputs, the third and fourth carry save adders each receiving at least one output from each of the first and second carry save adders and the first and second carry save adders being arranged to receive at least some of the inputs.Type: GrantFiled: July 30, 2001Date of Patent: September 19, 2006Assignee: STMicroelectronics S.A.Inventor: Sebastien Ferroussat

Patent number: 7043520Abstract: A partial carrysave format is employed for a finite impulse response filter output representation, thereby reducing a number of flipflops and hence power. By replacing the least significant bit processing section on the output side of the finite impulse response filter with a combined carrysave adder and carrypropagate adder followed by a register rather than two flipflops, the present invention reduces the load on the clock and achieves reduced propagation delay. To further improve the performance of the finite impulse response filter, a simpler carrysave adder is employed in the least significant bit section, which is possible due to the use of a single register at an input to each of the carrysave adders rather than two flipflops, one for a carry output and one for a sum output from the adder.Type: GrantFiled: November 29, 2003Date of Patent: May 9, 2006Assignee: Agere Systems Inc.Inventors: Patrik Larsson, Christopher John Nicol

Patent number: 6989843Abstract: A sampletopixel calculation unit in a graphics system may comprise an adder tree. The adder tree includes a plurality of adder cells coupled in a tree configuration. Input values are presented to a first layer of adder cells. Each input value may have two associated control signals: a data valid signal and a winnertakeall signal. The final output of the adder tree equals (a) a sum of those input values whose data valid signals are asserted provided that none of the winnertakeall signals are asserted, or (b) a selected one of the input values if one of the winnertakeall bits is asserted. The selected input value is the one whose winnertakeall bit is set. The adder tree may be used to perform sums of weighted sample attributes and/or sums of coefficients values as part of pixel value computations.Type: GrantFiled: June 28, 2001Date of Patent: January 24, 2006Assignee: Sun Microsystems, Inc.Inventors: N. David Naegle, Scott R. Nelson

Patent number: 6978426Abstract: A lowerror fixedwidth multiplier receives a Wbit input and produces a Wbit product. In an embodiment, a multiplier (Y) is encoded using modified Booth coding. The encoded multiplier (Y) and a multiplicand (X) are processed together to generate partial products. The partial products are accumulated to generate a product (P). To compensate for the quantization error, Booth encoder outputs are used for the generation of error compensation bias. The truncated bits are divided into two groups, a major least significant bit group and a minor least significant bit group, depending upon their effects on the quantization error. Different error compensation methods are applied to each group.Type: GrantFiled: August 30, 2002Date of Patent: December 20, 2005Assignee: Broadcom CorporationInventors: Keshab K. Parhi, JinGyun Chung, KwangCheol Lee, KyungJu Cho

Patent number: 6973471Abstract: A multiplier (42) forms a product from two signed operands without performing a sign extension of the multiplicand (A). A modified Booth's recoding of the multiplier operand (B) is begun immediately without being delayed by a sign extension operation. While recoding and partial product generation is occurring, a determination is made in parallel whether or not a sign extension adjustment term must be created. When needed, a value equal to (?B) (2N), where N is equal to a bit width of the multiplicand (A), is formed in parallel with the recoding and partial product generation. The sign extension adjustment term is coupled to a plurality of carry save adders (49, 51, 53) that compress a plurality of partial products to a sum term and a carry term. A final add stage combines the sum term and carry term to provide a product with correct sign extension.Type: GrantFiled: February 22, 2002Date of Patent: December 6, 2005Assignee: Freescale Semiconductor, Inc.Inventor: Trinh Huy Nguyen

Patent number: 6763367Abstract: An apparatus and method for compressing a reduction array into an accumulated carrysave sum. The reduction array includes a partial product matrix, a carrysave sum, and a constant value row. A compressor array generates a previous accumulated carrysave sum. A threeinput/twooutput carrysave adder prereduces the constant value row and the previously accumulated carrysave sum into a tworow intermediate carrysave sum that is added to the partial product matrix to form a current accumulated carrysave sum.Type: GrantFiled: December 11, 2000Date of Patent: July 13, 2004Assignee: International Business Machines CorporationInventors: Ohsang Kwon, Kevin J. Nowka

Patent number: 6732135Abstract: In a digital processor performing division, quotient accumulation apparatus is formed of a set of muxes and a single carry save adder. Partial quotients are accumulated in carrysave form with proper sign extension. Delay of partial quotient bit fragments from one iteration to a following iteration enables the apparatus to limit use to one carry save adder. By enlarging minimal logic, the quotient accumulation apparatus operates at a rate fast enough to support the rate of fast dividers.Type: GrantFiled: January 31, 2000Date of Patent: May 4, 2004Assignee: HewlettPackard Development Company, L.P.Inventors: Sridhar Samudrala, John D. Clouser, William R. Grundmann

Patent number: 6721774Abstract: A digital multiplier 110 for multiplying a plurality of multiplicand signals X0X23 representing a multiplicand and a plurality of multiplier signals Y0Y23 representing a multiplier. In it, a plurality of intermediate results signals, such as partial product signals, are generated from the multiplicand signals and the multiplier signals. A plurality of adder circuits 40 are also provided for adding the intermediate results signals to generate a plurality of final result signals representing the result of multiplying the multiplicand and the multiplier, wherein at least some of the adder circuits receive first signals representing intermediate addition results from at least two prior adder stages and also receive second signals representing intermediate results generated as the result of only a single addition.Type: GrantFiled: May 7, 1998Date of Patent: April 13, 2004Assignee: Texas Instruments IncorporatedInventors: Wai Lee, Toshiyuki Sakuta

Patent number: 6692534Abstract: The present invention provides an apparatus for booth decoding which stores the most significant bit of the lower half of the number used as the key for booth decoding. By using this stored bit to determine the rightmost booth group corresponding to the upper half of the key, booth decoding may be accomplished more quickly using an apparatus that is simpler and smaller than prior art assemblies.Type: GrantFiled: September 8, 1999Date of Patent: February 17, 2004Assignee: Sun Microsystems, Inc.Inventors: Yong Wang, Allan Tzeng

Patent number: 6615229Abstract: The present invention relates to a new lowpower, high performance multiplier circuit design, and more specifically to a partitioned multiplier implemented using a modified, symmetrical Wallace tree structure that enables the power to parts of the multiplier to be selectively turned on and off. A multiplier implemented using complementary passtransistor logic (CPL) 3:2 carry save adders (CSAs) includes a left array with a first multiple of CPL CSAs, a right array with a second multiple of CPL CSAs, and a merge block coupled to the left array and the right array, such that the left and right arrays are not coupled to each other. The left and right arrays are configured to independently receive power, such that, each array can be turned on and off without affecting the other array. The merge block includes a third multiple of CPL CSAs and the merge block can be configured to output a result value of a multiplication operation.Type: GrantFiled: June 29, 2000Date of Patent: September 2, 2003Assignee: Intel CorporationInventors: Narsing Vijayrao, Chi Keung Lee, Kumar Sudarshan

Patent number: 6611857Abstract: A multiplier (12) is disclosed that includes an encoder (36), a hierarchy of compressors (40, 42, 44, 50, 52, 60 and 70), a bit detector (130) and a switch (134). The encoder (36) is operable to receive a first and second encoder input. The compressors (40, 42, 44, 50, 52, 60 and 70) are coupled to the encoder (36). The compressors (40,42, 44, 50, 52, 60 and 70) are operable to receive a first number of inputs and to generate a second number of outputs, with the second number being less than the first number. The bit detector (130) is operable to monitor the first encoder input to determine whether the first encoder input is in a reduced precision range (28). The bit detector (130) is also operable to deactivate a subset of the compressors (40 and 50) when the bit detector (130) determines that the first encoder input is in the reduced precision range (28). The switch (134) is coupled to a specified one of the compressors (42).Type: GrantFiled: November 15, 2000Date of Patent: August 26, 2003Assignee: Texas Instruments IncorporatedInventors: Carl E. Lemonds, Alan Gatherer

Publication number: 20030093454Abstract: A Wallace tree structure such as that used in a DSP is arranged to sum vectors. The structure has a number of adder stages (365, 370, 375), each of which may have half adders (300) with two input nodes, and full adders (310) with three input nodes. The structure is designed with reference to the vectors to be summed. The number of full and halfadders in each stage and the arrangement of vector inputs depends upon their characteristics. An algorithm calculates the possible tree structures and input arrangements, and selects an optimum design having a small final stage ripple adder (380), the design being based upon the characteristics of the vector inputs. This leads to reduced propagation delay and a reduced amount of semiconductor material for implementation of the DSP.Type: ApplicationFiled: October 29, 2002Publication date: May 15, 2003Inventors: Alain Combes, Franz Steininger

Patent number: 6535902Abstract: A multiplier circuit has an encoder and a partial product bit generating circuit. The encoder receives a multiplier bit signal arid is used to output a plurality of encode signals. The partial product bit generating circuit receives the encode signals along with a multiplicand bit signal from each digit place and is used to generate a partial product bit for each digit place. The partial product bit generating circuit has a first selection circuit which is used to select a logically true signal from among the encode signals in accordance with a value of the multiplicand bit signal. Therefore, the circuit can be reduced in size by reducing the number of necessary elements without sacrificing its high speed capability.Type: GrantFiled: April 2, 2001Date of Patent: March 18, 2003Assignee: Fujitsu LimitedInventor: Gensuke Goto

Patent number: 6535901Abstract: A method and apparatus for generating a fast multiply accumulation circuit includes processing that begins by determining number of current partial products for a multiplication of a first multiplicand and a second multiplicand. The processing then continues by determining size of the current partial products. The processing then continues by identifying one of a plurality of reduction patterns based on the size of the current partial products. The processing then continues by determining number of, and configuration of, full adders and half adders required for a reduction function of the current partial products based on the one of the plurality of reduction patterns and the size of the current partial products, wherein the multiplyaccumulator performs the reduction function.Type: GrantFiled: April 26, 2000Date of Patent: March 18, 2003Assignee: Sigmatel, Inc.Inventor: Robert T Grisamore

Patent number: 6484193Abstract: A fully pipelined parallel multiplier with a fast clock cycle. The pipelined parallel multiplier contains three units: a bitproduct matrix unit, a reduction unit, and an addition unit. The bitproduct matrix is configured to receive two binary numbers, a multiplier and a multiplicand. A bitproduct matrix is formed based on these two numbers. The bitproduct matrix unit forms a first pipeline stage. The bitproduct matrix is latched to the reduction unit using dtype latch circuits. The reduction unit includes a plurality of reduction stages, with each reduction stage acting as a pipeline stage. The reduction unit reduces the matrix down to a tworow matrix. Intermediate results are latched from one stage to the next using dtype latch circuits. The reduction unit also contains a plurality of halfadder and fulladder circuits. The final tworow matrix formed by the reduction unit is then latched to an addition unit.Type: GrantFiled: July 30, 1999Date of Patent: November 19, 2002Assignee: Advanced Micro Devices, Inc.Inventors: Gwangwoo Johnny Choe, James R. MacDonald

Patent number: 6446104Abstract: A doubleprecision multiplier for use in the floating point pipeline of a processor has an array multiplier and a carrysave partialproduct accumulator. Double precision multiplication is accomplished by generating a plurality of partial products and summing these in the carrysave partialproduct accumulator. The partialproduct accumulator has a carrysave adder, a sum register, a carryout counter and an extender. The carryout counter receives a carry outputs of the carrysave adder and array multiplier, and the extender is coupled to extend the sum register dependent upon the contents of the carryout counter. The extension occurs during addition of the most significant partial product to the sum of less significant partial products.Type: GrantFiled: September 15, 1999Date of Patent: September 3, 2002Assignee: Sun Microsystems, Inc.Inventors: Tzungren Allan Tzeng, Choon Ping Chng

Patent number: 6442582Abstract: A multiplier carry bit compression apparatus and method for a multiplier using Wallace tree addition structures uses a plurality of early and late carry bit compression operations for each level of the Wallace tree addition structure. For each level in a Wallace tree addition structure, each early carry bit compression operation compresses early compression bits prior to each corresponding late carry bit compression operation that compresses late carry bits.Type: GrantFiled: June 17, 1999Date of Patent: August 27, 2002Assignee: ATI International SRLInventor: Stephen C. Hale

Publication number: 20020116433Abstract: A multiplyaccumulate module (100) includes a multiplyaccumulate core (120), which includes a plurality of Booth encoder cells (104a). The multiplyaccumulate core (120) also includes a plurality of Booth decoder cells (110a) connected to at least one of the Booth encoder cells (104a) and a plurality of Wallace tree cells (112a) connected to at least one of the Booth decoder cells (110a). Moreover, at least one first Wallace tree cell (112a1) or at least one first Booth decoder cell (110a1), or any combination thereof, includes a first plurality of transistors, and at least one second Wallace tree cell (112a2) or at least one second Booth decoder cell (110a2), or any combination thereof, includes a second plurality of transistors. In addition, at least one critical path of the multiplyaccumulate module (100) includes the at least one first cell and a width of at least one of the first plurality of transistors is greater than a width of at least one of the second plurality of transistors.Type: ApplicationFiled: September 27, 2001Publication date: August 22, 2002Inventors: Kaoru Awaka, Hiroshi Takahashi, Shigetoshi Muramatsu, Akihiro Takegama

Patent number: 6434587Abstract: An embodiment of the present invention is a mixed length encoding unit. The mixed length may be a 12/16 bits (12/16b) encoding algorithm within a multiplyaccumulate (MAC). The mixed length encoding unit includes 16b Booth encoder adapted to produce eight partial product vectors from sixteen bits of data. The 16b Booth encoder is coupled to a four stage Wallace Tree. During a first cycle of the invention, a multiplex system directs the eight partial products and an accumulation vector to a four stage Wallace Tree. During subsequent cycles, the multiplex system directs six partial product vectors, an accumulation vector, one carryfeedback input vector, and one sumfeedback input vector to the four stage Wallace Tree.Type: GrantFiled: June 14, 1999Date of Patent: August 13, 2002Assignee: Intel CorporationInventors: Yuyun Liao, David Roberts