Multiplication Followed By Addition Patents (Class 708/501)
  • Patent number: 8463834
    Abstract: A floating point multiplier includes a data path in which a plurality of partial products are calculated and then reduced to a first partial product and a second partial product. Shift amount determining circuitry 100 analyzes the exponents of the input operands A and B as well as counting the leading zeros in the fractional portions of these operands to determine an amount of left shift or right shift to be applied by shifting circuitry 200, 202 within the multiplier data path. This shift amount is applied so as to align the partial products so that when they are added they will produce the result C without requiring this to be further shifted. Furthermore, shifting the partial products to the correct alignment in this way in advance of adding these partial products permits injection rounding combined with the adding of the partial products to be employed for cases including subnormal values.
    Type: Grant
    Filed: November 3, 2009
    Date of Patent: June 11, 2013
    Assignee: ARM Limited
    Inventor: David Raymond Lutz
  • Patent number: 8447800
    Abstract: In a denormal support mode, the normalization circuit of a floating-point adder is used to normalize or denormalized the output of a floating-point multiplier. Each floating-point multiply instruction is speculatively converted to a multiply-add instruction, with the addend forced to zero. This preserves the value of the product, while normalizing or denormalizing the product using the floating-point adder's normalization circuit. When the operands to the multiply operation are available, they are inspected. If the operands will not generate an unnormal intermediate product or a denormal final product, the add operation is suppressed, such as by operand-forwarding. Additionally, each non-fused floating-point multiply-add instruction is replaced with a multiply-add instruction having a zero addend, and a floating-point add instruction having the addend of the original multiply-add instruction is inserted into the instruction stream.
    Type: Grant
    Filed: February 14, 2011
    Date of Patent: May 21, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Kenneth Alan Dockser, Pathik Sunil Lall
  • Publication number: 20130124592
    Abstract: Asynchronous arithmetic units including an asynchronous IEEE 754 compliant floating-point adder and an asynchronous floating point multiplier component. Arithmetic units optimized for lower power consumption and methods for optimization are disclosed.
    Type: Application
    Filed: October 30, 2012
    Publication date: May 16, 2013
    Applicant: CORNELL UNIVERSITY
    Inventors: Rajit Manohar, Basit R. Sheikh
  • Patent number: 8443032
    Abstract: A multiplication circuit generates a product of a matrix and a first scalar when in matrix mode and a product of a second scalar and a third scalar when in scalar mode. The multiplication circuit comprises a sub-product generator, an accumulator and an adder. The adder is configured to sum outputs of the accumulator to generate the product of the first scalar second scalar and the third scalar when in scalar mode. The sub-product generator generates sub-products of the matrix and the first scalar when in matrix mode and sub-products of the second scalar and the third scalar when in scalar mode. The accumulator is configured to generate the product of the matrix and the first scalar by providing save of the multiplication operation of the outputs from the sub-product generator.
    Type: Grant
    Filed: March 27, 2008
    Date of Patent: May 14, 2013
    Assignee: National Tsing Hua University
    Inventors: Chen Hsing Wang, Chieh Lin Chuang, Cheng Wen Wu
  • Patent number: 8443030
    Abstract: Floating point multiply-accumulate (FMAC) instructions are processed by a logic circuit. A register file stores operands for a FMAC instruction. A multiplier multiplies an operand S1 and an operand S2 from the register file to produce a resultant operand St. An adder adds two operands St and Sd (which is the result of a prior accumulation) to produce the result Sd of the FMAC instruction. A reorder buffer (ROB) stores and reorders entries corresponding to FMAC instructions, and a hazard-checking block detects whether the FMAC instruction contains a potential hazard. A selector selects an output value from the ROB. The operands St and Sd can be supplied via one of a plurality of paths based on a priority of the paths, and the priority for the paths is based in part on output from the hazard-checking block and contents of the ROB.
    Type: Grant
    Filed: March 10, 2008
    Date of Patent: May 14, 2013
    Assignee: Marvell International Ltd.
    Inventor: Hua Tang
  • Patent number: 8438208
    Abstract: A processor including instruction support for implementing large-operand multiplication may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include an instruction execution unit comprising a hardware multiplier datapath circuit, where the hardware multiplier datapath circuit is configured to multiply operands having a maximum number of bits M.
    Type: Grant
    Filed: June 19, 2009
    Date of Patent: May 7, 2013
    Assignee: Oracle America, Inc.
    Inventors: Christopher H. Olson, Jeffrey S. Brooks, Robert T. Golla, Paul J. Jordan
  • Patent number: 8429217
    Abstract: A mechanism for executing fixed point divide operations using a floating point multiply-add pipeline are provided. With the mechanism, the floating point execution unit in a processor is modified to include elements that may be used to perform fixed point divide operations. These additional elements include a leading zero counter, a leading one counter, an estimate table unit, and a state machine. The fixed point divide operands are converted to a floating point format and an estimate of the reciprocal of the divisor is generated using estimate tables. These values are used in multiple passes through the floating point unit for calculating estimates of the quotient and corresponding error values. The estimates of the quotient are based on previous estimates of the quotient in a prior pass through the floating point unit and a corresponding error value. The final quotient estimate is truncated.
    Type: Grant
    Filed: May 29, 2008
    Date of Patent: April 23, 2013
    Assignee: International Business Machines Corporation
    Inventor: Martin Stanley Schmookler
  • Patent number: 8423600
    Abstract: An accumulating operator is applicable to a digital data processor to realize an output floating point number in response to a first floating point number and a second floating point number. The accumulating operator comprises a splitter dividing the first floating point number into a third floating point number and a compensation number, wherein an exponent of the third floating point number is equal to or greater than the exponent of the second floating point number; an accumulator electrically connected to the splitter for operating the second and third floating point numbers to realize a fourth floating point number; and a compensator electrically connected to the splitter and the accumulator for operating the fourth floating point number and the compensation number to realize the output floating point number. Via compensation, the precision of the floating point operation can be improved.
    Type: Grant
    Filed: January 10, 2005
    Date of Patent: April 16, 2013
    Assignee: Via Technologies, Inc.
    Inventor: Ko-Fang Wang
  • Publication number: 20130091190
    Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.
    Type: Application
    Filed: October 1, 2012
    Publication date: April 11, 2013
    Applicant: INTEL CORPORATION
    Inventors: Alexander Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf C. Witt
  • Patent number: 8402075
    Abstract: A floating point unit includes a floating point adder to perform a floating point addition operation between first and second floating point numbers each having an exponent and a mantissa. The floating point unit also includes an alignment shifter that may calculate a shift value corresponding to a number of bit positions to shift the second mantissa such that the second exponent value is the same as the first exponent value. The alignment shifter may detect an overshift condition, in which the shift value is greater than or equal to a selected overshift threshold value. The selected overshift threshold value comprises a base 2 number in a range of overshift values including a minimum overshift threshold value and a maximum overshift threshold value, and which has a largest number of a consecutive of bits that are zero beginning at a least significant bit.
    Type: Grant
    Filed: March 16, 2009
    Date of Patent: March 19, 2013
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David S. Oliver
  • Patent number: 8352533
    Abstract: There is provided a semiconductor integrated circuit including: a plurality of first logic blocks which are reconfigurable, the plurality of first logic blocks inputting data of a first bit width and performing computation; a first network connecting the plurality of first logic blocks in a dynamically reconfigurable manner; a plurality of second logic blocks inputting data of a second bit width different from the first bit width and performing computation; a second network connected to outputs of the plurality of second logic blocks; and a third network connecting a carry bit output of a computing unit included in the first logic block to an input of a computing unit included in the second logic block in a dynamically reconfigurable manner.
    Type: Grant
    Filed: December 11, 2008
    Date of Patent: January 8, 2013
    Assignee: Fujitsu Semiconductor Limited
    Inventor: Hiroshi Furukawa
  • Patent number: 8332453
    Abstract: A shifter that includes a plurality of shift stages positioned within the shifter, and receiving and shifting input data to generate a shifted result, and a detection circuit coupled at an input of a final shift stage of the plurality of shifters, in a final stage within the shifter. The detection circuit receives a partially shifted vector at the input of the final shift stage along with a predetermined shift amount, and performing an all-one or all-zero detection operation using a portion of the partially shifted vector and the predetermined shift amount, in parallel, to a shifting operation performed by the final shift stage to generate the shifted result.
    Type: Grant
    Filed: December 10, 2008
    Date of Patent: December 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Maarten Boersma, Silvia Melitta Mueller, Jochen Preiss, Holger Wetter
  • Patent number: 8316071
    Abstract: Sum and carry signals are formed representing a product of a first and a second operand. A bias signal is formed having a value determined by a sign of a product of the first and the second operand. An output signal is provided based on an addition of the sum signal, the carry signal, a sign-extended addend, and the bias signal. A portion of the output signal, a saturated minimum value, or a saturated maximum value, is selected as a final result based on the sign of the product and a sign of the output signal.
    Type: Grant
    Filed: May 27, 2009
    Date of Patent: November 20, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kevin A. Hurd, Scott A. Hilker
  • Patent number: 8260837
    Abstract: A system for handling denormal floating point operands when the result must be normalized. A leading zero counter (lzc) on the operand B (opB) is used to limit alignment shifts when opB is denormal but is much greater than the product of operands A and C, i.e. AC. By limiting the additional shift of B during normalization, by the number of leading zeros in opB, no increase is needed in the output bus of the alignment shifter. Furthermore, the additional shift may be done either in the alignment shifter, or postponed to a later stage in the pipeline, where the result is normalized.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: September 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Lawrence Joseph Powell, Jr., Martin Stanley Schmookler, Son Dao Trong
  • Patent number: 8239440
    Abstract: Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.
    Type: Grant
    Filed: March 28, 2008
    Date of Patent: August 7, 2012
    Assignee: Oracle America, Inc.
    Inventors: Jeffrey S. Brooks, Christopher H. Olson
  • Publication number: 20120124117
    Abstract: A fixed multiply-add (FMA) apparatus and method are provided. The FMA apparatus includes a partial product generator configured to generate a partial sum and a partial carry, a carry save adder configured to generate a partial sum having a first bit size and a partial carry having the first bit size by adding the partial sum and the partial carry to least significant bits (LSBs) of the mantissa of a third floating-point number, a carry select adder configured to generate a mantissa having a second bit size by adding the first bit-size partial sum and the first bit-size partial carry to most significant bits (MSBs) of the third floating-point number, and a selector configured to transmit the first bit-size partial sum and the first bit-size partial carry to the carry save adder or the carry select adder according to whether the mantissa of the third floating-point number is zero.
    Type: Application
    Filed: June 6, 2011
    Publication date: May 17, 2012
    Inventors: Hyeong-Seok Yu, Dong-Kwan Suh, Suk-Jin Kim, San Kim, Yong-Surk Lee
  • Patent number: 8180822
    Abstract: A computer system for computing a binary operation involving a first term multiplied by a second term resulting in a product, where the product is conditionally added to a third term in a central processing unit. The central processing unit includes a carry save adder configured to add a plurality of partial products obtained from the product of the first term and the second term to obtain a first partial result and a second partial result, a multiplexer configured to output one selected from the group consisting of the second term, the third term, and zero, and an alignment shifter configured to shift an output of the multiplexer to align the output of the multiplexer with the first partial result and the second partial result to obtain a shifted term. The shifted term, the first partial result and the second partial result are added together to obtain a result of the binary operation.
    Type: Grant
    Filed: September 3, 2008
    Date of Patent: May 15, 2012
    Assignee: Oracle America, Inc.
    Inventor: Leonard D. Rarick
  • Patent number: 8166091
    Abstract: In an embodiment, a dot-product unit to perform single-precision floating-point product and addition operations is disclosed that includes a first multiplier tree unit adapted to multiply first and second significand operands to produce a first set of two partial products. The dot-product unit further includes a second multiplier tree unit adapted to multiply third and fourth significand operands to produce a second set of two partial products, a shared exponent compare unit adapted to compare exponents of the first, second, third and fourth operands to produce an alignment shift value, and an alignment unit adapted to shift the second set of two partial products based on the alignment shift value. The dot-product unit also includes an adder unit adapted to add or subtract the first set of two partial products and the second shifted set of two partial products to produce a dot-product value that is a single-precision floating-point value.
    Type: Grant
    Filed: November 10, 2008
    Date of Patent: April 24, 2012
    Assignee: Crossfield Technology LLC
    Inventors: Earl Swartzlander, Jr., Hani Saleh
  • Patent number: 8166085
    Abstract: The present invention provides for calculating a shift amount as a function of a plurality of numbers. At least one decoder and the at least one adder are coupled in parallel. A shifter is configured to compute a value in a plurality of shift stages, and wherein a bit group of the shift amount is employable to affect at least one of the plurality of shift stages, thereby decreasing processing time.
    Type: Grant
    Filed: April 18, 2008
    Date of Patent: April 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Sang Hoo Dhong, Christian Jacobi, Silvia Melitta Mueller, Hiroo Nishikawa, Hwa-Joon Oh
  • Publication number: 20120078992
    Abstract: A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.
    Type: Application
    Filed: September 24, 2010
    Publication date: March 29, 2012
    Inventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver
  • Patent number: 8131795
    Abstract: A method is provided for improving a high-speed adder for Floating-Point Units (FPU) in a given computer system. The improved adder utilizes a compound incrementer, a compound adder, a carry network, an adder control/selector, and series of multiplexers (muxes). The carry network performs the end-around-carry function simultaneously to and independent of other required functions optimizing the functioning of the adder. Also, the use of a minimum number of muxes is also utilized to reduce mux delays.
    Type: Grant
    Filed: November 25, 2008
    Date of Patent: March 6, 2012
    Assignee: International Business Machines Corporation
    Inventors: Sang Hoo Dhong, Silvia Melitta Mueller, Hwa-Joon Oh
  • Publication number: 20120011181
    Abstract: A decimal floating-point Fused-Multiply-Add (FMA) unit that performs the operation of ±(A×B)±C on decimal floating-point operands. The decimal floating-point FMA unit executes the multiplication and addition operations compliant with the IEEE 754-2008 standard. Specifically, the decimal floating-point FMA includes a parallel multiplier and injects the addend after required alignment as an additional partial product in the reduction tree used in the parallel multiplier. The decimal floating-point FMA unit may be configured to perform addition-subtraction operations or multiplication operations as standalone operations.
    Type: Application
    Filed: July 6, 2011
    Publication date: January 12, 2012
    Applicant: SILMINDS, LLC, EGYPT
    Inventors: Rodina Samy, Hossam Ali Hassan Fahmy, Tarek Eldeeb, Ramy Raafat, Yasmeen Farouk, Mostafa Elkhouly, Amira Mohamed
  • Patent number: 8090758
    Abstract: A multiplier-accumulator includes a pre-adder, a multiplier, an accumulator, multiplexing logic, and control logic. The pre-adder is configured to sum a first input and a second input to produce a pre-sum output. The multiplier is configured to multiply a third input and the pre-sum output to produce a product output. The accumulator is configured to sum a pair of accumulator inputs to produce a sum output. The multiplexer is configured to select the pair of accumulator inputs from a plurality of multiplexer inputs, where the plurality of multiplexer inputs includes the product output and the sum output. The control logic is configured to control operation of the pre-adder, the accumulator, and the multiplexer logic. In an example, each of the first input, the second input, the third input, and the sum output is coupled to programmable interconnect of a programmable logic device.
    Type: Grant
    Filed: December 14, 2006
    Date of Patent: January 3, 2012
    Assignee: Xilinx, Inc.
    Inventors: Schuyler E. Shimanek, William E. Allaire, Steven J. Zack
  • Patent number: 8078660
    Abstract: A bridge fused multiply-adder is disclosed. The fused multiply-adder is for the single instruction execution of (A×B)+C. The bridge fused multiply-add unit adds this functionality to existing floating-point co-processor units by including a fused multiply-add hardware “bridge” between an existing floating-point adder and a floating-point multiplier unit. This fused multiply-add functionality is added to existing two-operand architecture designs without degrading the performance or parallel pipe execution of floating-point adder and floating-point multiplier instructions.
    Type: Grant
    Filed: April 9, 2008
    Date of Patent: December 13, 2011
    Assignee: The Board of Regents, University of Texas System
    Inventors: Eric Quinnell, Earl E. Swartzlander, Jr., Carl Lemonds
  • Patent number: 8069200
    Abstract: A floating point (FP) shifter for use with FP adders providing a shifted FP operand as a power of the exponent base (usually two) multiplied by a FP operand. First arithmetic processor using at least one FP shifter with FP adder. FP adder for N FP operands creating FP result, where N is at least three. Second arithmetic processor including at least one FP adder for N operands. Descriptions of FP shifter and FP adder for implementing their operational methods. Implementations of FP shifter and FP adder.
    Type: Grant
    Filed: April 27, 2006
    Date of Patent: November 29, 2011
    Assignee: QSigma, Inc.
    Inventors: George Landers, Earle Jennings
  • Patent number: 8058899
    Abstract: A logic cell array having a number of logic cells and a segmented bus system for logic cell communication, the bus system including different segment lines having shorter and longer segments for connecting two points in order to be able to minimize the number of bus elements traversed between separate communication start and end points.
    Type: Grant
    Filed: February 13, 2009
    Date of Patent: November 15, 2011
    Inventors: Martin Vorbach, Frank May, Dirk Reichardt, Frank Lier, Gerd Ehlers, Armin Nückel, Volker Baumgarte, Prashant Rao, Jens Oertel
  • Patent number: 8051123
    Abstract: A multipurpose arithmetic functional unit selectively performs planar attribute interpolation, unary function approximation, double-precision arithmetic, and/or arbitrary filtering functions such as texture filtering, bilinear filtering, or anisotropic filtering by iterating through a multi-step multiplication operation with partial products (partial results) accumulated in an accumulation register. Shared multiplier and adder circuits are advantageously used to implement the product and sum operations for unary function approximation and planar interpolation; the same multipliers and adders are also leveraged to implement double-precision multiplication and addition.
    Type: Grant
    Filed: December 15, 2006
    Date of Patent: November 1, 2011
    Assignee: NVIDIA Corporation
    Inventors: Stuart Oberman, Ming Y. Siu
  • Patent number: 8046399
    Abstract: A computer processor including a single fused-unfused floating point multiply-add (FMA) module computes the result of the operation A*B+C for floating point numbers for fused multiply-add rounding operations and unfused multiply-add rounding operations. In one embodiment, a fused multiply-add rounding implementation is augmented with additional hardware which calculates an unfused multiply-add rounding result without adding additional pipeline stages. In one embodiment, a computation by the fused-unfused floating point multiply-add (FMA) module is initiated using a single opcode which determines whether a fused multiply-add rounding result or unfused multiply-add rounding result is generated.
    Type: Grant
    Filed: January 25, 2008
    Date of Patent: October 25, 2011
    Assignee: Oracle America, Inc.
    Inventors: Murali K. Inaganti, Leonard D. Rarick
  • Patent number: 8041759
    Abstract: A specialized processing block for a programmable logic device incorporates a fundamental processing unit that performs a sum of two multiplications, adding the partial products of both multiplications without computing the individual multiplications. Such fundamental processing units consume less area than conventional separate multipliers and adders. The specialized processing block further has input and output stages, as well as a loopback function, to allow the block to be configured for various digital signal processing operations, including finite impulse response (FIR) filters and infinite impulse response (IIR) filters. By using the programmable connections, and in some cases the programmable resources of the programmable logic device, and by running portions of the specialized processing block at higher clock speeds than the remainder of the programmable logic device, more complex FIR and IIR filters can be implemented.
    Type: Grant
    Filed: June 5, 2006
    Date of Patent: October 18, 2011
    Assignee: Altera Corporation
    Inventors: Martin Langhammer, Kwan Yee Martin Lee, Orang Azgomi, Keone Streicher, Robert L. Pelt
  • Patent number: 8037118
    Abstract: A three-path floating-point fused multiply-adder is disclosed. The fused multiply-adder is for the single instruction execution of (A×B)+C. The three-path fused multiply-adder is based on a dual-path adder and reduces latency significantly by operating on case data in parallel and by reducing component bit size. The fused multiply-adder is a common serial fused multiply-adder that reuses floating-point adder (FPA) and floating-point multiplier (FPM) hardware, allowing single adds, single multiplies, and fused multiply-adds to execute at maximum speed.
    Type: Grant
    Filed: April 9, 2008
    Date of Patent: October 11, 2011
    Inventors: Eric Quinnell, Earl E. Swartzlander, Jr., Carl Lemonds
  • Publication number: 20110231460
    Abstract: A fused multiply add (FMA) unit includes an alignment counter configured to calculate an alignment shift count, an aligner configured to align an addend input based on the alignment shift count and output an aligned addend, a multiplier configured to multiply a first multiplicand input and a second multiplicand input and output a product, an adder configured to add the aligned addend and the product and output a sum without determining the sign of the sum or complementing the sum, a normalizer configured to receive the sum directly from the adder and normalize the sum irrespective of the sign of the sum and output a normalized sum, and a rounder configured to round and complement-adjust the normalized sum and output a final mantissa.
    Type: Application
    Filed: March 17, 2010
    Publication date: September 22, 2011
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventor: Sadar Ahmed
  • Patent number: 8024393
    Abstract: Floating-point processors capable of performing multiply-add (Madd) operations and incorporating improved intermediate result handling capability. The floating-point processor includes a multiplier unit coupled to an adder unit. In a specific operating mode, the intermediate result from the multiplier unit is processed (i.e., rounded but not normalized or denormalized) into representations that are more accurate and easily managed in the adder unit. By processing the intermediate result in such manner, accuracy is improved, circuit complexity is reduced, operating speed may be increased.
    Type: Grant
    Filed: December 3, 2007
    Date of Patent: September 20, 2011
    Assignee: MIPS Technologies, Inc.
    Inventors: Ying-wai Ho, John L. Kelley, XingYu Jiang
  • Patent number: 7991817
    Abstract: An apparatus and method that use an associative calculator for calculating a sequence of non-associative operations on a set of input data, comprising: using the associative calculator to calculate from the set of input data an evaluated value of each operation of said sequence as if the non-associative operations were associative operations; detecting if some of the evaluated values are erroneous; if there are erroneous evaluated values, correcting the erroneous evaluated values; and if there are no erroneous evaluated value, outputting as the result of the sequence of non-associative operations the evaluated value of the last operation of the sequence.
    Type: Grant
    Filed: January 19, 2007
    Date of Patent: August 2, 2011
    Assignee: California Institute of Technology
    Inventors: Andre M. DeHon, Nachiket Kapre
  • Patent number: 7966609
    Abstract: Embodiments of the present invention include code generation methods. In one embodiment, a table of patterns is generated. Each pattern in the table includes an FMA (fused multiply-add) DAG (Directed Acyclic Graph), a canonical form equivalent of the FMA DAG, and a shape corresponding to the canonical form equivalent. Incoming floating-point expressions are matched against the patterns in the table during compilation of a program to obtain optical sequences of FMA, FMS (fused multiply-subtract), and FNMA (fused negate multiply-add) instructions as compiled instructions for computing the floating point expressions.
    Type: Grant
    Filed: March 30, 2006
    Date of Patent: June 21, 2011
    Assignee: Intel Corporation
    Inventor: Konstantin S. Serebryany
  • Publication number: 20110137970
    Abstract: In a denormal support mode, the normalization circuit of a floating-point adder is used to normalize or denormalized the output of a floating-point multiplier. Each floating-point multiply instruction is speculatively converted to a multiply-add instruction, with the addend forced to zero. This preserves the value of the product, while normalizing or denormalizing the product using the floating-point adder's normalization circuit. When the operands to the multiply operation are available, they are inspected. If the operands will not generate an unnormal intermediate product or a denormal final product, the add operation is suppressed, such as by operand-forwarding. Additionally, each non-fused floating-point multiply-add instruction is replaced with a multiply-add instruction having a zero addend, and a floating-point add instruction having the addend of the original multiply-add instruction is inserted into the instruction stream.
    Type: Application
    Filed: February 14, 2011
    Publication date: June 9, 2011
    Applicant: QUALCOMM INCORPORATED
    Inventors: Kenneth Alan Dockser, Pathik Sunil Lall
  • Publication number: 20110106868
    Abstract: A floating point multiplier includes a data path in which a plurality of partial products are calculated and then reduced to a first partial product and a second partial product. Shift amount determining circuitry 100 analyses the exponents of the input operands A and B as well as counting the leading zeros in the fractional portions of these operands to determine an amount of left shift or right shift to be applied by shifting circuitry 200, 202 within the multiplier data path. This shift amount is applied so as to align the partial products so that when they are added they will produce the result C without requiring this to be further shifted. Furthermore, shifting the partial products to the correct alignment in this way in advance of adding these partial products permits injection rounding combined with the adding of the partial products to be employed for cases including subnormal values.
    Type: Application
    Filed: November 3, 2009
    Publication date: May 5, 2011
    Applicant: ARM Limited
    Inventor: David Raymond Lutz
  • Publication number: 20110072066
    Abstract: A fused multiply add floating point unit 1 includes multiplying circuitry 4 and adding circuitry 8. The multiply circuitry 4 multiplies operands B and C having N-bit significands to generate an unrounded product B*C. The unrounded product B*C has an M-bit significand, where M>N. The adding circuitry 8 receives an operand A that is input at a later processing cycle than a processing cycle at which the multiplying circuitry 4 receives operands B and C. The adding circuitry 8 commences processing of the operand A after the unrounded product B*C is generated by the multiplying circuitry 4. The adding circuitry 8 adds the operand A to the unrounded product B*C and outputs a rounded result A+B*C.
    Type: Application
    Filed: September 21, 2009
    Publication date: March 24, 2011
    Inventor: David Raymond Lutz
  • Patent number: 7912887
    Abstract: In a denormal support mode, the normalization circuit of a floating-point adder is used to normalize or denormalized the output of a floating-point multiplier. Each floating-point multiply instruction is speculatively converted to a multiply-add instruction, with the addend forced to zero. This preserves the value of the product, while normalizing or denormalizing the product using the floating-point adder's normalization circuit. When the operands to the multiply operation are available, they are inspected. If the operands will not generate an unnormal intermediate product or a denormal final product, the add operation is suppressed, such as by operand-forwarding. Additionally, each non-fused floating-point multiply-add instruction is replaced with a multiply-add instruction having a zero addend, and a floating-point add instruction having the addend of the original multiply-add instruction is inserted into the instruction stream.
    Type: Grant
    Filed: May 10, 2006
    Date of Patent: March 22, 2011
    Assignee: QUALCOMM Incorporated
    Inventors: Kenneth Alan Dockser, Pathik Sunil Lall
  • Publication number: 20110040815
    Abstract: A data processing apparatus is arranged to perform a fused multiply add operation. The apparatus 100 has multiplying circuitry 110 configured to multiply operands B and C to generate a product B*C having a high order portion 160 and a low order portion 170. The apparatus has adding circuitry 130 configured to: (i) add an operand A to one of the high order portion 160 and the low order portion 170 to generate an intermediate sum value; and (ii) add the intermediate sum value to a remaining one of the high order portion 160 and the low order portion 170 to generate a result A+B*C.
    Type: Application
    Filed: August 12, 2009
    Publication date: February 17, 2011
    Applicant: ARM Limited
    Inventors: Antony John Penton, Simon John Craske, Ian Michael Caulfield
  • Patent number: 7840622
    Abstract: Method to convert a hexadecimal floating point number (H) into a binary floating point number by using a Floating Point Unit (FPU) with fused multiply add with an A-register a B-register for two multiplicand operands and a C-register for an addend operand, wherein a leading zero counting unit (LZC) is associated to the addend C-register, wherein the difference of the leading zero result provided by the LZC and the input exponent (E) is calculated by a control unit and determines based on the Raw-Result-Exponent a force signal (F) with special conditions like ‘Exponent Overflow’, ‘Exponent Underflow’, and ‘Zero Result’.
    Type: Grant
    Filed: July 20, 2006
    Date of Patent: November 23, 2010
    Assignee: International Business Machines Corporation
    Inventors: Guenter Gerwig, Klaus Michael Kroener
  • Patent number: 7728624
    Abstract: An integrated circuit comprising at least one group comprising having multiple arithmetic/logic units arranged in sub-groups. In the sub-groups at inputs of multiple arithmetic/logic units, in each case a single one of the first selection units is connected on the input side, wherein no other selection unit is connected directly on the input side of this selection unit. The first selection units are coupled to each other such that a horizontal and/or vertical logical interconnection of the arithmetic/logic units within a group, and/or a logical interconnection of arithmetic/logic units to an upstream group can be implemented. Second selection units are in each case connected on the output side of a column of arithmetic/logic units. The second selection units of a group are connected on the output side to one bus each, and a microprocessor is coupled to this bus.
    Type: Grant
    Filed: October 10, 2006
    Date of Patent: June 1, 2010
    Assignee: Micronas GmbH
    Inventor: Gert Umbach
  • Patent number: 7720900
    Abstract: An apparatus and method for performing floating-point operations, particularly a fused multiply add operation. The apparatus includes a arithmetic logic unit adapted to produce both a high-order part (H) and a low-order part (L) of an intermediate extended result according to H, L=A*B+C, where A, B are input operands and C an addend. Each H, L part is formatted the same as the format of the input operands, and alignment of the resulting fractions is not affected by alignment of the inputs. The apparatus includes an architecture for suppressing left-alignment of the intermediate extended result, such that input operands for a subsequent A*B+C operation remain right-aligned.
    Type: Grant
    Filed: September 9, 2005
    Date of Patent: May 18, 2010
    Assignee: International Business Machines Corporation
    Inventors: Guenter Gerwig, Eric M. Schwarz, Ronald M. Smith, Sr.
  • Publication number: 20100121898
    Abstract: In an embodiment, a dot-product unit to perform single-precision floating-point product and addition operations is disclosed that includes a first multiplier tree unit adapted to multiply first and second significand operands to produce a first set of two partial products. The dot-product unit further includes a second multiplier tree unit adapted to multiply third and fourth significand operands to produce a second set of two partial products, a shared exponent compare unit adapted to compare exponents of the first, second, third and fourth operands to produce an alignment shift value, and an alignment unit adapted to shift the second set of two partial products based on the alignment shift value. The dot-product unit also includes an adder unit adapted to add or subtract the first set of two partial products and the second shifted set of two partial products to produce a dot-product value that is a single-precision floating-point value.
    Type: Application
    Filed: November 10, 2008
    Publication date: May 13, 2010
    Applicant: Crossfield Technology LLC
    Inventors: Earl E. Swartzlander, JR., Hani H. Saleh
  • Patent number: 7716266
    Abstract: A method and system for performing a binary mode and hexadecimal mode Multiply-Add floating point operation in a floating point arithmetic unit according to a formula A*C+B, wherein A, B and C operands each have a fraction and an exponent part expA, expB and expC and the exponent of the product A*C is calculated and compared to the exponent of the addend under inclusion of an exponent bias value dedicated to use unsigned biased exponents, wherein the comparison yields a shift amount used for aligning the addend with the product operand, wherein a shift amount calculation provides a common value CV for both binary and hexadecimal according to the formula (expA+expC?expB+CV).
    Type: Grant
    Filed: January 26, 2006
    Date of Patent: May 11, 2010
    Assignee: International Business Machines Corporation
    Inventors: Son Dao Trong, Juergen Haess, Klaus Michael Kroener, Eric M. Schwarz
  • Publication number: 20100057824
    Abstract: A computer system for computing a binary operation involving a first term multiplied by a second term resulting in a product, where the product is conditionally added to a third term in a central processing unit. The central processing unit includes a carry save adder configured to add a plurality of partial products obtained from the product of the first term and the second term to obtain a first partial result and a second partial result, a multiplexer configured to output one selected from the group consisting of the second term, the third term, and zero, and an alignment shifter configured to shift an output of the multiplexer to align the output of the multiplexer with the first partial result and the second partial result to obtain a shifted term. The shifted term, the first partial result and the second partial result are added together to obtain a result of the binary operation.
    Type: Application
    Filed: September 3, 2008
    Publication date: March 4, 2010
    Applicant: SUN MICROSYSTEMS, INC.
    Inventor: Leonard D. Rarick
  • Patent number: 7659911
    Abstract: A method and apparatus for perfectly lossless and minimal-loss interconversion of digital color data between spectral color spaces (RGB) and perceptually based luma-chroma color spaces (Y?CBCR) is disclosed. In particular, the present invention provides a process for converting digital pixels from R?G?B? space to Y?CBCR space and back, or from Y?CBCR space to R?G?B? space and back, with zero error, or, in constant-precision implementations, with guaranteed minimal error. This invention permits digital video editing and image editing systems to repeatedly interconvert between color spaces without accumulating errors. In image codecs, this invention can improve the quality of lossy image compressors independently of their core algorithms, and enables lossless image compressors to operate in a different color space than the source data without thereby becoming lossy.
    Type: Grant
    Filed: April 21, 2005
    Date of Patent: February 9, 2010
    Inventor: Andreas Wittenstein
  • Publication number: 20090265409
    Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.
    Type: Application
    Filed: March 23, 2009
    Publication date: October 22, 2009
    Inventors: Alexander D. Peleg, Millind Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf Witt
  • Patent number: 7595659
    Abstract: A logic cell array having a number of logic cells and a segmented bus system for logic cell communication, the bus system including different segment lines having shorter and longer segments for connecting two points in order to be able to minimize the number of bus elements traversed between separate communication start and end points.
    Type: Grant
    Filed: October 8, 2001
    Date of Patent: September 29, 2009
    Assignee: Pact XPP Technologies AG
    Inventors: Martin Vorbach, Frank May, Dirk Reichardt, Frank Lier, Gerd Ehlers, Armin Nückel, Volker Baumgarte, Prashant Rao, Jens Oertel
  • Publication number: 20090146691
    Abstract: A logic cell array having a number of logic cells and a segmented bus system for logic cell communication, the bus system including different segment lines having shorter and longer segments for connecting two points in order to be able to minimize the number of bus elements traversed between separate communication start and end points.
    Type: Application
    Filed: February 13, 2009
    Publication date: June 11, 2009
    Inventors: Martin VORBACH, Frank MAY, Dirk REICHARDT, Frank LIER, Gerd EHLERS, Armin NUCKEL, Volker BAUMGARTE, Prashant RAO, Jens OERTEL
  • Patent number: 7543013
    Abstract: A multi-stage floating-point accumulator includes at least two stages and is capable of operating at higher speed. In one design, the floating-point accumulator includes first and second stages. The first stage includes three operand alignment units, two multiplexers, and three latches. The three operand alignment units operate on a current floating-point value, a prior floating-point value, and a prior accumulated value. A first multiplexer provides zero or the prior floating-point value to the second operand alignment unit. A second multiplexer provides zero or the prior accumulated value to the third operand alignment unit. The three latches couple to the three operand alignment units. The second stage includes a 3-operand adder to sum the operands generated by the three operand alignment units, a latch, and a post alignment unit.
    Type: Grant
    Filed: August 18, 2006
    Date of Patent: June 2, 2009
    Assignee: QUALCOMM Incorporated
    Inventors: Yun Du, Chun Yu, Guofang Jiao