Multiplication Followed By Addition Patents (Class 708/501)
-
Patent number: 9405535Abstract: A circuit arrangement provides support for packed sum of absolute difference operations in a floating point execution unit, e.g., a scalar or vector floating point execution unit. Existing adders in a floating point execution unit may be utilized along with minimal additional logic in the floating point execution unit to support efficient execution of a fixed point packed sum of absolute differences instruction within the floating point execution unit, often eliminating the need for a separate vector fixed point execution unit in a processor architecture, and thereby leading to less logic and circuit area, lower power consumption and lower cost.Type: GrantFiled: November 29, 2012Date of Patent: August 2, 2016Assignee: International Business Machines CorporationInventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
-
Patent number: 9400635Abstract: An integrated circuit is provided that performs floating-point operations involving at least two successive computational steps. Two floating-point numbers entering any additional computational step after the first computational step are aligned dynamically by shifting the mantissa of the floating-point number with the greater exponent to the left and the mantissa of the floating-point number with the smaller exponent to the right. The number of left shift bits is dependent on the magnitude of the difference between the two floating-point exponents and the number of leading zeroes in the mantissa with the greater exponent. The number of right shift bits is dependent on the magnitude of the difference between the two floating-point exponents and the number of left shift bits.Type: GrantFiled: January 14, 2013Date of Patent: July 26, 2016Assignee: Altera CorporationInventor: Tomasz Sebastian Czajkowski
-
Patent number: 9317250Abstract: The present application provides a method and apparatus for supporting denormal numbers in a floating point multiply-add unit (FMAC). One embodiment of the FMAC is configurable to add a product of first and second operands to a third operand. This embodiment of the FMAC is configurable to determine a minimum exponent shift for a sum of the product and the third operand by subtracting a minimum normal exponent from a product exponent of the product. This embodiment of the FMAC is also configurable to cause bits representing the sum to be left shifted by the minimum exponent shift if a third exponent of the third operand is less than or equal to the product exponent and the minimum exponent shift is less than or equal to a predicted left shift for the sum.Type: GrantFiled: November 12, 2012Date of Patent: April 19, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Kelvin D. Goveas, Debjit Das Sarma, Scott A. Hilker, Hanbing Liu
-
Patent number: 9274752Abstract: In one embodiment, a processor includes at least one floating point unit. The at least one floating point unit may include an adder, leading change anticipator (LCA) logic, and a shifter. The adder may be to add a first operand X and a second operand Y to obtain an output operand having a bit length n. The LCA logic may be to: for each bit position i from n?1 to 1, obtain a set of propagation values and a set of bit values based on the first operand X and the second operand Y; and generate a LCA mask based on the set of propagation values and the set of bit values. The shifter may be to normalize the output operand based on the LCA mask. Other embodiments are described and claimed.Type: GrantFiled: December 28, 2012Date of Patent: March 1, 2016Assignee: Intel CorporationInventors: Simon Rubanovich, Thierry Pons, Amit Gradstein, Zeev Sperber
-
Patent number: 9264066Abstract: Techniques are disclosed relating to type conversion using a floating-point unit. In one embodiment, to convert a floating-point value to a normalized integer format, a floating-point unit is configured to perform an operation to generate a result having a significant portion and an exponent portion, where the operation includes multiplying the floating-point value by a constant. In one embodiment, the apparatus is further configured to add a value to the exponent portion of the result, and set a rounding mode to round to nearest. The constant may be a greatest value less than one that can be represented using the particular number of unsigned bits. The value added to the initial exponent may be equal to the number of unsigned bits of the normalized integer format. The apparatus may perform this conversion in response to a pack instruction.Type: GrantFiled: July 30, 2013Date of Patent: February 16, 2016Assignee: Apple Inc.Inventors: James S. Blomgren, Terence M. Potter
-
Patent number: 9256397Abstract: A fused multiply-adder is disclosed. The fused multiply-adder includes a Booth encoder, a fraction multiplier, a carry corrector, and an adder. The Booth encoder initially encodes a first operand. The fraction multiplier multiplies the Booth-encoded first operand by a second operand to produce partial products, and then reduces the partial products into a set of redundant sum and carry vectors. The carry corrector then generates a carry correction factor for correcting the carry vectors. The adder adds the redundant sum and carry vectors and the carry correction factor to a third operand to yield a final result.Type: GrantFiled: December 3, 2013Date of Patent: February 9, 2016Assignee: International Business Machines CorporationInventors: Maarten J. Boersma, Klaus M. Kroener, Christophe J. Layer, Silvia M. Mueller
-
Patent number: 9141337Abstract: A processing device is provided that includes a first, second and third precision operation circuit. The processing device further includes a shared, bit-shifting circuit that is communicatively coupled to the first, second and third precision operation circuits. A method is also provided for multiplying a first and second binary number including adding a first exponent value associated with the first binary number to a second exponent value associated with the second binary number and multiplying a first mantissa value associated with the first binary number to a second mantissa value associated with the second binary number. The method includes performing the exponent adding and mantissa multiplying substantially in parallel. The method further includes performing at least one of adding or subtracting a third binary number to the product. Also provided is a computer readable storage device encoded with data for adapting a manufacturing facility to create an apparatus.Type: GrantFiled: September 6, 2011Date of Patent: September 22, 2015Assignee: Advanced Micro Devices, Inc.Inventor: Scott Hilker
-
Patent number: 9104474Abstract: Embodiments of the present invention may provide methods and circuits for energy efficient floating point multiply and/or add operations. A variable precision floating point circuit may determine the certainty of the result of a multiply-add floating point calculation in parallel with the floating-point calculation. The variable precision floating point circuit may use the certainty of the inputs in combination with information from the computation, such as, binary digits that cancel, normalization shifts, and rounding, to perform a calculation of the certainty of the result. A floating point multiplication circuit may determine whether a lowest portion of a multiplication result could affect the final result and may induce a replay of the multiplication operation when it is determined that the result could affect the final result.Type: GrantFiled: December 28, 2012Date of Patent: August 11, 2015Assignee: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Ram K. Krishnamurthy, William C. Hasenplaugh, Randy L. Allmon, Jonathan Enoch
-
Patent number: 8990282Abstract: A fused multiply add floating point unit 1 includes multiplying circuitry 4 and adding circuitry 8. The multiply circuitry 4 multiplies operands B and C having N-bit significands to generate an unrounded product B*C. The unrounded product B*C has an M-bit significand, where M>N. The adding circuitry 8 receives an operand A that is input at a later processing cycle than a processing cycle at which the multiplying circuitry 4 receives operands B and C. The adding circuitry 8 commences processing of the operand A after the unrounded product B*C is generated by the multiplying circuitry 4. The adding circuitry 8 adds the operand A to the unrounded product B*C and outputs a rounded result A+B*C.Type: GrantFiled: September 21, 2009Date of Patent: March 24, 2015Assignee: ARM LimitedInventor: David Raymond Lutz
-
Patent number: 8990283Abstract: A computer processor including a single fused-unfused floating point multiply-add (FMA) module computes the result of the operation A*B+C for floating point numbers for fused multiply-add rounding operations and unfused multiply-add rounding operations. In one embodiment, a fused multiply-add rounding implementation is augmented with additional hardware which calculates an unfused multiply-add rounding result without adding additional pipeline stages. In one embodiment, a computation by the fused-unfused floating point multiply-add (FMA) module is initiated using a single opcode which determines whether a fused multiply-add rounding result or unfused multiply-add rounding result is generated.Type: GrantFiled: October 24, 2011Date of Patent: March 24, 2015Assignee: Oracle America, Inc.Inventors: Murali K. Inaganti, Leonard D. Rarick
-
Patent number: 8977670Abstract: Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.Type: GrantFiled: May 11, 2012Date of Patent: March 10, 2015Assignee: Oracle International CorporationInventors: Jeffrey S. Brooks, Christopher H. Olson
-
Patent number: 8972471Abstract: An arithmetic module is provided, including a first adder, a first shifter coupled to the first adder, a multiplier coupled to the first shifter for receiving an external coefficient signal, a digit alignment unit coupled to the multiplier, a second adder coupled to the digit alignment unit, and a second shifter coupled to the second adder. The arithmetic module reduces the overall computation time effectively, as compared with a scalar processor, by employing a serial data connection design, and also significantly reduces power consumption of the digital signal processor by requiring fewer input and output ends than those of a multi-issue processor.Type: GrantFiled: September 12, 2012Date of Patent: March 3, 2015Assignee: National Chiao Tung UniversityInventors: Chih-Wei Liu, Kuo-Chiang Chang, Shih-Hao Ou, Yu-Wen Chen
-
Patent number: 8965945Abstract: An apparatus and method are provided for performing an addition operation on operands A and B in order to produce a result R, the operands A and B and the result R being floating point values each having a significand and an exponent. The apparatus comprises prediction circuitry for generating a shift indication based on a prediction of the number of leading zeros that would be present in an output produced by subjecting the operands A and B to an unlike signed addition. Further, result pre-normalization circuitry performs a shift operation on the significands of both operand A and operand B prior to addition of the significands, this serving to discard a number of most significant bits of the significands of both operands as determined by the shift indication in order to produce modified significands for operands A and B.Type: GrantFiled: February 17, 2011Date of Patent: February 24, 2015Assignee: ARM LimitedInventor: David Raymond Lutz
-
Patent number: 8930433Abstract: An embodiment of an apparatus performs a floating-point multiply-add process on a first multiplicand, a second multiplicand, and an addend. A leading 0 bit is added to a mantissa of the first multiplicand to form an expanded first mantissa, and a partial-product multiplication is performed on the expanded first mantissa and a mantissa of the second multiplicand to produce partial-product sum and a partial-product carry mantissas. Leading bits of the partial-product sum and carry mantissas are changed to 0 bits if they are both 1 bits, and the partial-product sum and the partial-product carry are shifted right according to an exponent difference of a product of the first multiplicand and the second multiplicand. Otherwise both the partial-product sum and carry mantissas are arithmetically shifted right according to the exponent difference. The first and second multiplicands and the addend can be complex numbers.Type: GrantFiled: April 24, 2012Date of Patent: January 6, 2015Assignee: Futurewei Technologies, Inc.Inventors: Zhihong Li, Tong Sun, Zhikun Cheng
-
Patent number: 8930432Abstract: A floating point execution unit is capable of selectively repurposing one or more adders in an exponent path of the floating point execution unit to perform fixed point addition operations, thereby providing fixed point functionality in the floating point execution unit.Type: GrantFiled: August 4, 2011Date of Patent: January 6, 2015Assignee: International Business Machines CorporationInventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
-
Patent number: 8924454Abstract: A first floating-point operation unit receives first and second variables and performs a first operation generating a first output. A first rounding unit receives and rounds the first output to generate a second output if a control bit is in a first state. A second floating-point operation unit receives a third variable and either the first output or the second output and performs a second operation on the third variable and either the first output or the second output, to generate a third output. The second floating-point operation unit receives and operates on the first output if the control bit is in the first state, or the second output if the control bit is in the second state. A second rounding unit receives and rounds the third output.Type: GrantFiled: January 25, 2012Date of Patent: December 30, 2014Assignee: Arm Finance Overseas LimitedInventor: David Yiu-Man Lau
-
Publication number: 20140379773Abstract: Systems and methods of performing a fused multiply add (FMA) operations are provided. In one embodiment, the length of the adder used by the FMA operation is less than 3*N, where N is the number of bits in the mantissa term of a floating point number. A mask may be used to perform the addition portion of the FMA operation using the adder. A second mask may be used to denormalize the result of the addition portion of the FMA operation if an underflow occurs.Type: ApplicationFiled: June 25, 2013Publication date: December 25, 2014Inventors: Simon Rubanovich, Thierry Pons, Amit Gradstein, Zeev Sperber
-
Patent number: 8914430Abstract: A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction.Type: GrantFiled: September 24, 2010Date of Patent: December 16, 2014Assignee: Intel CorporationInventors: Amit Gradstein, Cristina S. Anderson, Zeev Sperber, Simon Rubanovich, Benny Eitan
-
Publication number: 20140351309Abstract: An FMA unit, for carrying out an arithmetic operation in a model computation unit of a control unit, is configured to process input of two factors and one summand in the form of floating point values, and provide a computation result of such processing as an output variable in the form of a floating point value. The FMA unit is designed to carry out a multiplication and a subsequent addition, the bit resolutions of the inputs for the factors being lower than the bit resolution of the input for the summand and the bit resolution of the output variable.Type: ApplicationFiled: May 21, 2014Publication date: November 27, 2014Applicant: ROBERT BOSCH GMBHInventors: Wolfgang FISCHER, Andre GUNTORO
-
Publication number: 20140351308Abstract: A system and method are provided for dynamically reducing power consumption of floating-point logic. A disable control signal that is based on a characteristic of a floating-point format input operand is received and a portion of a logic circuit is disabled based on the disable control signal. The logic circuit processes the floating-point format input operand to generate an output.Type: ApplicationFiled: May 23, 2013Publication date: November 27, 2014Applicant: NVIDIA CorporationInventors: David C. Tannenbaum, Srinivasan Iyer
-
Patent number: 8892619Abstract: A floating-point fused multiply-add (FMA) unit embodied in an integrated circuit includes a multiplier circuit cascaded with an adder circuit to produce a result A*C+B. To decrease latency, the FMA includes accumulation bypass circuits forwarding an unrounded result of the adder to inputs of the close path and the far path circuits of the adder, and forwarding an exponent result in carry save format to an input of the exponent difference circuit. Also included in the FMA is a multiply-add bypass circuit forwarding the unrounded result to the inputs of the multiplier circuit. The adder circuit includes an exponent difference circuit implemented in parallel with the multiplier circuit; a close path circuit implemented after the exponent difference circuit; and a far path circuit implemented after the exponent difference circuit.Type: GrantFiled: July 24, 2012Date of Patent: November 18, 2014Assignee: The Board of Trustees of the Leland Stanford Junior UniversityInventors: Sameh Galal, Mark Horowitz
-
Patent number: 8868632Abstract: Methods and apparatus for predicting an underflow condition associated with a floating-point multiply-add operation are disclosed. An example apparatus obtains a first operand value and a second operand value. The example apparatus then determines if the second operand value subtracted from the first operand value is greater than a minimum value and determines if the first operand value is greater than a sum value associated with a minimum operand value. The example apparatus then asserts an output signal indicative of an absence of an underflow condition associated with a floating-point value based on conditions associated with determining whether the second operand value subtracted from the first operand value is greater than the minimum value and determining if the first operand value is greater than the sum value.Type: GrantFiled: September 15, 2005Date of Patent: October 21, 2014Assignee: Intel CorporationInventor: Marius A. Cornea-Hasegan
-
Publication number: 20140244704Abstract: A method for operating a fused-multiply-add pipeline in a floating-point unit of a processor is disclosed. A multiplication is initially performed between a first operand and a second operand in a multiplier block to obtain a set of partial product results. The partial product results are sent to a carry-save adder block. A partial product reduction is performed on the partial product results to generate a carry-save result having a sum term and a carry term. The carry-save result is then formatted to generate a carry-out bit. The carry-save result is added to a third operand to generate a final result.Type: ApplicationFiled: January 31, 2014Publication date: August 28, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: SON DAO TRONG, MICHAEL KLEIN, CHRISTOPHE LAYER, SILVIA M. MUELLER
-
Patent number: 8805915Abstract: A fixed multiply-add (FMA) apparatus and method are provided. The FMA apparatus includes a partial product generator configured to generate a partial sum and a partial carry, a carry save adder configured to generate a partial sum having a first bit size and a partial carry having the first bit size by adding the partial sum and the partial carry to least significant bits (LSBs) of the mantissa of a third floating-point number, a carry select adder configured to generate a mantissa having a second bit size by adding the first bit-size partial sum and the first bit-size partial carry to most significant bits (MSBs) of the third floating-point number, and a selector configured to transmit the first bit-size partial sum and the first bit-size partial carry to the carry save adder or the carry select adder according to whether the mantissa of the third floating-point number is zero.Type: GrantFiled: June 6, 2011Date of Patent: August 12, 2014Assignees: Samsung Electronics Co., Ltd., Industry-Academic Cooperation Foundation, Yonsei UniversityInventors: Hyeong-Seok Yu, Dong-Kwan Suh, Suk-Jin Kim, San Kim, Yong-Surk Lee
-
Publication number: 20140195580Abstract: A method of an aspect includes receiving a floating point round-off amount determination instruction. The instruction indicates a source of one or more floating point data elements, indicates a number of fraction bits after a radix point, and indicates a destination storage location. A result including one or more result floating point data elements is stored in the destination storage location in response to the floating point round-off amount determination instruction. Each of the one or more result floating point data elements includes a difference between a corresponding floating point data element of the source in a corresponding position, and a rounded version of the corresponding floating point data element of the source that has been rounded to the indicated number of the fraction bits. Other methods, apparatus, systems, and instructions are disclosed.Type: ApplicationFiled: December 30, 2011Publication date: July 10, 2014Inventors: Cristina S. Anderson, Bret L. Toll, Robert Valentine, Simon Rubanovich, Amit Gradsieien
-
Publication number: 20140188967Abstract: In one embodiment, a processor includes at least one floating point unit. The at least one floating point unit may include an adder, leading change anticipator (LCA) logic, and a shifter. The adder may be to add a first operand X and a second operand Y to obtain an output operand having a bit length n. The LCA logic may be to: for each bit position i from n?1 to 1, obtain a set of propagation values and a set of bit values based on the first operand X and the second operand Y; and generate a LCA mask based on the set of propagation values and the set of bit values. The shifter may be to normalize the output operand based on the LCA mask. Other embodiments are described and claimed.Type: ApplicationFiled: December 28, 2012Publication date: July 3, 2014Inventors: Simon Rubanovich, Thierry Pons, Amit Gradstein, Zeev Sperber
-
Publication number: 20140188968Abstract: Embodiments of the present invention may provide methods and circuits for energy efficient floating point multiply and/or add operations. A variable precision floating point circuit may determine the certainty of the result of a multiply-add floating point calculation in parallel with the floating-point calculation. The variable precision floating point circuit may use the certainty of the inputs in combination with information from the computation, such as, binary digits that cancel, normalization shifts, and rounding, to perform a calculation of the certainty of the result. A floating point multiplication circuit may determine whether a lowest portion of a multiplication result could affect the final result and may induce a replay of the multiplication operation when it is determined that the result could affect the final result.Type: ApplicationFiled: December 28, 2012Publication date: July 3, 2014Inventors: Himanshu KAUL, Mark A. ANDERS, Sanu K. MATHEW, Ram K. KRISHNAMURTHY, William C. HASENPLAUGH, Randy L. ALLMON, Jonathan ENOCH
-
Publication number: 20140188966Abstract: A floating-point fused multiply-add (FMA) unit embodied in an integrated circuit includes a multiplier circuit cascaded with an adder circuit to produce a result A*C+B. To decrease latency, the FMA includes accumulation bypass circuits forwarding an unrounded result of the adder to inputs of the close path and the far path circuits of the adder, and forwarding an exponent result in carry save format to an input of the exponent difference circuit. Also included in the FMA is a multiply-add bypass circuit forwarding the unrounded result to the inputs of the multiplier circuit. The adder circuit includes an exponent difference circuit implemented in parallel with the multiplier circuit; a close path circuit implemented after the exponent difference circuit; and a far path circuit implemented after the exponent difference circuit.Type: ApplicationFiled: July 24, 2012Publication date: July 3, 2014Inventors: Sameh Galal, Mark Horowitz
-
Publication number: 20140164465Abstract: A method, circuit arrangement, and program product for executing instructions including denormal values for one or more operands in a vector execution unit. A denormal value operand may be prenormalized by a first processing lane of the vector execution unit upon detecting the denormal value. The prenormalized value and any other operands of the instruction may be communicated to a dot product adder of the vector execution unit. The dot product adder performs at least a portion of the floating point operation with the prenormalized value and any other operands of the instruction.Type: ApplicationFiled: March 11, 2013Publication date: June 12, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
-
Publication number: 20140164464Abstract: A method, circuit arrangement, and program product for executing instructions including denormal values for one or more operands in a vector execution unit. A denormal value operand may be prenormalized by a first processing lane of the vector execution unit upon detecting the denormal value. The prenormalized value and any other operands of the instruction may be communicated to a dot product adder of the vector execution unit. The dot product adder performs at least a portion of the floating point operation with the prenormalized value and any other operands of the instruction.Type: ApplicationFiled: December 6, 2012Publication date: June 12, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
-
Patent number: 8732225Abstract: Digital signal processing (“DSP”) block circuitry on an integrated circuit (“IC”) is adapted for use (e.g., in multiple instances of the DSP block circuitry on the IC) for implementing finite-impulse-response (“FIR”) digital filters in systolic form. Each DSP block may include (1) first and second multiplier circuitry and (2) adder circuitry for adding (a) outputs of the multipliers and (b) signals chained in from a first other instance of the DSP block circuitry. Systolic delay circuitry is provided for either the outputs of the first multiplier (upstream from the adder) or at least one of the sets of inputs to the first multiplier. Additional systolic delay circuitry is provided for outputs of the adder, which are chained out to a second other instance of the DSP block circuitry.Type: GrantFiled: October 11, 2013Date of Patent: May 20, 2014Assignee: Altera CorporationInventors: Suleyman Demirsoy, Hyun Yi
-
Publication number: 20140136587Abstract: The present application provides a method and apparatus for supporting denormal numbers in a floating point multiply-add unit (FMAC). One embodiment of the FMAC is configurable to add a product of first and second operands to a third operand. This embodiment of the FMAC is configurable to determine a minimum exponent shift for a sum of the product and the third operand by subtracting a minimum normal exponent from a product exponent of the product. This embodiment of the FMAC is also configurable to cause bits representing the sum to be left shifted by the minimum exponent shift if a third exponent of the third operand is less than or equal to the product exponent and the minimum exponent shift is less than or equal to a predicted left shift for the sum.Type: ApplicationFiled: November 12, 2012Publication date: May 15, 2014Inventors: Kelvin D. Goveas, Debjit Das Sarma, Scott A. Hilker, Hanbing Liu
-
Patent number: 8711146Abstract: Methods and apparatuses for constructing a multi-level solver, comprising decomposing a graph into a plurality of pieces, wherein each of the pieces has a plurality of edges and a plurality of interface nodes, and wherein the interface nodes in the graph are fewer in number than the edges in the graph; producing a local preconditioner for each of the pieces; and aggregating the local preconditioners to form a global preconditioner.Type: GrantFiled: November 29, 2007Date of Patent: April 29, 2014Assignee: Carnegie Mellon UniversityInventors: Gary Lee Miller, Ioannis Koutis
-
Patent number: 8694572Abstract: A decimal floating-point Fused-Multiply-Add (FMA) unit that performs the operation of ±(A×B)±C on decimal floating-point operands. The decimal floating-point FMA unit executes the multiplication and addition operations compliant with the IEEE 754-2008 standard. Specifically, the decimal floating-point FMA includes a parallel multiplier and injects the addend after required alignment as an additional partial product in the reduction tree used in the parallel multiplier. The decimal floating-point FMA unit may be configured to perform addition-subtraction operations or multiplication operations as standalone operations.Type: GrantFiled: July 6, 2011Date of Patent: April 8, 2014Assignee: SilMinds, LLC, EgyptInventors: Rodina Samy, Hossam Ali Hassan Fahmy, Tarek Eldeeb, Ramy Raafat, Yasmeen Farouk, Mostafa Elkhouly, Amira Mohamed
-
Publication number: 20140095568Abstract: A fused multiply-adder is disclosed. The fused multiply-adder includes a Booth encoder, a fraction multiplier, a carry corrector, and an adder. The Booth encoder initially encodes a first operand. The fraction multiplier multiplies the Booth-encoded first operand by a second operand to produce partial products, and then reduces the partial products into a set of redundant sum and carry vectors. The carry corrector then generates a carry correction factor for correcting the carry vectors. The adder adds the redundant sum and carry vectors and the carry correction factor to a third operand to yield a final result.Type: ApplicationFiled: December 3, 2013Publication date: April 3, 2014Inventors: MAARTEN J. BOERSMA, KLAUS M. KROENER, CHRISTOPHE J. LAYER, SILVIA M. MUELLER
-
Publication number: 20140067894Abstract: Systems and methods for efficiently handling problematic corner cases in floating point operations without raising flags or exceptions. One or more floating point numbers that will generate a problematic corner case in floating point computations, such as division or square root computation, are detected. Fix-up operations are applied to modify the computation such that the problematic corner case is avoided. The modified computation then is performed, while suppressing error flags are suppressed during intermediate stages.Type: ApplicationFiled: August 30, 2012Publication date: March 6, 2014Applicant: QUALCOMM INCORPORATEDInventors: Erich James Plondke, David J. Hoyle, Swaminathan Balasubramanian
-
Publication number: 20140067895Abstract: Systems and methods for implementing a floating point fused multiply and accumulate with scaling (FMASc) operation. A floating point unit receives input multiplier, multiplicand, addend, and scaling factor operands. A multiplier block is configured to multiply mantissas of the multiplier and multiplicand to generate an intermediate product. Alignment logic is configured to pre-align the addend with the intermediate product based on the scaling factor and exponents of the addend, multiplier, and multiplicand, and accumulation logic is configured to add or subtract a mantissa of the pre-aligned addend with the intermediate product to obtain a result of the floating point unit. Normalization and rounding are performed on the result, avoiding rounding during intermediate stages.Type: ApplicationFiled: August 30, 2012Publication date: March 6, 2014Applicant: QUALCOMM INCORPORATEDInventor: Liang-Kai Wang
-
Patent number: 8667042Abstract: A vector functional unit implemented on a semiconductor chip to perform vector operations of dimension N is described. The vector functional unit includes N functional units. Each of the N functional units have logic circuitry to perform: a first integer multiply add instruction that presents highest ordered bits but not lowest ordered bits of a first integer multiply add calculation, and, a second integer multiply add instruction that presents lowest ordered bits but not highest ordered bits of a second integer multiply add calculation.Type: GrantFiled: September 24, 2010Date of Patent: March 4, 2014Assignee: Intel CorporationInventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver
-
Patent number: 8626813Abstract: A fused floating-point dot product unit. The fused dot product unit includes an improved alignment scheme that generates smaller significand pairs compared to the traditional alignment due to the reduced shift amount and sticky logic. Furthermore, the fused dot product unit implements early normalization and a fast rounding scheme. By normalizing the significands prior to the significand addition, the length of the adder can be reduced and the round logic can be performed in parallel. Additionally, the fused dot product unit implements a four-input leading zero anticipation unit thereby reducing the overhead of the reduction tree by encoding the four inputs at once. The fused floating-point dot product unit may also employ a dual-path (a far path and a close path) algorithm to improve performance. Pipelining may also be applied to the dual-path fused dot product unit to increase the throughput.Type: GrantFiled: August 12, 2013Date of Patent: January 7, 2014Assignee: Board of Regents, The University of Texas SystemInventors: Earl E. Swartzlander, Jongwook Sohn
-
Publication number: 20140006467Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.Type: ApplicationFiled: June 29, 2012Publication date: January 2, 2014Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
-
Publication number: 20130332501Abstract: A fused multiply-adder is disclosed. The fused multiply-adder includes a Booth encoder, a fraction multiplier, a carry corrector, and an adder. The Booth encoder initially encodes a first operand. The fraction multiplier multiplies the Booth-encoded first operand by a second operand to produce partial products, and then reduces the partial products into a set of redundant sum and carry vectors. The carry corrector then generates a carry correction factor for correcting the carry vectors. The adder adds the redundant sum and carry vectors and the carry correction factor to a third operand to yield a final result.Type: ApplicationFiled: June 11, 2012Publication date: December 12, 2013Applicant: IBM CorporationInventors: Maarten J. Boersma, Klaus Michael Kroener, Christophe J. Layer, Silvia M. Mueller
-
Patent number: 8601047Abstract: A decimal floating-point (DFP) adder includes a decimal leading-zero anticipator (LZA). The DFP adder receives DFP operands. Each operand includes a significand, an exponent, a sign bit and a leading zero count for the significand. The DFP adder adds or subtracts the DFP operands to obtain a DFP result. The LZA determines the leading zero count associated with the significand of the DFP result. The LZA operates at least partially in parallel with circuitry (in the DFP adder) that computes the DFP result. The LZA does not wait for that circuitry to finish computation of the DFP result. Instead it “anticipates” the number of leading zeros that the result's significand will contain.Type: GrantFiled: June 13, 2013Date of Patent: December 3, 2013Assignee: Advanced Micro DevicesInventor: Liang-Kai Wang
-
Patent number: 8577948Abstract: In one embodiment, a processor includes a multiply-accumulate (MAC) unit having a first path to handle execution of an instruction if a difference between at least a portion of first and second operands and a third operand is less than a threshold value, and a second path to handle the instruction execution if the difference is greater than the threshold value. Based on the difference, at least part of the third operand is to be provided to a multiplier of the MAC unit or to a compressor of the second path. Other embodiments are described and claimed.Type: GrantFiled: September 20, 2010Date of Patent: November 5, 2013Assignee: Intel CorporationInventors: Suresh Srinivasan, Rajaraman Ramanarayanan, Sanu K. Mathew, Ram K. Krishnamurthy, Vasantha K. Erraguntla
-
Publication number: 20130282784Abstract: A device and methods are disclosed for communicating an unrounded result from one arithmetic calculation for use in a second, subsequent calculation. For example, an unrounded result of a first calculation can be forwarded to provide a multiplier, a multiplicand or an addend operand for the subsequent operation. The operand can be forwarded to the input of the same fused multiply addition module (FMAM) that supplied the result, or to another FMAM, and do so without regard to the precision of the forwarded operand, the precision of the subsequent operation, or the native precision of the FMAM.Type: ApplicationFiled: June 19, 2013Publication date: October 24, 2013Inventors: David S. Oliver, Debjit Das Sarma, Scott Hilker
-
Publication number: 20130282783Abstract: An embodiment of an apparatus performs a floating-point multiply-add process on a first multiplicand, a second multiplicand, and an addend. A leading 0 bit is added to a mantissa of the first multiplicand to form an expanded first mantissa, and a partial-product multiplication is performed on the expanded first mantissa and a mantissa of the second multiplicand to produce partial-product sum and a partial-product carry mantissas. Leading bits of the partial-product sum and carry mantissas are changed to 0 bits if they are both 1 bits, and the partial-product sum and the partial-product carry are shifted right according to an exponent difference of a product of the first multiplicand and the second multiplicand. Otherwise both the partial-product sum and carry mantissas are arithmetically shifted right according to the exponent difference. The first and second multiplicands and the addend can be complex numbers.Type: ApplicationFiled: April 24, 2012Publication date: October 24, 2013Applicant: FutureWei Technologies, Inc.Inventors: Zhihong Li, Tong Sun, Zhikun Cheng
-
Publication number: 20130262547Abstract: A method and apparatus for including in a processor instructions for performing multiply-add operations on packed data. In one embodiment, a processor is coupled to a memory. The memory has stored therein a first packed data and a second packed data. The processor performs operations on data elements in said first packed data and said second packed data to generate a third packed data in response to receiving an instruction. At least two of the data elements in this third packed data storing the result of performing multiply-add operations on data elements in the first and second packed data.Type: ApplicationFiled: May 30, 2013Publication date: October 3, 2013Inventors: Alexander Peleg, Milland Mittal, Larry M. Mennemeier, Benny Eitan, Carole Dulong, Eiichi Kowashi, Wolf C. Witt
-
Patent number: 8549054Abstract: In an arithmetic processing apparatus, a dividing unit divides a second bit string into a low-order bit part having a bit width equal to a first bit width and a high-order bit part which is higher than the low-order bit part, a first arithmetic unit performs arithmetic operations for a carry to and a borrow from the high-order bit part; and a second arithmetic unit performs addition of absolute values of the low-order bit part and the first bit string. Finally, a selecting unit selects an output of the first arithmetic unit from among an arithmetic operation result with a carry, an arithmetic operation result with a borrow, and the high-order bit part itself, according to information about the high-order bit part, sign information of the first bit string and the second bit string, and an intermediate result of the addition of the absolute values by the second arithmetic unit.Type: GrantFiled: August 21, 2008Date of Patent: October 1, 2013Assignee: Fujitsu LimitedInventor: Ryuji Kan
-
Patent number: 8495121Abstract: A device and methods are disclosed for communicating an unrounded result from one arithmetic calculation for use in a second, subsequent calculation. For example, an unrounded result of a first calculation can be forwarded to provide a multiplier, a multiplicand or an addend operand for the subsequent operation. The operand can be forwarded to the input of the same fused multiply addition module (FMAM) that supplied the result, or to another FMAM, and do so without regard to the precision of the forwarded operand, the precision of the subsequent operation, or the native precision of the FMAM.Type: GrantFiled: November 20, 2008Date of Patent: July 23, 2013Assignee: Advanced Micro Devices, Inc.Inventors: David S. Oliver, Debjit Das Sarma, Scott Hilker
-
Patent number: 8489663Abstract: A decimal floating-point (DFP) adder includes a decimal leading-zero anticipator (LZA). The DFP adder receives DFP operands. Each operand includes a significand, an exponent, a sign bit and a leading zero count for the significand. The DFP adder adds or subtracts the DFP operands to obtain a DFP result. The LZA determines the leading zero count associated with the significand of the DFP result. The LZA operates at least partially in parallel with circuitry (in the DFP adder) that computes the DFP result. The LZA does not wait for that circuitry to finish computation of the DFP result. Instead it “anticipates” the number of leading zeros that the result's significand will contain.Type: GrantFiled: June 5, 2009Date of Patent: July 16, 2013Assignee: Advanced Micro DevicesInventor: Liang-Kai Wang
-
Patent number: 8471593Abstract: A logic cell array having a number of logic cells and a segmented bus system for logic cell communication, the bus system including different segment lines having shorter and longer segments for connecting two points in order to be able to minimize the number of bus elements traversed between separate communication start and end points.Type: GrantFiled: November 4, 2011Date of Patent: June 25, 2013Inventors: Martin Vorbach, Frank May, Dirk Reichardt, Frank Lier, Gerd Ehlers, Armin Nückel, Volker Baumgarte, Prashant Rao, Jens Oertel