Multiplication Patents (Class 708/503)
  • Patent number: 12056461
    Abstract: An integrated circuit with specialized processing blocks are provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: August 6, 2024
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Dongdong Chen
  • Patent number: 12008367
    Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Grant
    Filed: June 21, 2022
    Date of Patent: June 11, 2024
    Assignee: Intel Corporation
    Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Patent number: 11816446
    Abstract: Systems and methods are provided to perform multiply-accumulate operations of multiple data types in a systolic array. One or more processing elements in the systolic array can include a shared multiplier and one or more adders. The shared multiplier can include a separate and/or a shared circuitry where the shared circuitry can perform at least a part of integer multiplication and at least a part of non-integer multiplication. The one or more adders can include one or more shared adders or one or more separate adders. The shared adder can include a separate and/or a shared circuitry where the shared circuitry can perform at least a part of integer addition and at least a part of non-integer addition.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: November 14, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Thomas Elmer, Thomas A. Volpe
  • Patent number: 11740869
    Abstract: Embodiments are directed to selecting a multiplication operation to be scheduled in a first stage of an execution schedule, the multiplication operation meeting a first condition of having no dependency. An addition/subtraction operation is selected to be scheduled in the first stage of the execution schedule responsive to meeting the first condition. A process is performed which includes selecting another multiplication operation to be scheduled in a next stage of the execution schedule responsive to meeting the first condition or a second condition, the second condition including having a dependency that is fulfilled by a previous stage. The process includes selecting another addition/subtraction operation to be scheduled in the next stage of the execution schedule responsive to meeting the first or second condition, and repeating the process until each operation has been scheduled in the execution schedule, where the execution schedule is configured for execution by an arithmetic logic unit.
    Type: Grant
    Filed: April 28, 2021
    Date of Patent: August 29, 2023
    Assignee: International Business Machines Corporation
    Inventor: Rajat Rao
  • Patent number: 11650819
    Abstract: Systems, methods, and apparatuses relating to instructions to multiply floating-point values of about one are described.
    Type: Grant
    Filed: December 13, 2019
    Date of Patent: May 16, 2023
    Assignee: Intel Corporation
    Inventors: Mohamed Elmalaki, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 11507350
    Abstract: The present disclosure relates to a fused vector multiplier for computing an inner product between vectors, where vectors to be computed are a multiplier number vector {right arrow over (A)}{AN . . . A2A1A0} and a multiplicand number {right arrow over (B)} {BN . . . B2B1B0}, {right arrow over (A)} and {right arrow over (B)} have the same dimension which is N+1.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: November 22, 2022
    Assignee: CAMBRICON (XI'AN) SEMICONDUCTOR CO., LTD.
    Inventors: Tianshi Chen, Shengyuan Zhou, Zidong Du, Qi Guo
  • Patent number: 11500631
    Abstract: A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: November 15, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Mujibur Rahman, Timothy David Anderson
  • Patent number: 11366638
    Abstract: Floating point Multiply-Add, Accumulate Unit, supporting BF16 format for Multiply-Accumulate operations, and FP32 Single-Precision Addition complying with the IEEE 754 Standard. The Multiply-Accumulate unit uses higher radix and longer internal 2's complement significand representation to facilitate precision as well as comparison and operation with negative numbers. The addition can be performed using Carry-Save format to avoid long carry propagation and speed up the operation. The circuit uses early exponent comparison to shorten the accumulate pipeline stage. Operations including overflow detection, zero detection and sign extension are adopted for 2s complement and Carry-Save format.
    Type: Grant
    Filed: September 2, 2021
    Date of Patent: June 21, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Vojin G. Oklobdzija, Matthew M. Kim
  • Patent number: 11366636
    Abstract: An apparatus and method for efficiently performing a multiply add or multiply accumulate operation. For example, one embodiment of a processor comprises: a decoder to decode an instruction specifying an operation, the instruction comprising a first operand identifying a multiplier and a second operand identifying a multiplicand; and fused multiply-add (FMA) execution circuitry comprising first multiplication circuitry to perform a multiplication using the multiplicand and multiplier to generate a result for multipliers and multiplicands falling within a first precision range, and second multiplication circuitry to be used instead of the first multiplication circuitry for multipliers and multiplicands falling within a second precision range.
    Type: Grant
    Filed: July 1, 2020
    Date of Patent: June 21, 2022
    Assignee: INTEL CORPORATION
    Inventors: Aditya Varma, Michael Espig
  • Patent number: 11296068
    Abstract: A discrete three-dimensional (3-D) processor comprises first and second dice. The first die comprises 3-D memory (3D-M) arrays, whereas the second die comprises logic circuits and at least an off-die peripheral-circuit component of the 3D-M array(s). The first die does not comprise the off-die peripheral-circuit component. The first and second dice are communicatively coupled by a plurality of inter-die connections. The preferred discrete 3-D processor can be applied to mathematical computing, computer simulation, configurable gate array, pattern processing and neural network.
    Type: Grant
    Filed: November 15, 2020
    Date of Patent: April 5, 2022
    Assignees: HangZhou HaiCun Information Technology Co., Ltd.
    Inventor: Guobiao Zhang
  • Patent number: 11294627
    Abstract: The disclosure relates to a hardware operator for dot-product computation, comprising a plurality of multipliers each receiving two multiplicands in the form of floating-point numbers encoded in a first precision format; an alignment circuit associated with each multiplier, configured to, based on the exponents of the corresponding multiplicands, convert the result of the multiplication into a respective fixed-point number having a sufficient number of bits to cover the full dynamic range of the multiplication; and a multi-adder configured to add without loss the fixed-point numbers provided by the multipliers, providing a sum in the form of a fixed-point number.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: April 5, 2022
    Assignee: Kalray
    Inventor: Nicolas Brunie
  • Patent number: 11250105
    Abstract: A computation unit that comprises (i) a multiplicand vector decomposer that generates a decomposed multiplicand vector which uses a sequence of first and second concatenated multiplicand sub-elements (1st2ndCMCSE) in a lower-precision format (LPF) to represent corresponding ones of multiplicand elements in a multiplicand vector in a higher-precision format (HPF), (ii) a multiplier vector decomposer that generates a decomposed multiplier vector which uses a sequence of first and second concatenated multiplier sub-elements (1st2ndCMLSE) in the LPF to represent corresponding ones of multiplier elements in a multiplier vector in the HPF, (iii) a multiplicand tensor encoder that encodes double reads of the sequence of the 1st2ndCMCSE in a decomposed multiplicand tensor, and (iv) a product vector generator that generates a product vector containing a sequence of first and second concatenated product sub-elements by executing general matrix-matrix multiplication (GeMM) operations between the double reads of the 1st2
    Type: Grant
    Filed: May 12, 2020
    Date of Patent: February 15, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
  • Patent number: 10963245
    Abstract: An apparatus is provided, that includes an instruction decoder responsive to an anchored-data processing instruction, to generate one or more control signals. Conversion circuitry is responsive to the one or more control signals to perform a conversion from a data value to an anchored-data select value. The conversion is based on anchor metadata indicative of a given range of significance for the anchored-data select value. Output circuitry is responsive to the one or more control signals, to write the anchored-data select value to a register.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: March 30, 2021
    Assignee: Arm Limited
    Inventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds, Nigel John Stephens
  • Patent number: 10949766
    Abstract: A method for an associative memory device includes dividing a multi-bit mantissa A of a number X to a plurality of smaller partial mantissas Aj, offline calculating a plurality of partial exponents F(Aj) for each possible value of each partial mantissa Aj and storing the plurality of partial exponents F(Aj) in a look up table (LUT) of the associative memory device. A system includes an associative memory array to store a plurality of partial mantissas Ai of a mantissa A of a number X and an exponent calculator to utilize the partial mantissas to compute e in the power of X.
    Type: Grant
    Filed: October 15, 2017
    Date of Patent: March 16, 2021
    Assignee: GSI Technology Inc.
    Inventor: Avidan Akerib
  • Patent number: 10754616
    Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.
    Type: Grant
    Filed: May 25, 2020
    Date of Patent: August 25, 2020
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 10503472
    Abstract: An apparatus and method are provided for processing floating point values using an intermediate representation which has significand, exponent and shadow sections. A less significant portion of the exponent of the floating point value defines a range of positions within the significand section where the representation of the significand is to be held. The exponent section holds a representation of a more significant portion of the exponent indicating a selected window of multiple contiguous windows spanning a value range of a format of the floating point value. A first portion of the significand section corresponds to the selected window and a second portion corresponds to an overlap into a further window which is adjacent to and lower in the value range.
    Type: Grant
    Filed: May 17, 2016
    Date of Patent: December 10, 2019
    Assignee: ARM Limited
    Inventors: Daryl John Stewart, Thomas Christopher Grocutt
  • Patent number: 10489114
    Abstract: Methods and apparatuses for performing a floating point multiply-add operation with alignment correction. A processor receives a first operand, a second operand and a third operand, wherein the first, second and third operands each represent a floating point number comprising a significand value and a biased exponent value. A processor determines a shift amount based, at least in part, on the one or more biased exponent values of the first, second or third operand. A processor determines a shift amount correction based, at least in part, on the one or more biased exponent values of the first, second or third operand being equal to zero.
    Type: Grant
    Filed: June 27, 2014
    Date of Patent: November 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Son T. Dao, Silvia Melitta Mueller
  • Patent number: 10489115
    Abstract: Methods and apparatuses for performing a floating point multiply-add operation with alignment correction. A processor receives a first operand, a second operand and a third operand, wherein the first, second and third operands each represent a floating point number comprising a significand value and a biased exponent value. A processor determines a shift amount based, at least in part, on the one or more biased exponent values of the first, second or third operand. A processor determines a shift amount correction based, at least in part, on the one or more biased exponent values of the first, second or third operand being equal to zero.
    Type: Grant
    Filed: December 17, 2014
    Date of Patent: November 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Son T. Dao, Silvia Melitta Mueller
  • Patent number: 10452403
    Abstract: In an embodiment, a method is provided. The method includes managing user-level threads on a first instruction sequencer in response to executing user-level instructions on a second instruction sequencer that is under control of an application level program. A first user-level thread is run on the second instruction sequencer and contains one or more user level instructions. A first user level instruction has at least 1) a field that makes reference to one or more instruction sequencers or 2) implicitly references with a pointer to code that specifically addresses one or more instruction sequencers when the code is executed.
    Type: Grant
    Filed: September 26, 2015
    Date of Patent: October 22, 2019
    Assignee: Intel Corporation
    Inventors: Hong Wang, John P. Shen, Edward T. Grochowski, Richard A. Hankins, Gautham N. Chinya, Bryant E. Bigbee, Shivnandan D. Kaushik, Xiang Chris Zou, Per Hammarlund, Scott Dion Rodgers, Xinmin Tian, Anil Aggawal, Prashant Sethi, Baiju V. Patel, James P Held
  • Patent number: 10381098
    Abstract: A memory interface latch including a data NAND gate and a feedback gate can be created within an integrated circuit (IC). When a feedback node is driven low, the data NAND gate can drive an inverted value of a memory array bitline input to a data output of the memory interface latch within a time of one gate delay. A feedback gate can, in a functional mode, during one phase of a clock signal, drive the feedback node high and during the other phase of the clock signal, drive the feedback node to a complement the data output. The feedback gate can be also, in an LBIST write-through mode, drive the feedback node to the value of a WRITE_DATA input. The feedback gate can be also, in a fence mode, drive the feedback node to fixed logic value.
    Type: Grant
    Filed: November 28, 2017
    Date of Patent: August 13, 2019
    Assignee: International Business Machines Corporation
    Inventors: Elizabeth L. Gerhard, Todd A. Christensen, Chad A. Adams, Peter T. Freiburger
  • Patent number: 10346133
    Abstract: A processor includes an integer multiplier configured to execute an integer multiply instruction to multiply significand bits of at least one floating point operand of a floating point multiply operation. The processor also includes a floating point multiplier configured to execute a special purpose floating point multiply accumulate instruction with respect to an intermediate result of the floating point multiply operation and the at least one floating point operand to generate a final floating point multiplication result.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: July 9, 2019
    Assignee: QUALCOMM Incorporated
    Inventors: Albert Danysh, Erich Plondke, Eric Mahurin
  • Patent number: 10331406
    Abstract: A data processing apparatus and method of operating a data processing apparatus are disclosed. Comparisons are made between first and second floating-point operands received. A more significant portion of the first floating-point operand and of the second floating-point operand are subject to comparison. The more significant portion of the first floating-point operand minus a least significant bit in the more significant portion is subject to comparison with the more significant portion of the second floating-point operand. A less significant portion of the first floating-point operand and of the second floating-point operand are also subject to comparison. In dependence on the outcome of these comparisons, right-shift circuitry is used selectively to perform a 1-bit right shift on a difference calculated between the first floating-point operand and the second floating-point operand.
    Type: Grant
    Filed: November 17, 2017
    Date of Patent: June 25, 2019
    Assignee: ARM Limited
    Inventors: David Raymond Lutz, Thomas Gilles Tarridec
  • Patent number: 10037191
    Abstract: A method and computer system are provided for performing a comparison computation, e.g. for use in a check procedure for a reciprocal square root operation. The comparison computation compares a multiplication of three values with a predetermined value. The computer system performs the multiplication using multiplier logic which is configured to perform multiply operations in which two values are multiplied together. A first and second of the three values are multiplied to determine a first intermediate result, w1. The digits of w1 are separated into two portions, w1,1 and w1,2. The third of the three values is multiplied with w1,2 and the result is added into a multiplication of the third of the three values with w1,1 to thereby determine the result of multiplying the three values together. In this way the comparison is performed with high accuracy, while keeping the area and power consumption of the multiplier logic low.
    Type: Grant
    Filed: January 18, 2018
    Date of Patent: July 31, 2018
    Assignee: Imagination Technologies Limited
    Inventor: Leonard Rarick
  • Patent number: 9996320
    Abstract: An example processor includes a register and a fused multiply-add (FMA) low functional unit. The register stores first, second, and third floating point (FP) values. The FMA low functional unit receives a request to perform an FMA low operation: multiplies the first FP value with the second FP value to obtain a first product value; adds the first product with the third FP value to generate a first result value; rounds the first result to generate a first FMA value; multiplies the first FP value with the second FP value to obtain a second product value; adds the second product value with the third FP value to generate a second result value; and subtracts the FMA value from the second result value to obtain a third result value, which can then be normalized and rounded (FMA low result) and sent the FMA low result to an application.
    Type: Grant
    Filed: December 23, 2015
    Date of Patent: June 12, 2018
    Assignee: Intel Corporation
    Inventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
  • Patent number: 9996319
    Abstract: An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.
    Type: Grant
    Filed: December 23, 2015
    Date of Patent: June 12, 2018
    Assignee: Intel Corporation
    Inventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
  • Patent number: 9836279
    Abstract: An apparatus and method for floating-point multiplication are provided. Two partial products are generated from two operand significands, which are then added to generate a product significand. The value of an unbiased result exponent is determined from the operand exponent values and leading zero counts, and a shift amount and direction for the product significand are determined in dependence on a predetermined minimum exponent value of a predetermined canonical format. The product significand is shifted by the shift amount in the shift direction. An overflow mask identifying an overflow bit position of the product significand is generated by right shifting a predetermined mask pattern by the shift amount, and the overflow mask is applied to the product significand to extract an overflow value at the overflow bit position. This extraction of the overflow value happens before the shift circuitry shifts the product significand, allowing an overall faster floating-point multiplication to be performed.
    Type: Grant
    Filed: September 25, 2015
    Date of Patent: December 5, 2017
    Assignee: ARM Limited
    Inventor: David Raymond Lutz
  • Patent number: 9825814
    Abstract: Systems, methods, and computer-readable storage media are provided for dynamically setting an end point group for an end point. An endpoint can be assigned a default end point group when added to a network. For example, the default end point group can be a baseline port/security group which is considered an untrusted group. The end point can then be dynamically assigned an end point group based on a set of group selection rules. For example, the group selection rules can identify an end point group based on the MAC address or other attributes. When the end point is added to the network, the MAC address and/or other attributes of the end point can be determined and used to assign an end point group. As another example, an end point group can be assigned based on the amount of traffic or guest operation system.
    Type: Grant
    Filed: July 27, 2015
    Date of Patent: November 21, 2017
    Assignee: CISCO TECHNOLOGY, INC.
    Inventors: Joji Thomas Mekkattuparamban, Vijay Chander, Saurabh Jain, Van Lieu, Badhri Madabusi Vijayaraghavan, Praveen Jain, Munish Mehta, Michael R. Smith, Narender Enduri
  • Patent number: 9690545
    Abstract: A floating-point calculation apparatus comprising: a selection part; an addition and subtraction calculation part; an output determination part; and a buffer management part configured to add, when it is determined that a buffer used to store an input value is not prepared, a buffer that corresponds to the input value, wherein when a number of significant digits of the result of performing an addition and subtraction calculation exceeds a number of significant digits of the buffer selected by the selection part, the addition and subtraction calculation part shifts right or shifts left part of the result of performing the addition and subtraction calculation and divides the result of performing the addition and subtraction calculation into values each being storable in one of a plurality of buffers.
    Type: Grant
    Filed: May 29, 2015
    Date of Patent: June 27, 2017
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Shiro Kitamura, Tomoyuki Okumura, Hiroshi Nanjo
  • Patent number: 9606796
    Abstract: Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.
    Type: Grant
    Filed: October 30, 2013
    Date of Patent: March 28, 2017
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Kyong Ho Lee, Seok-Jun Lee, Manish Goel
  • Patent number: 9348558
    Abstract: A processor includes a carry save array multiplier. The carry save array multiplier includes an array of cascaded partial product generators. The array of cascaded partial product generators is configured to generate an output value as a product of two operands presented at inputs of the multiplier. The array of cascaded partial product generators is also configured to generate an output value as a sum of two operands presented at inputs of the multiplier.
    Type: Grant
    Filed: August 23, 2013
    Date of Patent: May 24, 2016
    Assignee: TEXAS INSTRUMENTS DEUTSCHLAND GMBH
    Inventors: Christian Wiencke, Armin Stingl
  • Patent number: 9197902
    Abstract: A method for wavelet based data compression comprising: receiving data associated, with a set of pixels, computing wavelet coefficients by applying a series of Discrete Wavelet Transform (DWT) low-pass and high-pass filtering operations, wherein a number of filtering operations is reduced by: identifying common partial products for at least one of the lowpass filtering operations and the high-pass filtering operations, classifying a first portion of the wavelet coefficients as low magnitude coefficients and a second portion of the wavelet coefficients as high magnitude coefficients, eliminating the common partial products for the high magnitude wavelet coefficients, replacing multiplication operations for the low magnitude wavelet coefficients with shift-and-add operations, and eliminating the common partial products, and applying the DWT based on remaining filtering operations.
    Type: Grant
    Filed: January 14, 2011
    Date of Patent: November 24, 2015
    Assignee: M.S. Ramaiah School of Advanced Studies
    Inventors: Dipayan Mazumdar, Cyril Prasanna Raj P, Brahmananda Reddy Ganda
  • Patent number: 9170772
    Abstract: Embodiments of systems, apparatuses, and methods for performing BIDSplit instructions in a computer processor are described. In some embodiments, the execution of a BIDSplit instruction tests the encoding of a binary-integer decimal source value and extracts a sign, exponent, and/or significand into a destination.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: October 27, 2015
    Assignee: Intel Corporation
    Inventor: Shihjong J. Kuo
  • Patent number: 9092213
    Abstract: A method of performing vector operations on a semiconductor chip is described. The method includes performing a first vector instruction with a vector functional unit implemented on the semiconductor chip and performing a second vector instruction with the vector functional unit. The first vector instruction is a vector multiply add instruction. The second vector instruction is a vector leading zeros count instruction.
    Type: Grant
    Filed: September 24, 2010
    Date of Patent: July 28, 2015
    Assignee: Intel Corporation
    Inventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver, Eric W. Mahurin
  • Patent number: 8996600
    Abstract: The functions available in a specialized processing block of a programmable device include floating-point operations, including support within the specialized processing block for subnormal operations. This is accomplished, in part, by borrowing an adder in the specialized processing block and using the adder to operate on output of a multiplier or other operator to compete a subnormal operation. Although the adder becomes unavailable to serve as an adder, the need to complete the operation in slower, more valuable general purpose logic is avoided. The adder and the other operator need not necessarily be located together in a specialized processing block.
    Type: Grant
    Filed: August 3, 2012
    Date of Patent: March 31, 2015
    Assignee: Altera Corporation
    Inventor: Martin Langhammer
  • Patent number: 8943114
    Abstract: An apparatus including a first circuit and a second circuit. The first circuit may be configured to receive a first 2N-bit complex number and a second 2N-bit complex number, each having a first format, and to reformat the first and the second 2N-bit complex numbers to a second format such that a lower portion of each real and imaginary part of each 2N-bit complex number is positive. The second circuit may be configured to multiply the first and the second 2N-bit complex numbers using at least one N-bit signed complex multiplier, where N is an integer.
    Type: Grant
    Filed: August 17, 2011
    Date of Patent: January 27, 2015
    Assignee: Avago Technologies General IP (Singapore) Pte. Ltd.
    Inventors: Eran Ovadia Mershain, Eran Goldstein
  • Patent number: 8930432
    Abstract: A floating point execution unit is capable of selectively repurposing one or more adders in an exponent path of the floating point execution unit to perform fixed point addition operations, thereby providing fixed point functionality in the floating point execution unit.
    Type: Grant
    Filed: August 4, 2011
    Date of Patent: January 6, 2015
    Assignee: International Business Machines Corporation
    Inventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
  • Patent number: 8918446
    Abstract: Methods and apparatus relating to reducing power consumption in multi-precision floating point multipliers are described. In an embodiment, certain portions of a multiplier are disabled in response to two or more multiplication operations with the same data size and data type occurring back-to-back. Other embodiments are also claimed and described.
    Type: Grant
    Filed: December 14, 2010
    Date of Patent: December 23, 2014
    Assignee: Intel Corporation
    Inventors: Brent R. Boswell, Thierry Pons, Tom Aviram
  • Patent number: 8918445
    Abstract: An integrated multiplier circuit that operates on a variety of data formats including integer fixed point, signed or unsigned, real or complex, 8 bit, 16 bit or 32 bit as well as floating point data that may be single precision real, single precision complex or double precision. The circuit uses a single set of multiplier arrays to perform 16×16, 32×32 and 64×64 multiplies, 32×32 and 64×64 complex multiplies, 32×32 and 64×64 complex multiplies with one operand conjugated.
    Type: Grant
    Filed: September 21, 2011
    Date of Patent: December 23, 2014
    Assignee: Texas Instruments Incorporated
    Inventors: Timothy David Anderson, Mujibur Rahman
  • Publication number: 20140372493
    Abstract: A system and method for accelerating evaluation of functions. In one embodiment, a method includes receiving, by a processor, a value to be processed, and notification of a function to be applied to the value. The value is represented in a floating point format. The value is converted, by the processor, to a fixed point format. Which of Newton-Raphson and polynomial approximation is to be used to apply the function to the value in the fixed point format is determined by the processor. The function is applied to the value in the fixed point format to generate a result in the fixed point format. The result is converted to the floating point format by the processor.
    Type: Application
    Filed: June 14, 2013
    Publication date: December 18, 2014
    Inventors: Brent Everett Peterson, Nitya Ramdas, Sotirios Christodulos Tsongas, Jonathan Zack Albus, Johann Zipperer
  • Publication number: 20140351308
    Abstract: A system and method are provided for dynamically reducing power consumption of floating-point logic. A disable control signal that is based on a characteristic of a floating-point format input operand is received and a portion of a logic circuit is disabled based on the disable control signal. The logic circuit processes the floating-point format input operand to generate an output.
    Type: Application
    Filed: May 23, 2013
    Publication date: November 27, 2014
    Applicant: NVIDIA Corporation
    Inventors: David C. Tannenbaum, Srinivasan Iyer
  • Patent number: 8886695
    Abstract: A programmable integrated circuit device is programmed to normalize multiplication operations by examining the input or output values to determined the likelihood of overflow or underflow and then to adjust the input or output values accordingly. The examination of the inputs can include an examination of the number of adder stages feeding into the inputs, as well as a count of leading bits ahead of the first significant bit. Adjustment of an input can include shifting the mantissa by the leading bit count and adjusting the exponent accordingly, while adjustment of the output can include shifting the mantissa by the sum of the leading bit counts of the inputs and adjusting the exponent accordingly. Or the output can be examined to find its leading bit count and the output then can be adjusted by shifting the mantissa by the leading bit count and adjusting the exponent accordingly.
    Type: Grant
    Filed: July 10, 2012
    Date of Patent: November 11, 2014
    Assignee: Altera Corporation
    Inventor: Martin Langhammer
  • Patent number: 8751551
    Abstract: Digital signal processing (“DSP”) circuit blocks are provided that can more easily work together to perform larger (e.g., more complex and/or more arithmetically precise) DSP operations if desired. These DSP blocks may also include redundancy circuitry that facilitates stitching together multiple such blocks despite an inability to use some block (e.g., because of a circuit defect). Systolic registers may be included at various points in the DSP blocks to facilitate use of the blocks to implement systolic form, finite-impulse-response (“FIR”), digital filters.
    Type: Grant
    Filed: November 21, 2013
    Date of Patent: June 10, 2014
    Assignee: Altera Corporation
    Inventors: Keone Streicher, Martin Langhammer, Yi-Wen Lin, Wai-Bor Leung, David Lewis, Volker Mauer, Henry Y. Lui, Suleyman Sirri Demirsoy, Hyun Yi
  • Patent number: 8732225
    Abstract: Digital signal processing (“DSP”) block circuitry on an integrated circuit (“IC”) is adapted for use (e.g., in multiple instances of the DSP block circuitry on the IC) for implementing finite-impulse-response (“FIR”) digital filters in systolic form. Each DSP block may include (1) first and second multiplier circuitry and (2) adder circuitry for adding (a) outputs of the multipliers and (b) signals chained in from a first other instance of the DSP block circuitry. Systolic delay circuitry is provided for either the outputs of the first multiplier (upstream from the adder) or at least one of the sets of inputs to the first multiplier. Additional systolic delay circuitry is provided for outputs of the adder, which are chained out to a second other instance of the DSP block circuitry.
    Type: Grant
    Filed: October 11, 2013
    Date of Patent: May 20, 2014
    Assignee: Altera Corporation
    Inventors: Suleyman Demirsoy, Hyun Yi
  • Patent number: 8706790
    Abstract: The resources needed—particularly in a programmable device—when carrying out a mixed-precision multiplication-based floating-point operation (i.e., multiplication or division) is reduced by maintaining the mantissas of the operands in their native precisions instead of promoting the lower-precision number to the higher precision. Exponents and other elements can be handled by the higher-precision logic as they do not consume significant resources.
    Type: Grant
    Filed: March 3, 2009
    Date of Patent: April 22, 2014
    Assignee: Altera Corporation
    Inventor: Martin Langhammer
  • Publication number: 20140089371
    Abstract: A circuit for calculating the fused sum of an addend and product of two multiplicands, the addend and multiplicands being binary floating-point numbers represented in a standardized format as a mantissa and an exponent is provided. The multiplicands are in a lower precision format than the addend, with q>2p, where p and q are respectively the mantissa size of the multiplicand precision format and the addend precision format. The circuit includes a p-bit multiplier receiving the mantissas of the multiplicands; a shift circuit that aligns the mantissa of the addend with the product output by the multiplier based on the exponent values of the addend and multiplicands; and an adder that processes q-bit mantissas, receiving the aligned mantissa of the addend and the product, the input lines of the adder corresponding to the product being completed to the right by lines at 0 to form a q-bit mantissa.
    Type: Application
    Filed: April 19, 2012
    Publication date: March 27, 2014
    Applicant: KALRAY
    Inventors: Florent Dupont De Dinechin, Nicolas Brunie, Benoit Dupont De Dinechin
  • Patent number: 8671129
    Abstract: A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.
    Type: Grant
    Filed: March 8, 2011
    Date of Patent: March 11, 2014
    Assignee: Oracle International Corporation
    Inventors: Jeffrey S. Brooks, Christopher H. Olson
  • Publication number: 20140006467
    Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.
    Type: Application
    Filed: June 29, 2012
    Publication date: January 2, 2014
    Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
  • Patent number: 8620977
    Abstract: Digital signal processing (“DSP”) circuit blocks are provided that can more easily work together to perform larger (e.g., more complex and/or more arithmetically precise) DSP operations if desired. These DSP blocks may also include redundancy circuitry that facilitates stitching together multiple such blocks despite an inability to use some block (e.g., because of a circuit defect). Systolic registers may be included at various points in the DSP blocks to facilitate use of the blocks to implement systolic form, finite-impulse-response (“FIR”), digital filters.
    Type: Grant
    Filed: August 7, 2013
    Date of Patent: December 31, 2013
    Assignee: Altera Corporation
    Inventors: Keone Streicher, Martin Langhammer, Yi-Wen Lin, Wai-Bor Leung, David Lewis, Volker Mauer, Henry Y. Lui, Suleyman Sirri Demirsoy, Hyun Yi
  • Patent number: 8615540
    Abstract: An arithmetic logic unit (ALU) for use within a flight control system is provided. The ALU comprises a first register configured to receive a first operand, a second register configured to receive a second operand, and an adder coupled to the first register and the second register. The adder is configured to generate a sum of the first operand and the second operand and to generate intermediate sums that are used to determine a product of the first operand and the second operand.
    Type: Grant
    Filed: July 24, 2009
    Date of Patent: December 24, 2013
    Assignee: Honeywell International Inc.
    Inventors: Jason Bickler, Karen Brack
  • Patent number: 8601047
    Abstract: A decimal floating-point (DFP) adder includes a decimal leading-zero anticipator (LZA). The DFP adder receives DFP operands. Each operand includes a significand, an exponent, a sign bit and a leading zero count for the significand. The DFP adder adds or subtracts the DFP operands to obtain a DFP result. The LZA determines the leading zero count associated with the significand of the DFP result. The LZA operates at least partially in parallel with circuitry (in the DFP adder) that computes the DFP result. The LZA does not wait for that circuitry to finish computation of the DFP result. Instead it “anticipates” the number of leading zeros that the result's significand will contain.
    Type: Grant
    Filed: June 13, 2013
    Date of Patent: December 3, 2013
    Assignee: Advanced Micro Devices
    Inventor: Liang-Kai Wang