Multiplication Patents (Class 708/503)
-
Patent number: 12056461Abstract: An integrated circuit with specialized processing blocks are provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.Type: GrantFiled: September 24, 2021Date of Patent: August 6, 2024Assignee: Intel CorporationInventors: Martin Langhammer, Dongdong Chen
-
Patent number: 12008367Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.Type: GrantFiled: June 21, 2022Date of Patent: June 11, 2024Assignee: Intel CorporationInventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
-
Patent number: 11816446Abstract: Systems and methods are provided to perform multiply-accumulate operations of multiple data types in a systolic array. One or more processing elements in the systolic array can include a shared multiplier and one or more adders. The shared multiplier can include a separate and/or a shared circuitry where the shared circuitry can perform at least a part of integer multiplication and at least a part of non-integer multiplication. The one or more adders can include one or more shared adders or one or more separate adders. The shared adder can include a separate and/or a shared circuitry where the shared circuitry can perform at least a part of integer addition and at least a part of non-integer addition.Type: GrantFiled: November 27, 2019Date of Patent: November 14, 2023Assignee: Amazon Technologies, Inc.Inventors: Thomas Elmer, Thomas A. Volpe
-
Patent number: 11740869Abstract: Embodiments are directed to selecting a multiplication operation to be scheduled in a first stage of an execution schedule, the multiplication operation meeting a first condition of having no dependency. An addition/subtraction operation is selected to be scheduled in the first stage of the execution schedule responsive to meeting the first condition. A process is performed which includes selecting another multiplication operation to be scheduled in a next stage of the execution schedule responsive to meeting the first condition or a second condition, the second condition including having a dependency that is fulfilled by a previous stage. The process includes selecting another addition/subtraction operation to be scheduled in the next stage of the execution schedule responsive to meeting the first or second condition, and repeating the process until each operation has been scheduled in the execution schedule, where the execution schedule is configured for execution by an arithmetic logic unit.Type: GrantFiled: April 28, 2021Date of Patent: August 29, 2023Assignee: International Business Machines CorporationInventor: Rajat Rao
-
Patent number: 11650819Abstract: Systems, methods, and apparatuses relating to instructions to multiply floating-point values of about one are described.Type: GrantFiled: December 13, 2019Date of Patent: May 16, 2023Assignee: Intel CorporationInventors: Mohamed Elmalaki, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 11507350Abstract: The present disclosure relates to a fused vector multiplier for computing an inner product between vectors, where vectors to be computed are a multiplier number vector {right arrow over (A)}{AN . . . A2A1A0} and a multiplicand number {right arrow over (B)} {BN . . . B2B1B0}, {right arrow over (A)} and {right arrow over (B)} have the same dimension which is N+1.Type: GrantFiled: November 27, 2019Date of Patent: November 22, 2022Assignee: CAMBRICON (XI'AN) SEMICONDUCTOR CO., LTD.Inventors: Tianshi Chen, Shengyuan Zhou, Zidong Du, Qi Guo
-
Patent number: 11500631Abstract: A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.Type: GrantFiled: May 20, 2020Date of Patent: November 15, 2022Assignee: Texas Instruments IncorporatedInventors: Mujibur Rahman, Timothy David Anderson
-
Patent number: 11366638Abstract: Floating point Multiply-Add, Accumulate Unit, supporting BF16 format for Multiply-Accumulate operations, and FP32 Single-Precision Addition complying with the IEEE 754 Standard. The Multiply-Accumulate unit uses higher radix and longer internal 2's complement significand representation to facilitate precision as well as comparison and operation with negative numbers. The addition can be performed using Carry-Save format to avoid long carry propagation and speed up the operation. The circuit uses early exponent comparison to shorten the accumulate pipeline stage. Operations including overflow detection, zero detection and sign extension are adopted for 2s complement and Carry-Save format.Type: GrantFiled: September 2, 2021Date of Patent: June 21, 2022Assignee: SambaNova Systems, Inc.Inventors: Vojin G. Oklobdzija, Matthew M. Kim
-
Patent number: 11366636Abstract: An apparatus and method for efficiently performing a multiply add or multiply accumulate operation. For example, one embodiment of a processor comprises: a decoder to decode an instruction specifying an operation, the instruction comprising a first operand identifying a multiplier and a second operand identifying a multiplicand; and fused multiply-add (FMA) execution circuitry comprising first multiplication circuitry to perform a multiplication using the multiplicand and multiplier to generate a result for multipliers and multiplicands falling within a first precision range, and second multiplication circuitry to be used instead of the first multiplication circuitry for multipliers and multiplicands falling within a second precision range.Type: GrantFiled: July 1, 2020Date of Patent: June 21, 2022Assignee: INTEL CORPORATIONInventors: Aditya Varma, Michael Espig
-
Patent number: 11296068Abstract: A discrete three-dimensional (3-D) processor comprises first and second dice. The first die comprises 3-D memory (3D-M) arrays, whereas the second die comprises logic circuits and at least an off-die peripheral-circuit component of the 3D-M array(s). The first die does not comprise the off-die peripheral-circuit component. The first and second dice are communicatively coupled by a plurality of inter-die connections. The preferred discrete 3-D processor can be applied to mathematical computing, computer simulation, configurable gate array, pattern processing and neural network.Type: GrantFiled: November 15, 2020Date of Patent: April 5, 2022Assignees: HangZhou HaiCun Information Technology Co., Ltd.Inventor: Guobiao Zhang
-
Patent number: 11294627Abstract: The disclosure relates to a hardware operator for dot-product computation, comprising a plurality of multipliers each receiving two multiplicands in the form of floating-point numbers encoded in a first precision format; an alignment circuit associated with each multiplier, configured to, based on the exponents of the corresponding multiplicands, convert the result of the multiplication into a respective fixed-point number having a sufficient number of bits to cover the full dynamic range of the multiplication; and a multi-adder configured to add without loss the fixed-point numbers provided by the multipliers, providing a sum in the form of a fixed-point number.Type: GrantFiled: June 25, 2020Date of Patent: April 5, 2022Assignee: KalrayInventor: Nicolas Brunie
-
Patent number: 11250105Abstract: A computation unit that comprises (i) a multiplicand vector decomposer that generates a decomposed multiplicand vector which uses a sequence of first and second concatenated multiplicand sub-elements (1st2ndCMCSE) in a lower-precision format (LPF) to represent corresponding ones of multiplicand elements in a multiplicand vector in a higher-precision format (HPF), (ii) a multiplier vector decomposer that generates a decomposed multiplier vector which uses a sequence of first and second concatenated multiplier sub-elements (1st2ndCMLSE) in the LPF to represent corresponding ones of multiplier elements in a multiplier vector in the HPF, (iii) a multiplicand tensor encoder that encodes double reads of the sequence of the 1st2ndCMCSE in a decomposed multiplicand tensor, and (iv) a product vector generator that generates a product vector containing a sequence of first and second concatenated product sub-elements by executing general matrix-matrix multiplication (GeMM) operations between the double reads of the 1st2Type: GrantFiled: May 12, 2020Date of Patent: February 15, 2022Assignee: SambaNova Systems, Inc.Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
-
Patent number: 10963245Abstract: An apparatus is provided, that includes an instruction decoder responsive to an anchored-data processing instruction, to generate one or more control signals. Conversion circuitry is responsive to the one or more control signals to perform a conversion from a data value to an anchored-data select value. The conversion is based on anchor metadata indicative of a given range of significance for the anchored-data select value. Output circuitry is responsive to the one or more control signals, to write the anchored-data select value to a register.Type: GrantFiled: May 29, 2019Date of Patent: March 30, 2021Assignee: Arm LimitedInventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds, Nigel John Stephens
-
Patent number: 10949766Abstract: A method for an associative memory device includes dividing a multi-bit mantissa A of a number X to a plurality of smaller partial mantissas Aj, offline calculating a plurality of partial exponents F(Aj) for each possible value of each partial mantissa Aj and storing the plurality of partial exponents F(Aj) in a look up table (LUT) of the associative memory device. A system includes an associative memory array to store a plurality of partial mantissas Ai of a mantissa A of a number X and an exponent calculator to utilize the partial mantissas to compute e in the power of X.Type: GrantFiled: October 15, 2017Date of Patent: March 16, 2021Assignee: GSI Technology Inc.Inventor: Avidan Akerib
-
Patent number: 10754616Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.Type: GrantFiled: May 25, 2020Date of Patent: August 25, 2020Assignee: Singular Computing LLCInventor: Joseph Bates
-
Patent number: 10503472Abstract: An apparatus and method are provided for processing floating point values using an intermediate representation which has significand, exponent and shadow sections. A less significant portion of the exponent of the floating point value defines a range of positions within the significand section where the representation of the significand is to be held. The exponent section holds a representation of a more significant portion of the exponent indicating a selected window of multiple contiguous windows spanning a value range of a format of the floating point value. A first portion of the significand section corresponds to the selected window and a second portion corresponds to an overlap into a further window which is adjacent to and lower in the value range.Type: GrantFiled: May 17, 2016Date of Patent: December 10, 2019Assignee: ARM LimitedInventors: Daryl John Stewart, Thomas Christopher Grocutt
-
Patent number: 10489114Abstract: Methods and apparatuses for performing a floating point multiply-add operation with alignment correction. A processor receives a first operand, a second operand and a third operand, wherein the first, second and third operands each represent a floating point number comprising a significand value and a biased exponent value. A processor determines a shift amount based, at least in part, on the one or more biased exponent values of the first, second or third operand. A processor determines a shift amount correction based, at least in part, on the one or more biased exponent values of the first, second or third operand being equal to zero.Type: GrantFiled: June 27, 2014Date of Patent: November 26, 2019Assignee: International Business Machines CorporationInventors: Son T. Dao, Silvia Melitta Mueller
-
Patent number: 10489115Abstract: Methods and apparatuses for performing a floating point multiply-add operation with alignment correction. A processor receives a first operand, a second operand and a third operand, wherein the first, second and third operands each represent a floating point number comprising a significand value and a biased exponent value. A processor determines a shift amount based, at least in part, on the one or more biased exponent values of the first, second or third operand. A processor determines a shift amount correction based, at least in part, on the one or more biased exponent values of the first, second or third operand being equal to zero.Type: GrantFiled: December 17, 2014Date of Patent: November 26, 2019Assignee: International Business Machines CorporationInventors: Son T. Dao, Silvia Melitta Mueller
-
Patent number: 10452403Abstract: In an embodiment, a method is provided. The method includes managing user-level threads on a first instruction sequencer in response to executing user-level instructions on a second instruction sequencer that is under control of an application level program. A first user-level thread is run on the second instruction sequencer and contains one or more user level instructions. A first user level instruction has at least 1) a field that makes reference to one or more instruction sequencers or 2) implicitly references with a pointer to code that specifically addresses one or more instruction sequencers when the code is executed.Type: GrantFiled: September 26, 2015Date of Patent: October 22, 2019Assignee: Intel CorporationInventors: Hong Wang, John P. Shen, Edward T. Grochowski, Richard A. Hankins, Gautham N. Chinya, Bryant E. Bigbee, Shivnandan D. Kaushik, Xiang Chris Zou, Per Hammarlund, Scott Dion Rodgers, Xinmin Tian, Anil Aggawal, Prashant Sethi, Baiju V. Patel, James P Held
-
Patent number: 10381098Abstract: A memory interface latch including a data NAND gate and a feedback gate can be created within an integrated circuit (IC). When a feedback node is driven low, the data NAND gate can drive an inverted value of a memory array bitline input to a data output of the memory interface latch within a time of one gate delay. A feedback gate can, in a functional mode, during one phase of a clock signal, drive the feedback node high and during the other phase of the clock signal, drive the feedback node to a complement the data output. The feedback gate can be also, in an LBIST write-through mode, drive the feedback node to the value of a WRITE_DATA input. The feedback gate can be also, in a fence mode, drive the feedback node to fixed logic value.Type: GrantFiled: November 28, 2017Date of Patent: August 13, 2019Assignee: International Business Machines CorporationInventors: Elizabeth L. Gerhard, Todd A. Christensen, Chad A. Adams, Peter T. Freiburger
-
Patent number: 10346133Abstract: A processor includes an integer multiplier configured to execute an integer multiply instruction to multiply significand bits of at least one floating point operand of a floating point multiply operation. The processor also includes a floating point multiplier configured to execute a special purpose floating point multiply accumulate instruction with respect to an intermediate result of the floating point multiply operation and the at least one floating point operand to generate a final floating point multiplication result.Type: GrantFiled: December 21, 2017Date of Patent: July 9, 2019Assignee: QUALCOMM IncorporatedInventors: Albert Danysh, Erich Plondke, Eric Mahurin
-
Patent number: 10331406Abstract: A data processing apparatus and method of operating a data processing apparatus are disclosed. Comparisons are made between first and second floating-point operands received. A more significant portion of the first floating-point operand and of the second floating-point operand are subject to comparison. The more significant portion of the first floating-point operand minus a least significant bit in the more significant portion is subject to comparison with the more significant portion of the second floating-point operand. A less significant portion of the first floating-point operand and of the second floating-point operand are also subject to comparison. In dependence on the outcome of these comparisons, right-shift circuitry is used selectively to perform a 1-bit right shift on a difference calculated between the first floating-point operand and the second floating-point operand.Type: GrantFiled: November 17, 2017Date of Patent: June 25, 2019Assignee: ARM LimitedInventors: David Raymond Lutz, Thomas Gilles Tarridec
-
Patent number: 10037191Abstract: A method and computer system are provided for performing a comparison computation, e.g. for use in a check procedure for a reciprocal square root operation. The comparison computation compares a multiplication of three values with a predetermined value. The computer system performs the multiplication using multiplier logic which is configured to perform multiply operations in which two values are multiplied together. A first and second of the three values are multiplied to determine a first intermediate result, w1. The digits of w1 are separated into two portions, w1,1 and w1,2. The third of the three values is multiplied with w1,2 and the result is added into a multiplication of the third of the three values with w1,1 to thereby determine the result of multiplying the three values together. In this way the comparison is performed with high accuracy, while keeping the area and power consumption of the multiplier logic low.Type: GrantFiled: January 18, 2018Date of Patent: July 31, 2018Assignee: Imagination Technologies LimitedInventor: Leonard Rarick
-
Patent number: 9996320Abstract: An example processor includes a register and a fused multiply-add (FMA) low functional unit. The register stores first, second, and third floating point (FP) values. The FMA low functional unit receives a request to perform an FMA low operation: multiplies the first FP value with the second FP value to obtain a first product value; adds the first product with the third FP value to generate a first result value; rounds the first result to generate a first FMA value; multiplies the first FP value with the second FP value to obtain a second product value; adds the second product value with the third FP value to generate a second result value; and subtracts the FMA value from the second result value to obtain a third result value, which can then be normalized and rounded (FMA low result) and sent the FMA low result to an application.Type: GrantFiled: December 23, 2015Date of Patent: June 12, 2018Assignee: Intel CorporationInventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
-
Patent number: 9996319Abstract: An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.Type: GrantFiled: December 23, 2015Date of Patent: June 12, 2018Assignee: Intel CorporationInventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
-
Patent number: 9836279Abstract: An apparatus and method for floating-point multiplication are provided. Two partial products are generated from two operand significands, which are then added to generate a product significand. The value of an unbiased result exponent is determined from the operand exponent values and leading zero counts, and a shift amount and direction for the product significand are determined in dependence on a predetermined minimum exponent value of a predetermined canonical format. The product significand is shifted by the shift amount in the shift direction. An overflow mask identifying an overflow bit position of the product significand is generated by right shifting a predetermined mask pattern by the shift amount, and the overflow mask is applied to the product significand to extract an overflow value at the overflow bit position. This extraction of the overflow value happens before the shift circuitry shifts the product significand, allowing an overall faster floating-point multiplication to be performed.Type: GrantFiled: September 25, 2015Date of Patent: December 5, 2017Assignee: ARM LimitedInventor: David Raymond Lutz
-
Patent number: 9825814Abstract: Systems, methods, and computer-readable storage media are provided for dynamically setting an end point group for an end point. An endpoint can be assigned a default end point group when added to a network. For example, the default end point group can be a baseline port/security group which is considered an untrusted group. The end point can then be dynamically assigned an end point group based on a set of group selection rules. For example, the group selection rules can identify an end point group based on the MAC address or other attributes. When the end point is added to the network, the MAC address and/or other attributes of the end point can be determined and used to assign an end point group. As another example, an end point group can be assigned based on the amount of traffic or guest operation system.Type: GrantFiled: July 27, 2015Date of Patent: November 21, 2017Assignee: CISCO TECHNOLOGY, INC.Inventors: Joji Thomas Mekkattuparamban, Vijay Chander, Saurabh Jain, Van Lieu, Badhri Madabusi Vijayaraghavan, Praveen Jain, Munish Mehta, Michael R. Smith, Narender Enduri
-
Patent number: 9690545Abstract: A floating-point calculation apparatus comprising: a selection part; an addition and subtraction calculation part; an output determination part; and a buffer management part configured to add, when it is determined that a buffer used to store an input value is not prepared, a buffer that corresponds to the input value, wherein when a number of significant digits of the result of performing an addition and subtraction calculation exceeds a number of significant digits of the buffer selected by the selection part, the addition and subtraction calculation part shifts right or shifts left part of the result of performing the addition and subtraction calculation and divides the result of performing the addition and subtraction calculation into values each being storable in one of a plurality of buffers.Type: GrantFiled: May 29, 2015Date of Patent: June 27, 2017Assignee: HONDA MOTOR CO., LTD.Inventors: Shiro Kitamura, Tomoyuki Okumura, Hiroshi Nanjo
-
Patent number: 9606796Abstract: Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.Type: GrantFiled: October 30, 2013Date of Patent: March 28, 2017Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Kyong Ho Lee, Seok-Jun Lee, Manish Goel
-
Patent number: 9348558Abstract: A processor includes a carry save array multiplier. The carry save array multiplier includes an array of cascaded partial product generators. The array of cascaded partial product generators is configured to generate an output value as a product of two operands presented at inputs of the multiplier. The array of cascaded partial product generators is also configured to generate an output value as a sum of two operands presented at inputs of the multiplier.Type: GrantFiled: August 23, 2013Date of Patent: May 24, 2016Assignee: TEXAS INSTRUMENTS DEUTSCHLAND GMBHInventors: Christian Wiencke, Armin Stingl
-
Patent number: 9197902Abstract: A method for wavelet based data compression comprising: receiving data associated, with a set of pixels, computing wavelet coefficients by applying a series of Discrete Wavelet Transform (DWT) low-pass and high-pass filtering operations, wherein a number of filtering operations is reduced by: identifying common partial products for at least one of the lowpass filtering operations and the high-pass filtering operations, classifying a first portion of the wavelet coefficients as low magnitude coefficients and a second portion of the wavelet coefficients as high magnitude coefficients, eliminating the common partial products for the high magnitude wavelet coefficients, replacing multiplication operations for the low magnitude wavelet coefficients with shift-and-add operations, and eliminating the common partial products, and applying the DWT based on remaining filtering operations.Type: GrantFiled: January 14, 2011Date of Patent: November 24, 2015Assignee: M.S. Ramaiah School of Advanced StudiesInventors: Dipayan Mazumdar, Cyril Prasanna Raj P, Brahmananda Reddy Ganda
-
Patent number: 9170772Abstract: Embodiments of systems, apparatuses, and methods for performing BIDSplit instructions in a computer processor are described. In some embodiments, the execution of a BIDSplit instruction tests the encoding of a binary-integer decimal source value and extracts a sign, exponent, and/or significand into a destination.Type: GrantFiled: December 23, 2011Date of Patent: October 27, 2015Assignee: Intel CorporationInventor: Shihjong J. Kuo
-
Patent number: 9092213Abstract: A method of performing vector operations on a semiconductor chip is described. The method includes performing a first vector instruction with a vector functional unit implemented on the semiconductor chip and performing a second vector instruction with the vector functional unit. The first vector instruction is a vector multiply add instruction. The second vector instruction is a vector leading zeros count instruction.Type: GrantFiled: September 24, 2010Date of Patent: July 28, 2015Assignee: Intel CorporationInventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver, Eric W. Mahurin
-
Patent number: 8996600Abstract: The functions available in a specialized processing block of a programmable device include floating-point operations, including support within the specialized processing block for subnormal operations. This is accomplished, in part, by borrowing an adder in the specialized processing block and using the adder to operate on output of a multiplier or other operator to compete a subnormal operation. Although the adder becomes unavailable to serve as an adder, the need to complete the operation in slower, more valuable general purpose logic is avoided. The adder and the other operator need not necessarily be located together in a specialized processing block.Type: GrantFiled: August 3, 2012Date of Patent: March 31, 2015Assignee: Altera CorporationInventor: Martin Langhammer
-
Patent number: 8943114Abstract: An apparatus including a first circuit and a second circuit. The first circuit may be configured to receive a first 2N-bit complex number and a second 2N-bit complex number, each having a first format, and to reformat the first and the second 2N-bit complex numbers to a second format such that a lower portion of each real and imaginary part of each 2N-bit complex number is positive. The second circuit may be configured to multiply the first and the second 2N-bit complex numbers using at least one N-bit signed complex multiplier, where N is an integer.Type: GrantFiled: August 17, 2011Date of Patent: January 27, 2015Assignee: Avago Technologies General IP (Singapore) Pte. Ltd.Inventors: Eran Ovadia Mershain, Eran Goldstein
-
Patent number: 8930432Abstract: A floating point execution unit is capable of selectively repurposing one or more adders in an exponent path of the floating point execution unit to perform fixed point addition operations, thereby providing fixed point functionality in the floating point execution unit.Type: GrantFiled: August 4, 2011Date of Patent: January 6, 2015Assignee: International Business Machines CorporationInventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
-
Patent number: 8918446Abstract: Methods and apparatus relating to reducing power consumption in multi-precision floating point multipliers are described. In an embodiment, certain portions of a multiplier are disabled in response to two or more multiplication operations with the same data size and data type occurring back-to-back. Other embodiments are also claimed and described.Type: GrantFiled: December 14, 2010Date of Patent: December 23, 2014Assignee: Intel CorporationInventors: Brent R. Boswell, Thierry Pons, Tom Aviram
-
Patent number: 8918445Abstract: An integrated multiplier circuit that operates on a variety of data formats including integer fixed point, signed or unsigned, real or complex, 8 bit, 16 bit or 32 bit as well as floating point data that may be single precision real, single precision complex or double precision. The circuit uses a single set of multiplier arrays to perform 16×16, 32×32 and 64×64 multiplies, 32×32 and 64×64 complex multiplies, 32×32 and 64×64 complex multiplies with one operand conjugated.Type: GrantFiled: September 21, 2011Date of Patent: December 23, 2014Assignee: Texas Instruments IncorporatedInventors: Timothy David Anderson, Mujibur Rahman
-
Publication number: 20140372493Abstract: A system and method for accelerating evaluation of functions. In one embodiment, a method includes receiving, by a processor, a value to be processed, and notification of a function to be applied to the value. The value is represented in a floating point format. The value is converted, by the processor, to a fixed point format. Which of Newton-Raphson and polynomial approximation is to be used to apply the function to the value in the fixed point format is determined by the processor. The function is applied to the value in the fixed point format to generate a result in the fixed point format. The result is converted to the floating point format by the processor.Type: ApplicationFiled: June 14, 2013Publication date: December 18, 2014Inventors: Brent Everett Peterson, Nitya Ramdas, Sotirios Christodulos Tsongas, Jonathan Zack Albus, Johann Zipperer
-
Publication number: 20140351308Abstract: A system and method are provided for dynamically reducing power consumption of floating-point logic. A disable control signal that is based on a characteristic of a floating-point format input operand is received and a portion of a logic circuit is disabled based on the disable control signal. The logic circuit processes the floating-point format input operand to generate an output.Type: ApplicationFiled: May 23, 2013Publication date: November 27, 2014Applicant: NVIDIA CorporationInventors: David C. Tannenbaum, Srinivasan Iyer
-
Patent number: 8886695Abstract: A programmable integrated circuit device is programmed to normalize multiplication operations by examining the input or output values to determined the likelihood of overflow or underflow and then to adjust the input or output values accordingly. The examination of the inputs can include an examination of the number of adder stages feeding into the inputs, as well as a count of leading bits ahead of the first significant bit. Adjustment of an input can include shifting the mantissa by the leading bit count and adjusting the exponent accordingly, while adjustment of the output can include shifting the mantissa by the sum of the leading bit counts of the inputs and adjusting the exponent accordingly. Or the output can be examined to find its leading bit count and the output then can be adjusted by shifting the mantissa by the leading bit count and adjusting the exponent accordingly.Type: GrantFiled: July 10, 2012Date of Patent: November 11, 2014Assignee: Altera CorporationInventor: Martin Langhammer
-
Patent number: 8751551Abstract: Digital signal processing (“DSP”) circuit blocks are provided that can more easily work together to perform larger (e.g., more complex and/or more arithmetically precise) DSP operations if desired. These DSP blocks may also include redundancy circuitry that facilitates stitching together multiple such blocks despite an inability to use some block (e.g., because of a circuit defect). Systolic registers may be included at various points in the DSP blocks to facilitate use of the blocks to implement systolic form, finite-impulse-response (“FIR”), digital filters.Type: GrantFiled: November 21, 2013Date of Patent: June 10, 2014Assignee: Altera CorporationInventors: Keone Streicher, Martin Langhammer, Yi-Wen Lin, Wai-Bor Leung, David Lewis, Volker Mauer, Henry Y. Lui, Suleyman Sirri Demirsoy, Hyun Yi
-
Patent number: 8732225Abstract: Digital signal processing (“DSP”) block circuitry on an integrated circuit (“IC”) is adapted for use (e.g., in multiple instances of the DSP block circuitry on the IC) for implementing finite-impulse-response (“FIR”) digital filters in systolic form. Each DSP block may include (1) first and second multiplier circuitry and (2) adder circuitry for adding (a) outputs of the multipliers and (b) signals chained in from a first other instance of the DSP block circuitry. Systolic delay circuitry is provided for either the outputs of the first multiplier (upstream from the adder) or at least one of the sets of inputs to the first multiplier. Additional systolic delay circuitry is provided for outputs of the adder, which are chained out to a second other instance of the DSP block circuitry.Type: GrantFiled: October 11, 2013Date of Patent: May 20, 2014Assignee: Altera CorporationInventors: Suleyman Demirsoy, Hyun Yi
-
Patent number: 8706790Abstract: The resources needed—particularly in a programmable device—when carrying out a mixed-precision multiplication-based floating-point operation (i.e., multiplication or division) is reduced by maintaining the mantissas of the operands in their native precisions instead of promoting the lower-precision number to the higher precision. Exponents and other elements can be handled by the higher-precision logic as they do not consume significant resources.Type: GrantFiled: March 3, 2009Date of Patent: April 22, 2014Assignee: Altera CorporationInventor: Martin Langhammer
-
Publication number: 20140089371Abstract: A circuit for calculating the fused sum of an addend and product of two multiplicands, the addend and multiplicands being binary floating-point numbers represented in a standardized format as a mantissa and an exponent is provided. The multiplicands are in a lower precision format than the addend, with q>2p, where p and q are respectively the mantissa size of the multiplicand precision format and the addend precision format. The circuit includes a p-bit multiplier receiving the mantissas of the multiplicands; a shift circuit that aligns the mantissa of the addend with the product output by the multiplier based on the exponent values of the addend and multiplicands; and an adder that processes q-bit mantissas, receiving the aligned mantissa of the addend and the product, the input lines of the adder corresponding to the product being completed to the right by lines at 0 to form a q-bit mantissa.Type: ApplicationFiled: April 19, 2012Publication date: March 27, 2014Applicant: KALRAYInventors: Florent Dupont De Dinechin, Nicolas Brunie, Benoit Dupont De Dinechin
-
Patent number: 8671129Abstract: A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.Type: GrantFiled: March 8, 2011Date of Patent: March 11, 2014Assignee: Oracle International CorporationInventors: Jeffrey S. Brooks, Christopher H. Olson
-
Publication number: 20140006467Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.Type: ApplicationFiled: June 29, 2012Publication date: January 2, 2014Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
-
Patent number: 8620977Abstract: Digital signal processing (“DSP”) circuit blocks are provided that can more easily work together to perform larger (e.g., more complex and/or more arithmetically precise) DSP operations if desired. These DSP blocks may also include redundancy circuitry that facilitates stitching together multiple such blocks despite an inability to use some block (e.g., because of a circuit defect). Systolic registers may be included at various points in the DSP blocks to facilitate use of the blocks to implement systolic form, finite-impulse-response (“FIR”), digital filters.Type: GrantFiled: August 7, 2013Date of Patent: December 31, 2013Assignee: Altera CorporationInventors: Keone Streicher, Martin Langhammer, Yi-Wen Lin, Wai-Bor Leung, David Lewis, Volker Mauer, Henry Y. Lui, Suleyman Sirri Demirsoy, Hyun Yi
-
Patent number: 8615540Abstract: An arithmetic logic unit (ALU) for use within a flight control system is provided. The ALU comprises a first register configured to receive a first operand, a second register configured to receive a second operand, and an adder coupled to the first register and the second register. The adder is configured to generate a sum of the first operand and the second operand and to generate intermediate sums that are used to determine a product of the first operand and the second operand.Type: GrantFiled: July 24, 2009Date of Patent: December 24, 2013Assignee: Honeywell International Inc.Inventors: Jason Bickler, Karen Brack
-
Patent number: 8601047Abstract: A decimal floating-point (DFP) adder includes a decimal leading-zero anticipator (LZA). The DFP adder receives DFP operands. Each operand includes a significand, an exponent, a sign bit and a leading zero count for the significand. The DFP adder adds or subtracts the DFP operands to obtain a DFP result. The LZA determines the leading zero count associated with the significand of the DFP result. The LZA operates at least partially in parallel with circuitry (in the DFP adder) that computes the DFP result. The LZA does not wait for that circuitry to finish computation of the DFP result. Instead it “anticipates” the number of leading zeros that the result's significand will contain.Type: GrantFiled: June 13, 2013Date of Patent: December 3, 2013Assignee: Advanced Micro DevicesInventor: Liang-Kai Wang