Multiplication Patents (Class 708/503)

Tensor operations in AI models

Patent number: 12361262

Abstract: A method of performing computations for artificial intelligence models may include obtaining an input tensor based on an input to an artificial intelligence model and loading the input tensor into multiple processing devices. The input tensor may be split into multiple input tensor tiles that are distributed among the processing devices such that each of the processing devices does not include an entirety of the input tensor. The method may also include performing multiple tensor operations according to the artificial intelligence model to generate multiple intermediate tensors and an output tensor, one or more of the tensor operations performed using the input tensor.

Type: Grant

Filed: October 22, 2024

Date of Patent: July 15, 2025

Assignee: ETCHED.AI INC.

Inventor: Gavin Uberti
Processing device using dynamic bit shift

Patent number: 12288041

Abstract: A processing device may include multiplier circuitry configured to output a product of an integer represented by the first signal and a mantissa represented by the second signal. The processing device may further include a dynamic shifting circuit configured to shift a first shifted signal generated by shifting a mantissa part of a third signal based on the integer part of the first signal to generate and output a second shifted signal, and shift an output signal of the multiplier circuitry based on the integer part of the first signal to generate and output a third shifted signal. The processing device may further include an arithmetic logic circuit configured to output an signal representing a mantissa of a sum of a product of the first and second signals and the third signal based on output signals of the dynamic shifting circuit.

Type: Grant

Filed: September 20, 2024

Date of Patent: April 29, 2025

Assignee: REBELLIONS INC.

Inventor: Jinseok Kim
Integrated circuits with machine learning extensions

Patent number: 12056461

Abstract: An integrated circuit with specialized processing blocks are provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

Type: Grant

Filed: September 24, 2021

Date of Patent: August 6, 2024

Assignee: Intel Corporation

Inventors: Martin Langhammer, Dongdong Chen
Systems and methods for performing 16-bit floating-point vector dot product instructions

Patent number: 12008367

Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Type: Grant

Filed: June 21, 2022

Date of Patent: June 11, 2024

Assignee: Intel Corporation

Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
Systolic array component combining multiple integer and floating-point data types

Patent number: 11816446

Abstract: Systems and methods are provided to perform multiply-accumulate operations of multiple data types in a systolic array. One or more processing elements in the systolic array can include a shared multiplier and one or more adders. The shared multiplier can include a separate and/or a shared circuitry where the shared circuitry can perform at least a part of integer multiplication and at least a part of non-integer multiplication. The one or more adders can include one or more shared adders or one or more separate adders. The shared adder can include a separate and/or a shared circuitry where the shared circuitry can perform at least a part of integer addition and at least a part of non-integer addition.

Type: Grant

Filed: November 27, 2019

Date of Patent: November 14, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Thomas Elmer, Thomas A. Volpe
Scheduling atomic field operations in jacobian coordinates used in elliptic curve cryptography scalar multiplications

Patent number: 11740869

Abstract: Embodiments are directed to selecting a multiplication operation to be scheduled in a first stage of an execution schedule, the multiplication operation meeting a first condition of having no dependency. An addition/subtraction operation is selected to be scheduled in the first stage of the execution schedule responsive to meeting the first condition. A process is performed which includes selecting another multiplication operation to be scheduled in a next stage of the execution schedule responsive to meeting the first condition or a second condition, the second condition including having a dependency that is fulfilled by a previous stage. The process includes selecting another addition/subtraction operation to be scheduled in the next stage of the execution schedule responsive to meeting the first or second condition, and repeating the process until each operation has been scheduled in the execution schedule, where the execution schedule is configured for execution by an arithmetic logic unit.

Type: Grant

Filed: April 28, 2021

Date of Patent: August 29, 2023

Assignee: International Business Machines Corporation

Inventor: Rajat Rao
Apparatuses, methods, and systems for instructions to multiply floating-point values of about one

Patent number: 11650819

Abstract: Systems, methods, and apparatuses relating to instructions to multiply floating-point values of about one are described.

Type: Grant

Filed: December 13, 2019

Date of Patent: May 16, 2023

Assignee: Intel Corporation

Inventors: Mohamed Elmalaki, Elmoustapha Ould-Ahmed-Vall
Processing apparatus and processing method

Patent number: 11507350

Abstract: The present disclosure relates to a fused vector multiplier for computing an inner product between vectors, where vectors to be computed are a multiplier number vector {right arrow over (A)}{AN . . . A2A1A0} and a multiplicand number {right arrow over (B)} {BN . . . B2B1B0}, {right arrow over (A)} and {right arrow over (B)} have the same dimension which is N+1.

Type: Grant

Filed: November 27, 2019

Date of Patent: November 22, 2022

Assignee: CAMBRICON (XI'AN) SEMICONDUCTOR CO., LTD.

Inventors: Tianshi Chen, Shengyuan Zhou, Zidong Du, Qi Guo
Method and apparatus for implied bit handling in floating point multiplication

Patent number: 11500631

Abstract: A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.

Type: Grant

Filed: May 20, 2020

Date of Patent: November 15, 2022

Assignee: Texas Instruments Incorporated

Inventors: Mujibur Rahman, Timothy David Anderson
Method and apparatus for efficient binary and ternary support in fused multiply-add (FMA) circuits

Patent number: 11366636

Abstract: An apparatus and method for efficiently performing a multiply add or multiply accumulate operation. For example, one embodiment of a processor comprises: a decoder to decode an instruction specifying an operation, the instruction comprising a first operand identifying a multiplier and a second operand identifying a multiplicand; and fused multiply-add (FMA) execution circuitry comprising first multiplication circuitry to perform a multiplication using the multiplicand and multiplier to generate a result for multipliers and multiplicands falling within a first precision range, and second multiplication circuitry to be used instead of the first multiplication circuitry for multipliers and multiplicands falling within a second precision range.

Type: Grant

Filed: July 1, 2020

Date of Patent: June 21, 2022

Assignee: INTEL CORPORATION

Inventors: Aditya Varma, Michael Espig
Floating point multiply-add, accumulate unit with combined alignment circuits

Patent number: 11366638

Abstract: Floating point Multiply-Add, Accumulate Unit, supporting BF16 format for Multiply-Accumulate operations, and FP32 Single-Precision Addition complying with the IEEE 754 Standard. The Multiply-Accumulate unit uses higher radix and longer internal 2's complement significand representation to facilitate precision as well as comparison and operation with negative numbers. The addition can be performed using Carry-Save format to avoid long carry propagation and speed up the operation. The circuit uses early exponent comparison to shorten the accumulate pipeline stage. Operations including overflow detection, zero detection and sign extension are adopted for 2s complement and Carry-Save format.

Type: Grant

Filed: September 2, 2021

Date of Patent: June 21, 2022

Assignee: SambaNova Systems, Inc.

Inventors: Vojin G. Oklobdzija, Matthew M. Kim
Floating point dot-product operator with correct rounding

Patent number: 11294627

Abstract: The disclosure relates to a hardware operator for dot-product computation, comprising a plurality of multipliers each receiving two multiplicands in the form of floating-point numbers encoded in a first precision format; an alignment circuit associated with each multiplier, configured to, based on the exponents of the corresponding multiplicands, convert the result of the multiplication into a respective fixed-point number having a sufficient number of bits to cover the full dynamic range of the multiplication; and a multi-adder configured to add without loss the fixed-point numbers provided by the multipliers, providing a sum in the form of a fixed-point number.

Type: Grant

Filed: June 25, 2020

Date of Patent: April 5, 2022

Assignee: Kalray

Inventor: Nicolas Brunie
Discrete three-dimensional processor

Patent number: 11296068

Abstract: A discrete three-dimensional (3-D) processor comprises first and second dice. The first die comprises 3-D memory (3D-M) arrays, whereas the second die comprises logic circuits and at least an off-die peripheral-circuit component of the 3D-M array(s). The first die does not comprise the off-die peripheral-circuit component. The first and second dice are communicatively coupled by a plurality of inter-die connections. The preferred discrete 3-D processor can be applied to mathematical computing, computer simulation, configurable gate array, pattern processing and neural network.

Type: Grant

Filed: November 15, 2020

Date of Patent: April 5, 2022

Assignees: HangZhou HaiCun Information Technology Co., Ltd.

Inventor: Guobiao Zhang
Computationally efficient general matrix-matrix multiplication (GeMM)

Patent number: 11250105

Abstract: A computation unit that comprises (i) a multiplicand vector decomposer that generates a decomposed multiplicand vector which uses a sequence of first and second concatenated multiplicand sub-elements (1st2ndCMCSE) in a lower-precision format (LPF) to represent corresponding ones of multiplicand elements in a multiplicand vector in a higher-precision format (HPF), (ii) a multiplier vector decomposer that generates a decomposed multiplier vector which uses a sequence of first and second concatenated multiplier sub-elements (1st2ndCMLSE) in the LPF to represent corresponding ones of multiplier elements in a multiplier vector in the HPF, (iii) a multiplicand tensor encoder that encodes double reads of the sequence of the 1st2ndCMCSE in a decomposed multiplicand tensor, and (iv) a product vector generator that generates a product vector containing a sequence of first and second concatenated product sub-elements by executing general matrix-matrix multiplication (GeMM) operations between the double reads of the 1st2

Type: Grant

Filed: May 12, 2020

Date of Patent: February 15, 2022

Assignee: SambaNova Systems, Inc.

Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
Anchored data element conversion

Patent number: 10963245

Abstract: An apparatus is provided, that includes an instruction decoder responsive to an anchored-data processing instruction, to generate one or more control signals. Conversion circuitry is responsive to the one or more control signals to perform a conversion from a data value to an anchored-data select value. The conversion is based on anchor metadata indicative of a given range of significance for the anchored-data select value. Output circuitry is responsive to the one or more control signals, to write the anchored-data select value to a register.

Type: Grant

Filed: May 29, 2019

Date of Patent: March 30, 2021

Assignee: Arm Limited

Inventors: David Raymond Lutz, Neil Burgess, Christopher Neal Hinds, Nigel John Stephens
Precise exponent and exact softmax computation

Patent number: 10949766

Abstract: A method for an associative memory device includes dividing a multi-bit mantissa A of a number X to a plurality of smaller partial mantissas Aj, offline calculating a plurality of partial exponents F(Aj) for each possible value of each partial mantissa Aj and storing the plurality of partial exponents F(Aj) in a look up table (LUT) of the associative memory device. A system includes an associative memory array to store a plurality of partial mantissas Ai of a mantissa A of a number X and an exponent calculator to utilize the partial mantissas to compute e in the power of X.

Type: Grant

Filed: October 15, 2017

Date of Patent: March 16, 2021

Assignee: GSI Technology Inc.

Inventor: Avidan Akerib
Processing with compact arithmetic processing element

Patent number: 10754616

Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.

Type: Grant

Filed: May 25, 2020

Date of Patent: August 25, 2020

Assignee: Singular Computing LLC

Inventor: Joseph Bates
Apparatus and method for processing floating point values

Patent number: 10503472

Abstract: An apparatus and method are provided for processing floating point values using an intermediate representation which has significand, exponent and shadow sections. A less significant portion of the exponent of the floating point value defines a range of positions within the significand section where the representation of the significand is to be held. The exponent section holds a representation of a more significant portion of the exponent indicating a selected window of multiple contiguous windows spanning a value range of a format of the floating point value. A first portion of the significand section corresponds to the selected window and a second portion corresponds to an overlap into a further window which is adjacent to and lower in the value range.

Type: Grant

Filed: May 17, 2016

Date of Patent: December 10, 2019

Assignee: ARM Limited

Inventors: Daryl John Stewart, Thomas Christopher Grocutt
Shift amount correction for multiply-add

Patent number: 10489115

Abstract: Methods and apparatuses for performing a floating point multiply-add operation with alignment correction. A processor receives a first operand, a second operand and a third operand, wherein the first, second and third operands each represent a floating point number comprising a significand value and a biased exponent value. A processor determines a shift amount based, at least in part, on the one or more biased exponent values of the first, second or third operand. A processor determines a shift amount correction based, at least in part, on the one or more biased exponent values of the first, second or third operand being equal to zero.

Type: Grant

Filed: December 17, 2014

Date of Patent: November 26, 2019

Assignee: International Business Machines Corporation

Inventors: Son T. Dao, Silvia Melitta Mueller
Shift amount correction for multiply-add

Patent number: 10489114

Abstract: Methods and apparatuses for performing a floating point multiply-add operation with alignment correction. A processor receives a first operand, a second operand and a third operand, wherein the first, second and third operands each represent a floating point number comprising a significand value and a biased exponent value. A processor determines a shift amount based, at least in part, on the one or more biased exponent values of the first, second or third operand. A processor determines a shift amount correction based, at least in part, on the one or more biased exponent values of the first, second or third operand being equal to zero.

Type: Grant

Filed: June 27, 2014

Date of Patent: November 26, 2019

Assignee: International Business Machines Corporation

Inventors: Son T. Dao, Silvia Melitta Mueller
Mechanism for instruction set based thread execution on a plurality of instruction sequencers

Patent number: 10452403

Abstract: In an embodiment, a method is provided. The method includes managing user-level threads on a first instruction sequencer in response to executing user-level instructions on a second instruction sequencer that is under control of an application level program. A first user-level thread is run on the second instruction sequencer and contains one or more user level instructions. A first user level instruction has at least 1) a field that makes reference to one or more instruction sequencers or 2) implicitly references with a pointer to code that specifically addresses one or more instruction sequencers when the code is executed.

Type: Grant

Filed: September 26, 2015

Date of Patent: October 22, 2019

Assignee: Intel Corporation

Inventors: Hong Wang, John P. Shen, Edward T. Grochowski, Richard A. Hankins, Gautham N. Chinya, Bryant E. Bigbee, Shivnandan D. Kaushik, Xiang Chris Zou, Per Hammarlund, Scott Dion Rodgers, Xinmin Tian, Anil Aggawal, Prashant Sethi, Baiju V. Patel, James P Held
Memory interface latch with integrated write-through and fence functions

Patent number: 10381098

Abstract: A memory interface latch including a data NAND gate and a feedback gate can be created within an integrated circuit (IC). When a feedback node is driven low, the data NAND gate can drive an inverted value of a memory array bitline input to a data output of the memory interface latch within a time of one gate delay. A feedback gate can, in a functional mode, during one phase of a clock signal, drive the feedback node high and during the other phase of the clock signal, drive the feedback node to a complement the data output. The feedback gate can be also, in an LBIST write-through mode, drive the feedback node to the value of a WRITE_DATA input. The feedback gate can be also, in a fence mode, drive the feedback node to fixed logic value.

Type: Grant

Filed: November 28, 2017

Date of Patent: August 13, 2019

Assignee: International Business Machines Corporation

Inventors: Elizabeth L. Gerhard, Todd A. Christensen, Chad A. Adams, Peter T. Freiburger
System and method of floating point multiply operation processing

Patent number: 10346133

Abstract: A processor includes an integer multiplier configured to execute an integer multiply instruction to multiply significand bits of at least one floating point operand of a floating point multiply operation. The processor also includes a floating point multiplier configured to execute a special purpose floating point multiply accumulate instruction with respect to an intermediate result of the floating point multiply operation and the at least one floating point operand to generate a final floating point multiplication result.

Type: Grant

Filed: December 21, 2017

Date of Patent: July 9, 2019

Assignee: QUALCOMM Incorporated

Inventors: Albert Danysh, Erich Plondke, Eric Mahurin
Handling floating-point operations

Patent number: 10331406

Abstract: A data processing apparatus and method of operating a data processing apparatus are disclosed. Comparisons are made between first and second floating-point operands received. A more significant portion of the first floating-point operand and of the second floating-point operand are subject to comparison. The more significant portion of the first floating-point operand minus a least significant bit in the more significant portion is subject to comparison with the more significant portion of the second floating-point operand. A less significant portion of the first floating-point operand and of the second floating-point operand are also subject to comparison. In dependence on the outcome of these comparisons, right-shift circuitry is used selectively to perform a 1-bit right shift on a difference calculated between the first floating-point operand and the second floating-point operand.

Type: Grant

Filed: November 17, 2017

Date of Patent: June 25, 2019

Assignee: ARM Limited

Inventors: David Raymond Lutz, Thomas Gilles Tarridec
Performing a comparison computation in a computer system

Patent number: 10037191

Abstract: A method and computer system are provided for performing a comparison computation, e.g. for use in a check procedure for a reciprocal square root operation. The comparison computation compares a multiplication of three values with a predetermined value. The computer system performs the multiplication using multiplier logic which is configured to perform multiply operations in which two values are multiplied together. A first and second of the three values are multiplied to determine a first intermediate result, w1. The digits of w1 are separated into two portions, w1,1 and w1,2. The third of the three values is multiplied with w1,2 and the result is added into a multiplication of the third of the three values with w1,1 to thereby determine the result of multiplying the three values together. In this way the comparison is performed with high accuracy, while keeping the area and power consumption of the multiplier logic low.

Type: Grant

Filed: January 18, 2018

Date of Patent: July 31, 2018

Assignee: Imagination Technologies Limited

Inventor: Leonard Rarick
Fused multiply-add (FMA) low functional unit

Patent number: 9996320

Abstract: An example processor includes a register and a fused multiply-add (FMA) low functional unit. The register stores first, second, and third floating point (FP) values. The FMA low functional unit receives a request to perform an FMA low operation: multiplies the first FP value with the second FP value to obtain a first product value; adds the first product with the third FP value to generate a first result value; rounds the first result to generate a first FMA value; multiplies the first FP value with the second FP value to obtain a second product value; adds the second product value with the third FP value to generate a second result value; and subtracts the FMA value from the second result value to obtain a third result value, which can then be normalized and rounded (FMA low result) and sent the FMA low result to an application.

Type: Grant

Filed: December 23, 2015

Date of Patent: June 12, 2018

Assignee: Intel Corporation

Inventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
Floating point (FP) add low instructions functional unit

Patent number: 9996319

Abstract: An example processor includes a register and an ADD low functional unit. The register stores first, second, and third floating point (FP) values. The ADD low functional unit receives a request to perform an ADD low operation and, responsive to the request: adds the first FP value with the second FP value to obtain a first sum value; rounds the first sum value to generate an ADD value; adds the first FP value with the second FP value to obtain a second sum value; subtracts the ADD value from the second sum value to generate a difference value; normalizes the difference value to obtain a normalized difference value; rounds the normalized difference value to generate an ADD low value; and sends the ADD low value to an application.

Type: Grant

Filed: December 23, 2015

Date of Patent: June 12, 2018

Assignee: Intel Corporation

Inventors: Cristina S. Anderson, Marius A. Cornea-Hasegan, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Nikita Astafev, Mark J. Charney, Milind B. Girkar, Amit Gradstein, Simon Rubanovich, Zeev Sperber
Apparatus and method for floating-point multiplication

Patent number: 9836279

Abstract: An apparatus and method for floating-point multiplication are provided. Two partial products are generated from two operand significands, which are then added to generate a product significand. The value of an unbiased result exponent is determined from the operand exponent values and leading zero counts, and a shift amount and direction for the product significand are determined in dependence on a predetermined minimum exponent value of a predetermined canonical format. The product significand is shifted by the shift amount in the shift direction. An overflow mask identifying an overflow bit position of the product significand is generated by right shifting a predetermined mask pattern by the shift amount, and the overflow mask is applied to the product significand to extract an overflow value at the overflow bit position. This extraction of the overflow value happens before the shift circuitry shifts the product significand, allowing an overall faster floating-point multiplication to be performed.

Type: Grant

Filed: September 25, 2015

Date of Patent: December 5, 2017

Assignee: ARM Limited

Inventor: David Raymond Lutz
Dynamic attribute based application policy

Patent number: 9825814

Abstract: Systems, methods, and computer-readable storage media are provided for dynamically setting an end point group for an end point. An endpoint can be assigned a default end point group when added to a network. For example, the default end point group can be a baseline port/security group which is considered an untrusted group. The end point can then be dynamically assigned an end point group based on a set of group selection rules. For example, the group selection rules can identify an end point group based on the MAC address or other attributes. When the end point is added to the network, the MAC address and/or other attributes of the end point can be determined and used to assign an end point group. As another example, an end point group can be assigned based on the amount of traffic or guest operation system.

Type: Grant

Filed: July 27, 2015

Date of Patent: November 21, 2017

Assignee: CISCO TECHNOLOGY, INC.

Inventors: Joji Thomas Mekkattuparamban, Vijay Chander, Saurabh Jain, Van Lieu, Badhri Madabusi Vijayaraghavan, Praveen Jain, Munish Mehta, Michael R. Smith, Narender Enduri
Floating-point calculation apparatus, program, and calculation apparatus

Patent number: 9690545

Abstract: A floating-point calculation apparatus comprising: a selection part; an addition and subtraction calculation part; an output determination part; and a buffer management part configured to add, when it is determined that a buffer used to store an input value is not prepared, a buffer that corresponds to the input value, wherein when a number of significant digits of the result of performing an addition and subtraction calculation exceeds a number of significant digits of the buffer selected by the selection part, the addition and subtraction calculation part shifts right or shifts left part of the result of performing the addition and subtraction calculation and divides the result of performing the addition and subtraction calculation into values each being storable in one of a plurality of buffers.

Type: Grant

Filed: May 29, 2015

Date of Patent: June 27, 2017

Assignee: HONDA MOTOR CO., LTD.

Inventors: Shiro Kitamura, Tomoyuki Okumura, Hiroshi Nanjo
Computer and methods for solving math functions

Patent number: 9606796

Abstract: Computers and methods for performing mathematical functions are disclosed. An embodiment of a computer includes an operations level and a driver level. The operations level performs mathematical operations. The driver level includes a first lookup table and a second lookup table, wherein the first lookup table includes first data for calculating at least one mathematical function using a first level of accuracy. The second lookup table includes second data for calculating the at least one mathematical function using a second level of accuracy, wherein the first level of accuracy is greater than the second level of accuracy. A driver executes either the first data or the second data depending on a selected level of accuracy.

Type: Grant

Filed: October 30, 2013

Date of Patent: March 28, 2017

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Kyong Ho Lee, Seok-Jun Lee, Manish Goel
Processor with efficient arithmetic units

Patent number: 9348558

Abstract: A processor includes a carry save array multiplier. The carry save array multiplier includes an array of cascaded partial product generators. The array of cascaded partial product generators is configured to generate an output value as a product of two operands presented at inputs of the multiplier. The array of cascaded partial product generators is also configured to generate an output value as a sum of two operands presented at inputs of the multiplier.

Type: Grant

Filed: August 23, 2013

Date of Patent: May 24, 2016

Assignee: TEXAS INSTRUMENTS DEUTSCHLAND GMBH

Inventors: Christian Wiencke, Armin Stingl
Wavelet transformation using multicore processors

Patent number: 9197902

Abstract: A method for wavelet based data compression comprising: receiving data associated, with a set of pixels, computing wavelet coefficients by applying a series of Discrete Wavelet Transform (DWT) low-pass and high-pass filtering operations, wherein a number of filtering operations is reduced by: identifying common partial products for at least one of the lowpass filtering operations and the high-pass filtering operations, classifying a first portion of the wavelet coefficients as low magnitude coefficients and a second portion of the wavelet coefficients as high magnitude coefficients, eliminating the common partial products for the high magnitude wavelet coefficients, replacing multiplication operations for the low magnitude wavelet coefficients with shift-and-add operations, and eliminating the common partial products, and applying the DWT based on remaining filtering operations.

Type: Grant

Filed: January 14, 2011

Date of Patent: November 24, 2015

Assignee: M.S. Ramaiah School of Advanced Studies

Inventors: Dipayan Mazumdar, Cyril Prasanna Raj P, Brahmananda Reddy Ganda
Method and apparatus for decimal floating-point data logical extraction

Patent number: 9170772

Abstract: Embodiments of systems, apparatuses, and methods for performing BIDSplit instructions in a computer processor are described. In some embodiments, the execution of a BIDSplit instruction tests the encoding of a binary-integer decimal source value and extracts a sign, exponent, and/or significand into a destination.

Type: Grant

Filed: December 23, 2011

Date of Patent: October 27, 2015

Assignee: Intel Corporation

Inventor: Shihjong J. Kuo
Functional unit for vector leading zeroes, vector trailing zeroes, vector operand 1s count and vector parity calculation

Patent number: 9092213

Abstract: A method of performing vector operations on a semiconductor chip is described. The method includes performing a first vector instruction with a vector functional unit implemented on the semiconductor chip and performing a second vector instruction with the vector functional unit. The first vector instruction is a vector multiply add instruction. The second vector instruction is a vector leading zeros count instruction.

Type: Grant

Filed: September 24, 2010

Date of Patent: July 28, 2015

Assignee: Intel Corporation

Inventors: Jeff Wiedemeier, Sridhar Samudrala, Roger Golliver, Eric W. Mahurin
Specialized processing block for implementing floating-point multiplier with subnormal operation support

Patent number: 8996600

Abstract: The functions available in a specialized processing block of a programmable device include floating-point operations, including support within the specialized processing block for subnormal operations. This is accomplished, in part, by borrowing an adder in the specialized processing block and using the adder to operate on output of a multiplier or other operator to compete a subnormal operation. Although the adder becomes unavailable to serve as an adder, the need to complete the operation in slower, more valuable general purpose logic is avoided. The adder and the other operator need not necessarily be located together in a specialized processing block.

Type: Grant

Filed: August 3, 2012

Date of Patent: March 31, 2015

Assignee: Altera Corporation

Inventor: Martin Langhammer
Method for implementing 32 bit complex multiplication by using 16-bit complex multipliers

Patent number: 8943114

Abstract: An apparatus including a first circuit and a second circuit. The first circuit may be configured to receive a first 2N-bit complex number and a second 2N-bit complex number, each having a first format, and to reformat the first and the second 2N-bit complex numbers to a second format such that a lower portion of each real and imaginary part of each 2N-bit complex number is positive. The second circuit may be configured to multiply the first and the second 2N-bit complex numbers using at least one N-bit signed complex multiplier, where N is an integer.

Type: Grant

Filed: August 17, 2011

Date of Patent: January 27, 2015

Assignee: Avago Technologies General IP (Singapore) Pte. Ltd.

Inventors: Eran Ovadia Mershain, Eran Goldstein
Floating point execution unit with fixed point functionality

Patent number: 8930432

Abstract: A floating point execution unit is capable of selectively repurposing one or more adders in an exponent path of the floating point execution unit to perform fixed point addition operations, thereby providing fixed point functionality in the floating point execution unit.

Type: Grant

Filed: August 4, 2011

Date of Patent: January 6, 2015

Assignee: International Business Machines Corporation

Inventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
Circuit which performs split precision, signed/unsigned, fixed and floating point, real and complex multiplication

Patent number: 8918445

Abstract: An integrated multiplier circuit that operates on a variety of data formats including integer fixed point, signed or unsigned, real or complex, 8 bit, 16 bit or 32 bit as well as floating point data that may be single precision real, single precision complex or double precision. The circuit uses a single set of multiplier arrays to perform 16×16, 32×32 and 64×64 multiplies, 32×32 and 64×64 complex multiplies, 32×32 and 64×64 complex multiplies with one operand conjugated.

Type: Grant

Filed: September 21, 2011

Date of Patent: December 23, 2014

Assignee: Texas Instruments Incorporated

Inventors: Timothy David Anderson, Mujibur Rahman
Reducing power consumption in multi-precision floating point multipliers

Patent number: 8918446

Abstract: Methods and apparatus relating to reducing power consumption in multi-precision floating point multipliers are described. In an embodiment, certain portions of a multiplier are disabled in response to two or more multiplication operations with the same data size and data type occurring back-to-back. Other embodiments are also claimed and described.

Type: Grant

Filed: December 14, 2010

Date of Patent: December 23, 2014

Assignee: Intel Corporation

Inventors: Brent R. Boswell, Thierry Pons, Tom Aviram
SYSTEM AND METHOD FOR ACCELERATING EVALUATION OF FUNCTIONS

Publication number: 20140372493

Abstract: A system and method for accelerating evaluation of functions. In one embodiment, a method includes receiving, by a processor, a value to be processed, and notification of a function to be applied to the value. The value is represented in a floating point format. The value is converted, by the processor, to a fixed point format. Which of Newton-Raphson and polynomial approximation is to be used to apply the function to the value in the fixed point format is determined by the processor. The function is applied to the value in the fixed point format to generate a result in the fixed point format. The result is converted to the floating point format by the processor.

Type: Application

Filed: June 14, 2013

Publication date: December 18, 2014

Inventors: Brent Everett Peterson, Nitya Ramdas, Sotirios Christodulos Tsongas, Jonathan Zack Albus, Johann Zipperer
SYSTEM AND METHOD FOR DYNAMICALLY REDUCING POWER CONSUMPTION OF FLOATING-POINT LOGIC

Publication number: 20140351308

Abstract: A system and method are provided for dynamically reducing power consumption of floating-point logic. A disable control signal that is based on a characteristic of a floating-point format input operand is received and a portion of a logic circuit is disabled based on the disable control signal. The logic circuit processes the floating-point format input operand to generate an output.

Type: Application

Filed: May 23, 2013

Publication date: November 27, 2014

Applicant: NVIDIA Corporation

Inventors: David C. Tannenbaum, Srinivasan Iyer
Normalization of floating point operations in a programmable integrated circuit device

Patent number: 8886695

Abstract: A programmable integrated circuit device is programmed to normalize multiplication operations by examining the input or output values to determined the likelihood of overflow or underflow and then to adjust the input or output values accordingly. The examination of the inputs can include an examination of the number of adder stages feeding into the inputs, as well as a count of leading bits ahead of the first significant bit. Adjustment of an input can include shifting the mantissa by the leading bit count and adjusting the exponent accordingly, while adjustment of the output can include shifting the mantissa by the sum of the leading bit counts of the inputs and adjusting the exponent accordingly. Or the output can be examined to find its leading bit count and the output then can be adjusted by shifting the mantissa by the leading bit count and adjusting the exponent accordingly.

Type: Grant

Filed: July 10, 2012

Date of Patent: November 11, 2014

Assignee: Altera Corporation

Inventor: Martin Langhammer
Modular digital signal processing circuitry with optionally usable, dedicated connections between modules of the circuitry

Patent number: 8751551

Abstract: Digital signal processing (“DSP”) circuit blocks are provided that can more easily work together to perform larger (e.g., more complex and/or more arithmetically precise) DSP operations if desired. These DSP blocks may also include redundancy circuitry that facilitates stitching together multiple such blocks despite an inability to use some block (e.g., because of a circuit defect). Systolic registers may be included at various points in the DSP blocks to facilitate use of the blocks to implement systolic form, finite-impulse-response (“FIR”), digital filters.

Type: Grant

Filed: November 21, 2013

Date of Patent: June 10, 2014

Assignee: Altera Corporation

Inventors: Keone Streicher, Martin Langhammer, Yi-Wen Lin, Wai-Bor Leung, David Lewis, Volker Mauer, Henry Y. Lui, Suleyman Sirri Demirsoy, Hyun Yi
Digital signal processing circuit blocks with support for systolic finite-impulse-response digital filtering

Patent number: 8732225

Abstract: Digital signal processing (“DSP”) block circuitry on an integrated circuit (“IC”) is adapted for use (e.g., in multiple instances of the DSP block circuitry on the IC) for implementing finite-impulse-response (“FIR”) digital filters in systolic form. Each DSP block may include (1) first and second multiplier circuitry and (2) adder circuitry for adding (a) outputs of the multipliers and (b) signals chained in from a first other instance of the DSP block circuitry. Systolic delay circuitry is provided for either the outputs of the first multiplier (upstream from the adder) or at least one of the sets of inputs to the first multiplier. Additional systolic delay circuitry is provided for outputs of the adder, which are chained out to a second other instance of the DSP block circuitry.

Type: Grant

Filed: October 11, 2013

Date of Patent: May 20, 2014

Assignee: Altera Corporation

Inventors: Suleyman Demirsoy, Hyun Yi
Implementing mixed-precision floating-point operations in a programmable integrated circuit device

Patent number: 8706790

Abstract: The resources needed—particularly in a programmable device—when carrying out a mixed-precision multiplication-based floating-point operation (i.e., multiplication or division) is reduced by maintaining the mantissas of the operands in their native precisions instead of promoting the lower-precision number to the higher precision. Exponents and other elements can be handled by the higher-precision logic as they do not consume significant resources.

Type: Grant

Filed: March 3, 2009

Date of Patent: April 22, 2014

Assignee: Altera Corporation

Inventor: Martin Langhammer
MIXED PRECISION FUSED MULTIPLY-ADD OPERATOR

Publication number: 20140089371

Abstract: A circuit for calculating the fused sum of an addend and product of two multiplicands, the addend and multiplicands being binary floating-point numbers represented in a standardized format as a mantissa and an exponent is provided. The multiplicands are in a lower precision format than the addend, with q>2p, where p and q are respectively the mantissa size of the multiplicand precision format and the addend precision format. The circuit includes a p-bit multiplier receiving the mantissas of the multiplicands; a shift circuit that aligns the mantissa of the addend with the product output by the multiplier based on the exponent values of the addend and multiplicands; and an adder that processes q-bit mantissas, receiving the aligned mantissa of the addend and the product, the input lines of the adder corresponding to the product being completed to the right by lines at 0 to form a q-bit mantissa.

Type: Application

Filed: April 19, 2012

Publication date: March 27, 2014

Applicant: KALRAY

Inventors: Florent Dupont De Dinechin, Nicolas Brunie, Benoit Dupont De Dinechin
System and method of bypassing unrounded results in a multiply-add pipeline unit

Patent number: 8671129

Abstract: A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.

Type: Grant

Filed: March 8, 2011

Date of Patent: March 11, 2014

Assignee: Oracle International Corporation

Inventors: Jeffrey S. Brooks, Christopher H. Olson
DOUBLE ROUNDED COMBINED FLOATING-POINT MULTIPLY AND ADD

Publication number: 20140006467

Abstract: Methods, apparatus, instructions and logic are disclosed providing double rounded combined floating-point multiply and add functionality as scalar or vector SIMD instructions or as fused micro-operations. Embodiments include detecting floating-point (FP) multiplication operations and subsequent FP operations specifying as source operands results of the FP multiplications. The FP multiplications and the subsequent FP operations are encoded as combined FP operations including rounding of the results of FP multiplication followed by the subsequent FP operations. The encoding of said combined FP operations may be stored and executed as part of an executable thread portion using fused-multiply-add hardware that includes overflow detection for the product of FP multipliers, first and second FP adders to add third operand addend mantissas and the products of the FP multipliers with different rounding inputs based on overflow, or no overflow, in the products of the FP multiplier.

Type: Application

Filed: June 29, 2012

Publication date: January 2, 2014

Inventors: Sridhar Samudrala, Grigorios Magklis, Marc Lupon, David R. Ditzel
Modular digital signal processing circuitry with optionally usable, dedicated connections between modules of the circuitry

Patent number: 8620977

Abstract: Digital signal processing (“DSP”) circuit blocks are provided that can more easily work together to perform larger (e.g., more complex and/or more arithmetically precise) DSP operations if desired. These DSP blocks may also include redundancy circuitry that facilitates stitching together multiple such blocks despite an inability to use some block (e.g., because of a circuit defect). Systolic registers may be included at various points in the DSP blocks to facilitate use of the blocks to implement systolic form, finite-impulse-response (“FIR”), digital filters.

Type: Grant

Filed: August 7, 2013

Date of Patent: December 31, 2013

Assignee: Altera Corporation

Inventors: Keone Streicher, Martin Langhammer, Yi-Wen Lin, Wai-Bor Leung, David Lewis, Volker Mauer, Henry Y. Lui, Suleyman Sirri Demirsoy, Hyun Yi

1 2 3 next