Patents Examined by Michael D Yaary
  • Patent number: 11861328
    Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products, and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a second multiplier, and a third multiplier, the first activation value by a first least significant sub-word, a second least significant sub-word, and a most significant sub-word; and adding a first resulting partial product and a second resulting partial product. The forming of the second set of products may include forming a first floating point product, the forming of the first floating point product including multiplying, in the first multiplier, a first sub-word of a mantissa of an activation value by a first sub-word of a mantissa of a weight, to form a third partial product.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: January 2, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ali Shafiee Ardestani, Joseph H. Hassoun
  • Patent number: 11853715
    Abstract: A system comprises a floating-point computation unit configured to perform a dot-product operation in accordance with a first floating-point value and a second floating-point value, and detection logic operatively coupled to the floating-point computation unit. The detection logic is configured to compute a difference between fixed-point summations of exponent parts of the first floating-point value and the second floating-point value and, based on the computed difference, detect the presence of a condition prior to completion of the dot-product operation by the floating-point computation unit. In response to detection of the presence of the condition, the detection logic is further configured to cause the floating-point computation unit to avoid performing a subset of computations otherwise performed as part of the dot-product operation.
    Type: Grant
    Filed: November 23, 2020
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Mingu Kang, Seonghoon Woo, Eun Kyung Lee
  • Patent number: 11842166
    Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).
    Type: Grant
    Filed: December 2, 2022
    Date of Patent: December 12, 2023
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 11836214
    Abstract: A matrix calculation device including a storing unit, a multiply accumulate (MAC) circuit, a pre-fetch circuit, and a control circuit, and an operation method thereof are provided. The storing unit stores a first and second matrixes. The MAC circuit is configured to execute MAC calculation. The pre-fetch circuit pre-fetches at least one column of the first matrix from the storing unit to act as pre-fetch data, pre-fetches at least one row of the second matrix from the storing unit to act as the pre-fetch data, or pre-fetches at least one column of the first matrix and at least one row of the second matrix from the storing unit to act as the pre-fetch data. The control circuit decides whether to perform the MAC calculation on a current column of the first matrix and a current row of the second matrix through the MAC circuit according to the pre-fetch data.
    Type: Grant
    Filed: September 28, 2020
    Date of Patent: December 5, 2023
    Assignee: NEUCHIPS CORPORATION
    Inventors: Chiung-Liang Lin, Chao-Yang Kao
  • Patent number: 11829322
    Abstract: A vector memory subsystem for use with a programmable mix-radix vector processor (“PVP”) capable of calculating discrete Fourier transform (“DFT/IDFT”) values. In an exemplary embodiment, an apparatus includes a vector memory bank and a vector memory system (VMS) that generates input memory addresses that are used to store input data into the vector memory bank. The VMS also generates output memory addresses that are used to unload vector data from the memory banks. The input memory addresses are used to shuffle the input data in the memory bank based on a radix factorization associated with an N-point DFT, and the output memory addresses are used to unload the vector data from the memory bank to compute radix factors of the radix factorization.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: November 28, 2023
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Yuanbin Guo, Hong Jik Kim
  • Patent number: 11816448
    Abstract: An ALU is capable of generating a multiply accumulation by compressing like-magnitude partial products. Given N pairs of multiplier and multiplicand, Booth encoding is used to encode the multipliers into M digits, and M partial products are produced for each pair of with each partial product in a smaller precision than a final product. The partial products resulting from the same encoded multiplier digit position, are summed across all the multiplies to produce a summed partial product. In this manner, the partial product summation operations can be advantageously performed in the smaller precision. The M summed partial products are then summed together with an aggregated fixup vector for sign extension. If the N multipliers equal to a constant, a preliminary fixup vector can be generated based on a predetermined value with adjustment on particular bits, where the predetermined value is determined by the signs of the encoded multiplier digits.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: November 14, 2023
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: David Carlson
  • Patent number: 11809795
    Abstract: A method implements fixed-point polynomials in hardware logic. In an embodiment the method comprises distributing a defined error bound for the whole polynomial between operators in a data-flow graph for the polynomial and optimizing each operator to satisfy the part of the error bound allocated to that operator. The distribution of errors between operators is updated in an iterative process until a stop condition (such as a maximum number of iterations) is reached.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: November 7, 2023
    Assignee: Imagination Technologies Limited
    Inventor: Theo Alan Drane
  • Patent number: 11809515
    Abstract: Some embodiments provide an IC for implementing a machine-trained network with multiple layers. The IC includes a set of circuits to compute a dot product of (i) a first number of input values computed by other circuits of the IC and (ii) a set of predefined weight values, several of which are zero, with a weight value for each of the input values. The set of circuits includes (i) a dot product computation circuit to compute the dot product based on a second number of inputs and (ii) for each input value, at least two sets of wires for providing the input value to at least two of the dot product computation circuit inputs. The second number is less than the first number. Each input value with a corresponding weight value that is not equal to zero is provided to a different one of the dot product computation circuit inputs.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: November 7, 2023
    Assignee: PERCEIVE CORPORATION
    Inventors: Kenneth Duong, Jung Ko, Steven L. Teig
  • Patent number: 11809836
    Abstract: A system includes a fixed-point accumulator for storing numbers in an anchored fixed-point number format, a data interface arranged to receive a plurality of weight values and a plurality of data values represented in a floating-point number format, and logic circuitry. The logic circuitry is configured to: determine an anchor value indicative of a value of a lowest significant bit of the anchored fixed-point number format; convert at least a portion of the plurality of data values to the anchored fixed-point number format; perform MAC operations between the converted at least portion and respective weight values, using fixed-point arithmetic, to generate an accumulation value in the anchored fixed-point number format; and determine an output element of a later of a neural network in dependence on the accumulation value.
    Type: Grant
    Filed: August 27, 2020
    Date of Patent: November 7, 2023
    Assignee: Arm Limited
    Inventors: Daren Croxford, Guy Larri
  • Patent number: 11797829
    Abstract: The product-sum operation device includes a product operator and a sum operator. The product operator includes a plurality of product operation elements, and an alternative element that, when any of the plurality of product operation elements has malfunctioned, is used instead of the malfunctioning product operation element. Each of the plurality of product operation elements and the alternative element is a resistance change element. The sum operator includes an output detector which detects a sum of outputs from the plurality of product operation elements when the alternative element is not used.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: October 24, 2023
    Assignee: TDK CORPORATION
    Inventors: Tatsuo Shibata, Tomoyuki Sasaki
  • Patent number: 11790217
    Abstract: An apparatus is described. The apparatus includes a long short term memory (LSTM) circuit having a multiply accumulate circuit (MAC). The MAC circuit has circuitry to rely on a stored product term rather than explicitly perform a multiplication operation to determine the product term if an accumulation of differences between consecutive, preceding input values has not reached a threshold.
    Type: Grant
    Filed: September 25, 2019
    Date of Patent: October 17, 2023
    Assignee: Intel Corporation
    Inventors: Ram Krishnamurthy, Gregory K. Chen, Raghavan Kumar, Phil Knag, Huseyin Ekin Sumbul
  • Patent number: 11782498
    Abstract: An electronic device includes a coding module that determines whether a parameter of an artificial neural network is an outlier, depending on a value of the parameter and compresses the parameter by truncating a first bit of the parameter when the parameter is a non-outlier and truncating a second bit of the parameter when the parameter is the outlier, and a decoding module that decodes a compressed parameter.
    Type: Grant
    Filed: February 17, 2020
    Date of Patent: October 10, 2023
    Assignee: University-Industry Cooperation Group of Kyung Hee University
    Inventors: Ik Joon Chang, Ho Nguyen Dong, Minhson Le
  • Patent number: 11775257
    Abstract: Techniques for operating on and calculating binary floating-point numbers using an enhanced floating-point number format are presented. The enhanced format can comprise a single sign bit, six bits for the exponent, and nine bits for the fraction. Using six bits for the exponent can provide an enhanced exponent range that facilitates desirably fast convergence of computing-intensive algorithms and low error rates for computing-intensive applications. The enhanced format can employ a specified definition for the lowest binade that enables the lowest binade to be used for zero and normal numbers; and a specified definition for the highest binade that enables it to be structured to have one data point used for a merged Not-a-Number (NaN)/infinity symbol and remaining data points used for finite numbers. The signs of zero and merged NaN/infinity can be “don't care” terms. The enhanced format employs only one rounding mode, which is for rounding toward nearest up.
    Type: Grant
    Filed: April 6, 2020
    Date of Patent: October 3, 2023
    Assignee: International Business Machines Corporation
    Inventors: Silvia Melitta Mueller, Ankur Agrawal, Bruce Fleischer, Kailash Gopalakrishnan, Dongsoo Lee
  • Patent number: 11775258
    Abstract: The present invention extends to methods, systems, and computing system program products for elimination of rounding error accumulation in iterative calculations for Big Data or streamed data. Embodiments of the invention include iteratively calculating a function for a primary computation window of a pre-defined size while incrementally calculating the function for one or more backup computation windows started at different time points and whenever one of the backup computation windows reaches a size of the pre-defined size, swapping the primary computation window and the backup computation window. The result(s) of the function is/are generated by either the iterative calculation performed for the primary computation window or the incremental calculation performed for a backup computation window which reaches the pre-defined size.
    Type: Grant
    Filed: September 13, 2021
    Date of Patent: October 3, 2023
    Assignee: CLOUD & STREAM GEARS LLC
    Inventors: Jizhu Lu, Lihang Lu
  • Patent number: 11768664
    Abstract: A graphics processing unit (GPU) implements operations, with associated op codes, to perform mixed precision mathematical operations. The GPU includes an arithmetic logic unit (ALU) with different execution paths, wherein each execution path executes a different mixed precision operation. By implementing mixed precision operations at the ALU in response to designate op codes that delineate the operations, the GPU efficiently increases the precision of specified mathematical operations while reducing execution overhead.
    Type: Grant
    Filed: October 2, 2019
    Date of Patent: September 26, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Bin He, Michael Mantor, Jiasheng Chen
  • Patent number: 11768659
    Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: September 26, 2023
    Assignee: SINGULAR COMPUTING LLC
    Inventor: Joseph Bates
  • Patent number: 11768660
    Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).
    Type: Grant
    Filed: January 26, 2023
    Date of Patent: September 26, 2023
    Assignee: SINGULAR COMPUTING LLC
    Inventor: Joseph Bates
  • Patent number: 11769041
    Abstract: Systems, apparatuses, and methods for implementing a low latency long short-term memory (LSTM) machine learning engine using sequence interleaving techniques are disclosed. A computing system includes at least a host processing unit, a machine learning engine, and a memory. The host processing unit detects a plurality of sequences which will be processed by the machine learning engine. The host processing unit interleaves the sequences into data blocks and stores the data blocks in the memory. When the machine learning engine receives a given data block, the machine learning engine performs, in parallel, a plurality of matrix multiplication operations on the plurality of sequences in the given data block and a plurality of coefficients. Then, the outputs of the matrix multiplication operations are coupled to one or more LSTM layers.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: September 26, 2023
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Sateesh Lagudu, Lei Zhang, Allen H. Rush
  • Patent number: 11764757
    Abstract: This application relates to a signal filtering device. The device includes a memory and a processor. The processor may generate one or more matrices based on a size of a digital filter bank that generates an output signal by dividing an input signal into a plurality of channels and store in the memory each of the generated one or more matrices to which a plurality of digital filter bank coefficients or a plurality of input signals are assigned. The processor may also partially calculate the stored plurality of digital filter bank coefficients and the plurality of signals based on a number of at least some of the plurality of channels, and calculate the calculated digital filter bank coefficients and signals. The processor may further perform a discrete Fourier transform (DFT) on the calculated signal and compensate for a phase of the discrete Fourier transformed signal according to a preset reference.
    Type: Grant
    Filed: February 9, 2021
    Date of Patent: September 19, 2023
    Assignee: AGENCY FOR DEFENSE DEVELOPMENT
    Inventors: Yeonsoo Jang, Jintae Park, Beomjun Park, Insun Kim, Ghiback Kim
  • Patent number: 11763162
    Abstract: A dynamic gradient calibration method for a computing-in-memory neural network is performed to update a plurality of weights in a computing-in-memory circuit according to a plurality of inputs corresponding to a correct answer. A forward operating step includes performing a bit wise multiply-accumulate operation on a plurality of divided inputs and a plurality of divided weights to generate a plurality of multiply-accumulate values, and performing a clamping function on the multiply-accumulate values to generate a plurality of clamped multiply-accumulate values according to a predetermined upper bound value, and comparing the clamped multiply-accumulate values with the correct answer to generate a plurality of loss values. A backward operating step includes performing a partial differential operation on the loss values relative to the weights to generate a weight-based gradient. The weights are updated according to the weight-based gradient.
    Type: Grant
    Filed: June 16, 2020
    Date of Patent: September 19, 2023
    Assignee: NATIONAL TSING HUA UNIVERSITY
    Inventors: Meng-Fan Chang, Wei-Hsing Huang, Ta-Wei Liu