Patents Examined by Michael D Yaary
  • Patent number: 11907326
    Abstract: A system for determining the frequency coefficients of a one or multi-dimensional signal that is sparse in the frequency domain includes determining the locations of the non-zero frequency coefficients, and then determining values of the coefficients using the determined locations. If N is total number of frequency coefficients across the one or more dimension of the signal, and if R is an upper bound of the number of non-zero ones of these frequency coefficients, the systems requires up to (O(Rlog(R) (N))) samples and has a computation complexity of up to O(Rlog2(R) log (N). The system and the processing technique are stable to low-level noise and can exhibit only a small probability of failure. The frequency coefficients can be real and positive or they can be complex numbers.
    Type: Grant
    Filed: December 22, 2021
    Date of Patent: February 20, 2024
    Assignee: QUALCOMM Incorporated
    Inventor: Pierre-David Letourneau
  • Patent number: 11899746
    Abstract: The present disclosure relates generally to techniques for efficiently performing operations associated with artificial intelligence (AI), machine learning (ML), and/or deep learning (DL) applications, such as training and/or interference calculations, using an integrated circuit device. More specifically, the present disclosure relates to an integrated circuit design implemented to perform these operations with low latency and/or a high bandwidth of data. For example, embodiments of a computationally dense digital signal processing (DSP) circuitry, implemented to efficiently perform one or more arithmetic operations (e.g., a dot-product) on an input are disclosed. Moreover, embodiments described herein may relate to layout, design, and data scheduling of a processing element array implemented to compute matrix multiplications (e.g., systolic array multiplication).
    Type: Grant
    Filed: December 23, 2021
    Date of Patent: February 13, 2024
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Andrei-Mihai Hagiescu-Miriste
  • Patent number: 11900242
    Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.
    Type: Grant
    Filed: March 7, 2022
    Date of Patent: February 13, 2024
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu
  • Patent number: 11900241
    Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.
    Type: Grant
    Filed: March 7, 2022
    Date of Patent: February 13, 2024
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu
  • Patent number: 11893470
    Abstract: Techniques for neural network processing using specialized data representation are disclosed. Input data for manipulation in a layer of a neural network is obtained. The input data includes image data, where the image data is represented in bfloat16 format without loss of precision. The manipulation of the input data is performed on a processor that supports single-precision operations. The input data is converted to a 16-bit reduced floating-point representation, where the reduced floating-point representation comprises an alternative single-precision data representation mode. The input data is manipulated with one or more 16-bit reduced floating-point data elements. The manipulation includes a multiply and add-accumulate operation. The manipulation further includes a unary operation, a binary operation, or a conversion operation. A result of the manipulating is forwarded to a next layer of the neural network.
    Type: Grant
    Filed: December 5, 2019
    Date of Patent: February 6, 2024
    Assignee: MIPS Tech, LLC
    Inventor: Sanjay Patel
  • Patent number: 11893388
    Abstract: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configured to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configured to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.
    Type: Grant
    Filed: April 13, 2022
    Date of Patent: February 6, 2024
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
  • Patent number: 11886835
    Abstract: A multiplier-accumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. The number of bits in the B input is a number of AND-groups and the number of bits in A is the number of AND gates in an AND-group. Each unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors which sum and scale the charges from the charge transfer lines, the charge coupled to an analog to digital converter which forms the dot product output. The charge transfer lines may span multiple unit elements.
    Type: Grant
    Filed: December 31, 2020
    Date of Patent: January 30, 2024
    Assignee: Ceremorphic, Inc.
    Inventors: Ryan Boesch, Martin Kraemer, Wei Xiong
  • Patent number: 11861323
    Abstract: Hardware logic arranged to normalise (or renormalise) an n-bit input number is described in which at least a proportion of a left shifting operation is performed in parallel with a leading zero count operation. In various embodiments the left shifting and the leading zero count are performed independently. In various other embodiments, a subset of the bits output by a leading zero counter are input to a left shifter and the output from the left shifter is input to a renormalisation block which completes the remainder of the left shifting operation independently of any further input from the leading zero counter.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: January 2, 2024
    Assignee: Imagination Technologies Limited
    Inventor: Theo Alan Drane
  • Patent number: 11861327
    Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a first activation value by a least significant sub-word and a most significant sub-word of a first weight to form a first partial product and a second partial product; and adding the first partial product and the second partial product. The forming of the second set of products may include: multiplying, in the first multiplier, a second activation value by a first sub-word and a second sub-word of a mantissa to form a third partial product and a fourth partial product; and adding the third partial product and the fourth partial product.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: January 2, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ali Shafiee Ardestani, Joseph Hassoun
  • Patent number: 11861328
    Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products, and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a second multiplier, and a third multiplier, the first activation value by a first least significant sub-word, a second least significant sub-word, and a most significant sub-word; and adding a first resulting partial product and a second resulting partial product. The forming of the second set of products may include forming a first floating point product, the forming of the first floating point product including multiplying, in the first multiplier, a first sub-word of a mantissa of an activation value by a first sub-word of a mantissa of a weight, to form a third partial product.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: January 2, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ali Shafiee Ardestani, Joseph H. Hassoun
  • Patent number: 11853715
    Abstract: A system comprises a floating-point computation unit configured to perform a dot-product operation in accordance with a first floating-point value and a second floating-point value, and detection logic operatively coupled to the floating-point computation unit. The detection logic is configured to compute a difference between fixed-point summations of exponent parts of the first floating-point value and the second floating-point value and, based on the computed difference, detect the presence of a condition prior to completion of the dot-product operation by the floating-point computation unit. In response to detection of the presence of the condition, the detection logic is further configured to cause the floating-point computation unit to avoid performing a subset of computations otherwise performed as part of the dot-product operation.
    Type: Grant
    Filed: November 23, 2020
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Mingu Kang, Seonghoon Woo, Eun Kyung Lee
  • Patent number: 11842166
    Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).
    Type: Grant
    Filed: December 2, 2022
    Date of Patent: December 12, 2023
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 11836214
    Abstract: A matrix calculation device including a storing unit, a multiply accumulate (MAC) circuit, a pre-fetch circuit, and a control circuit, and an operation method thereof are provided. The storing unit stores a first and second matrixes. The MAC circuit is configured to execute MAC calculation. The pre-fetch circuit pre-fetches at least one column of the first matrix from the storing unit to act as pre-fetch data, pre-fetches at least one row of the second matrix from the storing unit to act as the pre-fetch data, or pre-fetches at least one column of the first matrix and at least one row of the second matrix from the storing unit to act as the pre-fetch data. The control circuit decides whether to perform the MAC calculation on a current column of the first matrix and a current row of the second matrix through the MAC circuit according to the pre-fetch data.
    Type: Grant
    Filed: September 28, 2020
    Date of Patent: December 5, 2023
    Assignee: NEUCHIPS CORPORATION
    Inventors: Chiung-Liang Lin, Chao-Yang Kao
  • Patent number: 11829322
    Abstract: A vector memory subsystem for use with a programmable mix-radix vector processor (“PVP”) capable of calculating discrete Fourier transform (“DFT/IDFT”) values. In an exemplary embodiment, an apparatus includes a vector memory bank and a vector memory system (VMS) that generates input memory addresses that are used to store input data into the vector memory bank. The VMS also generates output memory addresses that are used to unload vector data from the memory banks. The input memory addresses are used to shuffle the input data in the memory bank based on a radix factorization associated with an N-point DFT, and the output memory addresses are used to unload the vector data from the memory bank to compute radix factors of the radix factorization.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: November 28, 2023
    Assignee: Marvell Asia Pte, Ltd.
    Inventors: Yuanbin Guo, Hong Jik Kim
  • Patent number: 11816448
    Abstract: An ALU is capable of generating a multiply accumulation by compressing like-magnitude partial products. Given N pairs of multiplier and multiplicand, Booth encoding is used to encode the multipliers into M digits, and M partial products are produced for each pair of with each partial product in a smaller precision than a final product. The partial products resulting from the same encoded multiplier digit position, are summed across all the multiplies to produce a summed partial product. In this manner, the partial product summation operations can be advantageously performed in the smaller precision. The M summed partial products are then summed together with an aggregated fixup vector for sign extension. If the N multipliers equal to a constant, a preliminary fixup vector can be generated based on a predetermined value with adjustment on particular bits, where the predetermined value is determined by the signs of the encoded multiplier digits.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: November 14, 2023
    Assignee: Marvell Asia Pte, Ltd.
    Inventor: David Carlson
  • Patent number: 11809515
    Abstract: Some embodiments provide an IC for implementing a machine-trained network with multiple layers. The IC includes a set of circuits to compute a dot product of (i) a first number of input values computed by other circuits of the IC and (ii) a set of predefined weight values, several of which are zero, with a weight value for each of the input values. The set of circuits includes (i) a dot product computation circuit to compute the dot product based on a second number of inputs and (ii) for each input value, at least two sets of wires for providing the input value to at least two of the dot product computation circuit inputs. The second number is less than the first number. Each input value with a corresponding weight value that is not equal to zero is provided to a different one of the dot product computation circuit inputs.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: November 7, 2023
    Assignee: PERCEIVE CORPORATION
    Inventors: Kenneth Duong, Jung Ko, Steven L. Teig
  • Patent number: 11809836
    Abstract: A system includes a fixed-point accumulator for storing numbers in an anchored fixed-point number format, a data interface arranged to receive a plurality of weight values and a plurality of data values represented in a floating-point number format, and logic circuitry. The logic circuitry is configured to: determine an anchor value indicative of a value of a lowest significant bit of the anchored fixed-point number format; convert at least a portion of the plurality of data values to the anchored fixed-point number format; perform MAC operations between the converted at least portion and respective weight values, using fixed-point arithmetic, to generate an accumulation value in the anchored fixed-point number format; and determine an output element of a later of a neural network in dependence on the accumulation value.
    Type: Grant
    Filed: August 27, 2020
    Date of Patent: November 7, 2023
    Assignee: Arm Limited
    Inventors: Daren Croxford, Guy Larri
  • Patent number: 11809795
    Abstract: A method implements fixed-point polynomials in hardware logic. In an embodiment the method comprises distributing a defined error bound for the whole polynomial between operators in a data-flow graph for the polynomial and optimizing each operator to satisfy the part of the error bound allocated to that operator. The distribution of errors between operators is updated in an iterative process until a stop condition (such as a maximum number of iterations) is reached.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: November 7, 2023
    Assignee: Imagination Technologies Limited
    Inventor: Theo Alan Drane
  • Patent number: 11797829
    Abstract: The product-sum operation device includes a product operator and a sum operator. The product operator includes a plurality of product operation elements, and an alternative element that, when any of the plurality of product operation elements has malfunctioned, is used instead of the malfunctioning product operation element. Each of the plurality of product operation elements and the alternative element is a resistance change element. The sum operator includes an output detector which detects a sum of outputs from the plurality of product operation elements when the alternative element is not used.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: October 24, 2023
    Assignee: TDK CORPORATION
    Inventors: Tatsuo Shibata, Tomoyuki Sasaki
  • Patent number: 11790217
    Abstract: An apparatus is described. The apparatus includes a long short term memory (LSTM) circuit having a multiply accumulate circuit (MAC). The MAC circuit has circuitry to rely on a stored product term rather than explicitly perform a multiplication operation to determine the product term if an accumulation of differences between consecutive, preceding input values has not reached a threshold.
    Type: Grant
    Filed: September 25, 2019
    Date of Patent: October 17, 2023
    Assignee: Intel Corporation
    Inventors: Ram Krishnamurthy, Gregory K. Chen, Raghavan Kumar, Phil Knag, Huseyin Ekin Sumbul