Patents Examined by Michael D Yaary
  • Patent number: 11966715
    Abstract: A three-dimensional processor (3D-processor) for parallel computing includes a plurality of computing elements. Each computing element comprises at least a three-dimensional memory (3D-M) array for storing at least a portion of a look-up table (LUT) for a mathematical function and an arithmetic logic circuit (ALC) for performing arithmetic operations on the LUT data. Deficiency in latency is offset by a large scale of parallelism.
    Type: Grant
    Filed: July 26, 2020
    Date of Patent: April 23, 2024
    Assignees: HangZhou HaiCun Information Technology Co., Ltd.
    Inventors: Guobiao Zhang, Chen Shen
  • Patent number: 11960566
    Abstract: Systems and methods are provided to eliminate multiplication operations with zero padding data for convolution computations. A multiplication matrix is generated from an input feature map matrix with padding by adjusting coordinates and dimensions of the input feature map matrix to exclude padding data. The multiplication matrix is used to perform matrix multiplications with respective weight values which results in fewer computations as compared to matrix multiplications which include the zero padding data.
    Type: Grant
    Filed: April 13, 2021
    Date of Patent: April 16, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Dana Michelle Vantrease, Ron Diamant
  • Patent number: 11960886
    Abstract: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configurable to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configurable to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.
    Type: Grant
    Filed: April 25, 2022
    Date of Patent: April 16, 2024
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
  • Patent number: 11960567
    Abstract: A method for performing a fundamental computational primitive in a device is provided, where the device includes a processor and a matrix multiplication accelerator (MMA). The method includes configuring a streaming engine in the device to stream data for the fundamental computational primitive from memory, configuring the MMA to format the data, and executing the fundamental computational primitive by the device.
    Type: Grant
    Filed: July 4, 2021
    Date of Patent: April 16, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Arthur John Redfern, Timothy David Anderson, Kai Chirca, Chenchi Luo, Zhenhua Yu
  • Patent number: 11954457
    Abstract: An arithmetic device includes a function storage circuit and an activation function (AF) circuit. The function storage circuit stores and outputs a function selection signal, a first function information signal, and a second function information signal. The AF circuit generates an activation function result data by applying a slope value and a maximum value to a multiplication/accumulation (MAC) result data in a function setting mode that is activated by the function selection signal. The slope value is set based on the first function information signal, and the maximum value is set based on the second function information signal.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: April 9, 2024
    Assignee: SK hynix Inc.
    Inventor: Choung Ki Song
  • Patent number: 11941407
    Abstract: A unit for accumulating a plurality N of multiplied M bit values includes a receiving unit, a bit-wise multiplier and a bit-wise accumulator. The receiving unit receives a pipeline of multiplicands A and B such that, at each cycle, a new set of multiplicands is received. The bit-wise multiplier bit-wise multiplies bits of a current multiplicand A with bits of a current multiplicand B and to sum and carry between bit-wise multipliers. The bit-wise accumulator accumulates output of the bit-wise multiplier thereby to accumulate the multiplicands during the pipelining process.
    Type: Grant
    Filed: April 5, 2020
    Date of Patent: March 26, 2024
    Assignee: GSI Technology Inc.
    Inventor: Avidan Akerib
  • Patent number: 11934824
    Abstract: Methods, apparatuses, and systems for in- or near-memory processing are described. Strings of bits (e.g., vectors) may be fetched and processed in logic of a memory device without involving a separate processing unit. Operations (e.g., arithmetic operations) may be performed on numbers stored in a bit-parallel way during a single sequence of clock cycles. Arithmetic may thus be performed in a single pass as numbers are bits of two or more strings of bits are fetched and without intermediate storage of the numbers. Vectors may be fetched (e.g., identified, transmitted, received) from one or more bit lines. Registers of a memory array may be used to write (e.g., store or temporarily store) results or ancillary bits (e.g., carry bits or carry flags) that facilitate arithmetic operations. Circuitry near, adjacent, or under the memory array may employ XOR or AND (or other) logic to fetch, organize, or operate on the data.
    Type: Grant
    Filed: April 6, 2020
    Date of Patent: March 19, 2024
    Assignee: Micron Technology, Inc.
    Inventors: Dmitri Yudanov, Sean S. Eilert, Sivagnanam Parthasarathy, Shivasankar Gunasekaran, Ameen D. Akel
  • Patent number: 11934798
    Abstract: The present disclosure is directed to systems and methods for a memory device such as, for example, a Processing-In-Memory Device that is configured to perform multiplication operations in memory using a popcount operation. A multiplication operation may include a summation of multipliers being multiplied with corresponding multiplicands. The inputs may be arranged in particular configurations within a memory array. Sense amplifiers may be used to perform the popcount by counting active bits along bit lines. One or more registers may accumulate results for performing the multiplication operations.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: March 19, 2024
    Assignee: Micron Technology, Inc.
    Inventor: Dmitri Yudanov
  • Patent number: 11934479
    Abstract: A method for performing sparse quantum Fourier transform computation includes defining a set of quantum circuits, each quantum circuit comprising a Hadamard gate and a single frequency rotation operator, said set of quantum circuits being equivalent to a quantum Fourier transform circuit. The method includes constructing a subset of said quantum circuits in a quantum processor, said quantum processor having a quantum representation of a classical distribution loaded into a quantum state of said quantum processor. The method includes executing said subset of said quantum circuits on said quantum state, and performing a measurement in a frequency basis to obtain a frequency distribution corresponding to said quantum state.
    Type: Grant
    Filed: October 7, 2020
    Date of Patent: March 19, 2024
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tal Kachman, Mark S. Squillante, Lior Horesh, Kenneth Lee Clarkson, John A. Gunnels, Ismail Yunus Akhalwaya, Jayram Thathachar
  • Patent number: 11924338
    Abstract: A computing system may implement a split random number generator that may use a random number generator to generate and store seed values in a memory for retrieval and use by one or more core processors to generate random numbers for secure processes within each core processor.
    Type: Grant
    Filed: November 4, 2020
    Date of Patent: March 5, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: David A Kaplan, Paul Moyer
  • Patent number: 11922131
    Abstract: A method for performing vector-matrix multiplication may include converting a digital input vector comprising a plurality of binary-encoded values into a plurality of analog signals using a plurality of one-bit digital to analog converters (DACs); sequentially performing, using an analog vector matrix multiplier and based on bit-order, vector-matrix multiplication operations using a weighting matrix for the plurality of analog signals to generate analog outputs of the analog vector matrix multiplier; sequentially performing an analog-to-digital (ADC) operation on the analog outputs of the analog vector matrix multiplier to generate binary partial output vectors; and combining the binary partial output vectors to generate a result of the vector-matrix multiplication.
    Type: Grant
    Filed: November 7, 2020
    Date of Patent: March 5, 2024
    Assignee: Applied Materials, Inc.
    Inventors: Xiaofeng Zhang, She-Hwa Yen
  • Patent number: 11907326
    Abstract: A system for determining the frequency coefficients of a one or multi-dimensional signal that is sparse in the frequency domain includes determining the locations of the non-zero frequency coefficients, and then determining values of the coefficients using the determined locations. If N is total number of frequency coefficients across the one or more dimension of the signal, and if R is an upper bound of the number of non-zero ones of these frequency coefficients, the systems requires up to (O(Rlog(R) (N))) samples and has a computation complexity of up to O(Rlog2(R) log (N). The system and the processing technique are stable to low-level noise and can exhibit only a small probability of failure. The frequency coefficients can be real and positive or they can be complex numbers.
    Type: Grant
    Filed: December 22, 2021
    Date of Patent: February 20, 2024
    Assignee: QUALCOMM Incorporated
    Inventor: Pierre-David Letourneau
  • Patent number: 11900242
    Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.
    Type: Grant
    Filed: March 7, 2022
    Date of Patent: February 13, 2024
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu
  • Patent number: 11899746
    Abstract: The present disclosure relates generally to techniques for efficiently performing operations associated with artificial intelligence (AI), machine learning (ML), and/or deep learning (DL) applications, such as training and/or interference calculations, using an integrated circuit device. More specifically, the present disclosure relates to an integrated circuit design implemented to perform these operations with low latency and/or a high bandwidth of data. For example, embodiments of a computationally dense digital signal processing (DSP) circuitry, implemented to efficiently perform one or more arithmetic operations (e.g., a dot-product) on an input are disclosed. Moreover, embodiments described herein may relate to layout, design, and data scheduling of a processing element array implemented to compute matrix multiplications (e.g., systolic array multiplication).
    Type: Grant
    Filed: December 23, 2021
    Date of Patent: February 13, 2024
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Andrei-Mihai Hagiescu-Miriste
  • Patent number: 11900241
    Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.
    Type: Grant
    Filed: March 7, 2022
    Date of Patent: February 13, 2024
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu
  • Patent number: 11893470
    Abstract: Techniques for neural network processing using specialized data representation are disclosed. Input data for manipulation in a layer of a neural network is obtained. The input data includes image data, where the image data is represented in bfloat16 format without loss of precision. The manipulation of the input data is performed on a processor that supports single-precision operations. The input data is converted to a 16-bit reduced floating-point representation, where the reduced floating-point representation comprises an alternative single-precision data representation mode. The input data is manipulated with one or more 16-bit reduced floating-point data elements. The manipulation includes a multiply and add-accumulate operation. The manipulation further includes a unary operation, a binary operation, or a conversion operation. A result of the manipulating is forwarded to a next layer of the neural network.
    Type: Grant
    Filed: December 5, 2019
    Date of Patent: February 6, 2024
    Assignee: MIPS Tech, LLC
    Inventor: Sanjay Patel
  • Patent number: 11893388
    Abstract: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configured to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configured to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.
    Type: Grant
    Filed: April 13, 2022
    Date of Patent: February 6, 2024
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
  • Patent number: 11886835
    Abstract: A multiplier-accumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. The number of bits in the B input is a number of AND-groups and the number of bits in A is the number of AND gates in an AND-group. Each unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors which sum and scale the charges from the charge transfer lines, the charge coupled to an analog to digital converter which forms the dot product output. The charge transfer lines may span multiple unit elements.
    Type: Grant
    Filed: December 31, 2020
    Date of Patent: January 30, 2024
    Assignee: Ceremorphic, Inc.
    Inventors: Ryan Boesch, Martin Kraemer, Wei Xiong
  • Patent number: 11861327
    Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a first activation value by a least significant sub-word and a most significant sub-word of a first weight to form a first partial product and a second partial product; and adding the first partial product and the second partial product. The forming of the second set of products may include: multiplying, in the first multiplier, a second activation value by a first sub-word and a second sub-word of a mantissa to form a third partial product and a fourth partial product; and adding the third partial product and the fourth partial product.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: January 2, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ali Shafiee Ardestani, Joseph Hassoun
  • Patent number: 11861323
    Abstract: Hardware logic arranged to normalise (or renormalise) an n-bit input number is described in which at least a proportion of a left shifting operation is performed in parallel with a leading zero count operation. In various embodiments the left shifting and the leading zero count are performed independently. In various other embodiments, a subset of the bits output by a leading zero counter are input to a left shifter and the output from the left shifter is input to a renormalisation block which completes the remainder of the left shifting operation independently of any further input from the leading zero counter.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: January 2, 2024
    Assignee: Imagination Technologies Limited
    Inventor: Theo Alan Drane