Patents Examined by Carlo Waje
  • Patent number: 11762946
    Abstract: Convolution with a 5×5 kernel involves computing the dot product of a 5×5 data block with a 5×5 kernel. Instead of computing this dot product as a single sum of 25 products, the dot product is computed as a sum of four partial sums, where each partial sum is computed as a dot product of a 3×3 data block with a 3×3 kernel. The four partial sums may be computed by a single 3×3 convolver unit over four time periods. During each time period, at least some of the weights received by the 3×3 convolver unit may correspond to a quadrant of weights from the 5×5 kernel. A shifter circuit provides shifted columns (left or right shifted) of the input data to the 3×3 convolver unit, allowing the 3×3 convolver unit access to the 3×3 data block that spatially corresponds to a particular quadrant of weights from the 5×5 kernel.
    Type: Grant
    Filed: September 23, 2022
    Date of Patent: September 19, 2023
    Assignee: Recogni Inc.
    Inventors: Gary S. Goldman, Shabarivas Abhiram
  • Patent number: 11714604
    Abstract: An embodiment method for determining a carry digit indicator bit of a first binary datum includes a step for processing of the first binary datum masked by a masking operation, and not including any processing step of the first binary datum.
    Type: Grant
    Filed: September 30, 2020
    Date of Patent: August 1, 2023
    Assignees: STMICROELECTRONICS (ROUSSET) SAS, STMICROELECTRONICS (GPENOBLE 2) SAS
    Inventors: Rene Peyrard, Fabrice Romain
  • Patent number: 11687615
    Abstract: Systems, apparatuses, and methods for implementing a tiling algorithm for a matrix math instruction set are disclosed. A system includes at least a memory, a cache, a processor, and a plurality of compute units. The memory stores a plurality of matrix elements in a linear format, and the processor converts the plurality of matrix elements from the linear format to a tiling format. Each compute unit retrieves a plurality of matrix elements from the memory into the cache. Each compute unit includes a matrix operations unit which loads the plurality of matrix elements of corresponding tile(s) from the cache and performs a matrix operation on the plurality of matrix elements to generate a result in the tiling format. The system generates a classification of a first dataset based on results of the matrix operations.
    Type: Grant
    Filed: December 26, 2018
    Date of Patent: June 27, 2023
    Inventor: Hua Zhang
  • Patent number: 11669302
    Abstract: An in-memory vector addition method for a dynamic random access memory (DRAM) is disclosed which includes consecutively transposing two numbers across a plurality of rows of the DRAM, each number transposed across a fixed number of rows associated with a corresponding number of bits, assigning a scratch-pad including two consecutive bits for each bit of each number being added, two consecutive bits for carry-in (Cin), and two consecutive bits for carry-out-bar (Cout), assigning a plurality of bits in a transposed orientation to hold results as a sum of the two numbers, for each bit position of the two numbers: computing the associated sum of the bit position; and placing the computed sum in the associated bit of the sum.
    Type: Grant
    Filed: October 15, 2020
    Date of Patent: June 6, 2023
    Assignee: Purdue Research Foundation
    Inventors: Mustafa Ali, Akhilesh Jaiswal, Kaushik Roy
  • Patent number: 11640280
    Abstract: Embodiments relate to a system for solving differential equations. The system is configured to receive problem packages corresponding to problems to be solved, each comprising at least a differential equation and a domain. A solver stores a plurality of nodes of the domain corresponding to a first time-step, and processes the nodes over a plurality of time-steps using a systolic array comprising hardware for solving the particular type of the differential equation. The systolic array processes each node to generate a node for a subsequent time-step using a sub-array comprising a plurality of branches, each branch comprising a respective set of arithmetic units arranged in accordance with a corresponding term of the discretized form of the differential equation, and an aggregator configured to aggregate the corresponding terms from each branch to generate node data for the subsequent time-step.
    Type: Grant
    Filed: June 24, 2022
    Date of Patent: May 2, 2023
    Assignee: Vorticity Inc.
    Inventor: Chirath Neranjena Thouppuarachchi
  • Patent number: 11640278
    Abstract: A random number generation device includes: a plurality of first uniform random number generators configured to respectively generate a plurality of first uniform random numbers; a plurality of first normal random number generators configured to respectively generate a plurality of first normal random numbers based on the plurality of first uniform random numbers; a plurality of second uniform random number generators configured to perform a logical operation on bit values of two or more of the first uniform random numbers to respectively generate a plurality of second uniform random numbers; and at least one second normal random number generator configured to generate at least one second normal random number based on the plurality of second uniform random numbers.
    Type: Grant
    Filed: August 27, 2020
    Date of Patent: May 2, 2023
    Assignee: FUJITSU LIMITED
    Inventor: Kentaro Katayama
  • Patent number: 11620107
    Abstract: The invention is directed to a Quantum Random Number Generator comprising an emitting device (110) triggered by a signal representing an input bit x and adapted to generate and send a physical system (130) characterized by one of two possible quantum states determined by said input bit x, a measurement device (120) adapted to detect said physical system, to identify the quantum state of said physical system through an unambiguous state discrimination measurement and to generate an output b first representing whether the quantum state has been identified or not and, if it has been identified, which quantum state among the two possible quantum states was detected by the unambiguous state discrimination measurement to a processing device (140), the processing device (140) being adapted to estimate the entropy of the output b given the probabilities p(b|x) representing the probability of observing output b for a state preparation x, and a randomness extraction device (150) adapted to extract final random bit stre
    Type: Grant
    Filed: October 6, 2017
    Date of Patent: April 4, 2023
    Assignee: UNIVERSITÉ DE GENÈVE
    Inventors: Anthony Christophe Mickaël Martin, Nicolas Brunner, Hugo Zbinden, Jonatan Brask, Joseph Bowles
  • Patent number: 11620108
    Abstract: A random number generation system may generate one or more random numbers based on the repeated programming of a memory, such as a flash memory. As an example, a control system may repeatedly store a sequence to a block of flash memory to force a plurality of cells into a random state such that, at any given instant, the values in the cells may be random. The control system may identify which of the cells contain random values and then generate based on the identified values a number that is truly random.
    Type: Grant
    Filed: May 17, 2019
    Date of Patent: April 4, 2023
    Assignee: Board of Trustees of the University of Alabama for and on behalf of the University of Alabama in Huntsville
    Inventors: Biswajit Ray, Aleksander Milenkovic
  • Patent number: 11580192
    Abstract: A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: February 14, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Rakesh Komuravelli, Krishnakumar Narayanan Nair, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Patent number: 11561769
    Abstract: A random number generator including: a first ring oscillator including a first inverter chain, the first inverter chain including a plurality of serially connected first inverters, the first ring oscillator configured to output a first random signal generated at a first sub-node between two neighboring first inverters among the plurality of first inverters; a second ring oscillator including a second inverter chain, the second inverter chain including a plurality of serially connected second inverters, the second ring oscillator configured to output a second random signal generated at a second sub-node between two neighboring second inverters among the plurality of second inverters; and a signal processing circuit for generating a random number by combining the first random signal with the second random signal.
    Type: Grant
    Filed: August 15, 2019
    Date of Patent: January 24, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ji-eun Park, Yong-ki Lee, Yun-hyeok Choi, Bohdan Karpinskyy
  • Patent number: 11531727
    Abstract: Some embodiments provide a method for a circuit that executes a neural network including multiple nodes. The method loads a set of weight values for a node into a set of weight value buffers, a first set of bits of each input value of a set of input values for the node into a first set of input value buffers, and a second set of bits of each of the input values into a second set of input value buffers. The method computes a first dot product of the weight values and the first set of bits of each input value and a second dot product of the weight values and the second set of bits of each input value. The method shifts the second dot product by a particular number of bits and adds the first dot product with the bit-shifted second dot product to compute a dot product for the node.
    Type: Grant
    Filed: December 6, 2018
    Date of Patent: December 20, 2022
    Assignee: PERCEIVE CORPORATION
    Inventors: Jung Ko, Kenneth Duong, Steven L. Teig
  • Patent number: 11528033
    Abstract: A deep neural network (“DNN”) module compresses and decompresses neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit receives an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit receives a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion.
    Type: Grant
    Filed: April 13, 2018
    Date of Patent: December 13, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Joseph Leon Corkery, Benjamin Eliot Lundell, Larry Marvin Wall, Chad Balling McBride, Amol Ashok Ambardekar, George Petre, Kent D. Cedola, Boris Bobrov
  • Patent number: 11520561
    Abstract: Described herein is a neural network accelerator with a set of neural processing units and an instruction set for execution on the neural processing units. The instruction set is a compact instruction set including various compute and data move instructions for implementing a neural network. Among the compute instructions are an instruction for performing a fused operation comprising sequential computations, one of which involves matrix multiplication, and an instruction for performing an elementwise vector operation. The instructions in the instruction set are highly configurable and can handle data elements of variable size. The instructions also implement a synchronization mechanism that allows asynchronous execution of data move and compute operations across different components of the neural network accelerator as well as between multiple instances of the neural network accelerator.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: December 6, 2022
    Assignee: Amazon Technologies, Inc.
    Inventor: Tariq Afzal
  • Patent number: 11514136
    Abstract: A circuit for performing parallel convolutional computation for features and kernels of variable sizes may receive inputs of an m×n matrix of feature data, an m×n matrix of convolution data, and a (2m?1)×(2n?1) matrix of kernel data. A feature manager of the circuit may hold m rows of n data buffers storing the input feature data and rotating values between rows during one restricted convolution calculation. A kernel manager of the circuit may hold a (2m?1)×(2n?1) matrix of data buffers storing the input kernel data in the buffers and cyclically rotating values in upwards, downwards, leftwards and rightwards directions for different restricted convolution calculations. A row convolution engine of the circuit may hold m row convolution processors, each storing and updating input convolution data by multiplication-and-accumulation (MAC) operations on its input feature and kernel data rows. The circuit produces accumulated convolutional data.
    Type: Grant
    Filed: May 14, 2020
    Date of Patent: November 29, 2022
    Assignee: Aspiring Sky Co. Limited
    Inventors: Yujie Wen, Zhijiong Luo
  • Patent number: 11500613
    Abstract: A memory unit with a multiply-accumulate assist scheme for a plurality of multi-bit convolutional neural network based computing-in-memory applications is controlled by a reference voltage, a word line and a multi-bit input voltage. The memory unit includes a non-volatile memory cell, a voltage divider and a voltage keeper. The non-volatile memory cell is controlled by the word line and stores a weight. The voltage divider includes a data line and generates a charge current on the data line according to the reference voltage, and a voltage level of the data line is generated by the non-volatile memory cell and the charge current. The voltage keeper generates an output current on an output node according to the multi-bit input voltage and the voltage level of the data line, and the output current is corresponding to the multi-bit input voltage multiplied by the weight.
    Type: Grant
    Filed: February 6, 2020
    Date of Patent: November 15, 2022
    Assignee: NATIONAL TSING HUA UNIVERSITY
    Inventors: Meng-Fan Chang, Han-Wen Hu, Kuang-Tang Chang
  • Patent number: 11500612
    Abstract: The present disclosure relates generally to arithmetic units of processors, and may relate more particularly to multi-cycle division operations. Multiple-cycles of a radix-m division operation may be performed to generate one or more signal states representative of a result value based at least in part on a dividend value and a divisor value.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: November 15, 2022
    Assignee: Arm Limited
    Inventor: Javier Diaz Bruguera
  • Patent number: 11487846
    Abstract: Embodiments relate to a neural processor circuit including a plurality of neural engine circuits, a data buffer, and a kernel fetcher circuit. At least one of the neural engine circuits is configured to receive matrix elements of a matrix as at least the portion of the input data from the data buffer over multiple processing cycles. The at least one neural engine circuit further receives vector elements of a vector from the kernel fetcher circuit, wherein each of the vector elements is extracted as a corresponding kernel to the at least one neural engine circuit in each of the processing cycles. The at least one neural engine circuit performs multiplication between the matrix and the vector as a convolution operation to produce at least one output channel of the output data.
    Type: Grant
    Filed: May 4, 2018
    Date of Patent: November 1, 2022
    Assignee: Apple Inc.
    Inventors: Christopher L. Mills, Erik K. Norden, Sung Hee Park
  • Patent number: 11487023
    Abstract: The disclosure relates to a method for measuring the variance in a measurement signal, comprising the following steps: filtering the measurement signal by means of a high-pass filter in order to obtain a filtered measurement signal; determining the variance by using the filtered measurement signal.
    Type: Grant
    Filed: December 12, 2016
    Date of Patent: November 1, 2022
    Assignee: Robert Bosch GmbH
    Inventor: Aaron Troost
  • Patent number: 11468145
    Abstract: Some embodiments provide a neural network inference circuit (NNIC) for executing a NN that includes multiple computation nodes at multiple layers. Each of a set of the computation nodes includes a dot product of input values and weight values. The NNIC includes a set of dot product cores, each of which includes (i) partial dot product computation circuits to compute dot products between input values and weight values and (ii) memories to store the sets of weight values and sets of input values for a layer of the neural network. The input values for a particular layer are arranged in a plurality of two-dimensional grids. A particular core stores all of the input values of a subset of the two-dimensional grids. Input values having a same set of coordinates in each respective grid of the subset of the two-dimensional grids are stored sequentially within the memories of the particular core.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: October 11, 2022
    Assignee: PERCEIVE CORPORATION
    Inventors: Kenneth Duong, Jung Ko, Steven L. Teig
  • Patent number: 11467804
    Abstract: A computer-implemented method for programming an integrated circuit includes receiving a program design and determining one or more addition operations based on the program design. The method also includes performing geometric synthesis based on the one or more addition operations by determining a plurality of bits associated with the one or more addition operations and defining a plurality of counters that includes the plurality of bits. Furthermore, the method includes generating instructions configured to cause circuitry configured to perform the one or more addition operations to be implemented on the integrated circuit based on the plurality of counters. The circuitry includes first adder circuitry configured to add a portion of the plurality of bits and produce a carry-out value. The circuitry also includes second adder circuitry configured to determine a sum of a second portion of the plurality of bits and the carry-out value.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: October 11, 2022
    Assignee: Intel Corporation
    Inventors: Sergey Vladimirovich Gribok, Gregg William Baeckler, Martin Langhammer