Patents Examined by Michael D Yaary
  • Patent number: 11755901
    Abstract: An apparatus for applying dynamic quantization of a neural network is described herein. The apparatus includes a scaling unit and a quantizing unit. The scaling unit is to calculate an initial desired scale factors of a plurality of inputs, weights and a bias and apply the input scale factor to a summation node. Also, the scaling unit is to determine a scale factor for a multiplication node based on the desired scale factors of the inputs and select a scale factor for an activation function and an output node. The quantizing unit is to dynamically requantize the neural network by traversing a graph of the neural network.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: September 12, 2023
    Assignee: Intel Corporation
    Inventor: Michael E. Deisher
  • Patent number: 11748061
    Abstract: A mass multiplier implemented as an integrated circuit has a port receiving a stream of discrete values and circuitry multiplying each value as received by a plurality of weight values simultaneously. An output channel provides products of the mass multiplier as produced. The mass multiplier is applied to neural network nodes.
    Type: Grant
    Filed: October 15, 2020
    Date of Patent: September 5, 2023
    Assignee: GIGANTOR TECHNOLOGIES INC.
    Inventor: Mark Ashley Mathews
  • Patent number: 11748657
    Abstract: Machine-learning methods and apparatus are provided to solve blind source separation problems with an unknown number of sources and having a signal propagation model with features such as wave-like propagation, medium-dependent velocity, attenuation, diffusion, and/or advection, between sources and sensors. In exemplary embodiments, multiple trials of non-negative matrix factorization are performed for a fixed number of sources, with selection criteria applied to determine successful trials. A semi-supervised clustering procedure is applied to trial results, and the clustering results are evaluated for robustness using measures for reconstruction quality and cluster separation. The number of sources is determined by comparing these measures for different trial numbers of sources. Source locations and parameters of the signal propagation model can also be determined.
    Type: Grant
    Filed: September 14, 2020
    Date of Patent: September 5, 2023
    Assignee: Triad National Security, LLC
    Inventors: Boian S. Alexandrov, Ludmil B. Alexandrov, Filip L. Iliev, Valentin G. Stanev, Velimir V. Vesselinov
  • Patent number: 11740870
    Abstract: A Multiple Accumulate (MAC) hardware accelerator includes a plurality of multipliers. The plurality of multipliers multiply a digit-serial input having a plurality of digits by a parallel input having a plurality of bits by sequentially multiplying individual digits of the digit-serial input by the plurality of bits of the parallel input. A result is generated based on the multiplication of the digit-serial input by the parallel input. An accelerator framework may include multiple MAC hardware accelerators, and may be used to implement a convolutional neural network. The MAC hardware accelerators may multiple an input weight by an input feature by sequentially multiplying individual digits of the input weight by the input feature.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: August 29, 2023
    Assignees: STMICROELECTRONICS S.r.l., STMicroelectronics International N.V.
    Inventors: Giuseppe Desoli, Thomas Boesch, Carmine Cappetta, Ugo Maria Iannuzzi
  • Patent number: 11734002
    Abstract: The present disclosure provides a counting device and counting method. The device includes a storage unit, a counting unit, and a register unit, where the storage unit may be connected to the counting unit for storing input data to be counted and storing a number of elements satisfying a given condition in the input data after counting; the register unit may be configured to store an address where input data to be counted is stored in the storage unit; and the counting unit may be connected to the register unit, and may be configured to acquire a counting instruction, read a storage address of the input data to be counted in the register unit according to the counting instruction, acquire corresponding input data to be counted in the storage unit, perform statistical counting on a number of elements in the input data to be counted that satisfy the given condition, and obtain a counting result.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: August 22, 2023
    Assignee: SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD
    Inventors: Tianshi Chen, Jie Wei, Tian Zhi, Zai Wang
  • Patent number: 11716842
    Abstract: A random bit circuit includes four storage cells controlled by four different word lines. The first storage cell and the second storage cell are disposed along a first direction sequentially, and the first storage cell and the third storage cell are disposed along a second direction sequentially. The third storage cell and the fourth storage cell are disposed along the first direction sequentially. The first storage cell and the fourth storage cell are coupled in series, and the second storage cell and the third storage cell are coupled in series.
    Type: Grant
    Filed: September 29, 2020
    Date of Patent: August 1, 2023
    Assignee: eMemory Technology Inc.
    Inventors: Shiau-Pin Lin, Chih-Min Wang
  • Patent number: 11704388
    Abstract: A computing device determines a disaggregated solution vector of a plurality of variables. A first value is computed for a known variable using a predefined density distribution function, and a second value is computed for an unknown variable using the computed first value, a predefined correlation value, and a predefined aggregate value. The predefined correlation value indicates a correlation between the known variable and the unknown variable. A predefined number of solution vectors is computed by repeating the first value and the second value computations. A solution vector is the computed first value and the computed second value. A centroid vector is computed from solution vectors computed by repeating the computations. A predefined number of closest solution vectors to the computed centroid vector are determined from the solution vectors. The determined closest solution vectors are output.
    Type: Grant
    Filed: July 12, 2022
    Date of Patent: July 18, 2023
    Assignee: SAS Institute Inc.
    Inventors: Christian Macaro, Fedor Reva, Rocco Claudio Cannizzaro
  • Patent number: 11698772
    Abstract: An instruction is executed in round-for-reround mode wherein the permissible resultant value that is closest to and no greater in magnitude than the infinitely precise result is selected. If the selected value is not exact and the units digit of the selected value is either 0 or 5, then the digit is incremented by one and the selected value is delivered. In all other cases, the selected value is delivered.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: July 11, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric Mark Schwarz, Martin Stanley Schmookler
  • Patent number: 11681497
    Abstract: A method for an associative memory device includes storing a plurality of pairs of N-bit numbers A and B to be added together in columns of a memory array of the associative memory device, each pair in a column, each bit in a row of the column, and dividing each N-bit number A and B into groups containing M bits each, having group carry-out predictions for every group except a first group, the group carry-out predictions calculated for any possible group carry-in value, and, once the carry-out value for a first group is calculated, selecting the next group carry out value from the group carry-out predictions. The method also includes repeating the ripple selecting group carry-out values, until all group carry out values have been selected.
    Type: Grant
    Filed: November 2, 2020
    Date of Patent: June 20, 2023
    Assignee: GSI Technology Inc.
    Inventor: Moshe Lazer
  • Patent number: 11669303
    Abstract: A multiply-accumulate circuit and methods for using the same are disclosed. In one embodiment, a multiply-accumulate circuit includes a memory configured to store a first set of operands and a second set of operands, where the first set of operands and the second set of operands are cross-multiplied to form a plurality of product pairs, a plurality of computation circuits configured to generate a plurality of charges according to the plurality of product pairs, and an aggregator circuit configured to aggregate the plurality of charges from the plurality of computation circuits to record variations of charges, where the variation of charges represent an aggregated value of the plurality of product pairs.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: June 6, 2023
    Assignee: Ambient Scientific, Inc.
    Inventor: Gajendra Prasad Singh
  • Patent number: 11669585
    Abstract: In one embodiment, a method includes receiving an input tensor corresponding to a media object at a binary convolutional neural network, wherein the binary convolutional neural network comprises at least one binary convolution layer comprising one or more weights, and wherein the media object is associated with a particular task, binarizing the input tensor by the at least one binary convolution layer, binarizing the one or more weights by the at least one binary convolution layer, and generating an output corresponding to the particular task by the binary convolutional neural network based on the binarized input tensor and the binarized one or more weights.
    Type: Grant
    Filed: June 25, 2019
    Date of Patent: June 6, 2023
    Assignee: Apple Inc.
    Inventors: Carlo Eduardo Cabanero del Mundo, Ali Farhadi
  • Patent number: 11651231
    Abstract: A quasi-systolic array includes: a primary quasi-systolic processor; an edge row bank and edge column bank of edge quasi-systolic processors; and an interior bank of interior quasi-systolic processors. The primary quasi-systolic processor, edge quasi-systolic processor, and interior quasi-systolic processor independently include a quasi-systolic processor and are disposed and electrically connected in rows and columns in the quasi-systolic array.
    Type: Grant
    Filed: March 2, 2020
    Date of Patent: May 16, 2023
    Assignee: GOVERNMENT OF THE UNITED STATES OF AMERICA, AS REPRESENTED BY THE SECRETARY OF COMMERCE
    Inventors: Brian Douglas Hoskins, Mark David Stiles, Matthew William Daniels, Advait Madhavan, Gina Cristina Adam
  • Patent number: 11645535
    Abstract: A system and a method to normalize a deep neural network (DNN) in which a mean of activations of the DNN is set to be equal to about 0 for a training batch size of 8 or less, and a variance of the activations of the DNN is set to be equal to about a predetermined value for the training batch size. A minimization module minimizes a sum of a network loss of the DNN plus a sum of a product of a first Lagrange multiplier times the mean of the activations squared plus a sum of a product of a second Lagrange multiplier times a quantity of the variance of the activations minus one squared.
    Type: Grant
    Filed: November 9, 2018
    Date of Patent: May 9, 2023
    Inventor: Weiran Deng
  • Patent number: 11620357
    Abstract: The present disclosure provides a GPU-based third-order low-rank tensor calculation method. Operation steps of the method include: transmitting, by a CPU, third-order real value tensor input data DATA1 to a CPU; performing, by the GPU, Fourier transforms on the DATA1, to obtain third-order complex value tensor data DATA2; performing, by the GPU, matrix operations on the DATA2, to obtain third-order complex value tensor data DATA3; performing, by the GPU, inverse Fourier transforms on the DATA3, to obtain third-order real value tensor output data DATA4; and transmitting, by the GPU, the DATA4 to the CPU. In the present disclosure, in the third-order low-rank tensor calculation, a computational task with high concurrent processes is accelerated by using the CPU to improve computational efficiency. Compared with conventional CPU-based third-order low-rank tensor calculation, computational efficiency is significantly improved, and same calculation can be completed by using less time.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: April 4, 2023
    Assignee: Tensor Deep Learning Lab L.L.C.
    Inventors: Tao Zhang, Hai Li, Xiaoyang Liu
  • Patent number: 11610099
    Abstract: Hardware for implementing a Deep Neural Network (DNN) having a convolution layer, the hardware comprising an input buffer configured to provide data windows to a plurality of convolution engines, each data window comprising a single input plane; and each of the plurality of convolution engines being operable to perform a convolution operation by applying a filter to a data window, each filter comprising a set of weights for combination with respective data values of a data window, and each of the plurality of convolution engines comprising: multiplication logic operable to combine a weight of the filter with a respective data value of the data window provided by the input buffer; and accumulation logic configured to accumulate the results of a plurality of combinations performed by the multiplication logic so as to form an output for a respective convolution operation.
    Type: Grant
    Filed: November 6, 2018
    Date of Patent: March 21, 2023
    Assignee: Imagination Technologies Limited
    Inventor: Christopher Martin
  • Patent number: 11604972
    Abstract: Neural processing elements are configured with a hardware AND gate configured to perform a logical AND operation between a sign extend signal and a most significant bit (“MSB”) of an operand. The state of the sign extend signal can be based upon a type of a layer of a deep neural network (“DNN”) that generate the operand. If the sign extend signal is logical FALSE, no sign extension is performed. If the sign extend signal is logical TRUE, a concatenator concatenates the output of the hardware AND gate and the operand, thereby extending the operand from an N-bit unsigned binary value to an N+1 bit signed binary value. The neural processing element can also include another hardware AND gate and another concatenator for processing another operand similarly. The outputs of the concatenators for both operands are provided to a hardware binary multiplier.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: March 14, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Amol A Ambardekar, Boris Bobrov, Kent D. Cedola, Chad Balling McBride, George Petre, Larry Marvin Wall
  • Patent number: 11604850
    Abstract: A non-destructive memory array implements a full adder. The array includes a column connected by a bit line and a full adder unit. The column stores a first bit in a first row of the bit line, a second bit in a second row of the bit line, and an inverse of a carry-in bit in a third row of the bit line. The full adder unit stores, in the second and third rows of the bit line, a sum bit and a carry out bit output, respectively, of adding the first bit, the second bit and the carry-in bit. The full adder unit does not overwrite any of the bits when a full adder table indicates that the sum bit and the carry out bit are equivalent to the second bit and the carry-in bit.
    Type: Grant
    Filed: January 13, 2020
    Date of Patent: March 14, 2023
    Assignee: GSI Technology Inc.
    Inventors: LeeLean Shu, Avidan Akerib
  • Patent number: 11599334
    Abstract: A device for performing multiply/accumulate operations processes values in first and second buffers and having a first width using a computational pipeline with a second width, such as half the first width. A sequencer processes combinations of portions (high-high, low-low, high-low, low-high) of the values in the first and second buffers using a multiply/accumulate circuit and adds the accumulated result of each combination of portions to a group accumulator. Adding to the group accumulator may be preceded by left shifting the accumulated result (the first width for the high-high combination and the second width for the low-high and high-low combination).
    Type: Grant
    Filed: June 9, 2020
    Date of Patent: March 7, 2023
    Assignees: VeriSilicon Microelectronics, VeriSilicon Holdings Co., Ltd.
    Inventors: Mankit Lo, Meng Yue, Jin Zhang
  • Patent number: 11593626
    Abstract: A histogram-based method of selecting a fixed point number format for representing a set of values input to, or output from, a layer of a Deep Neural Network (DNN). The method comprises obtaining a histogram that represents an expected distribution of the set of values of the layer, each bin of the histogram is associated with a frequency value and a representative value in a floating point number format; quantising the representative values according to each of a plurality of potential fixed point number formats; estimating, for each of the plurality of potential fixed point number formats, the total quantisation error based on the frequency values of the histogram and a distance value for each bin that is based on the quantisation of the representative value for that bin; and selecting the fixed point number format associated with the smallest estimated total quantisation error as the optimum fixed point number format for representing the set of values of the layer.
    Type: Grant
    Filed: November 5, 2018
    Date of Patent: February 28, 2023
    Assignee: Imagination Technologies Limited
    Inventors: James Imber, Cagatay Dikici
  • Patent number: 11586417
    Abstract: A method of exploiting activation sparsity in deep neural networks is described. The method includes retrieving an activation tensor and a weight tensor where the activation tensor is a sparse activation tensor. The method also includes generating a compressed activation tensor comprising non-zero activations of the activation tensor, where the compressed activation tensor has fewer columns than the activation tensor. The method further includes processing the compressed activation tensor and the weight tensor to generate an output tensor.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: February 21, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Rexford Hill, Aaron Lamb, Michael Goldfarb, Amin Ansari, Christopher Lott