Patents Examined by Tan V. Mai
  • Patent number: 11989259
    Abstract: Methods, systems, and apparatus for a matrix multiply unit implemented as a systolic array of cells are disclosed. The matrix multiply unit may include cells arranged in columns of the systolic array. Two chains of weight shift registers per column of the systolic array are in the matrix multiply unit. Each weight shift register is connected to only one chain and each cell is connected to only one weight shift register. A weight matrix register per cell is configured to store a weight input received from a weight shift register. A multiply unit is coupled to the weight matrix register and configured to multiply the weight input of the weight matrix register with a vector data input in order to obtain a multiplication result.
    Type: Grant
    Filed: November 10, 2022
    Date of Patent: May 21, 2024
    Assignee: Google LLC
    Inventors: Andrew Everett Phelps, Norman Paul Jouppi
  • Patent number: 11989258
    Abstract: Methods, systems, and apparatus for performing a matrix multiplication using a hardware circuit are described. An example method begins by obtaining an input activation value and a weight input value in a first floating point format. The input activation value and the weight input value are multiplied to generate a product value in a second floating point format that has higher precision than the first floating point format. A partial sum value is obtained in a third floating point format that has a higher precision than the first floating point format. The partial sum value and the product value are combined to generate an updated partial sum value that has the third floating point format.
    Type: Grant
    Filed: November 9, 2020
    Date of Patent: May 21, 2024
    Assignee: Google LLC
    Inventors: Andrew Everett Phelps, Norman Paul Jouppi
  • Patent number: 11989257
    Abstract: An apparatus includes a processor and a memory to store instructions. The instructions, when executed by the processor, cause the processor to perform threading of a first matrix along a first dimension of the first matrix and a second dimension of the matrix. The threading represents block sizes of the first matrix to assign to process threads of a multiplication algorithm to determine a third matrix that represents a product of the first matrix and a second matrix. The block sizes include a first block size along the first dimension and a second block size along the second dimension. The second matrix shares the second dimension with the first matrix. The instructions, when executed by the processor, cause the processor to provide data to the multiplication algorithm, which represents the first block size and the second block size.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: May 21, 2024
    Assignee: Hewlett Packard Enterprise Development LP
    Inventor: Aaron M. Collier
  • Patent number: 11983631
    Abstract: A computer determines a solution to a nonlinear optimization problem. A conjugate gradient (CG) iteration is performed with a first order derivative vector and a second order derivative matrix to update a CG residual vector, an H-conjugate vector, and a residual weight vector. A CG solution vector is updated using a previous CG solution vector, the H-conjugate vector, and the residual weight vector. An eigenvector of the second order derivative matrix having a smallest eigenvalue is computed. A basis matrix is defined that includes a cubic regularization (CR) solution vector, a CR residual vector, the CG solution vector, the CG residual vector, and the eigenvector. A CR iteration is performed to update the CR solution vector. The CR residual vector is updated using the first order derivative vector, the second order derivative matrix, and the updated CR solution vector. The process is repeated until a stop criterion is satisfied.
    Type: Grant
    Filed: November 16, 2023
    Date of Patent: May 14, 2024
    Assignee: SAS INSTITUTE INC.
    Inventors: Wenwen Zhou, Joshua David Griffin, Riadh Omheni, Seyedalireza Yektamaram, Yan Xu
  • Patent number: 11966857
    Abstract: A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing a tanh and/or sigmoid operation/function. The inline post processing unit is further configured to accept data from a set of registers configured to maintain output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the tanh and/or sigmoid operation on each element of the data from the processing block on a per-element basis via the one or more lookup tables, and stream post processing result of the per-element tanh and/or sigmoid operation back to the OCM after the tanh and/or sigmoid operation is complete.
    Type: Grant
    Filed: April 6, 2021
    Date of Patent: April 23, 2024
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
  • Patent number: 11954582
    Abstract: Disclosed is a neural network accelerator including a first bit operator generating a first multiplication result by performing multiplication on first feature bits of input feature data and first weight bits of weight data, a second bit operator generating a second multiplication result by performing multiplication on second feature bits of the input feature data and second weight bits of the weight data, an adder generating an addition result by performing addition based on the first multiplication result and the second multiplication result, a shifter shifting a number of digits of the addition result depending on a shift value to generate a shifted addition result, and an accumulator generating output feature data based on the shifted addition result.
    Type: Grant
    Filed: December 21, 2022
    Date of Patent: April 9, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sungju Ryu, Hyungjun Kim, Jae-Joon Kim
  • Patent number: 11947929
    Abstract: An arithmetic device includes a comparison unit comparing voltage generated with charge stored in a storage unit with a threshold, and outputting an output signal at a timing when the voltage exceeds the threshold, and a timing extension unit extending an interval between timings at each of which the output signal is output.
    Type: Grant
    Filed: July 4, 2019
    Date of Patent: April 2, 2024
    Assignee: SONY CORPORATION
    Inventor: Hiroyuki Yamagishi
  • Patent number: 11941078
    Abstract: Performing set operations using sparse matrix operations offered by a multi-core processing unit (such as a graphics processing unit). The set operation is converted into operand matrices, and sparse matrix operations, foregoing the use of hash tables. The input set is converted into a matrix, a matrix operation corresponding to the set operation is identified, and one or more operands of the set operation are also represented within a matrix. The matrix operation is then performed on these matrices to obtain an output matrix, which is then converted to an output set.
    Type: Grant
    Filed: September 30, 2022
    Date of Patent: March 26, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Ritwik Das
  • Patent number: 11934481
    Abstract: Embodiments of the present invention disclose a matrix multiplier, and relate to the field of data computing technologies, so as to divide two matrices into blocks for computation. The matrix multiplier includes: a first memory, a second memory, an operation circuit, and a controller, where the operation circuit, the first memory, and the second memory may perform data communication by using a bus; and the controller is configured to control, according to a preset program or instruction, a first matrix and a second matrix to be divided into blocks, and control the operation circuit to perform a multiplication operation on corresponding blocks in the first memory and the second memory based on block division results of the controller. The matrix multiplier may be configured to perform a multiplication operation on two matrices.
    Type: Grant
    Filed: April 20, 2022
    Date of Patent: March 19, 2024
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Hu Liu, Heng Liao, Jiajin Tu, Honghui Yuan, Hou Fun Lam, Fan Zhu
  • Patent number: 11934308
    Abstract: Techniques for data manipulation using processor cluster address generation are disclosed. One or more processor clusters capable of executing software-initiated work requests are accessed. A plurality of dimensions from a tensor is flattened into a single dimension. A work request address field is parsed, where the address field contains unique address space descriptors for each of the plurality of dimensions, along with a common address space descriptor. A direct memory access (DMA) engine coupled to the one or more processor clusters is configured. Addresses are generated based on the unique address space descriptors and the common address space descriptor. The plurality of dimensions can be summed to generate a single address. Memory is accessed using two or more of the addresses that were generated. The addresses are used to enable DMA access.
    Type: Grant
    Filed: September 29, 2020
    Date of Patent: March 19, 2024
    Inventors: David John Simpson, Stephen Curtis Johnson, Richard Douglas Trauben
  • Patent number: 11934965
    Abstract: A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing a tanh and/or sigmoid operation/function. The inline post processing unit is further configured to accept data from a set of registers configured to maintain output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the tanh and/or sigmoid operation on each element of the data from the processing block on a per-element basis via the one or more lookup tables, and stream post processing result of the per-element tanh and/or sigmoid operation back to the OCM after the tanh and/or sigmoid operation is complete.
    Type: Grant
    Filed: April 6, 2021
    Date of Patent: March 19, 2024
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
  • Patent number: 11928442
    Abstract: A method related to posit tensor processing can include receiving, by a plurality of multiply-accumulator (MAC) units coupled to one another, a plurality of universal number (unum) or posit bit strings organized in a matrix and to be used as operands in a plurality of respective recursive operations performed using the plurality of MAC units and performing, using the MAC units, the plurality of respective recursive operations. Iterations of the respective recursive operations are performed using at least one bit string that is a same bit string as was used in a preceding iteration of the respective recursive operations. The method can further include prior to receiving the plurality of unum or posit bit strings, performing an operation to organize the plurality of unum or posit bit strings to achieve a threshold bandwidth ratio, a threshold latency, or both during performance of the plurality of respective recursive operations.
    Type: Grant
    Filed: January 3, 2022
    Date of Patent: March 12, 2024
    Assignee: Micron Technology, Inc.
    Inventor: Vijay S. Ramesh
  • Patent number: 11928177
    Abstract: Methods and apparatus for performing video processing matrix operations within a memory fabric. Various embodiments of the present disclosure are directed to converting a memory array into a matrix fabric for discrete cosine transform (DCT) matrix transformations and performing DCT matrix operations therein. Exemplary embodiments described herein perform DCT matrix-matrix multiplication operations within a memory device that includes a matrix fabric and matrix multiplication unit (MMU). In one embodiment, matrix-matrix multiplication operations are obtained using separate matrix-vector products. In one exemplary embodiment, the matrix fabric uses a “crossbar” construction of resistive elements. Each resistive element stores a level of impedance that represents the corresponding matrix coefficient value. The crossbar connectivity can be driven with an electrical signal representing the input vector as an analog voltage.
    Type: Grant
    Filed: September 19, 2022
    Date of Patent: March 12, 2024
    Assignee: Micron Technology, Inc.
    Inventor: Fa-Long Luo
  • Patent number: 11921813
    Abstract: Embodiments relate to a computing system for solving differential equations. The system is configured to receive problem packages corresponding to problems to be solved, each comprising at least a differential equation and a domain, and to select a solver of a plurality of solvers, based upon availability of each of the plurality of solvers. A dispatch computer selects a solver by monitoring the plurality of solvers, and responsive to a solver becoming available, determines if a received problem package having at least a threshold priority level can be solved by the solver. Otherwise, the dispatch computer generates a plurality of solver scenarios each reflecting a permutation of received problem packages assigned to solvers estimated to become available within a threshold period of time, and assigns the problem packages in accordance with a solver scenario having a highest utilization score.
    Type: Grant
    Filed: August 10, 2020
    Date of Patent: March 5, 2024
    Assignee: VORTICITY INC.
    Inventor: Chirath Neranjena Thouppuarachchi
  • Patent number: 11921814
    Abstract: Methods and devices, the method including receiving a matrix of a neural network model; classifying at least a portion of the matrix as a first section based on a first distribution pattern of non-zero elements of the portion of the matrix; and identifying memory addresses of the non-zero elements in the first section of the matrix for loading, according to a first order determined based on the first distribution pattern, the non-zero elements in the first section into one or more vector registers.
    Type: Grant
    Filed: June 14, 2022
    Date of Patent: March 5, 2024
    Assignee: Alibaba Group Holding Limited
    Inventors: Guoyang Chen, Yu Pu, Yongzhi Zhang, Weifeng Zhang, Yuan Xie
  • Patent number: 11921848
    Abstract: The disclosed embodiments relate to a system that characterizes susceptibility of an inferential model to follow signal degradation. During operation, the system receives a set of time-series signals associated with sensors in a monitored system during normal fault-free operation. Next, the system trains the inferential model using the set of time-series signals. The system then characterizes susceptibility of the inferential model to follow signal degradation. During this process, the system adds degradation to a signal in the set of time-series signals to produce a degraded signal. Next, the system uses the inferential model to perform prognostic-surveillance operations on the set of time-series signals with the degraded signal. Finally, the system characterizes susceptibility of the inferential model to follow degradation in the signal based on results of the prognostic-surveillance operations.
    Type: Grant
    Filed: November 2, 2020
    Date of Patent: March 5, 2024
    Assignee: Oracle International Corporation
    Inventors: Zexi Chen, Kenny C. Gross, Ashin George, Guang C. Wang
  • Patent number: 11914670
    Abstract: Methods and systems for compressing a matrix are described. The matrix, having a plurality of rows formed by a respective plurality of vectors, is partitioned into a plurality of submatrices, each submatrix containing sub-vectors from a respective group of one or more contiguous columns of the matrix. For each given submatrix, the sub-vectors are clustered into a plurality of clusters. For each given cluster, a centroid and a variance are computed and stored, based on the sub-vectors belonging to the given cluster. A mapping relating each vector to a respective cluster in each submatrix is stored. The stored centroids, stored variances and stored mapping form a set of compressed data for reconstruction of the matrix.
    Type: Grant
    Filed: September 8, 2020
    Date of Patent: February 27, 2024
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Krtin Kumar, Mehdi Rezagholizadeh, Peyman Passban
  • Patent number: 11915101
    Abstract: In one aspect, a method includes identifying (i) a computational problem that is a candidate for a quantum computation, and (ii) one or more numerical algorithms for solving the candidate computational problem; providing input task data identifying (i) the candidate computational problem, and (ii) the one or more numerical algorithms, to a numerical quantum experimentation system, wherein the numerical quantum experimentation system comprises multiple universal numerics workers, a universal numerics worker, of the multiple universal numerics workers being configured to solve the candidate computational problem using the one or more numerical algorithms; receiving, from the numerical quantum experimentation system, data representing results of the one or more numerical algorithms to solve the candidate computational problem; and determining whether the received data indicates that a quantum computation applied to the candidate computational problem has a greater efficacy at a solution than a classical computat
    Type: Grant
    Filed: November 12, 2021
    Date of Patent: February 27, 2024
    Assignee: Google LLC
    Inventor: Vasil S. Denchev
  • Patent number: 11907686
    Abstract: The present disclosure provides computing apparatuses, methods and software for generating random numbers. Data is received from an instrument characterising macromolecules in a sample, the data including measurement event information relating to measurements of individual macromolecules recorded over time. For each measurement event in a sequence of measurement events in the data, an event timing representative of the duration of event or the time passing between consecutive events is determined. This is compared with a comparator value to generate a binary output, and a bit value is determined based on the binary output. Data representative of a random number is generated by assembling a vector of bit values determined from the event timings in sequence. The determined sequence of event timings for the sequence of measurement events represents a source of entropy extracted by the comparison step to generate the random number.
    Type: Grant
    Filed: August 11, 2023
    Date of Patent: February 20, 2024
    Assignee: Veiovia Limited
    Inventors: Darren Hurley-Smith, Alastair Droop, Remy Lyon, Roxana Iuliana Teodor
  • Patent number: 11907832
    Abstract: A method includes: providing input information in an electronic format; converting the electronic input information into an optical input vector; optically transforming the optical input vector into an optical output vector based on an optical matrix multiplication; converting the optical output vector into an electronic format; and electronically applying a non-linear transformation to the electronically converted optical output vector to provide output information in an electronic format. For example, a set of input values are encoded on respective optical signals. For each of at least two subsets of optical signals, a copying module splits the subset into multiple copies of the optical signals. For each copy of a first subset of optical signals, a corresponding multiplication module multiplies the optical signals of the first subset by matrix element values using optical amplitude modulation. A summation module produces an electrical signal representing a sum of the results of the multiplication modules.
    Type: Grant
    Filed: April 20, 2020
    Date of Patent: February 20, 2024
    Assignee: Lightelligence PTE. Ltd.
    Inventors: Yichen Shen, Huaiyu Meng, Li Jing, Rumen Dangovski, Peng Xie, Matthew Khoury, Cheng-Kuan Lu, Ronald Gagnon, Maurice Steinman, Jianhua Wu, Arash Hosseinzadeh