Patents Examined by Tan V. Mai
  • Patent number: 12045616
    Abstract: In some examples, a circuit includes an interface configured to couple to a memory that includes a set of outputs to provide a set of data from the memory. The circuit further includes a rotator coupled to the interface that includes a first set of multiplexors that each include a set of inputs coupled to the set of outputs of the interface and an output. The circuit further includes a storage circuit coupled to the rotator that includes a register file coupled to the outputs of the first set of multiplexors an alignment network. The alignment network includes a second set of multiplexors that each include a set of inputs coupled to the register file and an output.
    Type: Grant
    Filed: March 8, 2021
    Date of Patent: July 23, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Jonathan (Son) Hung Tran, Joseph Raymond Michael Zbiciak
  • Patent number: 12038999
    Abstract: A method for performing a fast Fourier transform. The bin spreading effect of conventional FFT methodology may be removed by a mathematical technique that relies on an incomplete replacement of the input data sequence. In the present approach this replacement is accomplished by a “round robin” method. In this approach no window function is required and the FFT calculation proceeds after each new sample is added round robin fashion to the input sequence. The resulting output bins from the FFT show the signal evolution with time, overlapping as in the known art but by a single sample. The output of a FFT so constructed is not time invariant, but rather there is a rotation present in each output bin when viewed as an analytical signal. This rotation is predictable and hence removeable, but is also exploitable as a means to remove the bin spill over.
    Type: Grant
    Filed: December 13, 2022
    Date of Patent: July 16, 2024
    Assignee: SiliconIntervention Inc.
    Inventor: A. Martin Mallinson
  • Patent number: 12026221
    Abstract: Using an attributes model of a time series forecasting model, determine a set of features based on time series data, the set of features including periodic components. The time series data may be divided into a set of segments. For each segment of the set of segments, a weight may be assigned using an age of the segment, resulting in a set of weighted segments of time series data. Using a trend detection model of the time series forecasting model, trend data from the set of weighted segments of time series data may be determined. A time series forecast may be generated by combining the set of features and the trend data.
    Type: Grant
    Filed: February 22, 2023
    Date of Patent: July 2, 2024
    Assignee: Snowflake Inc.
    Inventors: Michel Adar, Boxin Jiang, Qiming Jiang, John Reumann, Boyu Wang, Jiaxun Wu
  • Patent number: 12026478
    Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.
    Type: Grant
    Filed: January 9, 2024
    Date of Patent: July 2, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 12014266
    Abstract: A cognitive modeling system uses a cognitive model to efficiently execute a variety of tasks over large datasets. The cognitive modeling system receives an input dataset and a query specifying a task to execute in relation to the input dataset. The cognitive modeling system determines an amount of similarity between each child node of the cognitive model and one or more of the input dataset and the query, selects a particular child node with the most determined amount of similarity, and executes the task using the particular child node. The task execution includes searching the particular child node for a connected set of neurons that match a particular part of the input dataset by a threshold amount, and applying an output that is associated with the connected set of neurons to the particular part of the input dataset.
    Type: Grant
    Filed: March 7, 2024
    Date of Patent: June 18, 2024
    Assignee: Illuscio, Inc.
    Inventors: Kevin Edward Dean, Joseph Nordling
  • Patent number: 12014272
    Abstract: A circuit for performing neural network computations for a neural network comprising a plurality of layers, the circuit comprising: activation circuitry configured to receive a vector of accumulated values and configured to apply a function to each accumulated value to generate a vector of activation values; and normalization circuitry coupled to the activation circuitry and configured to generate a respective normalized value from each activation value.
    Type: Grant
    Filed: March 1, 2023
    Date of Patent: June 18, 2024
    Assignee: Google LLC
    Inventors: Gregory Michael Thorson, Christopher Aaron Clark, Dan Luu
  • Patent number: 12008069
    Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.
    Type: Grant
    Filed: November 29, 2023
    Date of Patent: June 11, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 12005802
    Abstract: A bilevel coordinated optimization method for fixed and mobile charging facilities on highways, includes: constructing an optimization model framework which includes an upper-layer coordinated location optimization model and a lower-layer coordinated capacity optimization model, where the upper-layer coordinated location optimization model is used to optimize locations of charging stations and determine locations and time points of charging demands that require truck mobile charger (TMC) deployment, while the lower-layer coordinated capacity optimization model is used to optimize TMC and fixed charger (FC) capacities at candidate sites, improving an utilization rate of FCs; and performing equivalent linearization on a nonlinear problem using a big-M method and converting the problem into a mixed-integer linear programming model, and implementing a data exchange process between upper and lower layers using analytical target cascading.
    Type: Grant
    Filed: December 21, 2023
    Date of Patent: June 11, 2024
    Assignee: Tianjin University
    Inventors: Hongjie Jia, Kecheng He, Yunfei Mu, Xiaodan Yu, Xiaohong Dong, Xiandong Xu
  • Patent number: 12009843
    Abstract: A matrix compression/decompression accelerator (MCA) system/method that coordinates lossless data compression (LDC) and lossless data decompression (LDD) transfers between an external data memory (EDM) and a local data memory (LDM) is disclosed. The system implements LDC using a 2D-to-1D transformation of 2D uncompressed data blocks (2DU) within LDM to generate 1D uncompressed data blocks (1DU). The 1DU is then compressed to generate a 1D compressed superblock (CSB) in LDM. This LDM CSB may then be written to EDM with a reduced number of EDM bus cycles. The system implements LDD using decompression of CSB data retrieved from EDM to generate a 1D decompressed data block (1DD) in LDM. A 1D-to-2D transformation is then applied to the LDM 1DD to generate a 2D decompressed data block (2DD) in LDM. This 2DD may then be operated on by a matrix compute engine (MCE) using a variety of function operators.
    Type: Grant
    Filed: March 9, 2021
    Date of Patent: June 11, 2024
    Assignee: Texas Instruments Incorporated
    Inventors: Arthur John Redfern, Dan Wang
  • Patent number: 12001946
    Abstract: Systems and methods that include: providing input information in an electronic format; converting at least a part of the electronic input information into an optical input vector; optically transforming the optical input vector into an optical output vector based on an optical matrix multiplication; converting the optical output vector into an electronic format; and electronically applying a non-linear transformation to the electronically converted optical output vector to provide output information in an electronic format. In some examples, a set of multiple input values are encoded on respective optical signals carried by optical waveguides. For each of at least two subsets of one or more optical signals, a corresponding set of one or more copying modules splits the subset of one or more optical signals into two or more copies of the optical signals.
    Type: Grant
    Filed: April 20, 2020
    Date of Patent: June 4, 2024
    Assignee: Lightelligence PTE. Ltd.
    Inventors: Yichen Shen, Huaiyu Meng, Li Jing, Rumen Dangovski, Peng Xie, Matthew Khoury, Cheng-Kuan Lu, Ronald Gagnon, Maurice Steinman, Jianhua Wu, Arash Hosseinzadeh
  • Patent number: 11995569
    Abstract: A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing a tanh and/or sigmoid operation/function. The inline post processing unit is further configured to accept data from a set of registers configured to maintain output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the tanh and/or sigmoid operation on each element of the data from the processing block on a per-element basis via the one or more lookup tables, and stream post processing result of the per-element tanh and/or sigmoid operation back to the OCM after the tanh and/or sigmoid operation is complete.
    Type: Grant
    Filed: April 6, 2021
    Date of Patent: May 28, 2024
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
  • Patent number: 11995149
    Abstract: A processing system includes a first set and a second set of general-purpose registers (GPRs) and memory access circuitry that fetches nonzero values of a sparse matrix into consecutive slots in the first set. The memory access circuitry also fetches values of an expanded matrix into consecutive slots in the second set of GPRs. The expanded matrix is formed based on values of a vector and locations of the nonzero values in the sparse matrix. The processing system also includes a set of multipliers that concurrently perform multiplication of the nonzero values in slots of the first set of GPRs with the values of the vector in corresponding slots of the second set. Reduced sum circuitry accumulates results from the set of multipliers for rows of the sparse matrix.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: May 28, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor
  • Patent number: 11989259
    Abstract: Methods, systems, and apparatus for a matrix multiply unit implemented as a systolic array of cells are disclosed. The matrix multiply unit may include cells arranged in columns of the systolic array. Two chains of weight shift registers per column of the systolic array are in the matrix multiply unit. Each weight shift register is connected to only one chain and each cell is connected to only one weight shift register. A weight matrix register per cell is configured to store a weight input received from a weight shift register. A multiply unit is coupled to the weight matrix register and configured to multiply the weight input of the weight matrix register with a vector data input in order to obtain a multiplication result.
    Type: Grant
    Filed: November 10, 2022
    Date of Patent: May 21, 2024
    Assignee: Google LLC
    Inventors: Andrew Everett Phelps, Norman Paul Jouppi
  • Patent number: 11989257
    Abstract: An apparatus includes a processor and a memory to store instructions. The instructions, when executed by the processor, cause the processor to perform threading of a first matrix along a first dimension of the first matrix and a second dimension of the matrix. The threading represents block sizes of the first matrix to assign to process threads of a multiplication algorithm to determine a third matrix that represents a product of the first matrix and a second matrix. The block sizes include a first block size along the first dimension and a second block size along the second dimension. The second matrix shares the second dimension with the first matrix. The instructions, when executed by the processor, cause the processor to provide data to the multiplication algorithm, which represents the first block size and the second block size.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: May 21, 2024
    Assignee: Hewlett Packard Enterprise Development LP
    Inventor: Aaron M. Collier
  • Patent number: 11989258
    Abstract: Methods, systems, and apparatus for performing a matrix multiplication using a hardware circuit are described. An example method begins by obtaining an input activation value and a weight input value in a first floating point format. The input activation value and the weight input value are multiplied to generate a product value in a second floating point format that has higher precision than the first floating point format. A partial sum value is obtained in a third floating point format that has a higher precision than the first floating point format. The partial sum value and the product value are combined to generate an updated partial sum value that has the third floating point format.
    Type: Grant
    Filed: November 9, 2020
    Date of Patent: May 21, 2024
    Assignee: Google LLC
    Inventors: Andrew Everett Phelps, Norman Paul Jouppi
  • Patent number: 11983631
    Abstract: A computer determines a solution to a nonlinear optimization problem. A conjugate gradient (CG) iteration is performed with a first order derivative vector and a second order derivative matrix to update a CG residual vector, an H-conjugate vector, and a residual weight vector. A CG solution vector is updated using a previous CG solution vector, the H-conjugate vector, and the residual weight vector. An eigenvector of the second order derivative matrix having a smallest eigenvalue is computed. A basis matrix is defined that includes a cubic regularization (CR) solution vector, a CR residual vector, the CG solution vector, the CG residual vector, and the eigenvector. A CR iteration is performed to update the CR solution vector. The CR residual vector is updated using the first order derivative vector, the second order derivative matrix, and the updated CR solution vector. The process is repeated until a stop criterion is satisfied.
    Type: Grant
    Filed: November 16, 2023
    Date of Patent: May 14, 2024
    Assignee: SAS INSTITUTE INC.
    Inventors: Wenwen Zhou, Joshua David Griffin, Riadh Omheni, Seyedalireza Yektamaram, Yan Xu
  • Patent number: 11966857
    Abstract: A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing a tanh and/or sigmoid operation/function. The inline post processing unit is further configured to accept data from a set of registers configured to maintain output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the tanh and/or sigmoid operation on each element of the data from the processing block on a per-element basis via the one or more lookup tables, and stream post processing result of the per-element tanh and/or sigmoid operation back to the OCM after the tanh and/or sigmoid operation is complete.
    Type: Grant
    Filed: April 6, 2021
    Date of Patent: April 23, 2024
    Assignee: Marvell Asia Pte Ltd
    Inventors: Avinash Sodani, Ulf Hanebutte, Chia-Hsin Chen
  • Patent number: 11954582
    Abstract: Disclosed is a neural network accelerator including a first bit operator generating a first multiplication result by performing multiplication on first feature bits of input feature data and first weight bits of weight data, a second bit operator generating a second multiplication result by performing multiplication on second feature bits of the input feature data and second weight bits of the weight data, an adder generating an addition result by performing addition based on the first multiplication result and the second multiplication result, a shifter shifting a number of digits of the addition result depending on a shift value to generate a shifted addition result, and an accumulator generating output feature data based on the shifted addition result.
    Type: Grant
    Filed: December 21, 2022
    Date of Patent: April 9, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sungju Ryu, Hyungjun Kim, Jae-Joon Kim
  • Patent number: 11947929
    Abstract: An arithmetic device includes a comparison unit comparing voltage generated with charge stored in a storage unit with a threshold, and outputting an output signal at a timing when the voltage exceeds the threshold, and a timing extension unit extending an interval between timings at each of which the output signal is output.
    Type: Grant
    Filed: July 4, 2019
    Date of Patent: April 2, 2024
    Assignee: SONY CORPORATION
    Inventor: Hiroyuki Yamagishi
  • Patent number: 11941078
    Abstract: Performing set operations using sparse matrix operations offered by a multi-core processing unit (such as a graphics processing unit). The set operation is converted into operand matrices, and sparse matrix operations, foregoing the use of hash tables. The input set is converted into a matrix, a matrix operation corresponding to the set operation is identified, and one or more operands of the set operation are also represented within a matrix. The matrix operation is then performed on these matrices to obtain an output matrix, which is then converted to an output set.
    Type: Grant
    Filed: September 30, 2022
    Date of Patent: March 26, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Ritwik Das