Patents Examined by Tan V. Mai
-
Patent number: 12353505Abstract: Methods and apparatus for performing diversity matrix operations within a memory fabric. Various embodiments of the present disclosure are directed to converting a memory array into a matrix fabric for spatial diversity-related matrix transformations and performing matrix operations therein. Exemplary embodiments described herein perform MIMO-related matrix transformations (e.g., precoding, beamforming, or data recovery matrix operations) within a memory device that includes a matrix fabric and matrix multiplication unit (MMU). In one variant, the matrix fabric uses a “crossbar” construction of resistive elements. Each resistive element stores a level of impedance that represents the corresponding matrix coefficient value. The crossbar connectivity can be driven with an electrical signal representing the input vector as an analog voltage. The resulting signals can be converted from analog voltages to a digital values by an MMU to yield a matrix-vector product.Type: GrantFiled: November 6, 2023Date of Patent: July 8, 2025Assignee: Micron Technology, Inc.Inventor: Fa-Long Luo
-
Patent number: 12339923Abstract: A circuit comprises an input register configured to receive an input vector of elements, a control register configured to receive a control vector of elements, wherein each element of the control vector corresponds to a respective element of the input vector, and wherein each element specifies a permutation of a corresponding element of the input vector, and a permute execution circuit configured to generate an output vector of elements corresponding to a permutation of the input vector. Generating each element of the output vector comprises accessing, at the input register, a particular element of the input vector, accessing, at the control register, a particular element of the control vector corresponding to the particular element of the input vector, and outputting the particular element of the input vector as an element at a particular position of the output vector that is selected based on the particular element of the control vector.Type: GrantFiled: September 1, 2023Date of Patent: June 24, 2025Assignee: Google LLCInventors: Dong Hyuk Woo, Gregory Michael Thorson, Andrew Everett Phelps, Olivier Temam, Jonathan Ross, Christopher Aaron Clark
-
Patent number: 12339924Abstract: A data operation device is disclosed. The data operation device comprises at least one memory configured to store a first data set represented as a first sparse matrix and a second data set represented as a second matrix, a vector unit configured to perform a row-wise product-based matrix multiplication operation based on the first sparse matrix and the second matrix and output a third data set represented as a third matrix, and a memory load unit configured to load into the vector unit first vector data associated with a row of the first sparse matrix from the first data set, and second vector data associated with a row of the second matrix that corresponds to an order of non-zero vector elements included in the first vector data from the second data set.Type: GrantFiled: November 12, 2024Date of Patent: June 24, 2025Assignee: REBELLIONS INC.Inventor: Minhoo Kang
-
Patent number: 12321743Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.Type: GrantFiled: October 6, 2023Date of Patent: June 3, 2025Assignee: NVIDIA CorporationInventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
-
Patent number: 12321712Abstract: Provided is a method for normalizing embeddings for cross-embedding alignment. The method may include applying mean centering to the at least one embedding set, applying spectral normalization to the at least one embedding set, and/or applying length normalization to the at least one embedding set. Spectral normalization may include decomposing the at least one embedding set, determining an average singular value of the at least one embedding set, determining a respective substitute singular value for each respective singular value of a diagonal matrix, and/or replacing the at least one embedding set with a product of the at least one embedding set, a right singular vector, and an inverse of the substitute diagonal matrix. The mean centering, spectral normalization, and/or length normalization may be iteratively repeated for a configurable number of iterations. A system and computer program product are also disclosed.Type: GrantFiled: December 6, 2023Date of Patent: June 3, 2025Assignee: Visa International Service AssociationInventors: Yan Zheng, Michael Yeh, Junpeng Wang, Wei Zhang, Liang Wang, Hao Yang, Prince Osei Aboagye
-
Patent number: 12321713Abstract: Methods and apparatuses enable a general-purpose low power analog vector-matrix multiplier. A switched capacitor matrix multiplier may comprise a plurality of successive approximate registers (SAR) operating in parallel, each SAR having a SAR digital output; and a plurality of Analog Multiply-and-Accumulate (MAC) units for multiplying and accumulating and scaling bit-wise products of a digital weight matrix with a digital input vector, wherein each MAC unit is connected in series to a SAR of the plurality of SARs.Type: GrantFiled: October 31, 2023Date of Patent: June 3, 2025Assignee: Reconceive AI, Inc.Inventor: Behdad Youssefi
-
Patent number: 12321718Abstract: The present disclosure provides computing apparatuses, methods and software for generating random numbers. Data is received from an instrument characterising macromolecules in a sample, the data including measurement event information relating to measurements of individual macromolecules recorded over time. For each measurement event in a sequence of measurement events in the data, an event timing representative of the duration of event or the time passing between consecutive events is determined. This is compared with a comparator value to generate a binary output, and a bit value is determined based on the binary output. Data representative of a random number is generated by assembling a vector of bit values determined from the event timings in sequence. The determined sequence of event timings for the sequence of measurement events represents a source of entropy extracted by the comparison step to generate the random number.Type: GrantFiled: January 12, 2024Date of Patent: June 3, 2025Assignee: Veiovia LimitedInventors: Darren Hurley-Smith, Alastair Droop, Remy Lyon, Roxana Iuliana Teodor
-
Patent number: 12314837Abstract: Provided are systems, methods, and integrated circuits for neural network processing. In various implementations, an integrated circuit for neural network processing can include a plurality of memory banks storing weight values for a neural network. The memory banks can be on the same chip as an array of processing engines. Upon receiving input data, the circuit can be configured to use the set of weight values to perform a task defined for the neural network. Performing the task can include reading weight values from the memory banks, inputting the weight values into the array of processing engines, and computing a result using the array of processing engines, where the result corresponds to an outcome of performing the task.Type: GrantFiled: June 22, 2023Date of Patent: May 27, 2025Assignee: Amazon Technologies, Inc.Inventors: Randy Huang, Ron Diamant
-
Patent number: 12293163Abstract: In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.Type: GrantFiled: May 31, 2024Date of Patent: May 6, 2025Assignee: Recogni Inc.Inventors: Jian hui Huang, Gary S. Goldman
-
Patent number: 12292946Abstract: A method for implementing formal verification of an optimized multiplier via symbolic computer algebra (SCA)-satisfiability (SAT) synergy includes: systematically recovering, by a reverse engineering algorithm, an adder tree from an optimized multiplier; 2) generating, by a constraint satisfaction algorithm, a reference multiplier only by using an adder based on a constraint condition; and 3) combining, by an SCA-based and SAT-based verification method, complementary advantages of SCA and SAT. In the verification framework, the method introduces a reference multiplier generator for generating a correct reference multiplier. The correct reference multiplier has both a structure similar to a structure of the optimized multiplier and a clear adder boundary. The clear adder boundary allows proving correctness of the correct reference multiplier through SCA-based verification.Type: GrantFiled: December 4, 2024Date of Patent: May 6, 2025Assignee: SHANGHAITECH UNIVERSITYInventors: Rui Li, Lin Li, Yajun Ha
-
Patent number: 12282749Abstract: An integrated circuit comprising a plurality MAC pipelines wherein each MAC pipeline includes: (i) a plurality of MACs connected in series and (ii) a plurality of data paths including an accumulation data path, wherein each MAC includes a multiplier to multiply to generate product data and an accumulator to generate sum data. The integrated circuit further comprises a plurality of control/configure circuits, wherein each control/configure circuit connects directly to and is associated with a MAC pipeline, wherein each control/configure circuit includes an accumulation data path which is configurable to directly connect to the accumulation data path of the MAC pipeline to form an accumulation ring when the control/configure circuit is configured in an accumulation mode, and an output data path configurable to directly connect to the output of the accumulation data path of the MAC pipeline when the control/configure circuit is configured in an output data mode.Type: GrantFiled: August 2, 2021Date of Patent: April 22, 2025Assignee: Analog Devices, Inc.Inventors: Frederick A. Ware, Cheng C. Wang
-
Patent number: 12282750Abstract: A processing-in-memory (PIM) device may include a plurality of memory banks configured to provide plural groups of weight data, a global buffer configured to provide plural sets of vector data, and a plurality of multiplication/accumulation (MAC) operators configured to perform MAC operations of the plural groups of weigh data and the plural sets of vector data. Each of the plurality of MAC operators includes a plurality of multiple operation circuits. Each of the plurality of multiple operation circuits is configured to perform an arithmetic operation in a first operation mode, a second operation mode, or a third operation mode according to first to third selection signals.Type: GrantFiled: October 11, 2021Date of Patent: April 22, 2025Assignee: SK hynix Inc.Inventor: Choung Ki Song
-
Patent number: 12277499Abstract: A circuit for performing neural network computations for a neural network comprising a plurality of layers, the circuit comprising: activation circuitry configured to receive a vector of accumulated values and configured to apply a function to each accumulated value to generate a vector of activation values; and normalization circuitry coupled to the activation circuitry and configured to generate a respective normalized value from each activation value.Type: GrantFiled: April 16, 2024Date of Patent: April 15, 2025Assignee: Google LLCInventors: Gregory Michael Thorson, Christopher Aaron Clark, Dan Luu
-
Patent number: 12265797Abstract: Adder circuits and associated methods for processing a set of at least three floating-point numbers to be added together include identifying, from among the at least three numbers, at least two numbers that have the same sign—that is, at least two numbers that are both positive or both negative. The identified at least two numbers are added together using one or more same-sign floating-point adders. A same-sign floating-point adder comprises circuitry configured to add together floating-point numbers having the same sign and does not include circuitry configured to add together numbers having different signs.Type: GrantFiled: December 18, 2023Date of Patent: April 1, 2025Assignee: Imagination Technologies LimitedInventors: Sam Elliott, Jonas Olof Gunnar Kallen, Casper Van Benthem
-
Patent number: 12260213Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.Type: GrantFiled: December 10, 2021Date of Patent: March 25, 2025Assignee: Intel CorporationInventors: Robert Valentine, Dan Baum, Zeev Sperber, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Bret L. Toll, Mark J. Charney, Barukh Ziv, Alexander Heinecke, Milind Girkar, Simon Rubanovich
-
Patent number: 12259941Abstract: Methods and apparatus for job scheduling in a programmable mixed-radix DFT/IDFT processor. In one embodiment, a system for processing network data from a wireless communications network includes a vector pipeline, a programmable mixed radix engine, and a job scheduler. The vector pipeline is configured to scale, stage, and multiply twiddle factor to vector data from a mega-job. The programmable mixed radix engine is configurable for computing jobs bundled in the mega-job in accordance with a DFT of a particular point size. The job scheduler is operable to bundle multiple discrete Fourier transform (DFT) jobs having a substantially same point size into the mega-job after obtaining the DFT jobs.Type: GrantFiled: December 5, 2023Date of Patent: March 25, 2025Assignee: Marvell Asia Pte, LtdInventors: Yuanbin Guo, Hong Jik Kim
-
Patent number: 12242951Abstract: A CNN inference engine that convolves an input data set with a weight data set is disclosed together with components that facilitate such computation. The engine includes a plurality of multiply and accumulate processors (MACs), each MAC causing a value in the accumulator to be augmented by a product of a data value received on an input data port, a weight value received on a weight port. The engine also includes a slice buffer having a plurality of output ports, each output port being connected to one of the MAC input data value ports. The engine causes the slice buffer to connect one of the slices to the plurality of slice buffer output ports, and causes a weight received on an inference engine weight port to be input to each MAC weight port. The MACs process the input data values on the output ports in the slice in parallel.Type: GrantFiled: June 15, 2021Date of Patent: March 4, 2025Assignee: Ocean Logic Pty LtdInventor: Vincenzo Liguori
-
Patent number: 12235927Abstract: A process-in-memory architecture based on a resistive random access memory and a matrix decomposition acceleration algorithm, which is configured for transformer neural network acceleration. The present disclosure first optimizes a self-attention computing process, decomposes a weight matrix, and reduces computing and writing operands; and further reduces whole power consumption using a softmax computing array of a selection and comparison logic structure based on the resistive random access memory. The present disclosure proposes an optimized matrix multiplication computing based on Re-Transformer, and further eliminates data dependency and reduces computing delay in scaled dot-product attention by using matrix decomposition. Meanwhile, the present disclosure reduces power consumption by using hybrid softmax based on the resistive random access memory.Type: GrantFiled: October 21, 2024Date of Patent: February 25, 2025Assignee: ZHEJIANG UNIVERSITYInventors: Liang Zhao, Xiapeng Xu
-
Patent number: 12230306Abstract: A method, comprising: providing an electrical energy source having a specified amount of electrical energy; connecting an array comprising n magnetic tunnel junctions (MTJ) in parallel to said electrical energy source, wherein each of said MTJs is at a high resistance initial state; discharging said specified energy amount through said MTJs, thereby causing a random subset of said MTJs to switch to a lower resistance state; determining a post-discharging resistance state of each of the MTJs; and assigning a logical state to each of said MTJs corresponding to said resistance state of said MTJ.Type: GrantFiled: December 2, 2019Date of Patent: February 18, 2025Assignee: TECHNION RESEARCH &DEVELOPMENT FOUNDATION LIMITEDInventors: Shahar Kvatinsky, Ben Perach
-
Patent number: 12223011Abstract: Techniques for data manipulation using integer matrix multiplication using pipelining are disclosed. A first integer matrix with dimensions m×k and a second integer matrix with dimensions k×n are obtained for matrix multiplication within a processor. The first and second integer matrices employ a two's complement variable radix point data representation. The first and second integer matrices are distilled into (j×j) submatrices. A first variable radix point format and an initial value for an accumulator register are configured dynamically. A first variable radix point format is configured dynamically for the first integer matrix and a second variable radix point format is configured dynamically for the second integer matrix. Multiply-accumulate operations are executed in a pipelined fashion on the (j×j) submatrices of the first integer matrix and the second integer matrix, where a third variable radix point format is configured for the result.Type: GrantFiled: November 27, 2023Date of Patent: February 11, 2025Assignee: MIPS Holding, Inc.Inventor: David John Simpson