Patents Examined by Tan V. Mai
-
Patent number: 12242951Abstract: A CNN inference engine that convolves an input data set with a weight data set is disclosed together with components that facilitate such computation. The engine includes a plurality of multiply and accumulate processors (MACs), each MAC causing a value in the accumulator to be augmented by a product of a data value received on an input data port, a weight value received on a weight port. The engine also includes a slice buffer having a plurality of output ports, each output port being connected to one of the MAC input data value ports. The engine causes the slice buffer to connect one of the slices to the plurality of slice buffer output ports, and causes a weight received on an inference engine weight port to be input to each MAC weight port. The MACs process the input data values on the output ports in the slice in parallel.Type: GrantFiled: June 15, 2021Date of Patent: March 4, 2025Assignee: Ocean Logic Pty LtdInventor: Vincenzo Liguori
-
Patent number: 12235927Abstract: A process-in-memory architecture based on a resistive random access memory and a matrix decomposition acceleration algorithm, which is configured for transformer neural network acceleration. The present disclosure first optimizes a self-attention computing process, decomposes a weight matrix, and reduces computing and writing operands; and further reduces whole power consumption using a softmax computing array of a selection and comparison logic structure based on the resistive random access memory. The present disclosure proposes an optimized matrix multiplication computing based on Re-Transformer, and further eliminates data dependency and reduces computing delay in scaled dot-product attention by using matrix decomposition. Meanwhile, the present disclosure reduces power consumption by using hybrid softmax based on the resistive random access memory.Type: GrantFiled: October 21, 2024Date of Patent: February 25, 2025Assignee: ZHEJIANG UNIVERSITYInventors: Liang Zhao, Xiapeng Xu
-
Patent number: 12230306Abstract: A method, comprising: providing an electrical energy source having a specified amount of electrical energy; connecting an array comprising n magnetic tunnel junctions (MTJ) in parallel to said electrical energy source, wherein each of said MTJs is at a high resistance initial state; discharging said specified energy amount through said MTJs, thereby causing a random subset of said MTJs to switch to a lower resistance state; determining a post-discharging resistance state of each of the MTJs; and assigning a logical state to each of said MTJs corresponding to said resistance state of said MTJ.Type: GrantFiled: December 2, 2019Date of Patent: February 18, 2025Assignee: TECHNION RESEARCH &DEVELOPMENT FOUNDATION LIMITEDInventors: Shahar Kvatinsky, Ben Perach
-
Patent number: 12223011Abstract: Techniques for data manipulation using integer matrix multiplication using pipelining are disclosed. A first integer matrix with dimensions m×k and a second integer matrix with dimensions k×n are obtained for matrix multiplication within a processor. The first and second integer matrices employ a two's complement variable radix point data representation. The first and second integer matrices are distilled into (j×j) submatrices. A first variable radix point format and an initial value for an accumulator register are configured dynamically. A first variable radix point format is configured dynamically for the first integer matrix and a second variable radix point format is configured dynamically for the second integer matrix. Multiply-accumulate operations are executed in a pipelined fashion on the (j×j) submatrices of the first integer matrix and the second integer matrix, where a third variable radix point format is configured for the result.Type: GrantFiled: November 27, 2023Date of Patent: February 11, 2025Assignee: MIPS Holding, Inc.Inventor: David John Simpson
-
Patent number: 12217021Abstract: A parallel processing unit employs an arithmetic logic unit (ALU) having a relatively small footprint, thereby reducing the overall power consumption and circuit area of the processing unit. To support the smaller footprint, the ALU includes multiple stages to execute operations corresponding to a received instruction. The ALU executes at least one operation at a precision indicated by the received instruction, and then reduces the resulting data of the at least one operation to a smaller size before providing the results to another stage of the ALU to continue execution of the instruction.Type: GrantFiled: July 7, 2023Date of Patent: February 4, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Bin He, Shubh Shah, Michael Mantor
-
Patent number: 12217154Abstract: A neural network operation method includes: receiving an input vector sequence including a plurality of channels; performing a first convolution operation on a first input vector of the input vector sequence; and performing a second convolution operation on a second input vector of the input vector sequence that is adjacent to the first input vector in a channel direction.Type: GrantFiled: April 13, 2021Date of Patent: February 4, 2025Assignees: Samsung Electronics Co., Ltd., Seoul National University R&DB FoundationInventors: Xue Qian, Jin Hwan Park, Wonyong Sung
-
Patent number: 12204606Abstract: In some examples, a system can store a first array, which is a one-dimensional array of values (e.g., matrix values), in memory. The system can also store a second array in the memory, where the second array is a one-dimensional array of pointers that point to positions of a subset of the values in the first array. The subset of values can be a first entry of each row or column of a matrix. The system can then provide the second array as input to a program routine, which can perform a matrix operation. To do so, the program routine can access the first array and the second array in memory, select a set of values for the matrix from the first array by using the pointers, execute the matrix operation using the using the selected set of values, and output the result.Type: GrantFiled: August 2, 2024Date of Patent: January 21, 2025Assignee: SAS INSTITUTE INC.Inventor: Alexander Vladimirovich Andrianov
-
Patent number: 12197533Abstract: A processing device is provided which comprises memory configured to store data and a processor configured to receive a portion of data of a first matrix comprising a first plurality of elements and receive a portion of data of a second matrix comprising a second plurality of elements. The processor is also configured to determine values for a third matrix by dropping a number of products from products of pairs of elements of the first and second matrices based on approximating the products of the pairs of elements as a sum of the exponents of the pairs of elements and performing matrix multiplication on remaining products of the pairs of elements of the first and second matrices.Type: GrantFiled: March 26, 2021Date of Patent: January 14, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Pramod Vasant Argade, Swapnil P. Sakharshete, Maxim V. Kazakov, Alexander M. Potapov
-
Patent number: 12197530Abstract: An adaptive processor is configured to provide for reduced-complexity estimation of signal and data modes in high-dimensional data sets by implementing subspace-constrained partial updates to optimize an eigenvalue-based objective function. The adaptive processor selects, from a set of combiner weights, a set of update weights and a set of held weights; performs updates to the set of held weights within a reduced-dimensionality subspace and unconstrained updates to the set of update weights to produce updated combiner weights; and employs the updated combiner weights to determine at least one solution to an eigenequation or pseudo-eigenequation.Type: GrantFiled: September 18, 2023Date of Patent: January 14, 2025Inventor: Brian G. Agee
-
Patent number: 12197889Abstract: A process for a floating point multiplier-accumulator (MAC) is operative on N pairs of floating point values using N MAC processes operating concurrently, each MAC process operating on a pair of values comprising an input value and a coefficient value. Each MAC process simultaneously generates: an integer form fraction at a first bitwidth and a second bitwidth greater than the first bitwidth, a sign bit, and an exponent difference computed by subtracting an exponent sum from a maximum exponent sum of all exponent sums. The integer form fractions of the first bitwidths are provided to an adder tree using the first bitwidth, and if the sum has an excess percentage of leading 0s, then the second bitwidth is used by an adder tree using the second bitwidth to form a great precision integer form fraction. The sign, integer form fraction, and maximum exponent are provided to an normalizer which generates a floating point result.Type: GrantFiled: June 21, 2021Date of Patent: January 14, 2025Assignee: Ceremorphic, Inc.Inventor: Dylan Finch
-
Patent number: 12190028Abstract: A software architecture where the software architecture processes a method, wherein the method includes defining initial conditions for a set of Büttiker probes. The set of Büttiker probes include various interaction equations between one or several many-body systems. The method includes computing properties of particles with quantum transport methods. A quantum transport method of the quantum transport methods include a set of Büttiker probes. The particles include the particles of one or several many-body systems. Further, the method includes calculating a current for each Büttiker probe of the set of Büttiker probes. The current includes at least one of momentum current, particle current, energy current, spin current, color charge or chirality current. The method includes setting up a set of continuity equations such that for each continuity equation a calculated current of a Büttiker probe is in a particular relation with an another calculated current of an another Büttiker probe.Type: GrantFiled: August 13, 2019Date of Patent: January 7, 2025Assignee: Purdue Research FoundationInventors: Tillmann C Kubis, Yuanchen Chu, Kuang-Chung Wang
-
Patent number: 12174911Abstract: An apparatus and method for complex matrix multiplication. For example, one embodiment of a processor comprises: a decoder to decode a first complex matrix multiplication instruction; execution circuitry to execute the first complex matrix multiplication instruction, the execution circuitry comprising parallel multiplication circuitry to multiply real values from the first plurality of real and imaginary values with corresponding real values from the second plurality of real and imaginary values to generate a first plurality of real products, to multiply imaginary values from the first plurality of real and imaginary values with corresponding imaginary values from the second plurality of real and imaginary values to generate a second plurality of real products; and addition/subtraction circuitry to subtract each real product in the second plurality of real products from a corresponding real product in the first plurality of real products to produce a corresponding real value in the result matrix.Type: GrantFiled: December 23, 2020Date of Patent: December 24, 2024Assignee: Intel CorporationInventors: Menachem Adelman, Robert Valentine, Daniel Towner, Amit Gradstein, Mark Jay Charney
-
Patent number: 12174656Abstract: Systems and methods for performing matrix operations using a photonic processor are provided. The photonic processor includes encoders configured to encode a numerical value into an optical signal and optical multiplication devices configured to output an electrical signal proportional to a product of one or more encoded values. The optical multiplication devices include a first input waveguide, a second input waveguide, a coupler circuit coupled to the first input waveguide and the second input waveguide, a first detector and a second detector coupled to the coupler circuit, and a circuit coupled to the first detector and second detector and configured to output a current that is proportional to a product of a first input value and a second input value.Type: GrantFiled: November 9, 2023Date of Patent: December 24, 2024Assignee: Lightmatter, Inc.Inventors: Darius Bunandar, Nicholas C. Harris, Tyler J. Kenney
-
Patent number: 12175253Abstract: According to one embodiment, a calculating device includes a first memory, a second memory, a third memory, a first arithmetic module, a second arithmetic module, a first conductive line electrically connecting a first output terminal of the first memory and a first input terminal of the first arithmetic module, a second conductive line electrically connecting a second output terminal of the first memory and a first input terminal of the second arithmetic module, a third conductive line electrically connecting a first output terminal of the second memory and a second input terminal of the second arithmetic module, a fourth conductive line electrically connecting a first output terminal of the third memory and a third input terminal of the second arithmetic module, and a fifth conductive line electrically connecting a first output terminal of the second arithmetic module and a second input terminal of the first arithmetic module.Type: GrantFiled: March 21, 2023Date of Patent: December 24, 2024Assignee: Kabushiki Kaisha ToshibaInventors: Kosuke Tatsumura, Hayato Goto
-
Patent number: 12174909Abstract: In an embodiment a method programming floating gate transistors belonging to non-volatile memory cells to multilevel threshold voltages respectively corresponding to the weight factors, performing a sensing operation of the programmed floating gate transistors with a control signal adapted to make the corresponding memory cells become conductive at an instant determined by a respective programmed threshold voltage, performing the convolutional computation by using the input values during an elapsed time for each memory cell to become conductive and outputting output values resulting from the convolutional computation.Type: GrantFiled: July 13, 2021Date of Patent: December 24, 2024Assignees: STMicroelectronics S.r.l., STMicroelectronics (Rousset) SASInventors: Francesco La Rosa, Antonino Conte
-
Patent number: 12164593Abstract: A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.Type: GrantFiled: July 13, 2021Date of Patent: December 10, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Peng Gu, Krishna Malladi, Hongzhong Zheng, Dimin Niu
-
Patent number: 12164883Abstract: A method includes retrieving a plurality of datasets from respective memory registers of a memory and storing the retrieved plurality of datasets in respective register portions of a first register. A dataset of data-processing coefficients are stored in a second register. First processing is applied using, as the first operand, a first sub-set of dataset elements stored in the first register, and using, as the second operand, the data-processing coefficients, obtaining a first result. Second processing is applied using, as the first operand, a second sub-set of dataset elements stored in the first register comprised in a second window having a size equal to the dataset size, and using, as the second operand, the replica of the dataset of data-processing coefficients, obtaining a second result. An output is generated based on the first and second results. The first and second processing may perform multiply accumulate (MAC) operations.Type: GrantFiled: March 16, 2021Date of Patent: December 10, 2024Assignee: STMICROELECTRONICS S.r.l.Inventors: Xiao Kang Jiao, Fabio Giuseppe De Ambroggi, Loris Luise
-
Patent number: 12164882Abstract: A memory circuit includes a selection circuit, a column of memory cells, and an adder tree. The selection circuit is configured to receive input data elements, each input data element including a number of bits equal to H, and output a selected set of kth bits of the H bits of the input data elements. Each memory cell of the column of memory cells includes a first storage unit configured to store a first weight data element and a first multiplier configured to generate a first product data element based on the first weight data element and a first kth bit of the selected set of kth bits. The adder tree is configured to generate a summation data element based on each of the first product data elements.Type: GrantFiled: March 16, 2021Date of Patent: December 10, 2024Assignee: TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY, LTD.Inventors: Yu-Der Chih, Hidehiro Fujiwara, Yi-Chun Shih, Po-Hao Lee, Yen-Huei Chen, Chia-Fu Lee, Jonathan Tsung-Yung Chang
-
Patent number: 12153645Abstract: Methods, systems and apparatus for simulating quantum systems. In one aspect, a method includes the actions of obtaining a first Hamiltonian describing the quantum system, wherein the Hamiltonian is written in a plane wave basis comprising N plane wave basis vectors; applying a discrete Fourier transform to the first Hamiltonian to generate a second Hamiltonian written in a plane wave dual basis, wherein the second Hamiltonian comprises a number of terms that scales at most quadratically with N; and simulating the quantum system using the second Hamiltonian.Type: GrantFiled: May 5, 2023Date of Patent: November 26, 2024Assignee: Google LLCInventor: Ryan Babbush
-
Patent number: 12153975Abstract: Techniques for computing matrix operations for arbitrarily large matrices on a finite-sized hybrid analog-digital matrix processor are described. Techniques for gain adjustment in a finite-sized hybrid analog-digital matrix processor are described which enable the system to obtain higher energy efficiencies, greater physical density and improved numerical accuracy. In some embodiments, these techniques enable maximization of the predictive accuracy of a GEMM-based convolutional neural network using low-precision data representations.Type: GrantFiled: December 13, 2023Date of Patent: November 26, 2024Assignee: Lightmatter, Inc.Inventors: Tyler J. Kenney, Martin B. Z. Forsythe, Tomo Lazovich, Darius Bunandar