Patents Examined by Michael D Yaary

Three-dimensional processor for parallel computing

Patent number: 11966715

Abstract: A three-dimensional processor (3D-processor) for parallel computing includes a plurality of computing elements. Each computing element comprises at least a three-dimensional memory (3D-M) array for storing at least a portion of a look-up table (LUT) for a mathematical function and an arithmetic logic circuit (ALC) for performing arithmetic operations on the LUT data. Deficiency in latency is offset by a large scale of parallelism.

Type: Grant

Filed: July 26, 2020

Date of Patent: April 23, 2024

Assignees: HangZhou HaiCun Information Technology Co., Ltd.

Inventors: Guobiao Zhang, Chen Shen
Reducing computations for data including padding

Patent number: 11960566

Abstract: Systems and methods are provided to eliminate multiplication operations with zero padding data for convolution computations. A multiplication matrix is generated from an input feature map matrix with padding by adjusting coordinates and dimensions of the input feature map matrix to exclude padding data. The multiplication matrix is used to perform matrix multiplications with respective weight values which results in fewer computations as compared to matrix multiplications which include the zero padding data.

Type: Grant

Filed: April 13, 2021

Date of Patent: April 16, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Dana Michelle Vantrease, Ron Diamant
Multiplier-accumulator processing pipelines and processing component, and methods of operating same

Patent number: 11960886

Abstract: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configurable to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configurable to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.

Type: Grant

Filed: April 25, 2022

Date of Patent: April 16, 2024

Assignee: Flex Logix Technologies, Inc.

Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
Implementing fundamental computational primitives using a matrix multiplication accelerator (MMA)

Patent number: 11960567

Abstract: A method for performing a fundamental computational primitive in a device is provided, where the device includes a processor and a matrix multiplication accelerator (MMA). The method includes configuring a streaming engine in the device to stream data for the fundamental computational primitive from memory, configuring the MMA to format the data, and executing the fundamental computational primitive by the device.

Type: Grant

Filed: July 4, 2021

Date of Patent: April 16, 2024

Assignee: Texas Instruments Incorporated

Inventors: Arthur John Redfern, Timothy David Anderson, Kai Chirca, Chenchi Luo, Zhenhua Yu
Arithmetic devices for neural network including a function storage circuit and an activation function circuit

Patent number: 11954457

Abstract: An arithmetic device includes a function storage circuit and an activation function (AF) circuit. The function storage circuit stores and outputs a function selection signal, a first function information signal, and a second function information signal. The AF circuit generates an activation function result data by applying a slope value and a maximum value to a multiplication/accumulation (MAC) result data in a function setting mode that is activated by the function selection signal. The slope value is set based on the first function information signal, and the maximum value is set based on the second function information signal.

Type: Grant

Filed: December 17, 2020

Date of Patent: April 9, 2024

Assignee: SK hynix Inc.

Inventor: Choung Ki Song
Pipeline architecture for bitwise multiplier-accumulator (MAC)

Patent number: 11941407

Abstract: A unit for accumulating a plurality N of multiplied M bit values includes a receiving unit, a bit-wise multiplier and a bit-wise accumulator. The receiving unit receives a pipeline of multiplicands A and B such that, at each cycle, a new set of multiplicands is received. The bit-wise multiplier bit-wise multiplies bits of a current multiplicand A with bits of a current multiplicand B and to sum and carry between bit-wise multipliers. The bit-wise accumulator accumulates output of the bit-wise multiplier thereby to accumulate the multiplicands during the pipelining process.

Type: Grant

Filed: April 5, 2020

Date of Patent: March 26, 2024

Assignee: GSI Technology Inc.

Inventor: Avidan Akerib
Methods for performing processing-in-memory operations, and related memory devices and systems

Patent number: 11934824

Abstract: Methods, apparatuses, and systems for in- or near-memory processing are described. Strings of bits (e.g., vectors) may be fetched and processed in logic of a memory device without involving a separate processing unit. Operations (e.g., arithmetic operations) may be performed on numbers stored in a bit-parallel way during a single sequence of clock cycles. Arithmetic may thus be performed in a single pass as numbers are bits of two or more strings of bits are fetched and without intermediate storage of the numbers. Vectors may be fetched (e.g., identified, transmitted, received) from one or more bit lines. Registers of a memory array may be used to write (e.g., store or temporarily store) results or ancillary bits (e.g., carry bits or carry flags) that facilitate arithmetic operations. Circuitry near, adjacent, or under the memory array may employ XOR or AND (or other) logic to fetch, organize, or operate on the data.

Type: Grant

Filed: April 6, 2020

Date of Patent: March 19, 2024

Assignee: Micron Technology, Inc.

Inventors: Dmitri Yudanov, Sean S. Eilert, Sivagnanam Parthasarathy, Shivasankar Gunasekaran, Ameen D. Akel
Counter-based multiplication using processing in memory

Patent number: 11934798

Abstract: The present disclosure is directed to systems and methods for a memory device such as, for example, a Processing-In-Memory Device that is configured to perform multiplication operations in memory using a popcount operation. A multiplication operation may include a summation of multipliers being multiplied with corresponding multiplicands. The inputs may be arranged in particular configurations within a memory array. Sense amplifiers may be used to perform the popcount by counting active bits along bit lines. One or more registers may accumulate results for performing the multiplication operations.

Type: Grant

Filed: March 31, 2020

Date of Patent: March 19, 2024

Assignee: Micron Technology, Inc.

Inventor: Dmitri Yudanov
Quantum sparse Fourier transform

Patent number: 11934479

Abstract: A method for performing sparse quantum Fourier transform computation includes defining a set of quantum circuits, each quantum circuit comprising a Hadamard gate and a single frequency rotation operator, said set of quantum circuits being equivalent to a quantum Fourier transform circuit. The method includes constructing a subset of said quantum circuits in a quantum processor, said quantum processor having a quantum representation of a classical distribution loaded into a quantum state of said quantum processor. The method includes executing said subset of said quantum circuits on said quantum state, and performing a measurement in a frequency basis to obtain a frequency distribution corresponding to said quantum state.

Type: Grant

Filed: October 7, 2020

Date of Patent: March 19, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tal Kachman, Mark S. Squillante, Lior Horesh, Kenneth Lee Clarkson, John A. Gunnels, Ismail Yunus Akhalwaya, Jayram Thathachar
Split random number generator

Patent number: 11924338

Abstract: A computing system may implement a split random number generator that may use a random number generator to generate and store seed values in a memory for retrieval and use by one or more core processors to generate random numbers for secure processes within each core processor.

Type: Grant

Filed: November 4, 2020

Date of Patent: March 5, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: David A Kaplan, Paul Moyer
Scalable, multi-precision, self-calibrated multiplier-accumulator architecture

Patent number: 11922131

Abstract: A method for performing vector-matrix multiplication may include converting a digital input vector comprising a plurality of binary-encoded values into a plurality of analog signals using a plurality of one-bit digital to analog converters (DACs); sequentially performing, using an analog vector matrix multiplier and based on bit-order, vector-matrix multiplication operations using a weighting matrix for the plurality of analog signals to generate analog outputs of the analog vector matrix multiplier; sequentially performing an analog-to-digital (ADC) operation on the analog outputs of the analog vector matrix multiplier to generate binary partial output vectors; and combining the binary partial output vectors to generate a result of the vector-matrix multiplication.

Type: Grant

Filed: November 7, 2020

Date of Patent: March 5, 2024

Assignee: Applied Materials, Inc.

Inventors: Xiaofeng Zhang, She-Hwa Yen
Systems and method for determining frequency coefficients of signals

Patent number: 11907326

Abstract: A system for determining the frequency coefficients of a one or multi-dimensional signal that is sparse in the frequency domain includes determining the locations of the non-zero frequency coefficients, and then determining values of the coefficients using the determined locations. If N is total number of frequency coefficients across the one or more dimension of the signal, and if R is an upper bound of the number of non-zero ones of these frequency coefficients, the systems requires up to (O(Rlog(R) (N))) samples and has a computation complexity of up to O(Rlog2(R) log (N). The system and the processing technique are stable to low-level noise and can exhibit only a small probability of failure. The frequency coefficients can be real and positive or they can be complex numbers.

Type: Grant

Filed: December 22, 2021

Date of Patent: February 20, 2024

Assignee: QUALCOMM Incorporated

Inventor: Pierre-David Letourneau
Integrated circuit chip apparatus

Patent number: 11900242

Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.

Type: Grant

Filed: March 7, 2022

Date of Patent: February 13, 2024

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu
Circuitry for high-bandwidth, low-latency machine learning

Patent number: 11899746

Abstract: The present disclosure relates generally to techniques for efficiently performing operations associated with artificial intelligence (AI), machine learning (ML), and/or deep learning (DL) applications, such as training and/or interference calculations, using an integrated circuit device. More specifically, the present disclosure relates to an integrated circuit design implemented to perform these operations with low latency and/or a high bandwidth of data. For example, embodiments of a computationally dense digital signal processing (DSP) circuitry, implemented to efficiently perform one or more arithmetic operations (e.g., a dot-product) on an input are disclosed. Moreover, embodiments described herein may relate to layout, design, and data scheduling of a processing element array implemented to compute matrix multiplications (e.g., systolic array multiplication).

Type: Grant

Filed: December 23, 2021

Date of Patent: February 13, 2024

Assignee: Intel Corporation

Inventors: Martin Langhammer, Andrei-Mihai Hagiescu-Miriste
Integrated circuit chip apparatus

Patent number: 11900241

Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.

Type: Grant

Filed: March 7, 2022

Date of Patent: February 13, 2024

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu
Neural network processing using specialized data representation

Patent number: 11893470

Abstract: Techniques for neural network processing using specialized data representation are disclosed. Input data for manipulation in a layer of a neural network is obtained. The input data includes image data, where the image data is represented in bfloat16 format without loss of precision. The manipulation of the input data is performed on a processor that supports single-precision operations. The input data is converted to a 16-bit reduced floating-point representation, where the reduced floating-point representation comprises an alternative single-precision data representation mode. The input data is manipulated with one or more 16-bit reduced floating-point data elements. The manipulation includes a multiply and add-accumulate operation. The manipulation further includes a unary operation, a binary operation, or a conversion operation. A result of the manipulating is forwarded to a next layer of the neural network.

Type: Grant

Filed: December 5, 2019

Date of Patent: February 6, 2024

Assignee: MIPS Tech, LLC

Inventor: Sanjay Patel
Multiplier-accumulator processing pipelines and processing component, and methods of operating same

Patent number: 11893388

Abstract: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configured to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configured to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.

Type: Grant

Filed: April 13, 2022

Date of Patent: February 6, 2024

Assignee: Flex Logix Technologies, Inc.

Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
Cascade multiplier using unit element analog multiplier-accumulator

Patent number: 11886835

Abstract: A multiplier-accumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. The number of bits in the B input is a number of AND-groups and the number of bits in A is the number of AND gates in an AND-group. Each unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors which sum and scale the charges from the charge transfer lines, the charge coupled to an analog to digital converter which forms the dot product output. The charge transfer lines may span multiple unit elements.

Type: Grant

Filed: December 31, 2020

Date of Patent: January 30, 2024

Assignee: Ceremorphic, Inc.

Inventors: Ryan Boesch, Martin Kraemer, Wei Xiong
Processor for fine-grain sparse integer and floating-point operations

Patent number: 11861327

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a first activation value by a least significant sub-word and a most significant sub-word of a first weight to form a first partial product and a second partial product; and adding the first partial product and the second partial product. The forming of the second set of products may include: multiplying, in the first multiplier, a second activation value by a first sub-word and a second sub-word of a mantissa to form a third partial product and a fourth partial product; and adding the third partial product and the fourth partial product.

Type: Grant

Filed: December 22, 2020

Date of Patent: January 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph Hassoun
Partially and fully parallel normaliser

Patent number: 11861323

Abstract: Hardware logic arranged to normalise (or renormalise) an n-bit input number is described in which at least a proportion of a left shifting operation is performed in parallel with a leading zero count operation. In various embodiments the left shifting and the leading zero count are performed independently. In various other embodiments, a subset of the bits output by a leading zero counter are input to a left shifter and the output from the left shifter is input to a renormalisation block which completes the remainder of the left shifting operation independently of any further input from the leading zero counter.

Type: Grant

Filed: March 15, 2021

Date of Patent: January 2, 2024

Assignee: Imagination Technologies Limited

Inventor: Theo Alan Drane

1 2 3 4 5 … next