Patents by Inventor Jorge Albericio

Jorge Albericio has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SPARSE MATRIX MULTIPLICATION IN A NEURAL NETWORK

Publication number: 20250045107

Abstract: Apparatuses, systems, and methods to enable matrix multiplication acceleration by modifying an input to apply sparsity through sparse activation filtering. In at least one embodiment, a neural network modifies pixels within an image through sparse activation filtering to enable use of one or more matrix multiplication acceleration units to perform a sparse patch embedding operation.

Type: Application

Filed: August 30, 2023

Publication date: February 6, 2025

Inventors: Jorge Albericio Latorre, Chong Yu
IMAGE BLENDING USING ONE OR MORE NEURAL NETWORKS

Publication number: 20230196662

Abstract: Apparatuses, systems, and techniques are presented to reconstruct one or more images. In at least one embodiment, one or more circuits are to use one or more neural networks to adjust one or more pixel blending weights.

Type: Application

Filed: December 20, 2021

Publication date: June 22, 2023

Inventors: Pietari Kaskela, Andrew Tao, Michael Ranzinger, David Tarjan, Jonathan Filip Gustav Granskog, Jorge Albericio Latorre
ACCELERATOR FOR DEEP NEURAL NETWORKS

Publication number: 20230186065

Abstract: A system for bit-serial computation in a neural network is described. The system may be embodied on an integrated circuit and include one or more bit-serial tiles for performing bit-serial computations in which each bit-serial tile receives input neurons and synapses, and communicates output neurons. Also included is an activation memory for storing the neurons and a dispatcher. The dispatcher reads neurons and synapses from memory and communicates either the neurons or the synapses bit-serially to the one or more bit-serial tiles. The other of the neurons or the synapses are communicated bit-parallelly to the one or more bit-serial tiles, or according to a further embodiment, may also be communicated bit-serially to the one or more bit-serial tiles.

Type: Application

Filed: February 10, 2023

Publication date: June 15, 2023

Applicant: Samsung Electronics Co., Ltd.

Inventors: Patrick Judd, Jorge Albericio, Alberto Delmas Lascorz, Andreas Moshovos, Sayeh Sharify
Accelerator for deep neural networks

Patent number: 11610100

Abstract: A system for bit-serial computation in a neural network is described. The system may be embodied on an integrated circuit and include one or more bit-serial tiles for performing bit-serial computations in which each bit-serial tile receives input neurons and synapses, and communicates output neurons. Also included is an activation memory for storing the neurons and a dispatcher and a reducer. The dispatcher reads neurons and synapses from memory and communicates either the neurons or the synapses bit-serially to the one or more bit-serial tiles. The other of the neurons or the synapses are communicated bit-parallelly to the one or more bit-serial tiles, or according to a further embodiment, may also be communicated bit-serially to the one or more bit-serial tiles. The reducer receives the output neurons from the one or more tiles, and communicates the output neurons to the activation memory.

Type: Grant

Filed: July 7, 2019

Date of Patent: March 21, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Patrick Judd, Jorge Albericio, Alberto Delmas Lascorz, Andreas Moshovos, Sayeh Sharifymoghaddam
Compression techniques for data structures suitable for artificial neural networks

Patent number: 11489541

Abstract: In artificial neural networks, and other similar applications, there is typically a large amount of data involved that is considered sparse data. Due to the large size of the data involved in such applications, it is helpful to compress the data to save bandwidth resources when transmitting the data and save memory resources when storing the data. Introduced herein is a compression technique that selects elements with significant values from data and restructures them into a structured sparse format. By generating metadata that enforces the structured sparse format and organizing the data according to the metadata, the introduced technique not only reduces the size of the data but also consistently places the data in a particular format. As such, hardware can be simplified and optimized to process the data much faster and much more efficiently than the conventional compression techniques that rely on a non-structured sparsity format.

Type: Grant

Filed: May 30, 2019

Date of Patent: November 1, 2022

Assignee: NVIDIA Corporation

Inventors: Jorge Albericio Latorre, Ming Y. Siu
ACCELERATOR FOR DEEP NEURAL NETWORKS

Publication number: 20220327367

Abstract: Described is a system, integrated circuit and method for reducing ineffectual computations in the processing of layers in a neural network. One or more tiles perform computations where each tile receives input neurons, offsets and synapses, and where each input neuron has an associated offset. Each tile generates output neurons, and there is also an activation memory for storing neurons in communication with the tiles via a dispatcher and an encoder. The dispatcher reads neurons from the activation memory and communicates the neurons to the tiles and reads synapses from a memory and communicates the synapses to the tiles. The encoder receives the output neurons from the tiles, encodes them and communicates the output neurons to the activation memory. The offsets are processed by the tiles in order to perform computations only on non-zero neurons. Optionally, synapses may be similarly processed to skip ineffectual operations.

Type: Application

Filed: June 22, 2022

Publication date: October 13, 2022

Applicant: Samsung Electronics Co., Ltd.

Inventors: Patrick Judd, Jorge Albericio, Alberto Delmas Lascorz, Andreas Moshovos, Sayeh Sharifymoghaddam
Accelerator for deep neural networks

Patent number: 11423289

Abstract: Described is a system, integrated circuit and method for reducing ineffectual computations in the processing of layers in a neural network. One or more tiles perform computations where each tile receives input neurons, offsets and synapses, and where each input neuron has an associated offset. Each tile generates output neurons, and there is also an activation memory for storing neurons in communication with the tiles via a dispatcher and an encoder. The dispatcher reads neurons from the activation memory and communicates the neurons to the tiles and reads synapses from a memory and communicates the synapses to the tiles. The encoder receives the output neurons from the tiles, encodes them and communicates the output neurons to the activation memory. The offsets are processed by the tiles in order to perform computations only on non-zero neurons. Optionally, synapses may be similarly processed to skip ineffectual operations.

Type: Grant

Filed: June 14, 2017

Date of Patent: August 23, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Patrick Judd, Jorge Albericio, Alberto Delmas Lascorz, Andreas Moshovos, Sayeh Sharifymoghaddam
Managing data sparsity for neural networks

Patent number: 11392829

Abstract: Approaches in accordance with various embodiments provide for the processing of sparse matrices for mathematical and programmatic operations. In particular, various embodiments enforce sparsity constraints for performing sparse matrix multiply-add instruction (MMA) operations. Deep neural networks can exhibit significant sparsity in the data used in operations, both in the activations and weights. The computational load can be reduced by excluding zero-valued data elements. A sparsity constraint is applied across all submatrices of a sparse matrix, providing fine-grained structured sparsity that is evenly distributed across the matrix. The matrix may then be compressed since a minimum number of elements of the matrix are known to have zero value. Matrix operations are then performed using these matrices.

Type: Grant

Filed: April 2, 2019

Date of Patent: July 19, 2022

Assignee: NVIDIA Corporation

Inventors: Jeff Pool, Ganesh Venkatesh, Jorge Albericio Latorre, Jack Choquette, Ronny Krashinsky, John Tran, Feng Xie, Ming Y. Siu, Manan Patel
Decompression techniques for processing compressed data suitable for artificial neural networks

Patent number: 11379420

Abstract: Compressed data is oftentimes beneficial for reducing the computing resources required, for example, to transmit and store data. The compression of data is particularly useful when dealing with sparse data (data that includes numerous zeros or near-zero values) and only non-zero values above a certain threshold have significance. When dealing with compressed data, oftentimes the data needs to be decompressed for processing (e.g., by deep learning networks or other applications configured to operate on sparse, or other uncompressed data). Instructions are disclosed for supporting the decompression of compressed data by a processing unit such as a CPU and GPU.

Type: Grant

Filed: March 20, 2019

Date of Patent: July 5, 2022

Assignee: NVIDIA CORPORATION

Inventors: Jorge Albericio Latorre, Jack H. Choquette, Manan Maheshkumar Patel, Jeffrey Pool, Ming Y. Siu, Ronny Meir Krashinsky, Ganesh Venkatesh
Efficient matrix data format applicable for artificial neural network

Patent number: 11249727

Abstract: Many computing systems process data organized in a matrix format. For example, artificial neural networks (ANNs) perform numerous computations on data organized into matrices using conventional matrix arithmetic operations. One such operation, which is commonly performed, is the transpose operation. Additionally, many such systems need to process many matrices and/or matrices that are large in size. For sparse matrices that hold few significant values and many values that can be ignored, transmitting and processing all the values in such matrices is wasteful. Thus, techniques are introduced for storing a sparse matrix in a compressed format that allows for a matrix transpose operation to be performed on the compressed matrix without having to first decompress the compressed matrix. By utilizing the introduced techniques, more matrix operations can be performed than conventional systems.

Type: Grant

Filed: October 19, 2020

Date of Patent: February 15, 2022

Assignee: Nvidia Corporation

Inventors: Jorge Albericio Latorre, Jeff Pool, David Garcia
Efficient matrix format suitable for neural networks

Patent number: 11127167

Abstract: Many computing systems process data organized in a matrix format. For example, artificial neural networks perform numerous computations on data organized into matrices using conventional matrix arithmetic operations. One such operation is the transpose operation. Techniques are introduced for storing a matrix in a compressed format that allows, for example, a transpose operation to be performed during decompression. Thus, by utilizing the introduced techniques, transformations of compressed matrices such transposition can be achieved in a more effective way. Parallel processing may also be used to more efficiently compress and/or decompress.

Type: Grant

Filed: April 29, 2019

Date of Patent: September 21, 2021

Assignee: NVIDIA Corporation

Inventors: Michael Frumkin, Jeffrey Pool, Jorge Albericio Latorre
EFFICIENT MATRIX DATA FORMAT APPLICABLE FOR ARTIFICIAL NEURAL NETWORK

Publication number: 20210034332

Abstract: Many computing systems process data organized in a matrix format. For example, artificial neural networks (ANNs) perform numerous computations on data organized into matrices using conventional matrix arithmetic operations. One such operation, which is commonly performed, is the transpose operation. Additionally, many such systems need to process many matrices and/or matrices that are large in size. For sparse matrices that hold few significant values and many values that can be ignored, transmitting and processing all the values in such matrices is wasteful. Thus, techniques are introduced for storing a sparse matrix in a compressed format that allows for a matrix transpose operation to be performed on the compressed matrix without having to first decompress the compressed matrix. By utilizing the introduced techniques, more matrix operations can be performed than conventional systems.

Type: Application

Filed: October 19, 2020

Publication date: February 4, 2021

Inventors: Jorge Albericio Latorre, Jeff Pool, David Garcia
NEURAL NETWORK ACCELERATOR

Publication number: 20210004668

Abstract: Described is a neural network accelerator tile for exploiting input sparsity. The tile includes a weight memory to supply each weight lane with a weight and a weight selection metadata, an activation selection unit to receive a set of input activation values and rearrange the set of input activation values to supply each activation lane with a set of rearranged activation values, a set of multiplexers including at least one multiplexer per pair of activation and weight lanes, where each multiplexer is configured to select a combination activation value for the activation lane from the activation lane set of rearranged activation values based on the weight lane weight selection metadata, and a set of combination units including at least one combination unit per multiplexer, where each combination unit is configured to combine the activation lane combination value with the weight lane weight to output a weight lane product.

Type: Application

Filed: February 15, 2019

Publication date: January 7, 2021

Inventors: Andreas Moshovos, Alberto Delmas Lascorz, Zisis Poulos, Dylan Malone Stuart, Patrick Judd, Sayeh Sharify, Mostafa Mahmoud, Milos Nikolic, Kevin Chong Man Siu, Jorge Albericio
Efficient matrix data format applicable for artificial neural network

Patent number: 10860293

Abstract: Many computing systems process data organized in a matrix format. For example, artificial neural networks (ANNs) perform numerous computations on data organized into matrices using conventional matrix arithmetic operations. One such operation, which is commonly performed, is the transpose operation. Additionally, many such systems need to process many matrices and/or matrices that are large in size. For sparse matrices that hold few significant values and many values that can be ignored, transmitting and processing all the values in such matrices is wasteful. Thus, techniques are introduced for storing a sparse matrix in a compressed format that allows for a matrix transpose operation to be performed on the compressed matrix without having to first decompress the compressed matrix. By utilizing the introduced techniques, more matrix operations can be performed than conventional systems.

Type: Grant

Filed: February 27, 2019

Date of Patent: December 8, 2020

Assignee: Nvidia Corporation

Inventors: Jorge Albericio Latorre, Jeff Pool, David Garcia
COMPRESSION TECHNIQUES FOR DATA STRUCTURES SUITABLE FOR ARTIFICIAL NEURAL NETWORKS

Publication number: 20200373941

Abstract: In artificial neural networks, and other similar applications, there is typically a large amount of data involved that is considered sparse data. Due to the large size of the data involved in such applications, it is helpful to compress the data to save bandwidth resources when transmitting the data and save memory resources when storing the data. Introduced herein is a compression technique that selects elements with significant values from data and restructures them into a structured sparse format. By generating metadata that enforces the structured sparse format and organizing the data according to the metadata, the introduced technique not only reduces the size of the data but also consistently places the data in a particular format. As such, hardware can be simplified and optimized to process the data much faster and much more efficiently than the conventional compression techniques that rely on a non-structured sparsity format.

Type: Application

Filed: May 30, 2019

Publication date: November 26, 2020

Inventors: Jorge Albericio Latorre, Ming Y. Siu
EFFICIENT MATRIX FORMAT SUITABLE FOR NEURAL NETWORKS

Publication number: 20200342632

Abstract: Many computing systems process data organized in a matrix format. For example, artificial neural networks perform numerous computations on data organized into matrices using conventional matrix arithmetic operations. One such operation is the transpose operation. Techniques are introduced for storing a matrix in a compressed format that allows, for example, a transpose operation to be performed during decompression. Thus, by utilizing the introduced techniques, transformations of compressed matrices such transposition can be achieved in a more effective way. Parallel processing may also be used to more efficiently compress and/or decompress.

Type: Application

Filed: April 29, 2019

Publication date: October 29, 2020

Inventors: Michael FRUMKIN, Jeffrey POOL, Jorge ALBERICIO LATORRE
DECOMPRESSION TECHNIQUES FOR PROCESSING COMPRESSED DATA SUITABLE FOR ARTIFICIAL NEURAL NETWORKS

Publication number: 20200285618

Abstract: Compressed data is oftentimes beneficial for reducing the computing resources required, for example, to transmit and store data. The compression of data is particularly useful when dealing with sparse data (data that includes numerous zeros or near-zero values) and only non-zero values above a certain threshold have significance. When dealing with compressed data, oftentimes the data needs to be decompressed for processing (e.g., by deep learning networks or other applications configured to operate on sparse, or other uncompressed data). Instructions are disclosed for supporting the decompression of compressed data by a processing unit such as a CPU and GPU.

Type: Application

Filed: March 20, 2019

Publication date: September 10, 2020

Inventors: Jorge Albericio Latorre, Jack H. Choquette, Manan Maheshkumar Patel, Jeffrey Pool, Ming Y. Siu, Ronny Meir Krashinsky, Ganesh Venkatesh
EFFICIENT MATRIX DATA FORMAT APPLICABLE FOR ARTIFICIAL NEURAL NETWORK

Publication number: 20200272425

Abstract: Many computing systems process data organized in a matrix format. For example, artificial neural networks (ANNs) perform numerous computations on data organized into matrices using conventional matrix arithmetic operations. One such operation, which is commonly performed, is the transpose operation. Additionally, many such systems need to process many matrices and/or matrices that are large in size. For sparse matrices that hold few significant values and many values that can be ignored, transmitting and processing all the values in such matrices is wasteful. Thus, techniques are introduced for storing a sparse matrix in a compressed format that allows for a matrix transpose operation to be performed on the compressed matrix without having to first decompress the compressed matrix. By utilizing the introduced techniques, more matrix operations can be performed than conventional systems.

Type: Application

Filed: February 27, 2019

Publication date: August 27, 2020

Inventors: Jorge Albericio Latorre, Jeff Pool, David Garcia
ACCELERATOR FOR DEEP NEURAL NETWORKS

Publication number: 20200125931

Abstract: A system for bit-serial computation in a neural network is described. The system may be embodied on an integrated circuit and include one or more bit-serial tiles for performing bit-serial computations in which each bit-serial tile receives input neurons and synapses, and communicates output neurons. Also included is an activation memory for storing the neurons and a dispatcher and a reducer. The dispatcher reads neurons and synapses from memory and communicates either the neurons or the synapses bit-serially to the one or more bit-serial tiles. The other of the neurons or the synapses are communicated bit-parallelly to the one or more bit-serial tiles, or according to a further embodiment, may also be communicated bit-serially to the one or more bit-serial tiles. The reducer receives the output neurons from the one or more tiles, and communicates the output neurons to the activation memory.

Type: Application

Filed: July 7, 2019

Publication date: April 23, 2020

Inventors: Patrick Judd, Jorge Albericio, Alberto Delmas Lascorz, Andreas Moshovos, Sayeh Sharify
Accelerator for deep neural networks

Patent number: 10387771

Abstract: A system for bit-serial computation in a neural network is described. The system may be embodied on an integrated circuit and include one or more bit-serial tiles for performing bit-serial computations in which each bit-serial tile receives input neurons and synapses, and communicates output neurons. Also included is an activation memory for storing the neurons and a dispatcher and a reducer. The dispatcher reads neurons and synapses from memory and communicates either the neurons or the synapses bit-serially to the one or more bit-serial tiles. The other of the neurons or the synapses are communicated bit-parallelly to the one or more bit-serial tiles, or according to a further embodiment, may also be communicated bit-serially to the one or more bit-serial tiles. The reducer receives the output neurons from the one or more tiles, and communicates the output neurons to the activation memory.

Type: Grant

Filed: May 26, 2017

Date of Patent: August 20, 2019

Inventors: Patrick Judd, Jorge Albericio, Alberto Delmas Lascorz, Andreas Moshovos, Sayeh Sharify

1 2 next