Patents by Inventor Sharanyan Chetlur

Sharanyan Chetlur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Performing multi-convolution operations in a parallel processing system

Patent number: 10223333

Abstract: In one embodiment of the present invention a convolution engine configures a parallel processing pipeline to perform multi-convolution operations. More specifically, the convolution engine configures the parallel processing pipeline to independently generate and process individual image tiles. In operation, for each image tile, the pipeline calculates source locations included in an input image batch. Notably, the source locations reflect the contribution of the image tile to an output tile of an output matrix—the result of the multi-convolution operation. Subsequently, the pipeline copies data from the source locations to the image tile. Similarly, the pipeline copies data from a filter stack to a filter tile. The pipeline then performs matrix multiplication operations between the image tile and the filter tile to generate data included in the corresponding output tile. To optimize both on-chip memory usage and execution time, the pipeline creates each image tile in on-chip memory as-needed.

Type: Grant

Filed: August 27, 2015

Date of Patent: March 5, 2019

Assignee: NVIDIA CORPORATION

Inventors: Sharanyan Chetlur, Bryan Catanzaro
PERFORMING MULTI-CONVOLUTION OPERATIONS IN A PARALLEL PROCESSING SYSTEM

Publication number: 20160062947

Abstract: In one embodiment of the present invention a convolution engine configures a parallel processing pipeline to perform multi-convolution operations. More specifically, the convolution engine configures the parallel processing pipeline to independently generate and process individual image tiles. In operation, for each image tile, the pipeline calculates source locations included in an input image batch. Notably, the source locations reflect the contribution of the image tile to an output tile of an output matrix—the result of the multi-convolution operation. Subsequently, the pipeline copies data from the source locations to the image tile. Similarly, the pipeline copies data from a filter stack to a filter tile. The pipeline then performs matrix multiplication operations between the image tile and the filter tile to generate data included in the corresponding output tile. To optimize both on-chip memory usage and execution time, the pipeline creates each image tile in on-chip memory as-needed.

Type: Application

Filed: August 27, 2015

Publication date: March 3, 2016

Inventors: Sharanyan CHETLUR, Bryan CATANZARO
System and method for re-factorizing a square matrix into lower and upper triangular matrices on a parallel processor

Patent number: 9170836

Abstract: A system and method for re-factorizing a square input matrix on a parallel processor. In one embodiment, the system includes: (1) a matrix generator operable to generate an intermediate matrix by embedding a permuted form of the input matrix in a zeroed-out sparsity pattern of a combination of lower and upper triangular matrices resulting from a prior LU factorization of a previous matrix having a same sparsity pattern, reordering to minimize fill-in and pivoting strategy as the input matrix and (2) a re-factorizer associated with the matrix generator and operable to use parallel threads to apply an incomplete-LU factorization with zero fill-in on the intermediate matrix.

Type: Grant

Filed: January 9, 2013

Date of Patent: October 27, 2015

Assignee: NVIDIA CORPORATION

Inventors: Maxim Naumov, Sharanyan Chetlur, Lung Sheng Chien, Robert Strzodka, Philippe Vandermersch
SYSTEM AND METHOD FOR RE-FACTORIZING A SQUARE MATRIX INTO LOWER AND UPPER TRIANGULAR MATRICES ON A PARALLEL PROCESSOR

Publication number: 20140196043

Abstract: A system and method for re-factorizing a square input matrix on a parallel processor. In one embodiment, the system includes: (1) a matrix generator operable to generate an intermediate matrix by embedding a permuted form of the input matrix in a zeroed-out sparsity pattern of a combination of lower and upper triangular matrices resulting from a prior LU factorization of a previous matrix having a same sparsity pattern, reordering to minimize fill-in and pivoting strategy as the input matrix and (2) a re-factorizer associated with the matrix generator and operable to use parallel threads to apply an incomplete-LU factorization with zero fill-in on the intermediate matrix.

Type: Application

Filed: January 9, 2013

Publication date: July 10, 2014

Applicant: NVIDIA CORPORATION

Inventors: Maxim Naumov, Sharanyan Chetlur, Lung Sheng Chien, Robert Strzodka, Philippe Vandermersch

Performing multi-convolution operations in a parallel processing system

PERFORMING MULTI-CONVOLUTION OPERATIONS IN A PARALLEL PROCESSING SYSTEM

System and method for re-factorizing a square matrix into lower and upper triangular matrices on a parallel processor

SYSTEM AND METHOD FOR RE-FACTORIZING A SQUARE MATRIX INTO LOWER AND UPPER TRIANGULAR MATRICES ON A PARALLEL PROCESSOR