Patents by Inventor Jack H. Choquette

Jack H. Choquette has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Generalized acceleration of matrix multiply accumulate operations

Patent number: 11816482

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Grant

Filed: August 18, 2022

Date of Patent: November 14, 2023

Assignee: NVIDIA Corporation

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
Generalized acceleration of matrix multiply accumulate operations

Patent number: 11816481

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Grant

Filed: August 18, 2022

Date of Patent: November 14, 2023

Assignee: NVIDIA Corporation

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
Generalized acceleration of matrix multiply accumulate operations

Patent number: 11797301

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Grant

Filed: January 4, 2021

Date of Patent: October 24, 2023

Assignee: NVIDIA Corporation

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
Generalized acceleration of matrix multiply accumulate operations

Patent number: 11797303

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Grant

Filed: June 17, 2021

Date of Patent: October 24, 2023

Assignee: NVIDIA Corporation

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
Generalized acceleration of matrix multiply accumulate operations

Patent number: 11797302

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Grant

Filed: June 17, 2021

Date of Patent: October 24, 2023

Assignee: NVIDIA Corporation

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

Publication number: 20220405098

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Application

Filed: August 18, 2022

Publication date: December 22, 2022

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

Publication number: 20220391206

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Application

Filed: August 18, 2022

Publication date: December 8, 2022

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
SYSTEM AND METHOD OF CONTROLLING CACHE MEMORY RESIDENCY

Publication number: 20220365882

Abstract: Apparatuses, systems, and techniques to control operation of a memory cache. In at least one embodiment, cache guidance is specified within application source code by associating guidance with declaration of a memory block, and then applying specified guidance to source code statements that access said memory block.

Type: Application

Filed: August 5, 2021

Publication date: November 17, 2022

Inventors: Harold Carter Edwards, Luke David Durant, Stephen Jones, Jack H. Choquette, Ronny Krashinsky, Dmitri Vainbrand, Olivier Giroux, Olivier Francois Joseph Harel, Shirish Gadre, Ze Long, Matthieu Tardy, David Dastous St Hilaire, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Jaewook Shin, Jayashree Venkatesh, Girish Bhaskar Bharambe
Decompression techniques for processing compressed data suitable for artificial neural networks

Patent number: 11379420

Abstract: Compressed data is oftentimes beneficial for reducing the computing resources required, for example, to transmit and store data. The compression of data is particularly useful when dealing with sparse data (data that includes numerous zeros or near-zero values) and only non-zero values above a certain threshold have significance. When dealing with compressed data, oftentimes the data needs to be decompressed for processing (e.g., by deep learning networks or other applications configured to operate on sparse, or other uncompressed data). Instructions are disclosed for supporting the decompression of compressed data by a processing unit such as a CPU and GPU.

Type: Grant

Filed: March 20, 2019

Date of Patent: July 5, 2022

Assignee: NVIDIA CORPORATION

Inventors: Jorge Albericio Latorre, Jack H. Choquette, Manan Maheshkumar Patel, Jeffrey Pool, Ming Y. Siu, Ronny Meir Krashinsky, Ganesh Venkatesh
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

Publication number: 20210311734

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Application

Filed: June 17, 2021

Publication date: October 7, 2021

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

Publication number: 20210311733

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Application

Filed: June 17, 2021

Publication date: October 7, 2021

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

Publication number: 20210303302

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Application

Filed: January 4, 2021

Publication date: September 30, 2021

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
Generalized acceleration of matrix multiply accumulate operations

Patent number: 10884734

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Grant

Filed: July 1, 2019

Date of Patent: January 5, 2021

Assignee: NVIDIA Corporation

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
DECOMPRESSION TECHNIQUES FOR PROCESSING COMPRESSED DATA SUITABLE FOR ARTIFICIAL NEURAL NETWORKS

Publication number: 20200285618

Abstract: Compressed data is oftentimes beneficial for reducing the computing resources required, for example, to transmit and store data. The compression of data is particularly useful when dealing with sparse data (data that includes numerous zeros or near-zero values) and only non-zero values above a certain threshold have significance. When dealing with compressed data, oftentimes the data needs to be decompressed for processing (e.g., by deep learning networks or other applications configured to operate on sparse, or other uncompressed data). Instructions are disclosed for supporting the decompression of compressed data by a processing unit such as a CPU and GPU.

Type: Application

Filed: March 20, 2019

Publication date: September 10, 2020

Inventors: Jorge Albericio Latorre, Jack H. Choquette, Manan Maheshkumar Patel, Jeffrey Pool, Ming Y. Siu, Ronny Meir Krashinsky, Ganesh Venkatesh
Persistent scratchpad memory for data exchange between programs

Patent number: 10725837

Abstract: Techniques are disclosed for sharing of data exchange among kernels (each a set of instructions) executing on a system having multiple processing units. In an embodiment, each processing unit includes an on-chip scratchpad memory that can be accessed by the kernels executing on the processing unit. All or a portion of the scratchpad memory can be allocated and configured, for example, such that the scratchpad is accessible to multiple kernels in parallel, to one or more kernels in serial, or a combination of both.

Type: Grant

Filed: November 7, 2019

Date of Patent: July 28, 2020

Assignee: NVIDIA Corporation

Inventors: Rajballav Dash, Jack H. Choquette, Ming Liang Milton Lei, Stephen Jones, Christopher Frederick Lamb
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

Publication number: 20190324747

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Application

Filed: July 1, 2019

Publication date: October 24, 2019

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
Generalized acceleration of matrix multiply accumulate operations

Patent number: 10338919

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Grant

Filed: November 29, 2017

Date of Patent: July 2, 2019

Assignee: NVIDIA Corporation

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
System, method, and computer program product for managing out-of-order execution of program instructions

Patent number: 10255075

Abstract: A method, system and computer program product embodied on a computer-readable medium are provided for managing the execution of out-of-order instructions. The method includes the steps of receiving a plurality of instructions and identifying a subset of instructions in the plurality of instructions to be executed out-of-order.

Type: Grant

Filed: July 18, 2013

Date of Patent: April 9, 2019

Assignee: NVIDIA Corporation

Inventors: Olivier Giroux, Robert Ohannessian, Jr., Jack H. Choquette, William Parsons Newhall, Jr.
GENERALIZED ACCELERATION OF MATRIX MULTIPLY ACCUMULATE OPERATIONS

Publication number: 20180321938

Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Type: Application

Filed: November 29, 2017

Publication date: November 8, 2018

Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
Execution of divergent threads using a convergence barrier

Patent number: 10067768

Abstract: A method, system, and computer program product for executing divergent threads using a convergence barrier are disclosed. A first instruction in a program is executed by a plurality of threads, where the first instruction, when executed by a particular thread, indicates to a scheduler unit that the thread participates in a convergence barrier. A first path through the program is executed by a first divergent portion of the participating threads and a second path through the program is executed by a second divergent portion of the participating threads. The first divergent portion of the participating threads executes a second instruction in the program and transitions to a blocked state at the convergence barrier. The scheduler unit determines that all of the participating threads are synchronized at the convergence barrier and the convergence barrier is cleared.

Type: Grant

Filed: July 13, 2015

Date of Patent: September 4, 2018

Assignee: NVIDIA CORPORATION

Inventors: Gregory Frederick Diamos, Richard Craig Johnson, Vinod Grover, Olivier Giroux, Jack H. Choquette, Michael Alan Fetterman, Ajay S. Tirumala, Peter Nelson, Ronny Meir Krashinsky

1 2 3 next