Patents by Inventor Jeffrey Michael Pool

Jeffrey Michael Pool has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

INLINE DATA INSPECTION FOR WORKLOAD SIMPLIFICATION

Publication number: 20260044343

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

Type: Application

Filed: August 25, 2025

Publication date: February 12, 2026

Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
EXPRESSIVE SPARSE MATRIX REPRESENTATIONS WITH LIMITED METADATA

Publication number: 20260023811

Abstract: Disclosed are systems and techniques for compressing a dense matrix into an expressive sparse matrix representation with limited metadata. The techniques include generating a sparse matrix with corresponding metadata based on a dense matrix. Generating the sparse matrix with corresponding metadata includes identifying a first number (M) of elements to compress, a second number (N) of elements to retain, a third number (P) of positions, and a format; determining a metadata value for each of N elements of the dense matrix based on the identified P and the identified format, wherein the dense matrix includes at least M elements; and generating the sparse matrix containing the N elements of the dense matrix. The techniques include storing the sparse matrix and the corresponding metadata, wherein the corresponding metadata comprises the metadata value for each of the N elements of the dense matrix.

Type: Application

Filed: July 18, 2024

Publication date: January 22, 2026

Inventors: Jeffrey Michael Pool, Manan Patel, Ming Yiu Siu, Po-An Tsai
MATH OPERATIONS USING EXPRESSIVE SPARSE MATRIX REPRESENTATIONS WITH LIMITED METADATA

Publication number: 20260023813

Abstract: Disclosed are systems and techniques for performing matrix multiply operations on an expressive sparse matrix representation with limited metadata. The techniques include receiving a sparse matrix, metadata corresponding to the sparse matrix, and a matrix operand. The sparse matrix contains a first number (N) of elements to retain from a dense matrix which comprises at least a second number (M) of elements. The metadata corresponding to the sparse matrix is based on a third number (P) of positions and a format determined during compression of the dense matrix. The techniques include selecting, by one or more selection circuits, a subset of elements of the matrix operand based on the metadata corresponding to the sparse matrix and performing one or more matrix multiply operations on the sparse matrix and the subset of elements of the matrix operand.

Type: Application

Filed: July 18, 2024

Publication date: January 22, 2026

Inventors: Jeffrey Michael Pool, Manan Patel, Ming Yiu Siu, Po-An Tsai
DECOMPRESSION OF EXPRESSIVE SPARSE MATRIX REPRESENTATIONS WITH LIMITED METADATA

Publication number: 20260023812

Abstract: Disclosed are systems and techniques for decompressing an expressive sparse matrix representation with limited metadata. The techniques include receiving a sparse matrix and metadata corresponding to the sparse matrix. The sparse matrix is a compressed representation of a dense matrix. The sparse matrix contains a first number (N) of elements to retain from the dense matrix which comprises at least a second number (M) of elements. The metadata corresponding to the sparse matrix is based on a third number (P) of positions and a format determined during compression of the dense matrix. The techniques include generating an uncompressed matrix based on the sparse matrix and the metadata corresponding to the sparse matrix.

Type: Application

Filed: July 18, 2024

Publication date: January 22, 2026

Inventors: Jeffrey Michael Pool, Manan Patel, Ming Yiu Siu, Po-An Tsai
FULLY-EXPRESSIVE SPARSE MATRIX REPRESENTATIONS WITH LIMITED METADATA

Publication number: 20260023814

Abstract: Disclosed are systems and techniques for compressing a dense matrix into a fully-expressive sparse matrix representation with limited metadata. The techniques include generating a sparse matrix with corresponding metadata based on a dense matrix. Generating the sparse matrix with corresponding metadata includes identifying a first number (M) of elements to compress, a second number (N) of elements to retain, and a third number (B) indicating the number of bits each metadata value uses; determining a metadata value for each of N elements of the dense matrix; packing a first metadata value having more than B bits into a second metadata value having B bits; and generating the sparse matrix containing the N elements of the dense matrix. The techniques include storing the sparse matrix and the corresponding metadata, wherein the corresponding metadata comprises the second metadata value.

Type: Application

Filed: July 18, 2024

Publication date: January 22, 2026

Inventors: Jeffrey Michael Pool, Manan Patel, Ming Yiu Siu, Po-An Tsai
DATA INSPECTION FOR COMPRESSION/DECOMPRESSION CONFIGURATION AND DATA TYPE DETERMINATION

Publication number: 20260010809

Abstract: Distribution of data in a neural network data set is used to determine an optimal compressor configuration for compressing the neural network data set and/or the underlying data type of the neural network data set. By using a generalizable optimization of examining the data prior to compressor invocation, the example non-limiting technology herein makes it possible to tune a compressor to better target the incoming data. For sparse data compression, this step may involve examining the distribution of data (e.g., in one example, zeros in the data). For other algorithms, it may involve other types of inspection. This changes the fundamental behavior of the compressor itself. By inspecting the distribution of data (e.g., zeros in the data), it also possible to very accurately predict the data width of the underlying data.

Type: Application

Filed: May 30, 2025

Publication date: January 8, 2026

Inventor: Jeffrey Michael Pool
LEARNABLE SEMI-STRUCTURED SPARSITY FOR LARGE LANGUAGE MODELS

Publication number: 20250335769

Abstract: Apparatuses, systems, and techniques to losslessly compress neural networks via semi-structured sparsity. In at least one embodiment, a weighted average of candidate masks for semi-structured sparsity is learned for each parameter block of a neural network, and a composite mask is determined by selecting candidate masks based on the learned weighted averages. In at least one embodiment, computational resources required for inference are reduced, thereby contributing to more sustainable and environmentally friendly AI applications.

Type: Application

Filed: November 12, 2024

Publication date: October 30, 2025

Inventors: Gongfan Fang, Pavlo Molchanov, Hongxu Yin, Gregory Heinrich, Saurav Muralidharan, Chenhan Yu, Jeffrey Michael Pool, Jan Kautz, Jorge Albericio Latorre
Increasing sparcity in data sets

Patent number: 12443571

Abstract: Apparatuses, systems, and techniques to transform data sets, such as matrices representing layers of neural networks, to increase sparsity and/or other characteristics of said data sets to improve performance in computations, such as neural network computations. In at least one embodiment, one or more subsets of data in one or more sets of data are rearranged as part of a process to increase sparsity in said one or more sets of data to satisfy one or more one or more structural sparsity constraints.

Type: Grant

Filed: May 18, 2021

Date of Patent: October 14, 2025

Assignee: NVIDIA CORPORATION

Inventors: Jeffrey Michael Pool, Chong Yu, Paulius Micikevicius
Inline data inspection for workload simplification

Patent number: 12399716

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

Type: Grant

Filed: April 3, 2024

Date of Patent: August 26, 2025

Assignee: NVIDIA Corporation

Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
COMPRESSION OF MACHINE LEARNING MODELS VIA SPARSIFICATION AND QUANTIZATION

Publication number: 20250094864

Abstract: Machine learning is a process that learns a model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the size, computation, and latency of a machine learning model, a compression technique can be employed which includes model sparsification and quantization. To limit the extent to which the quality of the model is impacted when uniformly applying sparsification and quantization to all values of the model, the present disclosure provides for a hybrid sparsification and quantization of the model.

Type: Application

Filed: March 12, 2024

Publication date: March 20, 2025

Inventors: Po-An Tsai, Geonhwa Jeong, Jeffrey Michael Pool
INLINE DATA INSPECTION FOR WORKLOAD SIMPLIFICATION

Publication number: 20240248718

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

Type: Application

Filed: April 3, 2024

Publication date: July 25, 2024

Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
GENERATING SPARSE NEURAL NETWORKS

Publication number: 20240152407

Abstract: Apparatuses, systems, and techniques to determine a configuration based at least in part on data stored by at least one data structure of a workload at runtime, and transform the workload into a sparse workload based at least in part on the configuration. In at least one embodiment, one or more sparse workloads (e.g., one or more sparse neural networks) are generated based at least in part on, for example, one or more workloads (e.g., one or more neural networks).

Type: Application

Filed: July 17, 2023

Publication date: May 9, 2024

Inventors: Geonhwa Jeong, Po-An Tsai, Jeffrey Michael Pool
Inline data inspection for workload simplification

Patent number: 11977888

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

Type: Grant

Filed: February 22, 2023

Date of Patent: May 7, 2024

Assignee: NVIDIA Corporation

Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
INLINE DATA INSPECTION FOR WORKLOAD SIMPLIFICATION

Publication number: 20230221957

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

Type: Application

Filed: February 22, 2023

Publication date: July 13, 2023

Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
Inline data inspection for workload simplification

Patent number: 11609761

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

Type: Grant

Filed: December 9, 2019

Date of Patent: March 21, 2023

Assignee: NVIDIA CORPORATION

Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
Packed error correction code (ECC) for compressed data protection

Patent number: 11522565

Abstract: A packed error correction code (ECC) technique opportunistically embeds ECC check-bits with compressed data. When compressed, the data is encoded in fewer bits and is therefore fragmented when stored or transmitted compared with the uncompressed data. The ECC check-bits may be packed with compressed data at “source” points. The check-bits are transmitted along with the compressed data and, at any “intermediate” point between the source and a “destination” the check-bits may be used to detect and correct errors in the compressed data. In contrast with conventional systems, packed ECC enables end-to-end coverage for sufficiently-compressed data within the processor and also externally. While storage circuitry typically is protected by structure-specific ECC, protection is also beneficial for data as it is transmitted between processing and/or storage units.

Type: Grant

Filed: April 7, 2021

Date of Patent: December 6, 2022

Assignee: NVIDIA Corporation

Inventors: Michael Brendan Sullivan, Jeffrey Michael Pool, Yangxiang Huang, Timothy Kohchih Tsai, Siva Kumar Sastry Hari, Steven William Keckler
PACKED ERROR CORRECTION CODE (ECC) FOR COMPRESSED DATA PROTECTION

Publication number: 20220329265

Abstract: A packed error correction code (ECC) technique opportunistically embeds ECC check-bits with compressed data. When compressed, the data is encoded in fewer bits and is therefore fragmented when stored or transmitted compared with the uncompressed data. The ECC check-bits may be packed with compressed data at “source” points. The check-bits are transmitted along with the compressed data and, at any “intermediate” point between the source and a “destination” the check-bits may be used to detect and correct errors in the compressed data. In contrast with conventional systems, packed ECC enables end-to-end coverage for sufficiently-compressed data within the processor and also externally. While storage circuitry typically is protected by structure-specific ECC, protection is also beneficial for data as it is transmitted between processing and/or storage units.

Type: Application

Filed: April 7, 2021

Publication date: October 13, 2022

Inventors: Michael Brendan Sullivan, Jeffrey Michael Pool, Yangxiang Huang, Timothy Kohchih Tsai, Siva Kumar Sastry Hari, Steven William Keckler
INCREASING SPARCITY IN DATA SETS

Publication number: 20220327101

Abstract: Apparatuses, systems, and techniques to transform data sets, such as matrices representing layers of neural networks, to increase sparsity and/or other characteristics of said data sets to improve performance in computations, such as neural network computations. In at least one embodiment, one or more subsets of data in one or more sets of data are rearranged as part of a process to increase sparsity in said one or more sets of data to satisfy one or more one or more structural sparsity constraints.

Type: Application

Filed: May 18, 2021

Publication date: October 13, 2022

Inventors: Jeffrey Michael Pool, Chong Yu, Paulius Micikevicius
INLINE DATA INSPECTION FOR WORKLOAD SIMPLIFICATION

Publication number: 20200125363

Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.

Type: Application

Filed: December 9, 2019

Publication date: April 23, 2020

Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
Inline data inspection for workload simplification

Patent number: 10503507

Abstract: A method, computer readable medium, and system are disclosed for inline data inspection. The method includes the steps of receiving, by a load/store unit, a load instruction and obtaining, by an inspection circuit that is coupled to the load/store unit, data specified by the load instruction. Additional steps include determining that the data equals zero and transmitting the data and a predicate signal to the load/store unit, wherein the predicate signal indicates that the data equals zero. Alternative additional steps include computing a predicate value based on a comparison between the data and a threshold value and transmitting the data and the predicate value to the load/store unit, wherein the predicate value is asserted when the data is less than the threshold value and is negated when the data is not less than the threshold value.

Type: Grant

Filed: August 31, 2017

Date of Patent: December 10, 2019

Assignee: NVIDIA Corporation

Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman

1 2 next