Patents by Inventor Jeffrey Michael Pool

Jeffrey Michael Pool has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20260044343
    Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.
    Type: Application
    Filed: August 25, 2025
    Publication date: February 12, 2026
    Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
  • Publication number: 20260023811
    Abstract: Disclosed are systems and techniques for compressing a dense matrix into an expressive sparse matrix representation with limited metadata. The techniques include generating a sparse matrix with corresponding metadata based on a dense matrix. Generating the sparse matrix with corresponding metadata includes identifying a first number (M) of elements to compress, a second number (N) of elements to retain, a third number (P) of positions, and a format; determining a metadata value for each of N elements of the dense matrix based on the identified P and the identified format, wherein the dense matrix includes at least M elements; and generating the sparse matrix containing the N elements of the dense matrix. The techniques include storing the sparse matrix and the corresponding metadata, wherein the corresponding metadata comprises the metadata value for each of the N elements of the dense matrix.
    Type: Application
    Filed: July 18, 2024
    Publication date: January 22, 2026
    Inventors: Jeffrey Michael Pool, Manan Patel, Ming Yiu Siu, Po-An Tsai
  • Publication number: 20260023813
    Abstract: Disclosed are systems and techniques for performing matrix multiply operations on an expressive sparse matrix representation with limited metadata. The techniques include receiving a sparse matrix, metadata corresponding to the sparse matrix, and a matrix operand. The sparse matrix contains a first number (N) of elements to retain from a dense matrix which comprises at least a second number (M) of elements. The metadata corresponding to the sparse matrix is based on a third number (P) of positions and a format determined during compression of the dense matrix. The techniques include selecting, by one or more selection circuits, a subset of elements of the matrix operand based on the metadata corresponding to the sparse matrix and performing one or more matrix multiply operations on the sparse matrix and the subset of elements of the matrix operand.
    Type: Application
    Filed: July 18, 2024
    Publication date: January 22, 2026
    Inventors: Jeffrey Michael Pool, Manan Patel, Ming Yiu Siu, Po-An Tsai
  • Publication number: 20260023812
    Abstract: Disclosed are systems and techniques for decompressing an expressive sparse matrix representation with limited metadata. The techniques include receiving a sparse matrix and metadata corresponding to the sparse matrix. The sparse matrix is a compressed representation of a dense matrix. The sparse matrix contains a first number (N) of elements to retain from the dense matrix which comprises at least a second number (M) of elements. The metadata corresponding to the sparse matrix is based on a third number (P) of positions and a format determined during compression of the dense matrix. The techniques include generating an uncompressed matrix based on the sparse matrix and the metadata corresponding to the sparse matrix.
    Type: Application
    Filed: July 18, 2024
    Publication date: January 22, 2026
    Inventors: Jeffrey Michael Pool, Manan Patel, Ming Yiu Siu, Po-An Tsai
  • Publication number: 20260023814
    Abstract: Disclosed are systems and techniques for compressing a dense matrix into a fully-expressive sparse matrix representation with limited metadata. The techniques include generating a sparse matrix with corresponding metadata based on a dense matrix. Generating the sparse matrix with corresponding metadata includes identifying a first number (M) of elements to compress, a second number (N) of elements to retain, and a third number (B) indicating the number of bits each metadata value uses; determining a metadata value for each of N elements of the dense matrix; packing a first metadata value having more than B bits into a second metadata value having B bits; and generating the sparse matrix containing the N elements of the dense matrix. The techniques include storing the sparse matrix and the corresponding metadata, wherein the corresponding metadata comprises the second metadata value.
    Type: Application
    Filed: July 18, 2024
    Publication date: January 22, 2026
    Inventors: Jeffrey Michael Pool, Manan Patel, Ming Yiu Siu, Po-An Tsai
  • Publication number: 20260010809
    Abstract: Distribution of data in a neural network data set is used to determine an optimal compressor configuration for compressing the neural network data set and/or the underlying data type of the neural network data set. By using a generalizable optimization of examining the data prior to compressor invocation, the example non-limiting technology herein makes it possible to tune a compressor to better target the incoming data. For sparse data compression, this step may involve examining the distribution of data (e.g., in one example, zeros in the data). For other algorithms, it may involve other types of inspection. This changes the fundamental behavior of the compressor itself. By inspecting the distribution of data (e.g., zeros in the data), it also possible to very accurately predict the data width of the underlying data.
    Type: Application
    Filed: May 30, 2025
    Publication date: January 8, 2026
    Inventor: Jeffrey Michael Pool
  • Publication number: 20250335769
    Abstract: Apparatuses, systems, and techniques to losslessly compress neural networks via semi-structured sparsity. In at least one embodiment, a weighted average of candidate masks for semi-structured sparsity is learned for each parameter block of a neural network, and a composite mask is determined by selecting candidate masks based on the learned weighted averages. In at least one embodiment, computational resources required for inference are reduced, thereby contributing to more sustainable and environmentally friendly AI applications.
    Type: Application
    Filed: November 12, 2024
    Publication date: October 30, 2025
    Inventors: Gongfan Fang, Pavlo Molchanov, Hongxu Yin, Gregory Heinrich, Saurav Muralidharan, Chenhan Yu, Jeffrey Michael Pool, Jan Kautz, Jorge Albericio Latorre
  • Patent number: 12443571
    Abstract: Apparatuses, systems, and techniques to transform data sets, such as matrices representing layers of neural networks, to increase sparsity and/or other characteristics of said data sets to improve performance in computations, such as neural network computations. In at least one embodiment, one or more subsets of data in one or more sets of data are rearranged as part of a process to increase sparsity in said one or more sets of data to satisfy one or more one or more structural sparsity constraints.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: October 14, 2025
    Assignee: NVIDIA CORPORATION
    Inventors: Jeffrey Michael Pool, Chong Yu, Paulius Micikevicius
  • Patent number: 12399716
    Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.
    Type: Grant
    Filed: April 3, 2024
    Date of Patent: August 26, 2025
    Assignee: NVIDIA Corporation
    Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
  • Publication number: 20250094864
    Abstract: Machine learning is a process that learns a model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the size, computation, and latency of a machine learning model, a compression technique can be employed which includes model sparsification and quantization. To limit the extent to which the quality of the model is impacted when uniformly applying sparsification and quantization to all values of the model, the present disclosure provides for a hybrid sparsification and quantization of the model.
    Type: Application
    Filed: March 12, 2024
    Publication date: March 20, 2025
    Inventors: Po-An Tsai, Geonhwa Jeong, Jeffrey Michael Pool
  • Publication number: 20240248718
    Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.
    Type: Application
    Filed: April 3, 2024
    Publication date: July 25, 2024
    Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
  • Publication number: 20240152407
    Abstract: Apparatuses, systems, and techniques to determine a configuration based at least in part on data stored by at least one data structure of a workload at runtime, and transform the workload into a sparse workload based at least in part on the configuration. In at least one embodiment, one or more sparse workloads (e.g., one or more sparse neural networks) are generated based at least in part on, for example, one or more workloads (e.g., one or more neural networks).
    Type: Application
    Filed: July 17, 2023
    Publication date: May 9, 2024
    Inventors: Geonhwa Jeong, Po-An Tsai, Jeffrey Michael Pool
  • Patent number: 11977888
    Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.
    Type: Grant
    Filed: February 22, 2023
    Date of Patent: May 7, 2024
    Assignee: NVIDIA Corporation
    Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
  • Publication number: 20230221957
    Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.
    Type: Application
    Filed: February 22, 2023
    Publication date: July 13, 2023
    Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
  • Patent number: 11609761
    Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.
    Type: Grant
    Filed: December 9, 2019
    Date of Patent: March 21, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
  • Patent number: 11522565
    Abstract: A packed error correction code (ECC) technique opportunistically embeds ECC check-bits with compressed data. When compressed, the data is encoded in fewer bits and is therefore fragmented when stored or transmitted compared with the uncompressed data. The ECC check-bits may be packed with compressed data at “source” points. The check-bits are transmitted along with the compressed data and, at any “intermediate” point between the source and a “destination” the check-bits may be used to detect and correct errors in the compressed data. In contrast with conventional systems, packed ECC enables end-to-end coverage for sufficiently-compressed data within the processor and also externally. While storage circuitry typically is protected by structure-specific ECC, protection is also beneficial for data as it is transmitted between processing and/or storage units.
    Type: Grant
    Filed: April 7, 2021
    Date of Patent: December 6, 2022
    Assignee: NVIDIA Corporation
    Inventors: Michael Brendan Sullivan, Jeffrey Michael Pool, Yangxiang Huang, Timothy Kohchih Tsai, Siva Kumar Sastry Hari, Steven William Keckler
  • Publication number: 20220329265
    Abstract: A packed error correction code (ECC) technique opportunistically embeds ECC check-bits with compressed data. When compressed, the data is encoded in fewer bits and is therefore fragmented when stored or transmitted compared with the uncompressed data. The ECC check-bits may be packed with compressed data at “source” points. The check-bits are transmitted along with the compressed data and, at any “intermediate” point between the source and a “destination” the check-bits may be used to detect and correct errors in the compressed data. In contrast with conventional systems, packed ECC enables end-to-end coverage for sufficiently-compressed data within the processor and also externally. While storage circuitry typically is protected by structure-specific ECC, protection is also beneficial for data as it is transmitted between processing and/or storage units.
    Type: Application
    Filed: April 7, 2021
    Publication date: October 13, 2022
    Inventors: Michael Brendan Sullivan, Jeffrey Michael Pool, Yangxiang Huang, Timothy Kohchih Tsai, Siva Kumar Sastry Hari, Steven William Keckler
  • Publication number: 20220327101
    Abstract: Apparatuses, systems, and techniques to transform data sets, such as matrices representing layers of neural networks, to increase sparsity and/or other characteristics of said data sets to improve performance in computations, such as neural network computations. In at least one embodiment, one or more subsets of data in one or more sets of data are rearranged as part of a process to increase sparsity in said one or more sets of data to satisfy one or more one or more structural sparsity constraints.
    Type: Application
    Filed: May 18, 2021
    Publication date: October 13, 2022
    Inventors: Jeffrey Michael Pool, Chong Yu, Paulius Micikevicius
  • Publication number: 20200125363
    Abstract: A method, computer readable medium, and processor are described herein for inline data inspection by using a decoder to decode a load instruction, including a signal to cause a circuit in a processor to indicate whether data loaded by a load instruction exceeds a threshold value. Moreover, an indication of whether data loaded by a load instruction exceeds a threshold value may be stored.
    Type: Application
    Filed: December 9, 2019
    Publication date: April 23, 2020
    Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman
  • Patent number: 10503507
    Abstract: A method, computer readable medium, and system are disclosed for inline data inspection. The method includes the steps of receiving, by a load/store unit, a load instruction and obtaining, by an inspection circuit that is coupled to the load/store unit, data specified by the load instruction. Additional steps include determining that the data equals zero and transmitting the data and a predicate signal to the load/store unit, wherein the predicate signal indicates that the data equals zero. Alternative additional steps include computing a predicate value based on a comparison between the data and a threshold value and transmitting the data and the predicate value to the load/store unit, wherein the predicate value is asserted when the data is less than the threshold value and is negated when the data is not less than the threshold value.
    Type: Grant
    Filed: August 31, 2017
    Date of Patent: December 10, 2019
    Assignee: NVIDIA Corporation
    Inventors: Jeffrey Michael Pool, Andrew Kerr, John Tran, Ming Y. Siu, Stuart Oberman