Patents by Inventor Stuart F. Oberman

Stuart F. Oberman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230100785
    Abstract: In various embodiments, the maximum or minimum of multiple input values is determined. For each of a set of possible values, a corresponding detection result is set to indicate whether at least one of the input values matches the possible value. The detection results are used to ascertain the maximum or minimum of the multiple input values.
    Type: Application
    Filed: September 28, 2021
    Publication date: March 30, 2023
    Inventors: Ilyas ELKIN, Brent Ralph BOSWELL, Stuart F. OBERMAN, Ming Y. SIU
  • Patent number: 11150721
    Abstract: A system and method are described for providing hints to a processing unit that subsequent operations are likely. Responsively, the processing unit takes steps to prepare for the likely subsequent operations. Where the hints are more likely than not to be correct, the processing unit operates more efficiently. For example, in an embodiment, the processing unit consumes less power. In another embodiment, subsequent operations are performed more quickly because the processing unit is prepared to efficiently handle the subsequent operations.
    Type: Grant
    Filed: November 7, 2012
    Date of Patent: October 19, 2021
    Assignee: NVIDIA Corporation
    Inventors: David Conrad Tannenbaum, Ming Y. Siu, Stuart F Oberman, Colin Sprinkle, Srinivasan Iyer, Ian Chi Yan Kwong
  • Patent number: 10503513
    Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well as a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.
    Type: Grant
    Filed: October 23, 2013
    Date of Patent: December 10, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: David Conrad Tannenbaum, Srinivasan (Vasu) Iyer, Stuart F. Oberman, Ming Y. Siu, Michael Alan Fetterman, John Matthew Burgess, Shirish Gadre
  • Patent number: 10217184
    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.
    Type: Grant
    Filed: May 23, 2017
    Date of Patent: February 26, 2019
    Assignee: NVIDIA CORPORATION
    Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
  • Patent number: 9829956
    Abstract: An approach is provided for enabling power reduction in floating-point operations. In one example, a system receives floating-point numbers of a fused multiply-add instruction. The system determines the fused multiply-add instruction does not require compliance with a standard of precision for floating-point numbers. The system generates gating signals for an integrated circuit that is configured to perform operations of the fused multiply-add instruction. The system then sends the gating signals to the integrated circuit to turn off a plurality of logic gates included in the integrated circuit.
    Type: Grant
    Filed: November 21, 2012
    Date of Patent: November 28, 2017
    Assignee: NVIDIA Corporation
    Inventors: David Conrad Tannenbaum, Colin Sprinkle, Stuart F. Oberman, Ming Y. Siu, Srinivasan Iyer, Ian-Chi Yan Kwong
  • Publication number: 20170256022
    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.
    Type: Application
    Filed: May 23, 2017
    Publication date: September 7, 2017
    Inventors: John Erik LINDHOLM, Brett W. COON, Stuart F. OBERMAN, Ming Y. SIU, Matthew P. GERLACH
  • Patent number: 9659339
    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.
    Type: Grant
    Filed: March 25, 2013
    Date of Patent: May 23, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
  • Publication number: 20160300319
    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.
    Type: Application
    Filed: March 25, 2013
    Publication date: October 13, 2016
    Applicant: NVIDIA Corporation
    Inventors: John Erik LINDHOLM, Brett W. COON, Stuart F. OBERMAN, Ming Y. SIU, Matthew P. GERLACH
  • Patent number: 9465575
    Abstract: A fused floating-point multiply-add element includes a multiplier that generates a product, and a shifter that shifts an addend within a narrow range. Interpreting logic analyzes the magnitude of the addend relative to the product and then causes logic arrays to position the shifted addend within the left, center, or right portions of a composite register depending in the magnitude of the addend relative to the product. The interpreting logic also forces other portions of the composite register to zero. When the addend is zero, the interpreting logic forces all portions of the composite register to zero. Final combining logic then adds the contents of the composite register to the product.
    Type: Grant
    Filed: August 5, 2013
    Date of Patent: October 11, 2016
    Assignee: NVIDIA Corporation
    Inventors: Srinivasan Iyer, David Conrad Tannenbaum, Stuart F. Oberman, Ming (Michael) Y. Siu
  • Patent number: 9189242
    Abstract: One embodiment of the present invention sets forth a technique for ensuring cache access instructions are scheduled for execution in a multi-threaded system to improve cache locality and system performance. A credit-based technique may be used to control instruction by instruction scheduling for each warp in a group so that the group of warps is processed uniformly. A credit is computed for each warp and the credit contributes to a weight for each warp. The weight is used to select instructions for the warps that are issued for execution.
    Type: Grant
    Filed: September 17, 2010
    Date of Patent: November 17, 2015
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Brett W. Coon, Jered Wierzbicki, Robert J. Stoll, Stuart F. Oberman
  • Publication number: 20150193203
    Abstract: A four cycle fused floating point multiply-add unit includes a radix 8 Booth encoder multiplier that is partitioned over two stages with the compression element allocated to the second stage. The unit further includes an improved shifter design. Processing logic analyzes the input operands, detects values of zero and one, and inhibits portions of the processing logic accordingly. When one of the multiplicand inputs has a value of zero or one, the required multiplication becomes trivial, and the unit inhibits the associated coding logic and data transfer to reduce power consumption. The unit then performs an add-only operation. When the addend input has a value of zero, the addition becomes trivial, and the unit inhibits the improved shifter and data transfer to further reduce power consumption. The unit then performs a multiply-only operation.
    Type: Application
    Filed: January 7, 2014
    Publication date: July 9, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Srinivasan (Vasu) IYER, David Conrad TANNENBAUM, Stuart F. OBERMAN
  • Patent number: 9058672
    Abstract: One embodiment of the present invention sets forth a technique controlling the pixel location at which the plane equation is evaluated. Multiple pixel offsets (dx, dy) may be specified that each define to a sub-pixel sample position. Attributes are then calculated for each sub-pixel sample position that is covered by a geometric primitive. One advantage of the technique is that anti-aliasing quality may be improved since high frequency color components may be selectively supersampled for particular geometric primitives.
    Type: Grant
    Filed: October 5, 2010
    Date of Patent: June 16, 2015
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Henry Packard Moreton, Ming Y. Siu, Stuart F. Oberman
  • Publication number: 20150113254
    Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.
    Type: Application
    Filed: October 23, 2013
    Publication date: April 23, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: David Conrad TANNENBAUM, Srinivasan (Vasu) IYER, Stuart F. OBERMAN, Ming Y. SIU, Michael Alan FETTERMAN, John Matthew BURGESS, Shirish GADRE
  • Publication number: 20150039662
    Abstract: A fused floating-point multiply-add element includes a multiplier that generates a product, and a shifter that shifts an addend within a narrow range. Interpreting logic analyzes the magnitude of the addend relative to the product and then causes logic arrays to position the shifted addend within the left, center, or right portions of a composite register depending in the magnitude of the addend relative to the product. The interpreting logic also forces other portions of the composite register to zero. When the addend is zero, the interpreting logic forces all portions of the composite register to zero. Final combining logic then adds the contents of the composite register to the product.
    Type: Application
    Filed: August 5, 2013
    Publication date: February 5, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Srinivasan IYER, David Conrad TANNENBAUM, Stuart F. OBERMAN, Ming (Michael) Y. SIU
  • Patent number: 8860741
    Abstract: In contrast to a conventional computing system in which the graphics processor (graphics processing unit or GPU) is treated as a slave to one or several CPUs, systems and methods are provided that allow the GPU to be treated as a central processing unit (CPU) from the perspective of the operating system. The GPU can access a memory space shared by other CPUs in the computing system. Caches utilized by the GPU may be coherent with caches utilized by other CPUs in the computing system. The GPU may share execution of general-purpose computations with other CPUs in the computing system.
    Type: Grant
    Filed: December 8, 2006
    Date of Patent: October 14, 2014
    Assignee: NVIDIA Corporation
    Inventors: Norbert Juffa, Stuart F. Oberman
  • Patent number: 8860737
    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.
    Type: Grant
    Filed: July 19, 2006
    Date of Patent: October 14, 2014
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
  • Publication number: 20140285500
    Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.
    Type: Application
    Filed: March 25, 2013
    Publication date: September 25, 2014
    Applicant: NVIDIA Corporation
    Inventors: John Erik LINDHOLM, Brett W. COON, Stuart F. OBERMAN, Ming Y. SIU, Matthew P. GERLACH
  • Publication number: 20140143564
    Abstract: An approach is provided for enabling power reduction in floating-point operations. In one example, a system receives floating-point numbers of a fused multiply-add instruction. The system determines the fused multiply-add instruction does not require compliance with a standard of precision for floating-point numbers. The system generates gating signals for an integrated circuit that is configured to perform operations of the fused multiply-add instruction. The system then sends the gating signals to the integrated circuit to turn off a plurality of logic gates included in the integrated circuit.
    Type: Application
    Filed: November 21, 2012
    Publication date: May 22, 2014
    Applicant: NVIDIA Corporation
    Inventors: David Conrad TANNENBAUM, Colin SPRINKLE, Stuart F. OBERMAN, Ming Y. SIU, Srinivasan IYER, Ian-Chi Yan KWONG
  • Publication number: 20140129807
    Abstract: A system and method are described for providing hints to a processing unit that subsequent operations are likely. Responsively, the processing unit takes steps to prepare for the likely subsequent operations. Where the hints are more likely than not to be correct, the processing unit operates more efficiently. For example, in an embodiment, the processing unit consumes less power. In another embodiment, subsequent operations are performed more quickly because the processing unit is prepared to efficiently handle the subsequent operations.
    Type: Application
    Filed: November 7, 2012
    Publication date: May 8, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: David Conrad TANNENBAUM, Ming Y. SIU, Stuart F. OBERMAN, Colin SPRINKLE, Srinivasan IYER, Ian Chi Yan KWONG
  • Patent number: 8645638
    Abstract: A memory is used by concurrent threads in a multithreaded processor. Any addressable storage location is accessible by any of the concurrent threads, but only one location at a time is accessible. The memory is coupled to parallel processing engines that generate a group of parallel memory access requests, each specifying a target address that might be the same or different for different requests. Serialization logic selects one of the target addresses and determines which of the requests specify the selected target address. All such requests are allowed to proceed in parallel, while other requests are deferred. Deferred requests may be regenerated and processed through the serialization logic so that a group of requests can be satisfied by accessing each different target address in the group exactly once.
    Type: Grant
    Filed: May 7, 2012
    Date of Patent: February 4, 2014
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, Ming Y. Siu, Weizhong Xu, Stuart F. Oberman, John R. Nickolls, Peter C. Mills