Patents by Inventor Stuart F. Oberman

Stuart F. Oberman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PRIORITY ENCODER-BASED TECHNIQUES FOR COMPUTING THE MINIMUM OR THE MAXIMUM OF MULTIPLE VALUES

Publication number: 20230100785

Abstract: In various embodiments, the maximum or minimum of multiple input values is determined. For each of a set of possible values, a corresponding detection result is set to indicate whether at least one of the input values matches the possible value. The detection results are used to ascertain the maximum or minimum of the multiple input values.

Type: Application

Filed: September 28, 2021

Publication date: March 30, 2023

Inventors: Ilyas ELKIN, Brent Ralph BOSWELL, Stuart F. OBERMAN, Ming Y. SIU
Providing hints to an execution unit to prepare for predicted subsequent arithmetic operations

Patent number: 11150721

Abstract: A system and method are described for providing hints to a processing unit that subsequent operations are likely. Responsively, the processing unit takes steps to prepare for the likely subsequent operations. Where the hints are more likely than not to be correct, the processing unit operates more efficiently. For example, in an embodiment, the processing unit consumes less power. In another embodiment, subsequent operations are performed more quickly because the processing unit is prepared to efficiently handle the subsequent operations.

Type: Grant

Filed: November 7, 2012

Date of Patent: October 19, 2021

Assignee: NVIDIA Corporation

Inventors: David Conrad Tannenbaum, Ming Y. Siu, Stuart F Oberman, Colin Sprinkle, Srinivasan Iyer, Ian Chi Yan Kwong
Dispatching a stored instruction in response to determining that a received instruction is of a same instruction type

Patent number: 10503513

Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well as a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.

Type: Grant

Filed: October 23, 2013

Date of Patent: December 10, 2019

Assignee: NVIDIA CORPORATION

Inventors: David Conrad Tannenbaum, Srinivasan (Vasu) Iyer, Stuart F. Oberman, Ming Y. Siu, Michael Alan Fetterman, John Matthew Burgess, Shirish Gadre
Programmable graphics processor for multithreaded execution of programs

Patent number: 10217184

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Grant

Filed: May 23, 2017

Date of Patent: February 26, 2019

Assignee: NVIDIA CORPORATION

Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
Approach to power reduction in floating-point operations

Patent number: 9829956

Abstract: An approach is provided for enabling power reduction in floating-point operations. In one example, a system receives floating-point numbers of a fused multiply-add instruction. The system determines the fused multiply-add instruction does not require compliance with a standard of precision for floating-point numbers. The system generates gating signals for an integrated circuit that is configured to perform operations of the fused multiply-add instruction. The system then sends the gating signals to the integrated circuit to turn off a plurality of logic gates included in the integrated circuit.

Type: Grant

Filed: November 21, 2012

Date of Patent: November 28, 2017

Assignee: NVIDIA Corporation

Inventors: David Conrad Tannenbaum, Colin Sprinkle, Stuart F. Oberman, Ming Y. Siu, Srinivasan Iyer, Ian-Chi Yan Kwong
PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS

Publication number: 20170256022

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Application

Filed: May 23, 2017

Publication date: September 7, 2017

Inventors: John Erik LINDHOLM, Brett W. COON, Stuart F. OBERMAN, Ming Y. SIU, Matthew P. GERLACH
Programmable graphics processor for multithreaded execution of programs

Patent number: 9659339

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Grant

Filed: March 25, 2013

Date of Patent: May 23, 2017

Assignee: NVIDIA CORPORATION

Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS

Publication number: 20160300319

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Application

Filed: March 25, 2013

Publication date: October 13, 2016

Applicant: NVIDIA Corporation

Inventors: John Erik LINDHOLM, Brett W. COON, Stuart F. OBERMAN, Ming Y. SIU, Matthew P. GERLACH
FFMA operations using a multi-step approach to data shifting

Patent number: 9465575

Abstract: A fused floating-point multiply-add element includes a multiplier that generates a product, and a shifter that shifts an addend within a narrow range. Interpreting logic analyzes the magnitude of the addend relative to the product and then causes logic arrays to position the shifted addend within the left, center, or right portions of a composite register depending in the magnitude of the addend relative to the product. The interpreting logic also forces other portions of the composite register to zero. When the addend is zero, the interpreting logic forces all portions of the composite register to zero. Final combining logic then adds the contents of the composite register to the product.

Type: Grant

Filed: August 5, 2013

Date of Patent: October 11, 2016

Assignee: NVIDIA Corporation

Inventors: Srinivasan Iyer, David Conrad Tannenbaum, Stuart F. Oberman, Ming (Michael) Y. Siu
Credit-based streaming multiprocessor warp scheduling

Patent number: 9189242

Abstract: One embodiment of the present invention sets forth a technique for ensuring cache access instructions are scheduled for execution in a multi-threaded system to improve cache locality and system performance. A credit-based technique may be used to control instruction by instruction scheduling for each warp in a group so that the group of warps is processed uniformly. A credit is computed for each warp and the credit contributes to a weight for each warp. The weight is used to select instructions for the warps that are issued for execution.

Type: Grant

Filed: September 17, 2010

Date of Patent: November 17, 2015

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Brett W. Coon, Jered Wierzbicki, Robert J. Stoll, Stuart F. Oberman
EFFICIENCY IN A FUSED FLOATING-POINT MULTIPLY-ADD UNIT

Publication number: 20150193203

Abstract: A four cycle fused floating point multiply-add unit includes a radix 8 Booth encoder multiplier that is partitioned over two stages with the compression element allocated to the second stage. The unit further includes an improved shifter design. Processing logic analyzes the input operands, detects values of zero and one, and inhibits portions of the processing logic accordingly. When one of the multiplicand inputs has a value of zero or one, the required multiplication becomes trivial, and the unit inhibits the associated coding logic and data transfer to reduce power consumption. The unit then performs an add-only operation. When the addend input has a value of zero, the addition becomes trivial, and the unit inhibits the improved shifter and data transfer to further reduce power consumption. The unit then performs a multiply-only operation.

Type: Application

Filed: January 7, 2014

Publication date: July 9, 2015

Applicant: NVIDIA CORPORATION

Inventors: Srinivasan (Vasu) IYER, David Conrad TANNENBAUM, Stuart F. OBERMAN
Using a pixel offset for evaluating a plane equation

Patent number: 9058672

Abstract: One embodiment of the present invention sets forth a technique controlling the pixel location at which the plane equation is evaluated. Multiple pixel offsets (dx, dy) may be specified that each define to a sub-pixel sample position. Attributes are then calculated for each sub-pixel sample position that is covered by a geometric primitive. One advantage of the technique is that anti-aliasing quality may be improved since high frequency color components may be selectively supersampled for particular geometric primitives.

Type: Grant

Filed: October 5, 2010

Date of Patent: June 16, 2015

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Henry Packard Moreton, Ming Y. Siu, Stuart F. Oberman
EFFICIENCY THROUGH A DISTRIBUTED INSTRUCTION SET ARCHITECTURE

Publication number: 20150113254

Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.

Type: Application

Filed: October 23, 2013

Publication date: April 23, 2015

Applicant: NVIDIA CORPORATION

Inventors: David Conrad TANNENBAUM, Srinivasan (Vasu) IYER, Stuart F. OBERMAN, Ming Y. SIU, Michael Alan FETTERMAN, John Matthew BURGESS, Shirish GADRE
FFMA OPERATIONS USING A MULTI-STEP APPROACH TO DATA SHIFTING

Publication number: 20150039662

Abstract: A fused floating-point multiply-add element includes a multiplier that generates a product, and a shifter that shifts an addend within a narrow range. Interpreting logic analyzes the magnitude of the addend relative to the product and then causes logic arrays to position the shifted addend within the left, center, or right portions of a composite register depending in the magnitude of the addend relative to the product. The interpreting logic also forces other portions of the composite register to zero. When the addend is zero, the interpreting logic forces all portions of the composite register to zero. Final combining logic then adds the contents of the composite register to the product.

Type: Application

Filed: August 5, 2013

Publication date: February 5, 2015

Applicant: NVIDIA CORPORATION

Inventors: Srinivasan IYER, David Conrad TANNENBAUM, Stuart F. OBERMAN, Ming (Michael) Y. SIU
Graphics processor with memory management unit and cache coherent link

Patent number: 8860741

Abstract: In contrast to a conventional computing system in which the graphics processor (graphics processing unit or GPU) is treated as a slave to one or several CPUs, systems and methods are provided that allow the GPU to be treated as a central processing unit (CPU) from the perspective of the operating system. The GPU can access a memory space shared by other CPUs in the computing system. Caches utilized by the GPU may be coherent with caches utilized by other CPUs in the computing system. The GPU may share execution of general-purpose computations with other CPUs in the computing system.

Type: Grant

Filed: December 8, 2006

Date of Patent: October 14, 2014

Assignee: NVIDIA Corporation

Inventors: Norbert Juffa, Stuart F. Oberman
Programmable graphics processor for multithreaded execution of programs

Patent number: 8860737

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Grant

Filed: July 19, 2006

Date of Patent: October 14, 2014

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS

Publication number: 20140285500

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Application

Filed: March 25, 2013

Publication date: September 25, 2014

Applicant: NVIDIA Corporation

Inventors: John Erik LINDHOLM, Brett W. COON, Stuart F. OBERMAN, Ming Y. SIU, Matthew P. GERLACH
APPROACH TO POWER REDUCTION IN FLOATING-POINT OPERATIONS

Publication number: 20140143564

Abstract: An approach is provided for enabling power reduction in floating-point operations. In one example, a system receives floating-point numbers of a fused multiply-add instruction. The system determines the fused multiply-add instruction does not require compliance with a standard of precision for floating-point numbers. The system generates gating signals for an integrated circuit that is configured to perform operations of the fused multiply-add instruction. The system then sends the gating signals to the integrated circuit to turn off a plurality of logic gates included in the integrated circuit.

Type: Application

Filed: November 21, 2012

Publication date: May 22, 2014

Applicant: NVIDIA Corporation

Inventors: David Conrad TANNENBAUM, Colin SPRINKLE, Stuart F. OBERMAN, Ming Y. SIU, Srinivasan IYER, Ian-Chi Yan KWONG
APPROACH FOR EFFICIENT ARITHMETIC OPERATIONS

Publication number: 20140129807

Abstract: A system and method are described for providing hints to a processing unit that subsequent operations are likely. Responsively, the processing unit takes steps to prepare for the likely subsequent operations. Where the hints are more likely than not to be correct, the processing unit operates more efficiently. For example, in an embodiment, the processing unit consumes less power. In another embodiment, subsequent operations are performed more quickly because the processing unit is prepared to efficiently handle the subsequent operations.

Type: Application

Filed: November 7, 2012

Publication date: May 8, 2014

Applicant: NVIDIA CORPORATION

Inventors: David Conrad TANNENBAUM, Ming Y. SIU, Stuart F. OBERMAN, Colin SPRINKLE, Srinivasan IYER, Ian Chi Yan KWONG
Shared single-access memory with management of multiple parallel requests

Patent number: 8645638

Abstract: A memory is used by concurrent threads in a multithreaded processor. Any addressable storage location is accessible by any of the concurrent threads, but only one location at a time is accessible. The memory is coupled to parallel processing engines that generate a group of parallel memory access requests, each specifying a target address that might be the same or different for different requests. Serialization logic selects one of the target addresses and determines which of the requests specify the selected target address. All such requests are allowed to proceed in parallel, while other requests are deferred. Deferred requests may be regenerated and processed through the serialization logic so that a group of requests can be satisfied by accessing each different target address in the group exactly once.

Type: Grant

Filed: May 7, 2012

Date of Patent: February 4, 2014

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, Ming Y. Siu, Weizhong Xu, Stuart F. Oberman, John R. Nickolls, Peter C. Mills

1 2 3 4 5 next