Patents by Inventor John Erik Lindholm

John Erik Lindholm has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Register based queuing for texture requests

Patent number: 7864185

Abstract: A graphics processing unit can queue a large number of texture requests to balance out the variability of texture requests without the need for a large texture request buffer. A dedicated texture request buffer queues the relatively small texture commands and parameters. Additionally, for each queued texture command, an associated set of texture arguments, which are typically much larger than the texture command, are stored in a general purpose register. The texture unit retrieves texture commands from the texture request buffer and then fetches the associated texture arguments from the appropriate general purpose register. The texture arguments may be stored in the general purpose register designated as the destination of the final texture value computed by the texture unit. Because the destination register must be allocated for the final texture value as texture commands are queued, storing the texture arguments in this register does not consume any additional registers.

Type: Grant

Filed: October 23, 2008

Date of Patent: January 4, 2011

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, John R. Nickolls, Simon S. Moy, Brett W. Coon
Offloading cube map calculations to a shader

Patent number: 7859548

Abstract: Systems and methods for performing cube mapping computations using a shader program may reduce the need for fixed function cube mapping computation units in graphics processors. Therefore, die area is used more efficiently since a general purpose processing unit may be configured using shader program instructions to perform the cube mapping computations and other computations. The general purpose processing unit is configured to perform floating point computations to identify the cube map face that will be read and process the cube map coordinates. A fixed function unit is also configured to identify the cube map face that will be read to avoid passing the cube map face information from the general purpose processing unit to the fixed function unit.

Type: Grant

Filed: October 19, 2006

Date of Patent: December 28, 2010

Assignee: NVIDIA Corporation

Inventor: John Erik Lindholm
Programmable graphics processor for generalized texturing

Patent number: 7852346

Abstract: A programmable graphics processor including an execution pipeline and a texture unit is described. The execution pipeline processes graphics data as specified by a fragment program. The fragment program may include one or more opcodes. The texture unit includes one or more sub-units which execute the opcodes to perform specific operations such as an LOD computation, generation of sample locations used to read texture map data, and address computation based on the sample locations.

Type: Grant

Filed: November 22, 2005

Date of Patent: December 14, 2010

Assignee: NVIDIA Corporation

Inventors: Walter E. Donovan, John Erik Lindholm
Operand collector architecture

Patent number: 7834881

Abstract: An apparatus and method for simulating a multi-ported memory using lower port count memories as banks. A collector units gather source operands from the banks as needed to process program instructions. The collector units also gather constants that are used as operands. When all of the source operands needed to process a program instruction have been gathered, a collector unit outputs the source operands to an execution unit while avoiding writeback conflicts to registers specified by the program instruction that may be accessed by other execution units.

Type: Grant

Filed: November 1, 2006

Date of Patent: November 16, 2010

Assignee: NVIDIA Corporation

Inventors: Samuel Liu, John Erik Lindholm, Ming Y Siu, Brett W. Coon, Stuart F. Oberman
System and method for processing thread groups in a SIMD architecture

Patent number: 7836276

Abstract: A SIMD processor efficiently utilizes its hardware resources to achieve higher data processing throughput. The effective width of a SIMD processor is extended by clocking the instruction processing side of the SIMD processor at a fraction of the rate of the data processing side and by providing multiple execution pipelines, each with multiple data paths. As a result, higher data processing throughput is achieved while an instruction is fetched and issued once per clock. This configuration also allows a large group of threads to be clustered and executed together through the SIMD processor so that greater memory efficiency can be achieved for certain types of operations like texture memory accesses performed in connection with graphics processing.

Type: Grant

Filed: December 2, 2005

Date of Patent: November 16, 2010

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John Erik Lindholm
Processing an indirect branch instruction in a SIMD architecture

Patent number: 7761697

Abstract: One embodiment of a computing system configured to manage divergent threads in a thread group includes a stack configured to store at least one token and a multithreaded processing unit. The multithreaded processing unit is configured to perform the steps of fetching a program instruction, determining that the program instruction is an indirect branch instruction, and processing the indirect branch instruction as a sequence of two-way branches to execute an indirect branch instruction with multiple branch addresses. Indirect branch instructions may be used to allow greater flexibility since the branch address or multiple branch addresses do not need to be determined at compile time.

Type: Grant

Filed: November 6, 2006

Date of Patent: July 20, 2010

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John Erik Lindholm, Peter C. Mills, John R. Nickolls
System, method and article of manufacture for a programmable processing model with instruction set

Patent number: 7755636

Abstract: A system, method and article of manufacture are provided for programmable processing in a computer graphics pipeline. Initially, data is received from a source buffer. Thereafter, programmable operations are performed on the data in order to generate output. The operations are programmable in that a user may utilize instructions from a predetermined instruction set for generating the same. Such output is stored in a register. During operation, the output stored in the register is used in performing the programmable operations on the data.

Type: Grant

Filed: November 19, 2007

Date of Patent: July 13, 2010

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, David B. Kirk, Henry P. Moreton, Simon Moy
System, method and computer program product for branching during programmable vertex processing

Patent number: 7755634

Abstract: A system, method and computer program product are provided for branching during graphics processing. Initially, a first operation is performed on data. In response to the first operation, a branching operation is performed to a second operation. The first operation and the second operation are associated with instructions selected from a predetermined instruction set.

Type: Grant

Filed: November 22, 2005

Date of Patent: July 13, 2010

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Simon S. Moy, Robert Steven Glanville
ACROSS-THREAD OUT-OF-ORDER INSTRUCTION DISPATCH IN A MULTITHREADED MICROPROCESSOR

Publication number: 20100122067

Abstract: Instruction dispatch in a multithreaded microprocessor such as a graphics processor is not constrained by an order among the threads. Instructions for each thread are fetched, and a dispatch circuit determines which instructions in the buffer are ready to execute. The dispatch circuit may issue any ready instruction for execution, and an instruction from one thread may be issued prior to an instruction from another thread regardless of which instruction was fetched first. If multiple functional units are available, multiple instructions can be dispatched in parallel.

Type: Application

Filed: January 20, 2010

Publication date: May 13, 2010

Applicant: NVIDIA Corporation

Inventors: John Erik Lindholm, Brett Coon, Simon S. Moy
System, method and article of manufacture for a programmable processing model with instruction set

Patent number: 7697008

Abstract: A system, method and article of manufacture are provided for programmable processing in a computer graphics pipeline. Initially, data is received from a source buffer. Thereafter, programmable operations are performed on the data in order to generate output. The operations are programmable in that a user may utilize instructions from a predetermined instruction set for generating the same. Such output is stored in a register. During operation, the output stored in the register is used in performing the programmable operations on the data.

Type: Grant

Filed: February 28, 2007

Date of Patent: April 13, 2010

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, David B. Kirk, Henry P. Moreton, Simon Moy
Methods of processing graphics data including reading and writing buffers

Patent number: 7683905

Abstract: Apparatuses and methods for detecting position conflicts during fragment processing are described. Prior to executing a program on a fragment, a conflict detection unit, within a fragment processor checks if there is a position conflict indicating a RAW (read after write) hazard may exist. A RAW hazard exists when there is a pending write to a destination location that source data will be read from during execution of the program. When the fragment enters a processing pipeline, each destination location that may be written during the processing of the fragment is entered in conflict detection unit. During processing, the conflict detection unit is updated when a pending write to a destination location is completed.

Type: Grant

Filed: July 26, 2006

Date of Patent: March 23, 2010

Assignee: NVIDIA Corporation

Inventors: David B. Kirk, Matthew N. Papakipos, Rui M. Bastos, John Erik Lindholm, Steven E. Molnar
Across-thread out-of-order instruction dispatch in a multithreaded microprocessor

Patent number: 7676657

Abstract: Instruction dispatch in a multithreaded microprocessor such as a graphics processor is not constrained by an order among the threads. Instructions for each thread are fetched, and a dispatch circuit determines which instructions in the buffer are ready to execute. The dispatch circuit may issue any ready instruction for execution, and an instruction from one thread may be issued prior to an instruction from another thread regardless of which instruction was fetched first. If multiple functional units are available, multiple instructions can be dispatched in parallel.

Type: Grant

Filed: October 10, 2006

Date of Patent: March 9, 2010

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Brett Coon, Simon S. Moy
Register file allocation

Patent number: 7634621

Abstract: Circuits, methods, and apparatus that provide the die area and power savings of a single-ported memory with the performance advantages of a multiported memory. One example provides register allocation methods for storing data in a multiple-bank register file. In a thin register allocation method, data for a process is stored in a single bank. In this way, different processes use different banks to avoid conflicts. In a fat register allocation method, processes store data in each bank. In this way, if one process uses a large number of registers, those registers are spread among the banks, avoiding a situation where one bank is filled and other processes are forced to share a reduced number of banks. In a hybrid register allocation method, processes store data in more than one bank, but fewer than all the banks. Each of these methods may be combined in varying ways.

Type: Grant

Filed: November 3, 2006

Date of Patent: December 15, 2009

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John Erik Lindholm, Gary Tarolli, Svetoslav D. Tzvetkov, John R. Nickolls, Ming Y. Siu
Execution of parallel groups of threads with per-instruction serialization

Patent number: 7634637

Abstract: In a processor, a SIMD group (a group of threads for which instructions are issued in parallel using single instruction, multiple data instruction issue techniques) is logically divided into two or more “SIMD subsets,” each containing one or more of the threads in the SIMD group. Each SIMD subset is associated with a different instance of a variable state parameter. The processor determines which of the instructions to be executed for the SIMD group rely on the state variable and serializes execution of such instructions so that the instruction is executed separately for each SIMD subset. Instructions that do not rely on the state variable are advantageously not serialized.

Type: Grant

Filed: December 16, 2005

Date of Patent: December 15, 2009

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Stuart F. Oberman
Structured programming control flow using a disable mask in a SIMD architecture

Patent number: 7617384

Abstract: One embodiment of a computing system configured to manage divergent threads in a SIMD thread group includes a stack configured to store state information for processing control instructions. A parallel processing unit is configured to perform the steps of determining if one or more threads diverge during execution of a conditional control instruction. Threads that exit a program are identified as idle by a disable mask. Other threads that are disabled may be enabled once the divergent threads reach an instruction that enables the disabled threads. Use of the disable mask allows for the use of conditional return and break instructions in a multithreaded SIMD architecture.

Type: Grant

Filed: January 31, 2007

Date of Patent: November 10, 2009

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John Erik Lindholm, Svetoslav D. Tzvetkov
Indirect Function Call Instructions in a Synchronous Parallel Thread Processor

Publication number: 20090240931

Abstract: An indirect branch instruction takes an address register as an argument in order to provide indirect function call capability for single-instruction multiple-thread (SIMT) processor architectures. The indirect branch instruction is used to implement indirect function calls, virtual function calls, and switch statements to improve processing performance compared with using sequential chains of tests and branches.

Type: Application

Filed: March 24, 2008

Publication date: September 24, 2009

Inventors: Brett W. Coon, John R. Nickolls, Lars Nyland, Peter C. Mills, John Erik Lindholm
Apparatus and method for raster tile coalescing

Patent number: 7564456

Abstract: A graphics pipeline rasterizes primitives and generates a stream of groups of pixels, such as a stream of pixel quads. A tile coalesce unit received the stream of groups of pixels and generates pixel tiles for use by downstream pixel processing units. The pixel tiles facilitate hazard checks and transaction coherency.

Type: Grant

Filed: January 13, 2006

Date of Patent: July 21, 2009

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Henry Packard Moreton, John S. Montrym, Scott R. Whitman
Subdividing a shader program

Patent number: 7542043

Abstract: Methods and apparatus for subdividing a shader program into regions or “phases” of instructions identifiable by phase identifiers (IDs) inserted into the shader program are provided. The phase IDs may be used to constrain execution of the shader program to prohibit texture fetches in later phases from being executed before a texture fetch in a current phase has completed. Other operations (e.g., math operations) within the current phase, however, may be allowed to execute while waiting for the current phase texture fetch to complete.

Type: Grant

Filed: May 23, 2005

Date of Patent: June 2, 2009

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Brett W. Coon, Gary M. Tarolli
System and method for managing divergent threads using synchronization tokens and program instructions that include set-synchronization bits

Patent number: 7543136

Abstract: One embodiment of a computing system configured to manage divergent threads in a thread group includes a stack configured to store at least one token and a multithreaded processing unit. The multithreaded processing unit is configured to perform the steps of fetching a program instruction, determining that the program instruction is a branch instruction, determining that the program instruction is not a return or break instruction, determining whether the program instruction includes a set-synchronization bit, and updating an active program counter, where the manner in which the active program counter is updated depends on a branch instruction type.

Type: Grant

Filed: July 13, 2005

Date of Patent: June 2, 2009

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John Erik Lindholm
System and method for synchronizing divergent samples in a programmable graphics processing unit

Patent number: 7477255

Abstract: A method for synchronizing divergent samples in a programmable graphics processing unit is described. In one embodiment, the method includes the steps of determining that a divergence has occurred and detecting that a first sample of a group of samples has encountered a first synch token. The method also includes the steps of determining whether each of the other samples of the group has encountered a synch token and determining whether the synch token encountered by each of the other samples of the group is the first synch token.

Type: Grant

Filed: April 12, 2004

Date of Patent: January 13, 2009

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Harold Robert Feldman Zatz, Christian Rouet, Rui M. Bastos

prev 1 2 3 4 5 6 7 8 9 next