Patents by Inventor John Erik Lindholm

John Erik Lindholm has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7864185
    Abstract: A graphics processing unit can queue a large number of texture requests to balance out the variability of texture requests without the need for a large texture request buffer. A dedicated texture request buffer queues the relatively small texture commands and parameters. Additionally, for each queued texture command, an associated set of texture arguments, which are typically much larger than the texture command, are stored in a general purpose register. The texture unit retrieves texture commands from the texture request buffer and then fetches the associated texture arguments from the appropriate general purpose register. The texture arguments may be stored in the general purpose register designated as the destination of the final texture value computed by the texture unit. Because the destination register must be allocated for the final texture value as texture commands are queued, storing the texture arguments in this register does not consume any additional registers.
    Type: Grant
    Filed: October 23, 2008
    Date of Patent: January 4, 2011
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, John R. Nickolls, Simon S. Moy, Brett W. Coon
  • Patent number: 7859548
    Abstract: Systems and methods for performing cube mapping computations using a shader program may reduce the need for fixed function cube mapping computation units in graphics processors. Therefore, die area is used more efficiently since a general purpose processing unit may be configured using shader program instructions to perform the cube mapping computations and other computations. The general purpose processing unit is configured to perform floating point computations to identify the cube map face that will be read and process the cube map coordinates. A fixed function unit is also configured to identify the cube map face that will be read to avoid passing the cube map face information from the general purpose processing unit to the fixed function unit.
    Type: Grant
    Filed: October 19, 2006
    Date of Patent: December 28, 2010
    Assignee: NVIDIA Corporation
    Inventor: John Erik Lindholm
  • Patent number: 7852346
    Abstract: A programmable graphics processor including an execution pipeline and a texture unit is described. The execution pipeline processes graphics data as specified by a fragment program. The fragment program may include one or more opcodes. The texture unit includes one or more sub-units which execute the opcodes to perform specific operations such as an LOD computation, generation of sample locations used to read texture map data, and address computation based on the sample locations.
    Type: Grant
    Filed: November 22, 2005
    Date of Patent: December 14, 2010
    Assignee: NVIDIA Corporation
    Inventors: Walter E. Donovan, John Erik Lindholm
  • Patent number: 7834881
    Abstract: An apparatus and method for simulating a multi-ported memory using lower port count memories as banks. A collector units gather source operands from the banks as needed to process program instructions. The collector units also gather constants that are used as operands. When all of the source operands needed to process a program instruction have been gathered, a collector unit outputs the source operands to an execution unit while avoiding writeback conflicts to registers specified by the program instruction that may be accessed by other execution units.
    Type: Grant
    Filed: November 1, 2006
    Date of Patent: November 16, 2010
    Assignee: NVIDIA Corporation
    Inventors: Samuel Liu, John Erik Lindholm, Ming Y Siu, Brett W. Coon, Stuart F. Oberman
  • Patent number: 7836276
    Abstract: A SIMD processor efficiently utilizes its hardware resources to achieve higher data processing throughput. The effective width of a SIMD processor is extended by clocking the instruction processing side of the SIMD processor at a fraction of the rate of the data processing side and by providing multiple execution pipelines, each with multiple data paths. As a result, higher data processing throughput is achieved while an instruction is fetched and issued once per clock. This configuration also allows a large group of threads to be clustered and executed together through the SIMD processor so that greater memory efficiency can be achieved for certain types of operations like texture memory accesses performed in connection with graphics processing.
    Type: Grant
    Filed: December 2, 2005
    Date of Patent: November 16, 2010
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm
  • Patent number: 7761697
    Abstract: One embodiment of a computing system configured to manage divergent threads in a thread group includes a stack configured to store at least one token and a multithreaded processing unit. The multithreaded processing unit is configured to perform the steps of fetching a program instruction, determining that the program instruction is an indirect branch instruction, and processing the indirect branch instruction as a sequence of two-way branches to execute an indirect branch instruction with multiple branch addresses. Indirect branch instructions may be used to allow greater flexibility since the branch address or multiple branch addresses do not need to be determined at compile time.
    Type: Grant
    Filed: November 6, 2006
    Date of Patent: July 20, 2010
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm, Peter C. Mills, John R. Nickolls
  • Patent number: 7755636
    Abstract: A system, method and article of manufacture are provided for programmable processing in a computer graphics pipeline. Initially, data is received from a source buffer. Thereafter, programmable operations are performed on the data in order to generate output. The operations are programmable in that a user may utilize instructions from a predetermined instruction set for generating the same. Such output is stored in a register. During operation, the output stored in the register is used in performing the programmable operations on the data.
    Type: Grant
    Filed: November 19, 2007
    Date of Patent: July 13, 2010
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, David B. Kirk, Henry P. Moreton, Simon Moy
  • Patent number: 7755634
    Abstract: A system, method and computer program product are provided for branching during graphics processing. Initially, a first operation is performed on data. In response to the first operation, a branching operation is performed to a second operation. The first operation and the second operation are associated with instructions selected from a predetermined instruction set.
    Type: Grant
    Filed: November 22, 2005
    Date of Patent: July 13, 2010
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Simon S. Moy, Robert Steven Glanville
  • Publication number: 20100122067
    Abstract: Instruction dispatch in a multithreaded microprocessor such as a graphics processor is not constrained by an order among the threads. Instructions for each thread are fetched, and a dispatch circuit determines which instructions in the buffer are ready to execute. The dispatch circuit may issue any ready instruction for execution, and an instruction from one thread may be issued prior to an instruction from another thread regardless of which instruction was fetched first. If multiple functional units are available, multiple instructions can be dispatched in parallel.
    Type: Application
    Filed: January 20, 2010
    Publication date: May 13, 2010
    Applicant: NVIDIA Corporation
    Inventors: John Erik Lindholm, Brett Coon, Simon S. Moy
  • Patent number: 7697008
    Abstract: A system, method and article of manufacture are provided for programmable processing in a computer graphics pipeline. Initially, data is received from a source buffer. Thereafter, programmable operations are performed on the data in order to generate output. The operations are programmable in that a user may utilize instructions from a predetermined instruction set for generating the same. Such output is stored in a register. During operation, the output stored in the register is used in performing the programmable operations on the data.
    Type: Grant
    Filed: February 28, 2007
    Date of Patent: April 13, 2010
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, David B. Kirk, Henry P. Moreton, Simon Moy
  • Patent number: 7683905
    Abstract: Apparatuses and methods for detecting position conflicts during fragment processing are described. Prior to executing a program on a fragment, a conflict detection unit, within a fragment processor checks if there is a position conflict indicating a RAW (read after write) hazard may exist. A RAW hazard exists when there is a pending write to a destination location that source data will be read from during execution of the program. When the fragment enters a processing pipeline, each destination location that may be written during the processing of the fragment is entered in conflict detection unit. During processing, the conflict detection unit is updated when a pending write to a destination location is completed.
    Type: Grant
    Filed: July 26, 2006
    Date of Patent: March 23, 2010
    Assignee: NVIDIA Corporation
    Inventors: David B. Kirk, Matthew N. Papakipos, Rui M. Bastos, John Erik Lindholm, Steven E. Molnar
  • Patent number: 7676657
    Abstract: Instruction dispatch in a multithreaded microprocessor such as a graphics processor is not constrained by an order among the threads. Instructions for each thread are fetched, and a dispatch circuit determines which instructions in the buffer are ready to execute. The dispatch circuit may issue any ready instruction for execution, and an instruction from one thread may be issued prior to an instruction from another thread regardless of which instruction was fetched first. If multiple functional units are available, multiple instructions can be dispatched in parallel.
    Type: Grant
    Filed: October 10, 2006
    Date of Patent: March 9, 2010
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Brett Coon, Simon S. Moy
  • Patent number: 7634621
    Abstract: Circuits, methods, and apparatus that provide the die area and power savings of a single-ported memory with the performance advantages of a multiported memory. One example provides register allocation methods for storing data in a multiple-bank register file. In a thin register allocation method, data for a process is stored in a single bank. In this way, different processes use different banks to avoid conflicts. In a fat register allocation method, processes store data in each bank. In this way, if one process uses a large number of registers, those registers are spread among the banks, avoiding a situation where one bank is filled and other processes are forced to share a reduced number of banks. In a hybrid register allocation method, processes store data in more than one bank, but fewer than all the banks. Each of these methods may be combined in varying ways.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: December 15, 2009
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm, Gary Tarolli, Svetoslav D. Tzvetkov, John R. Nickolls, Ming Y. Siu
  • Patent number: 7634637
    Abstract: In a processor, a SIMD group (a group of threads for which instructions are issued in parallel using single instruction, multiple data instruction issue techniques) is logically divided into two or more “SIMD subsets,” each containing one or more of the threads in the SIMD group. Each SIMD subset is associated with a different instance of a variable state parameter. The processor determines which of the instructions to be executed for the SIMD group rely on the state variable and serializes execution of such instructions so that the instruction is executed separately for each SIMD subset. Instructions that do not rely on the state variable are advantageously not serialized.
    Type: Grant
    Filed: December 16, 2005
    Date of Patent: December 15, 2009
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Stuart F. Oberman
  • Patent number: 7617384
    Abstract: One embodiment of a computing system configured to manage divergent threads in a SIMD thread group includes a stack configured to store state information for processing control instructions. A parallel processing unit is configured to perform the steps of determining if one or more threads diverge during execution of a conditional control instruction. Threads that exit a program are identified as idle by a disable mask. Other threads that are disabled may be enabled once the divergent threads reach an instruction that enables the disabled threads. Use of the disable mask allows for the use of conditional return and break instructions in a multithreaded SIMD architecture.
    Type: Grant
    Filed: January 31, 2007
    Date of Patent: November 10, 2009
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm, Svetoslav D. Tzvetkov
  • Publication number: 20090240931
    Abstract: An indirect branch instruction takes an address register as an argument in order to provide indirect function call capability for single-instruction multiple-thread (SIMT) processor architectures. The indirect branch instruction is used to implement indirect function calls, virtual function calls, and switch statements to improve processing performance compared with using sequential chains of tests and branches.
    Type: Application
    Filed: March 24, 2008
    Publication date: September 24, 2009
    Inventors: Brett W. Coon, John R. Nickolls, Lars Nyland, Peter C. Mills, John Erik Lindholm
  • Patent number: 7564456
    Abstract: A graphics pipeline rasterizes primitives and generates a stream of groups of pixels, such as a stream of pixel quads. A tile coalesce unit received the stream of groups of pixels and generates pixel tiles for use by downstream pixel processing units. The pixel tiles facilitate hazard checks and transaction coherency.
    Type: Grant
    Filed: January 13, 2006
    Date of Patent: July 21, 2009
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Henry Packard Moreton, John S. Montrym, Scott R. Whitman
  • Patent number: 7542043
    Abstract: Methods and apparatus for subdividing a shader program into regions or “phases” of instructions identifiable by phase identifiers (IDs) inserted into the shader program are provided. The phase IDs may be used to constrain execution of the shader program to prohibit texture fetches in later phases from being executed before a texture fetch in a current phase has completed. Other operations (e.g., math operations) within the current phase, however, may be allowed to execute while waiting for the current phase texture fetch to complete.
    Type: Grant
    Filed: May 23, 2005
    Date of Patent: June 2, 2009
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Brett W. Coon, Gary M. Tarolli
  • Patent number: 7543136
    Abstract: One embodiment of a computing system configured to manage divergent threads in a thread group includes a stack configured to store at least one token and a multithreaded processing unit. The multithreaded processing unit is configured to perform the steps of fetching a program instruction, determining that the program instruction is a branch instruction, determining that the program instruction is not a return or break instruction, determining whether the program instruction includes a set-synchronization bit, and updating an active program counter, where the manner in which the active program counter is updated depends on a branch instruction type.
    Type: Grant
    Filed: July 13, 2005
    Date of Patent: June 2, 2009
    Assignee: NVIDIA Corporation
    Inventors: Brett W. Coon, John Erik Lindholm
  • Patent number: 7477255
    Abstract: A method for synchronizing divergent samples in a programmable graphics processing unit is described. In one embodiment, the method includes the steps of determining that a divergence has occurred and detecting that a first sample of a group of samples has encountered a first synch token. The method also includes the steps of determining whether each of the other samples of the group has encountered a synch token and determining whether the synch token encountered by each of the other samples of the group is the first synch token.
    Type: Grant
    Filed: April 12, 2004
    Date of Patent: January 13, 2009
    Assignee: NVIDIA Corporation
    Inventors: John Erik Lindholm, Harold Robert Feldman Zatz, Christian Rouet, Rui M. Bastos