Patents by Inventor John Erik Lindholm

John Erik Lindholm has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Queue manager for streaming multiprocessor systems

Patent number: 10983699

Abstract: A queue manager apparatus converts inbound commands of a first width into scalar format commands to be queued in a command queue. Furthermore, the queue manager converts the scalar format commands residing in the command queue into outbound commands of a second width for transmission. Converting inbound commands to scalar format commands and then converting the scalar format commands to a target width for transmission allows the queue manager to advantageously provide efficient and programmable command transmission between arbitrary processing units, regardless of potentially mismatched native command widths.

Type: Grant

Filed: October 25, 2019

Date of Patent: April 20, 2021

Assignee: NVIDIA Corporation

Inventor: John Erik Lindholm
QUEUE MANAGER FOR STREAMING MULTIPROCESSOR SYSTEMS

Publication number: 20200057560

Abstract: A queue manager apparatus converts inbound commands of a first width into scalar format commands to be queued in a command queue. Furthermore, the queue manager converts the scalar format commands residing in the command queue into outbound commands of a second width for transmission. Converting inbound commands to scalar format commands and then converting the scalar format commands to a target width for transmission allows the queue manager to advantageously provide efficient and programmable command transmission between arbitrary processing units, regardless of potentially mismatched native command widths.

Type: Application

Filed: October 25, 2019

Publication date: February 20, 2020

Inventor: John Erik Lindholm
Queue manager for streaming multiprocessor systems

Patent number: 10489056

Abstract: A queue manager apparatus converts inbound commands of a first width into scalar format commands to be queued in a command queue. Furthermore, the queue manager converts the scalar format commands residing in the command queue into outbound commands of a second width for transmission. Converting inbound commands to scalar format commands and then converting the scalar format commands to a target width for transmission allows the queue manager to advantageously provide efficient and programmable command transmission between arbitrary processing units, regardless of potentially mismatched native command widths.

Type: Grant

Filed: November 9, 2017

Date of Patent: November 26, 2019

Assignee: NVIDIA Corporation

Inventor: John Erik Lindholm
Approach for a configurable phase-based priority scheduler

Patent number: 10346212

Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

Type: Grant

Filed: February 3, 2015

Date of Patent: July 9, 2019

Assignee: NVIDIA CORPORATION

Inventors: Jack Hilaire Choquette, Olivier Giroux, Robert J. Stoll, Gary M. Tarolli, John Erik Lindholm
QUEUE MANAGER FOR STREAMING MULTIPROCESSOR SYSTEMS

Publication number: 20190138210

Abstract: A queue manager apparatus converts inbound commands of a first width into scalar format commands to be queued in a command queue. Furthermore, the queue manager converts the scalar format commands residing in the command queue into outbound commands of a second width for transmission. Converting inbound commands to scalar format commands and then converting the scalar format commands to a target width for transmission allows the queue manager to advantageously provide efficient and programmable command transmission between arbitrary processing units, regardless of potentially mismatched native command widths.

Type: Application

Filed: November 9, 2017

Publication date: May 9, 2019

Inventor: John Erik Lindholm
Beam tracing

Patent number: 10242485

Abstract: An apparatus, computer readable medium, and method are disclosed for performing an intersection query between a query beam and a target bounding volume. The target bounding volume may comprise an axis-aligned bounding box (AABB) associated with a bounding volume hierarchy (BVH) tree. An intersection query comprising beam information associated with the query beam and slab boundary information for a first dimension of a target bounding volume is received. Intersection parameter values are calculated for the first dimension based on the beam information and the slab boundary information and a slab intersection case is determined for the first dimension based on the beam information. A parametric variable range for the first dimension is assigned based on the slab intersection case and the intersection parameter values and it is determined whether the query beam intersects the target bounding volume based on at least the parametric variable range for the first dimension.

Type: Grant

Filed: December 28, 2016

Date of Patent: March 26, 2019

Assignee: NVIDIA CORPORATION

Inventors: Tero Tapani Karras, Timo Oskari Aila, Samuli Matias Laine, John Erik Lindholm
Programmable graphics processor for multithreaded execution of programs

Patent number: 10217184

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Grant

Filed: May 23, 2017

Date of Patent: February 26, 2019

Assignee: NVIDIA CORPORATION

Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
BEAM TRACING

Publication number: 20180182158

Abstract: An apparatus, computer readable medium, and method are disclosed for performing an intersection query between a query beam and a target bounding volume. The target bounding volume may comprise an axis-aligned bounding box (AABB) associated with a bounding volume hierarchy (BVH) tree. An intersection query comprising beam information associated with the query beam and slab boundary information for a first dimension of a target bounding volume is received. Intersection parameter values are calculated for the first dimension based on the beam information and the slab boundary information and a slab intersection case is determined for the first dimension based on the beam information. A parametric variable range for the first dimension is assigned based on the slab intersection case and the intersection parameter values and it is determined whether the query beam intersects the target bounding volume based on at least the parametric variable range for the first dimension.

Type: Application

Filed: December 28, 2016

Publication date: June 28, 2018

Inventors: Tero Tapani Karras, Timo Oskari Aila, Samuli Matias Laine, John Erik Lindholm
Tree-based thread management

Patent number: 9921847

Abstract: In one embodiment of the present invention, a streaming multiprocessor (SM) uses a tree of nodes to manage threads. Each node specifies a set of active threads and a program counter. Upon encountering a conditional instruction that causes an execution path to diverge, the SM creates child nodes corresponding to each of the divergent execution paths. Based on the conditional instruction, the SM assigns each active thread included in the parent node to at most one child node, and the SM temporarily discontinues executing instructions specified by the parent node. Instead, the SM concurrently executes instructions specified by the child nodes. After all the divergent paths reconverge to the parent path, the SM resumes executing instructions specified by the parent node. Advantageously, the disclosed techniques enable the SM to execute divergent paths in parallel, thereby reducing undesirable program behavior associated with conventional techniques that serialize divergent paths across thread groups.

Type: Grant

Filed: January 21, 2014

Date of Patent: March 20, 2018

Assignee: NVIDIA Corporation

Inventor: John Erik Lindholm
Tree-based thread management

Patent number: 9830161

Abstract: In one embodiment of the present invention, a streaming multiprocessor (SM) uses a tree of nodes to manage threads. Each node specifies a set of active threads and a program counter. Upon encountering a conditional instruction that causes an execution path to diverge, the SM creates child nodes corresponding to each of the divergent execution paths. Based on the conditional instruction, the SM assigns each active thread included in the parent node to at most one child node, and the SM temporarily discontinues executing instructions specified by the parent node. Instead, the SM concurrently executes instructions specified by the child nodes. After all the divergent paths reconverge to the parent path, the SM resumes executing instructions specified by the parent node. Advantageously, the disclosed techniques enable the SM to execute divergent paths in parallel, thereby reducing undesirable program behavior associated with conventional techniques that serialize divergent paths across thread groups.

Type: Grant

Filed: January 21, 2014

Date of Patent: November 28, 2017

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Michael C. Shebanow
PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS

Publication number: 20170256022

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Application

Filed: May 23, 2017

Publication date: September 7, 2017

Inventors: John Erik LINDHOLM, Brett W. COON, Stuart F. OBERMAN, Ming Y. SIU, Matthew P. GERLACH
APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER

Publication number: 20170192822

Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

Type: Application

Filed: February 3, 2015

Publication date: July 6, 2017

Inventors: Jack Hilaire CHOQUETTE, Olivier GIROUX, Robert J. STOLL, Gary M. TAROLLI, John Erik LINDHOLM
Programmable graphics processor for multithreaded execution of programs

Patent number: 9659339

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Grant

Filed: March 25, 2013

Date of Patent: May 23, 2017

Assignee: NVIDIA CORPORATION

Inventors: John Erik Lindholm, Brett W. Coon, Stuart F. Oberman, Ming Y. Siu, Matthew P. Gerlach
Indirect function call instructions in a synchronous parallel thread processor

Patent number: 9639365

Abstract: An indirect branch instruction takes an address register as an argument in order to provide indirect function call capability for single-instruction multiple-thread (SIMT) processor architectures. The indirect branch instruction is used to implement indirect function calls, virtual function calls, and switch statements to improve processing performance compared with using sequential chains of tests and branches.

Type: Grant

Filed: November 12, 2012

Date of Patent: May 2, 2017

Assignee: NVIDIA Corporation

Inventors: Brett W. Coon, John R. Nickolls, Lars Nyland, Peter C. Mills, John Erik Lindholm
Beam tracing

Patent number: 9569559

Abstract: An apparatus, computer readable medium, and method are disclosed for performing an intersection query between a query beam and a target bounding volume. The target bounding volume may comprise an axis-aligned bounding box (AABB) associated with a bounding volume hierarchy (BVH) tree. An intersection query comprising beam information associated with the query beam and slab boundary information for a first dimension of a target bounding volume is received. Intersection parameter values are calculated for the first dimension based on the beam information and the slab boundary information and a slab intersection case is determined for the first dimension based on the beam information. A parametric variable range for the first dimension is assigned based on the slab intersection case and the intersection parameter values and it is determined whether the query beam intersects the target bounding volume based on at least the parametric variable range for the first dimension.

Type: Grant

Filed: March 18, 2015

Date of Patent: February 14, 2017

Assignee: NVIDIA Corporation

Inventors: Tero Tapani Karras, Timo Oskari Aila, Samuli Matias Laine, John Erik Lindholm
Architecture and instructions for accessing multi-dimensional formatted surface memory

Patent number: 9519947

Abstract: One embodiment of the present invention sets forth a technique for a program to access multi-dimensional formatted graphics surface memory. Multi-dimensional memory objects called “surfaces” stored in a user-specified data or pixel format and arranged in a graphics optimized layout are accessed by programs using surface instructions. A set of memory access instructions e.g., load, store, reduce, and atomic, referred to as surface instructions, may be used to access the surfaces. Coordinate bounds checking is performed with configurable clamping. Caching behavior may also be specified by the surface instructions. Data format conversion and packing to a specified storage format is supported for store, reduction, and atomic surface instructions. Data format conversion and unpacking from a specified storage format is supported for loads and atomic surface instructions.

Type: Grant

Filed: September 24, 2010

Date of Patent: December 13, 2016

Assignee: NVIDIA Corporation

Inventors: John R. Nickolls, Brian Fahs, Lars Nyland, John Erik Lindholm, Richard Craig Johnson
PROGRAMMABLE GRAPHICS PROCESSOR FOR MULTITHREADED EXECUTION OF PROGRAMS

Publication number: 20160300319

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

Type: Application

Filed: March 25, 2013

Publication date: October 13, 2016

Applicant: NVIDIA Corporation

Inventors: John Erik LINDHOLM, Brett W. COON, Stuart F. OBERMAN, Ming Y. SIU, Matthew P. GERLACH
System and method for hardware scheduling of conditional barriers and impatient barriers

Patent number: 9448803

Abstract: A method and a system are provided for hardware scheduling of barrier instructions. Execution of a plurality of threads to process instructions of a program that includes a barrier instruction is initiated, and when each thread reaches the barrier instruction during execution of program, it is determined whether the thread participates in the barrier instruction. The threads that participate in the barrier instruction are then serially executed to process one or more instructions of the program that follow the barrier instruction. A method and system are also provided for impatient scheduling of barrier instructions. When a portion of the threads that is greater than a minimum number of threads and less than all of the threads in the plurality of threads reaches the barrier instruction each of the threads in the portion is serially executed to process one or more instructions of the program that follow the barrier instruction.

Type: Grant

Filed: March 11, 2013

Date of Patent: September 20, 2016

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Tero Tapani Karras, Timo Oskari Aila, Samuli Matias Laine
System and method for hardware scheduling of indexed barriers

Patent number: 9442755

Abstract: A method and a system are provided for hardware scheduling of indexed barrier instructions. Execution of a plurality of threads to process instructions of a program that includes a barrier instruction is initiated and when each thread reaches the barrier instruction, the thread pauses execution of the instructions. A first sub-group of threads in the plurality of threads is associated with a first sub-barrier index and a second sub-group of threads in the plurality of threads is associated with a second sub-barrier index. When the barrier instruction can be scheduled for execution, threads in the first sub-group are executed serially and threads in the second sub-group are executed serially and at least one thread in the first sub-group is executed in parallel with at least one thread in the second sub-group.

Type: Grant

Filed: March 15, 2013

Date of Patent: September 13, 2016

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Tero Tapani Karras
APPROACH FOR A CONFIGURABLE PHASE-BASED PRIORITY SCHEDULER

Publication number: 20160224386

Abstract: A streaming multiprocessor (SM) in a parallel processing subsystem schedules priority among a plurality of threads. The SM retrieves a priority descriptor associated with a thread group, and determines whether the thread group and a second thread group are both operating in the same phase. If so, then the method determines whether the priority descriptor of the thread group indicates a higher priority than the priority descriptor of the second thread group. If so, the SM skews the thread group relative to the second thread group such that the thread groups operate in different phases, otherwise the SM increases the priority of the thread group. f the thread groups are not operating in the same phase, then the SM increases the priority of the thread group. One advantage of the disclosed techniques is that thread groups execute with increased efficiency, resulting in improved processor performance.

Type: Application

Filed: February 3, 2015

Publication date: August 4, 2016

Inventors: Jack Hilaire CHOQUETTE, Olivier GIROUX, Robert J. STOLL, Gary M. TAROLLI, John Erik LINDHOLM

1 2 3 4 5 … next