Patents by Inventor Jayant B. Kolhe

Jayant B. Kolhe has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Execution of retargetted graphics processor accelerated code by a general purpose processor

Patent number: 9448779

Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.

Type: Grant

Filed: March 20, 2009

Date of Patent: September 20, 2016

Assignee: NVIDIA Corporation

Inventors: Vinod Grover, Bastiaan Joannes Matheus Aarts, Michael Murphy, Jayant B. Kolhe, John Bryan Pormann, Douglas Saylor
Retargetting an application program for execution by a general purpose processor

Patent number: 8612732

Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.

Type: Grant

Filed: March 19, 2009

Date of Patent: December 17, 2013

Assignee: NVIDIA Corporation

Inventors: Vinod Grover, Bastiaan Joannes Matheus Aarts, Michael Murphy, Boris Beylin, Jayant B. Kolhe, Douglas Saylor
Virtual architecture and instruction set for parallel thread computing

Patent number: 8321849

Abstract: A virtual architecture and instruction set support explicit parallel-thread computing. The virtual architecture defines a virtual processor that supports concurrent execution of multiple virtual threads with multiple levels of data sharing and coordination (e.g., synchronization) between different virtual threads, as well as a virtual execution driver that controls the virtual processor. A virtual instruction set architecture for the virtual processor is used to define behavior of a virtual thread and includes instructions related to parallel thread behavior, e.g., data sharing and synchronization. Using the virtual platform, programmers can develop application programs in which virtual threads execute concurrently to process data; virtual translators and drivers adapt the application code to particular hardware on which it is to execute, transparently to the programmer.

Type: Grant

Filed: January 26, 2007

Date of Patent: November 27, 2012

Assignee: NVIDIA Corporation

Inventors: John R. Nickolls, Henry P. Moreton, Lars S. Nyland, Ian A. Buck, Richard C. Johnson, Robert S. Glanville, Jayant B. Kolhe
Method and apparatus for register allocation in presence of hardware constraints

Patent number: 7681187

Abstract: A method and apparatus for optimizing register allocation during scheduling and execution of program code in a hardware environment. The program code can be compiled to optimize execution given predetermined hardware constraints. The hardware constraints can include the number of register read and write operations that can be performed in a given processor pass. The optimizer can initially schedule the program using virtual registers and a goal of minimizing the amount of active registers at any time. The optimizer reschedules the program to assign the virtual registers to actual physical registers in a manner that minimizes the number of processor passes used to execute the program.

Type: Grant

Filed: March 31, 2005

Date of Patent: March 16, 2010

Assignee: NVIDIA Corporation

Inventors: Michael G. Ludwig, Jayant B. Kolhe, Robert Steven Glanville, Geoffrey C. Berry, Boris Beylin, Michael T. Bunnell
EXECUTION OF RETARGETTED GRAPHICS PROCESSOR ACCELERATED CODE BY A GENERAL PURPOSE PROCESSOR

Publication number: 20090259828

Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.

Type: Application

Filed: March 20, 2009

Publication date: October 15, 2009

Inventors: Vinod Grover, Bastiaan Joannes Matheus Aarts, Michael Murphy, Jayant B. Kolhe, John Bryan Pormann, Douglas Saylor
VIRTUAL ARCHITECTURE AND INSTRUCTION SET FOR PARALLEL THREAD COMPUTING

Publication number: 20080184211

Abstract: A virtual architecture and instruction set support explicit parallel-thread computing. The virtual architecture defines a virtual processor that supports concurrent execution of multiple virtual threads with multiple levels of data sharing and coordination (e.g., synchronization) between different virtual threads, as well as a virtual execution driver that controls the virtual processor. A virtual instruction set architecture for the virtual processor is used to define behavior of a virtual thread and includes instructions related to parallel thread behavior, e.g., data sharing and synchronization. Using the virtual platform, programmers can develop application programs in which virtual threads execute concurrently to process data; virtual translators and drivers adapt the application code to particular hardware on which it is to execute, transparently to the programmer.

Type: Application

Filed: January 26, 2007

Publication date: July 31, 2008

Applicant: NVIDIA Corporation

Inventors: John R. Nickolls, Henry P. Moreton, Lars S. Nyland, Ian A. Buck, Richard C. Johnson, Robert S. Glanville, Jayant B. Kolhe
Dynamic instruction sequence selection during scheduling

Patent number: 7330962

Abstract: A list scheduler in a compiler can select from a plurality of alternative instruction sequences for one or more computation performed within an application. A scheduler can initially identify and track one or more computations for which multiple alternative instruction sequences exist. An available instruction list can be populated with the alternative instruction sequences. The list scheduler can access the available instruction list during scheduling of the application. The list scheduler can perform a cost analysis while scheduling the instructions by performing a look ahead. The list scheduler may select alternate instruction sequences for similar computations occurring in different portions of the application based on the cost benefit analysis.

Type: Grant

Filed: November 14, 2005

Date of Patent: February 12, 2008

Assignee: NVIDIA Corporation

Inventors: Michael G. Ludwig, Jayant B. Kolhe

Execution of retargetted graphics processor accelerated code by a general purpose processor

Retargetting an application program for execution by a general purpose processor

Virtual architecture and instruction set for parallel thread computing

Method and apparatus for register allocation in presence of hardware constraints

EXECUTION OF RETARGETTED GRAPHICS PROCESSOR ACCELERATED CODE BY A GENERAL PURPOSE PROCESSOR

VIRTUAL ARCHITECTURE AND INSTRUCTION SET FOR PARALLEL THREAD COMPUTING

Dynamic instruction sequence selection during scheduling