Patents by Inventor Jayant B. Kolhe

Jayant B. Kolhe has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9448779
    Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.
    Type: Grant
    Filed: March 20, 2009
    Date of Patent: September 20, 2016
    Assignee: NVIDIA Corporation
    Inventors: Vinod Grover, Bastiaan Joannes Matheus Aarts, Michael Murphy, Jayant B. Kolhe, John Bryan Pormann, Douglas Saylor
  • Patent number: 8612732
    Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.
    Type: Grant
    Filed: March 19, 2009
    Date of Patent: December 17, 2013
    Assignee: NVIDIA Corporation
    Inventors: Vinod Grover, Bastiaan Joannes Matheus Aarts, Michael Murphy, Boris Beylin, Jayant B. Kolhe, Douglas Saylor
  • Patent number: 8321849
    Abstract: A virtual architecture and instruction set support explicit parallel-thread computing. The virtual architecture defines a virtual processor that supports concurrent execution of multiple virtual threads with multiple levels of data sharing and coordination (e.g., synchronization) between different virtual threads, as well as a virtual execution driver that controls the virtual processor. A virtual instruction set architecture for the virtual processor is used to define behavior of a virtual thread and includes instructions related to parallel thread behavior, e.g., data sharing and synchronization. Using the virtual platform, programmers can develop application programs in which virtual threads execute concurrently to process data; virtual translators and drivers adapt the application code to particular hardware on which it is to execute, transparently to the programmer.
    Type: Grant
    Filed: January 26, 2007
    Date of Patent: November 27, 2012
    Assignee: NVIDIA Corporation
    Inventors: John R. Nickolls, Henry P. Moreton, Lars S. Nyland, Ian A. Buck, Richard C. Johnson, Robert S. Glanville, Jayant B. Kolhe
  • Patent number: 7681187
    Abstract: A method and apparatus for optimizing register allocation during scheduling and execution of program code in a hardware environment. The program code can be compiled to optimize execution given predetermined hardware constraints. The hardware constraints can include the number of register read and write operations that can be performed in a given processor pass. The optimizer can initially schedule the program using virtual registers and a goal of minimizing the amount of active registers at any time. The optimizer reschedules the program to assign the virtual registers to actual physical registers in a manner that minimizes the number of processor passes used to execute the program.
    Type: Grant
    Filed: March 31, 2005
    Date of Patent: March 16, 2010
    Assignee: NVIDIA Corporation
    Inventors: Michael G. Ludwig, Jayant B. Kolhe, Robert Steven Glanville, Geoffrey C. Berry, Boris Beylin, Michael T. Bunnell
  • Publication number: 20090259832
    Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.
    Type: Application
    Filed: March 19, 2009
    Publication date: October 15, 2009
    Inventors: Vinod GROVER, Bastiaan Joannes Matheus AARTS, Michael MURPHY, Boris BEYLIN, Jayant B. KOLHE, Douglas SAYLOR
  • Publication number: 20090259828
    Abstract: One embodiment of the present invention sets forth a technique for translating application programs written using a parallel programming model for execution on multi-core graphics processing unit (GPU) for execution by general purpose central processing unit (CPU). Portions of the application program that rely on specific features of the multi-core GPU are converted by a translator for execution by a general purpose CPU. The application program is partitioned into regions of synchronization independent instructions. The instructions are classified as convergent or divergent and divergent memory references that are shared between regions are replicated. Thread loops are inserted to ensure correct sharing of memory between various threads during execution by the general purpose CPU.
    Type: Application
    Filed: March 20, 2009
    Publication date: October 15, 2009
    Inventors: Vinod Grover, Bastiaan Joannes Matheus Aarts, Michael Murphy, Jayant B. Kolhe, John Bryan Pormann, Douglas Saylor
  • Publication number: 20080184211
    Abstract: A virtual architecture and instruction set support explicit parallel-thread computing. The virtual architecture defines a virtual processor that supports concurrent execution of multiple virtual threads with multiple levels of data sharing and coordination (e.g., synchronization) between different virtual threads, as well as a virtual execution driver that controls the virtual processor. A virtual instruction set architecture for the virtual processor is used to define behavior of a virtual thread and includes instructions related to parallel thread behavior, e.g., data sharing and synchronization. Using the virtual platform, programmers can develop application programs in which virtual threads execute concurrently to process data; virtual translators and drivers adapt the application code to particular hardware on which it is to execute, transparently to the programmer.
    Type: Application
    Filed: January 26, 2007
    Publication date: July 31, 2008
    Applicant: NVIDIA Corporation
    Inventors: John R. Nickolls, Henry P. Moreton, Lars S. Nyland, Ian A. Buck, Richard C. Johnson, Robert S. Glanville, Jayant B. Kolhe
  • Patent number: 7330962
    Abstract: A list scheduler in a compiler can select from a plurality of alternative instruction sequences for one or more computation performed within an application. A scheduler can initially identify and track one or more computations for which multiple alternative instruction sequences exist. An available instruction list can be populated with the alternative instruction sequences. The list scheduler can access the available instruction list during scheduling of the application. The list scheduler can perform a cost analysis while scheduling the instructions by performing a look ahead. The list scheduler may select alternate instruction sequences for similar computations occurring in different portions of the application based on the cost benefit analysis.
    Type: Grant
    Filed: November 14, 2005
    Date of Patent: February 12, 2008
    Assignee: NVIDIA Corporation
    Inventors: Michael G. Ludwig, Jayant B. Kolhe