Patents by Inventor Lacky V. Shah

Lacky V. Shah has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Techniques for configuring a processor to function as multiple, separate processors

Patent number: 11893423

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Grant

Filed: September 5, 2019

Date of Patent: February 6, 2024

Assignee: NVIDIA CORPORATION

Inventors: Jerome F. Duluk, Jr., Gregory Scott Palmer, Jonathon Stuart Ramsey Evans, Shailendra Singh, Samuel H. Duncan, Wishwesh Anil Gandhi, Lacky V. Shah, Sonata Gale Wen, Feiqi Su, James Leroy Deming, Alan Menezes, Pranav Vaidya, Praveen Joginipally, Timothy John Purcell, Manas Mandal
HIGH BANDWIDTH EXTENDED MEMORY IN A PARALLEL PROCESSING SYSTEM

Publication number: 20230315328

Abstract: Various embodiments include techniques for accessing extended memory in a parallel processing system via a high-bandwidth path to extended memory residing on a central processing unit. The disclosed extended memory system extends the directly addressable high-bandwidth memory local to a parallel processing system and avoids the performance penalties associated with low-bandwidth system memory. As a result, execution threads that are highly parallelizable and access a large memory space execute with increased performance on a parallel processing system relative to prior approaches.

Type: Application

Filed: March 18, 2022

Publication date: October 5, 2023

Inventors: Hemayet HOSSAIN, Steven E. MOLNAR, Jonathon Stuart Ramsay EVANS, Wishwesh Anil GANDHI, Lacky V. SHAH, Vyas VENKATARAMAN, Mark HAIRGROVE, Geoffrey GERFIN, Jeffrey M. SMITH, Terje BERGSTROM, Vikram SETHI, Piyush PATEL
Techniques for configuring a processor to function as multiple, separate processors

Patent number: 11663036

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Grant

Filed: September 5, 2019

Date of Patent: May 30, 2023

Assignee: NVIDIA CORPORATION

Inventors: Jerome F. Duluk, Jr., Gregory Scott Palmer, Jonathon Stuart Ramsey Evans, Shailendra Singh, Samuel H. Duncan, Wishwesh Anil Gandhi, Lacky V. Shah, Eric Rock, Feiqi Su, James Leroy Deming, Alan Menezes, Pranav Vaidya, Praveen Joginipally, Timothy John Purcell, Manas Mandal
Techniques for configuring a processor to function as multiple, separate processors

Patent number: 11635986

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Grant

Filed: September 5, 2019

Date of Patent: April 25, 2023

Assignee: NVIDIA CORPORATION

Inventors: Jerome F. Duluk, Jr., Gregory Scott Palmer, Jonathon Stuart Ramsey Evans, Shailendra Singh, Samuel H. Duncan, Wishwesh Anil Gandhi, Lacky V. Shah, Eric Rock, Feiqi Su, James Leroy Deming, Alan Menezes, Pranav Vaidya, Praveen Joginipally, Timothy John Purcell, Manas Mandal
Techniques for reconfiguring partitions in a parallel processing system

Patent number: 11579925

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Grant

Filed: September 5, 2019

Date of Patent: February 14, 2023

Assignee: NVIDIA CORPORATION

Inventors: Jerome F. Duluk, Jr., Gregory Scott Palmer, Jonathon Stuart Ramsey Evans, Shailendra Singh, Samuel H. Duncan, Wishwesh Anil Gandhi, Lacky V. Shah, Eric Rock, Feiqi Su, James Leroy Deming, Alan Menezes, Pranav Vaidya, Praveen Joginipally, Timothy John Purcell, Manas Mandal
Techniques for configuring a processor to function as multiple, separate processors

Patent number: 11249905

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Grant

Filed: September 5, 2019

Date of Patent: February 15, 2022

Assignee: NVIDIA CORPORATION

Inventors: Jerome F. Duluk, Jr., Gregory Scott Palmer, Jonathon Stuart Ramsey Evans, Shailendra Singh, Samuel H. Duncan, Wishwesh Anil Gandhi, Lacky V. Shah, Eric Rock, Feiqi Su, James Leroy Deming, Alan Menezes, Pranav Vaidya, Praveen Joginipally, Timothy John Purcell, Manas Mandal
TECHNIQUE FOR COMPUTATIONAL NESTED PARALLELISM

Publication number: 20210349763

Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.

Type: Application

Filed: February 5, 2021

Publication date: November 11, 2021

Inventors: Stephen Jones, Philip Alexander Cuadra, Daniel Elliot Wexler, Ignacio Llamas, Lacky V. Shah, Jerome F. Duluk, JR., Christopher Lamb
TECHNIQUES FOR CONFIGURING A PROCESSOR TO FUNCTION AS MULTIPLE, SEPARATE PROCESSORS IN A VIRTUALIZED ENVIRONMENT

Publication number: 20210157651

Abstract: A parallel processing unit (PPU), operating in a traditional processing environment or in a virtualized processing environment, can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Application

Filed: February 1, 2021

Publication date: May 27, 2021

Inventors: Jerome F. DULUK, Jr., Gregory Scott PALMER, Jonathon Stuart Ramsay EVANS, Shailendra SINGH, Samuel H. DUNCAN, Wishwesh Anil GANDHI, Lacky V. SHAH, Eric ROCK, Feiqi SU, James Leroy DEMING, Alan MENEZES, Pranav VAIDYA, Praveen JOGINIPALLY, Timothy John PURCELL, Manas MANDAL
TECHNIQUES FOR CONFIGURING A PROCESSOR TO FUNCTION AS MULTIPLE, SEPARATE PROCESSORS

Publication number: 20210073042

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Application

Filed: September 5, 2019

Publication date: March 11, 2021

Inventors: Jerome F. DULUK, Jr., Gregory Scott PALMER, Jonathon Stuart Ramsey EVANS, Shailendra SINGH, Samuel H. DUNCAN, Wishwesh Anil GANDHI, Lacky V. SHAH, Eric ROCK, Feiqi SU, James Leroy DEMING, Alan MENEZES, Pranav VAIDYA, Praveen JOGINIPALLY, Timothy John PURCELL, Manas MANDAL
TECHNIQUES FOR CONFIGURING A PROCESSOR TO FUNCTION AS MULTIPLE, SEPARATE PROCESSORS

Publication number: 20210073125

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Application

Filed: September 5, 2019

Publication date: March 11, 2021

Inventors: Jerome F. DULUK, JR., Gregory Scott PALMER, Jonathon Stuart Ramsey EVANS, Shailendra SINGH, Samuel H. DUNCAN, Wishwesh Anil GANDHI, Lacky V. SHAH, Eric ROCK, Feiqi SU, James Leroy DEMING, Alan MENEZES, Pranav VAIDYA, Praveen JOGINIPALLY, Timothy John PURCELL, Manas MANDAL
TECHNIQUES FOR CONFIGURING A PROCESSOR TO FUNCTION AS MULTIPLE, SEPARATE PROCESSORS

Publication number: 20210073035

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Application

Filed: September 5, 2019

Publication date: March 11, 2021

Inventors: Jerome F. DULUK, Jr., Gregory Scott PALMER, Jonathon Stuart Ramsey EVANS, Shailendra SINGH, Samuel H. DUNCAN, Wishwesh Anil GANDHI, Lacky V. SHAH, Eric ROCK, Feiqi SU, James Leroy DEMING, Alan MENEZES, Pranav VAIDYA, Praveen JOGINIPALLY, Timothy John PURCELL, Manas MANDAL
TECHNIQUES FOR CONFIGURING A PROCESSOR TO FUNCTION AS MULTIPLE, SEPARATE PROCESSORS

Publication number: 20210073025

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

Type: Application

Filed: September 5, 2019

Publication date: March 11, 2021

Inventors: Jerome F. DULUK, JR., Gregory Scott PALMER, Jonathon Stuart Ramsey EVANS, Shailendra SINGH, Samuel H. DUNCAN, Wishwesh Anil GANDHI, Lacky V. SHAH, Eric ROCK, Feiqi SU, James Leroy DEMING, Alan MENEZES, Pranav VAIDYA, Praveen JOGINIPALLY, Timothy John PURCELL, Manas MANDAL
Technique for computational nested parallelism

Patent number: 10915364

Abstract: Apparatuses, systems, and techniques for performing nested kernel execution within a parallel processing subsystem. In at least one embodiment, a parent thread launches a nested child grid on the parallel processing subsystem, and enables the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid.

Type: Grant

Filed: December 2, 2016

Date of Patent: February 9, 2021

Assignee: NVIDIA Corporation

Inventors: Stephen Jones, Philip Alexander Cuadra, Daniel Elliot Wexler, Ignacio Llamas, Lacky V. Shah, Jerome F. Duluk, Christopher Lamb
COMPUTE TASK STATE ENCAPSULATION

Publication number: 20210019185

Abstract: One embodiment of the present invention sets forth a technique for encapsulating compute task state that enables out-of-order scheduling and execution of the compute tasks. The scheduling circuitry organizes the compute tasks into groups based on priority levels. The compute tasks may then be selected for execution using different scheduling schemes. Each group is maintained as a linked list of pointers to compute tasks that are encoded as task metadata (TMD) stored in memory. A TMD encapsulates the state and parameters needed to initialize, schedule, and execute a compute task.

Type: Application

Filed: October 5, 2020

Publication date: January 21, 2021

Inventors: Jerome F. DULUK, JR., Lacky V. SHAH, Sean J. TREICHLER
Compute task state encapsulation

Patent number: 10795722

Abstract: One embodiment of the present invention sets forth a technique for encapsulating compute task state that enables out-of-order scheduling and execution of the compute tasks. The scheduling circuitry organizes the compute tasks into groups based on priority levels. The compute tasks may then be selected for execution using different scheduling schemes. Each group is maintained as a linked list of pointers to compute tasks that are encoded as task metadata (TMD) stored in memory. A TMD encapsulates the state and parameters needed to initialize, schedule, and execute a compute task.

Type: Grant

Filed: November 9, 2011

Date of Patent: October 6, 2020

Assignee: NVIDIA Corporation

Inventors: Jerome F. Duluk, Jr., Lacky V. Shah, Sean J. Treichler
TECHNIQUE FOR COMPUTATIONAL NESTED PARALLELISM

Publication number: 20200151016

Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.

Type: Application

Filed: January 17, 2020

Publication date: May 14, 2020

Inventors: Stephen Jones, Philip Alexander Cuadra, Daniel Elliot Wexler, Ignacio Llamas, Lacky V. Shah, Jerome F. Duluk, Jr., Christopher Lamb
Software-assisted instruction level execution preemption

Patent number: 10552202

Abstract: One embodiment of the present invention sets forth a technique for instruction level execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. Any in-flight instructions that follow the preemption command in the processing pipeline are captured and stored in a processing task buffer to be reissued when the preempted program is resumed. The processing task buffer is designated as a high priority task to ensure the preempted instructions are reissued before any new instructions for the preempted context when execution of the preempted context is restored.

Type: Grant

Filed: May 12, 2017

Date of Patent: February 4, 2020

Assignee: NVIDIA CORPORATION

Inventors: Philip Alexander Cuadra, Christopher Lamb, Lacky V. Shah
Software-assisted instruction level execution preemption

Patent number: 10552201

Abstract: One embodiment of the present invention sets forth a technique for instruction level execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. Any in-flight instructions that follow the preemption command in the processing pipeline are captured and stored in a processing task buffer to be reissued when the preempted program is resumed. The processing task buffer is designated as a high priority task to ensure the preempted instructions are reissued before any new instructions for the preempted context when execution of the preempted context is restored.

Type: Grant

Filed: May 12, 2017

Date of Patent: February 4, 2020

Assignee: NVIDIA CORPORATION

Inventors: Philip Alexander Cuadra, Christopher Lamb, Lacky V. Shah
Cooperative thread array granularity context switch during trap handling

Patent number: 10289418

Abstract: Techniques are provided for handling a trap encountered in a thread that is part of a thread array that is being executed in a plurality of execution units. In these techniques, a data structure with an identifier associated with the thread is updated to indicate that the trap occurred during the execution of the thread array. Also in these techniques, the execution units execute a trap handling routine that includes a context switch. The execution units perform this context switch for at least one of the execution units as part of the trap handling routine while allowing the remaining execution units to exit the trap handling routine before the context switch. One advantage of the disclosed techniques is that the trap handling routine operates efficiently in parallel processors.

Type: Grant

Filed: December 27, 2012

Date of Patent: May 14, 2019

Assignee: NVIDIA CORPORATION

Inventors: Gerald F. Luiz, Philip Alexander Cuadra, Luke Durant, Shirish Gadre, Robert Ohannessian, Lacky V. Shah, Nicholas Wang, Arthur Danskin
Technique for saving and restoring thread group operating state

Patent number: 10235208

Abstract: A streaming multiprocessor (SM) included within a parallel processing unit (PPU) is configured to suspend a thread group executing on the SM and to save the operating state of the suspended thread group. A load-store unit (LSU) within the SM re-maps local memory associated with the thread group to a location in global memory. Subsequently, the SM may re-launch the suspended thread group. The LSU may then perform local memory access operations on behalf of the re-launched thread group with the re-mapped local memory that resides in global memory.

Type: Grant

Filed: December 11, 2012

Date of Patent: March 19, 2019

Assignee: NVIDIA CORPORATION

Inventors: Nicholas Wang, Lacky V. Shah, Gerald F. Luiz, Philip Alexander Cuadra, Luke Durant, Shirish Gadre

1 2 3 4 5 next