Patents by Inventor Prakash Bangalore Prabhakar

Prakash Bangalore Prabhakar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

FAST DATA SYNCHRONIZATION IN PROCESSORS AND MEMORY

Publication number: 20230315655

Abstract: A new synchronization system synchronizes data exchanges between producer processes and consumer processes which may be on the same or different processors in a multiprocessor system. The synchronization incurs less than one roundtrip of latency - in some implementations, in approximately 0.5 roundtrip times. A key aspect of the fast synchronization is that the producer’s data store is followed without delay with the updating of a barrier on which the consumer is waiting.

Type: Application

Filed: March 10, 2022

Publication date: October 5, 2023

Inventors: Jack CHOQUETTE, Ronny KRASHINSKY, Timothy GUO, Carter EDWARDS, Steve HEINRICH, John EDMONDSON, Prakash Bangalore PRABHAKAR, Apoorv PARLE, JR., Manan PATEL, Olivier GIROUX, Michael PELLAUER
Distributed Shared Memory

Publication number: 20230289189

Abstract: Distributed shared memory (DSMEM) comprises blocks of memory that are distributed or scattered across a processor (such as a GPU). Threads executing on a processing core local to one memory block are able to access a memory block local to a different processing core. In one embodiment, shared access to these DSMEM allocations distributed across a collection of processing cores is implemented by communications between the processing cores. Such distributed shared memory provides very low latency memory access for processing cores located in proximity to the memory blocks, and also provides a way for more distant processing cores to also access the memory blocks in a manner and using interconnects that do not interfere with the processing cores' access to main or global memory such as hacked by an L2 cache.

Type: Application

Filed: March 10, 2022

Publication date: September 14, 2023

Inventors: Prakash BANGALORE PRABHAKAR, Gentaro HIROTA, Ronny KRASHINSKY, Ze LONG, Brian PHARRIS, Rajballav DASH, Jeff TUCKEY, Jerome F. DULUK, JR., Lacky SHAH, Luke DURANT, Jack CHOQUETTE, Eric WERNESS, Naman GOVIL, Manan PATEL, Shayani DEB, Sandeep NAVADA, John EDMONDSON, Greg PALMER, Wish GANDHI, Ravi MANYAM, Apoorv PARLE, Olivier GIROUX, Shirish GADRE, Steve HEINRICH
Cooperative Group Arrays

Publication number: 20230289215

Abstract: A new level(s) of hierarchy—Cooperate Group Arrays (CGAs)—and an associated new hardware-based work distribution/execution model is described. A CGA is a grid of thread blocks (also referred to as cooperative thread arrays (CTAs)). CGAs provide co-scheduling, e.g., control over where CTAs are placed/executed in a processor (such as a GPU), relative to the memory required by an application and relative to each other. Hardware support for such CGAs guarantees concurrency and enables applications to see more data locality, reduced latency, and better synchronization between all the threads in tightly cooperating collections of CTAs programmably distributed across different (e.g., hierarchical) hardware domains or partitions.

Type: Application

Filed: March 10, 2022

Publication date: September 14, 2023

Inventors: Greg PALMER, Gentaro HIROTA, Ronny KRASHINSKY, Ze LONG, Brian PHARRIS, Rajballav DASH, Jeff TUCKEY, Jerome F. DULUK, JR., Lacky SHAH, Luke DURANT, Jack CHOQUETTE, Eric WERNESS, Naman GOVIL, Manan PATEL, Shayani DEB, Sandeep NAVADA, John EDMONDSON, Prakash BANGALORE PRABHAKAR, Wish GANDHI, Ravi MANYAM, Apoorv PARLE, Olivier GIROUX, Shirish GADRE, Steve HEINRICH
PROGRAMMATICALLY CONTROLLED DATA MULTICASTING ACROSS MULTIPLE COMPUTE ENGINES

Publication number: 20230289190

Abstract: This specification describes a programmatic multicast technique enabling one thread (for example, in a cooperative group array (CGA) on a GPU) to request data on behalf of one or more other threads (for example, executing on respective processor cores of the GPU). The multicast is supported by tracking circuitry that interfaces between multicast requests received from processor cores and the available memory. The multicast is designed to reduce cache (for example, layer 2 cache) bandwidth utilization enabling strong scaling and smaller tile sizes.

Type: Application

Filed: March 10, 2022

Publication date: September 14, 2023

Inventors: Apoorv PARLE, Ronny KRASHINSKY, John EDMONDSON, Jack CHOQUETTE, Shirish GADRE, Steve HEINRICH, Manan PATEL, Prakash Bangalore PRABHAKAR, JR., Ravi MANYAM, Wish GANDHI, Lacky SHAH, Alexander L. Minkin
Addressing cache slices in a last level cache

Patent number: 11429534

Abstract: A system in having M memory controllers between a first memory and a second memory having N operative memory slices, where N and M are not evenly divisible, includes logic to operate the M memory controllers to linearly distribute addresses of the second memory across the N operative memory slices. The system may be utilized in commercial applications such as data centers, autonomous vehicles, and machine learning.

Type: Grant

Filed: April 13, 2021

Date of Patent: August 30, 2022

Assignee: NVIDIA CORP.

Inventors: Prakash Bangalore Prabhakar, James M. Van Dyke, Kun Fang
ADDRESSING CACHE SLICES IN A LAST LEVEL CACHE

Publication number: 20210255963

Abstract: “A system in having M memory controllers between a first memory and a second memory having N operative memory slices, where N and M are not evenly divisible, includes logic to operate the M memory controllers to linearly distribute addresses of the second memory across the N operative memory slices. The system may be utilized in commercial applications such as data centers, autonomous vehicles, and machine learning.

Type: Application

Filed: April 13, 2021

Publication date: August 19, 2021

Applicant: NVIDIA Corp.

Inventors: Prakash Bangalore Prabhakar, James M. Van Dyke, Kun Fang
Addressing cache slices in a last level cache

Patent number: 10983919

Abstract: An addressing scheme in systems utilizing a number of operative memory slices in a last level cache that is not evenly divisible by a number of memory channels utilizes the operative slices exposes the full last level cache bandwidth and capacity to data processing logic in a high-performance graphics system.

Type: Grant

Filed: September 25, 2019

Date of Patent: April 20, 2021

Assignee: NVIDIA Corp.

Inventors: Prakash Bangalore Prabhakar, James M Van Dyke, Kun Fang
ADDRESSING CACHE SLICES IN A LAST LEVEL CACHE

Publication number: 20210089465

Abstract: An addressing scheme in systems utilizing a number of operative memory slices in a last level cache that is not evenly divisible by a number of memory channels utilizes the operative slices exposes the full last level cache bandwidth and capacity to data processing logic in a high-performance graphics system.

Type: Application

Filed: September 25, 2019

Publication date: March 25, 2021

Applicant: NVIDIA Corp.

Inventors: Prakash Bangalore Prabhakar, James M. Van Dyke, Kun Fang