Patents by Inventor Poornachandra Rao

Poornachandra Rao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Techniques for efficiently transferring data to a processor

Patent number: 11907717

Abstract: A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.

Type: Grant

Filed: February 8, 2023

Date of Patent: February 20, 2024

Assignee: NVIDIA Corporation

Inventors: Andrew Kerr, Jack Choquette, Xiaogang Qiu, Omkar Paranjape, Poornachandra Rao, Shirish Gadre, Steven J. Heinrich, Manan Patel, Olivier Giroux, Alan Kaatz
TECHNIQUES FOR EFFICIENTLY TRANSFERRING DATA TO A PROCESSOR

Publication number: 20230185570

Abstract: A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.

Type: Application

Filed: February 8, 2023

Publication date: June 15, 2023

Inventors: Andrew KERR, Jack Choquette, Xiaogang Qiu, Omkar Paranjape, Poornachandra Rao, Shirish Gadre, Steven J. Heinrich, Manan Patel, Olivier Giroux, Alan Kaatz
Techniques for efficiently transferring data to a processor

Patent number: 11604649

Abstract: A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.

Type: Grant

Filed: June 30, 2021

Date of Patent: March 14, 2023

Assignee: NVIDIA Corporation

Inventors: Andrew Kerr, Jack Choquette, Xiaogang Qiu, Omkar Paranjape, Poornachandra Rao, Shirish Gadre, Steven J. Heinrich, Manan Patel, Olivier Giroux, Alan Kaatz
Unified cache for diverse memory traffic

Patent number: 11347668

Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.

Type: Grant

Filed: July 6, 2020

Date of Patent: May 31, 2022

Assignee: NVIDIA Corporation

Inventors: Xiaogang Qiu, Ronny Krashinsky, Steven Heinrich, Shirish Gadre, John Edmondson, Jack Choquette, Mark Gebhart, Ramesh Jandhyala, Poornachandra Rao, Omkar Paranjape, Michael Siu
METHOD AND SYSTEM FOR VEHICLE TRACTION

Publication number: 20210362751

Abstract: Various methods and systems are provided for automatically applying tractive material during operation of a vehicle. In one embodiment, the application of tractive material for a vehicle is adjusted based on calculated and expected tractive effort for each powered truck, axle, or wheel of the vehicle. In this way, tractive material application is controlled on a per truck, axle, or wheel basis in order to reduce excessive use of tractive material during operation of the vehicle.

Type: Application

Filed: May 19, 2021

Publication date: November 25, 2021

Inventors: Poornachandra Rao Gadipudi, Rajeev R. Verma
TECHNIQUES FOR EFFICIENTLY TRANSFERRING DATA TO A PROCESSOR

Publication number: 20210326137

Abstract: A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.

Type: Application

Filed: June 30, 2021

Publication date: October 21, 2021

Inventors: Andrew KERR, Jack CHOQUETTE, Xiaogang QIU, Omkar PARANJAPE, Poornachandra RAO, Shirish GADRE, Steven J. HEINRICH, Manan PATEL, Olivier GIROUX, Alan KAATZ
Techniques for efficiently transferring data to a processor

Patent number: 11080051

Abstract: A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.

Type: Grant

Filed: December 12, 2019

Date of Patent: August 3, 2021

Assignee: NVIDIA Corporation

Inventors: Andrew Kerr, Jack Choquette, Xiaogang Qiu, Omkar Paranjape, Poornachandra Rao, Shirish Gadre, Steven J. Heinrich, Manan Patel, Olivier Giroux, Alan Kaatz
TECHNIQUES FOR EFFICIENTLY TRANSFERRING DATA TO A PROCESSOR

Publication number: 20210124582

Abstract: A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.

Type: Application

Filed: December 12, 2019

Publication date: April 29, 2021

Inventors: Andrew Kerr, Jack Choquette, Xiaogang Qiu, Omkar Paranjape, Poornachandra Rao, Shirish Gadre, Steven J. Heinrich, Manan Patel, Olivier Giroux, Alan Kaatz
UNIFIED CACHE FOR DIVERSE MEMORY TRAFFIC

Publication number: 20200401541

Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.

Type: Application

Filed: July 6, 2020

Publication date: December 24, 2020

Inventors: Xiaogang QIU, Ronny KRASHINSKY, Steven HEINRICH, Shirish GADRE, John EDMONDSON, Jack CHOQUETTE, Mark GEBHART, Ramesh JANDHYALA, Poornachandra RAO, Omkar PARANJAPE, Michael SIU
Unified cache for diverse memory traffic

Patent number: 10705994

Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.

Type: Grant

Filed: May 4, 2017

Date of Patent: July 7, 2020

Assignee: NVIDIA Corporation

Inventors: Xiaogang Qiu, Ronny Krashinsky, Steven Heinrich, Shirish Gadre, John Edmondson, Jack Choquette, Mark Gebhart, Ramesh Jandhyala, Poornachandra Rao, Omkar Paranjape, Michael Siu
Unified cache for diverse memory traffic

Patent number: 10459861

Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.

Type: Grant

Filed: September 26, 2017

Date of Patent: October 29, 2019

Assignee: NVIDIA CORPORATION

Inventors: Xiaogang Qiu, Ronny Krashinsky, Steven Heinrich, Shirish Gadre, John Edmondson, Jack Choquette, Mark Gebhart, Ramesh Jandhyala, Poornachandra Rao, Omkar Paranjape, Michael Siu
UNIFIED CACHE FOR DIVERSE MEMORY TRAFFIC

Publication number: 20180322078

Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.

Type: Application

Filed: September 26, 2017

Publication date: November 8, 2018

Inventors: Xiaogang QIU, Ronny KRASHINSKY, Steven HEINRICH, Shirish GADRE, John EDMONDSON, Jack CHOQUETTE, Mark GEBHART, Ramesh JANDHYALA, Poornachandra RAO, Omkar PARANJAPE, Michael SIU
UNIFIED CACHE FOR DIVERSE MEMORY TRAFFIC

Publication number: 20180322077

Abstract: A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.

Type: Application

Filed: May 4, 2017

Publication date: November 8, 2018

Inventors: Xiaogang QIU, Ronny KRASHINSKY, Steven HEINRICH, Shirish GADRE, John EDMONDSON, Jack CHOQUETTE, Mark GEBHART, Ramesh JANDHYALA, Poornachandra RAO, Omkar PARANJAPE, Michael SIU
Approach to caching decoded texture data with variable dimensions

Patent number: 10032246

Abstract: A texture processing pipeline is configured to store decoded texture data within a cache unit in order to expedite the processing of texture requests. When a texture request is processed, the texture processing pipeline queries the cache unit to determine whether the requested data is resident in the cache. If the data is not resident in the cache unit, a cache miss occurs. The texture processing pipeline then reads encoded texture data from global memory, decodes that data, and writes different portions of the decoded memory into the cache unit at specific locations according to a caching map. If the data is, in fact, resident in the cache unit, a cache hit occurs, and the texture processing pipeline then reads decoded portions of the requested texture data from the cache unit and combines those portions according to the caching map.

Type: Grant

Filed: October 9, 2013

Date of Patent: July 24, 2018

Assignee: NVIDIA CORPORATION

Inventors: Eric T. Anderson, Poornachandra Rao
Managing memory regions to support sparse mappings

Patent number: 9754561

Abstract: One embodiment of the present invention includes a memory management unit (MMU) that is configured to manage sparse mappings. The MMU processes requests to translate virtual addresses to physical addresses based on page table entries (PTEs) that indicate a sparse status. If the MMU determines that the PTE does not include a mapping from a virtual address to a physical address, then the MMU responds to the request based on the sparse status. If the sparse status is active, then the MMU determines the physical address based on whether the type of the request is a write operation and, subsequently, generates an acknowledgement of the request. By contrast, if the sparse status is not active, then the MMU generates a page fault. Advantageously, the disclosed embodiments enable the computer system to manage sparse mappings without incurring the performance degradation associated with both page faults and conventional software-based sparse mapping management.

Type: Grant

Filed: October 4, 2013

Date of Patent: September 5, 2017

Assignee: NVIDIA CORPORATION

Inventors: Jonathan Dunaisky, Henry Packard Moreton, Jeffrey A. Bolz, Yury Y. Uralsky, James Leroy Deming, Rui M. Bastos, Patrick R. Brown, Amanpreet Grewal, Christian Amsinck, Poornachandra Rao, Jerome F. Duluk, Jr., Andrew J. Tao
Load/store operations in texture hardware

Patent number: 9595075

Abstract: Approaches are disclosed for performing memory access operations in a texture processing pipeline having a first portion configured to process texture memory access operations and a second portion configured to process non-texture memory access operations. A texture unit receives a memory access request. The texture unit determines whether the memory access request includes a texture memory access operation. If the memory access request includes a texture memory access operation, then the texture unit processes the memory access request via at least the first portion of the texture processing pipeline, otherwise, the texture unit processes the memory access request via at least the second portion of the texture processing pipeline. One advantage of the disclosed approach is that the same processing and cache memory may be used for both texture operations and load/store operations to various other address spaces, leading to reduced surface area and power consumption.

Type: Grant

Filed: September 26, 2013

Date of Patent: March 14, 2017

Assignee: NVIDIA Corporation

Inventors: Steven J. Heinrich, Eric T. Anderson, Jeffrey A. Bolz, Jonathan Dunaisky, Ramesh Jandhyala, Joel McCormack, Alexander L. Minkin, Bryon S. Nordquist, Poornachandra Rao
CATEGORIZING DATA BASED ON CROSS-CATEGORY RELEVANCE

Publication number: 20150235281

Abstract: Techniques for auto-categorization data may be provided. For example, a computing service may be implemented to analyze data sets. A first data set may include data strings pre-categorized in various groups. For a group, the computing service may generate a relevant data string representative of the group by considering how relevant that data string may be to the group and to other groups. A second data set may include an uncategorized data string. The computing service may match the uncategorized data string to the relevant data string and, accordingly, may categorize the uncategorized data string as belonging to the group.

Type: Application

Filed: June 16, 2014

Publication date: August 20, 2015

Inventors: Sarthak Jain, Poornachandra Rao Purushottama Pesala, Sagar Chodapaneedi
MANAGING MEMORY REGIONS TO SUPPORT SPARSE MAPPINGS

Publication number: 20150097847

Abstract: One embodiment of the present invention includes a memory management unit (MMU) that is configured to manage sparse mappings. The MMU processes requests to translate virtual addresses to physical addresses based on page table entries (PTEs) that indicate a sparse status. If the MMU determines that the PTE does not include a mapping from a virtual address to a physical address, then the MMU responds to the request based on the sparse status. If the sparse status is active, then the MMU determines the physical address based on whether the type of the request is a write operation and, subsequently, generates an acknowledgement of the request. By contrast, if the sparse status is not active, then the MMU generates a page fault. Advantageously, the disclosed embodiments enable the computer system to manage sparse mappings without incurring the performance degradation associated with both page faults and conventional software-based sparse mapping management.

Type: Application

Filed: October 4, 2013

Publication date: April 9, 2015

Applicant: NVIDIA CORPORATION

Inventors: Jonathan DUNAISKY, Henry Packard MORETON, Jeffrey A. BOLZ, Yury Y. URALSKY, James Leroy DEMING, Rui M. BASTOS, Patrick R. BROWN, Amanpreet GREWAL, Christian AMSINCK, Poornachandra RAO, Jerome F. DULUK, JR., Andrew J. TAO
APPROACH TO CACHING DECODED TEXTURE DATA WITH VARIABLE DIMENSIONS

Publication number: 20150097851

Abstract: A texture processing pipeline is configured to store decoded texture data within a cache unit in order to expedite the processing of texture requests. When a texture request is processed, the texture processing pipeline queries the cache unit to determine whether the requested data is resident in the cache. If the data is not resident in the cache unit, a cache miss occurs. The texture processing pipeline then reads encoded texture data from global memory, decodes that data, and writes different portions of the decoded memory into the cache unit at specific locations according to a caching map. If the data is, in fact, resident in the cache unit, a cache hit occurs, and the texture processing pipeline then reads decoded portions of the requested texture data from the cache unit and combines those portions according to the caching map.

Type: Application

Filed: October 9, 2013

Publication date: April 9, 2015

Applicant: NVIDIA CORPORATION

Inventors: Eric T. ANDERSON, Poornachandra RAO
LOAD/STORE OPERATIONS IN TEXTURE HARDWARE

Publication number: 20150084975

Abstract: Approaches are disclosed for performing memory access operations in a texture processing pipeline having a first portion configured to process texture memory access operations and a second portion configured to process non-texture memory access operations. A texture unit receives a memory access request. The texture unit determines whether the memory access request includes a texture memory access operation. If the memory access request includes a texture memory access operation, then the texture unit processes the memory access request via at least the first portion of the texture processing pipeline, otherwise, the texture unit processes the memory access request via at least the second portion of the texture processing pipeline. One advantage of the disclosed approach is that the same processing and cache memory may be used for both texture operations and load/store operations to various other address spaces, leading to reduced surface area and power consumption.

Type: Application

Filed: September 26, 2013

Publication date: March 26, 2015

Applicant: NVIDIA CORPORATION

Inventors: Steven J. HEINRICH, Eric T. ANDERSON, Jeffrey A. BOLZ, Jonathan DUNAISKY, Ramesh JANDHYALA, Joel MCCORMACK, Alexander L. MINKIN, Bryon S. NORDQUIST, Poornachandra RAO

1 2 next