Patents by Inventor Naveen Cherukuri

Naveen Cherukuri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11966480
    Abstract: Apparatuses, systems, and techniques for supporting fairness of multiple context sharing cryptographic hardware. An accelerator circuit includes a copy engine (CE) with AES-GCM hardware configured to perform both encryption and authentication of data transfers for multiple applications or multiple data streams in a single application or belonging to a single user. The CE splits a data transfer of a specified size into a set of partial transfers. The CE sequentially executes the set of partial transfers using a context for a period of time (e.g., a timeslice) for an application. The CE stores in a secure memory for the application one or more data for encryption or decryption (e.g., a hash key, a block counter, etc.) computed from a last partial transfer. The one or more data for encryption or decryption are retrieved and used when data transfers for the application is resumed by the CE.
    Type: Grant
    Filed: March 10, 2022
    Date of Patent: April 23, 2024
    Assignee: Nvidia Corporation
    Inventors: Adam Hendrickson, Vaishali Kulkarni, Gobikrishna Dhanuskodi, Naveen Cherukuri, Wish Gandhi, Raymond Wong
  • Publication number: 20230297696
    Abstract: In examples, a parallel processing unit (PPU) operates within a trusted execution environment (TEE) implemented using a central processing unit (CPU). A virtual machine (VM) executing within the TEE is provided access to the PPU by a hypervisor. However, data of an application executed by the VM is inaccessible to the hypervisor and other untrusted entities outside of the TEE. To protect the data in transit, the VM and the PPU may encrypt or decrypt the data for secure communication between the devices. To protect the data within the PPU, a protected memory region may be created in PPU memory where compute engines of the PPU are prevented from writing outside of the protected memory region. A write protect memory region is generated where access to the PPU memory is blocked from other computing devices and/or device instances.
    Type: Application
    Filed: March 17, 2023
    Publication date: September 21, 2023
    Inventors: Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, Mark Hairgrove, Michael Woodmansee
  • Publication number: 20230297406
    Abstract: In examples, trusted execution environments (TEE) are provided for an instance of a parallel processing unit (PPU) as PPU TEEs. Different instances of a PPU correspond to different PPU TEEs, and provide accelerated confidential computing to a corresponding TEE. The processors of each PPU instance have separate and isolated paths through the memory system of the PPU which are assigned uniquely to an individual PPU instance. Data in device memory of the PPU may be isolated and access controlled amongst the PPU instances using one or more hardware firewalls. A GPU hypervisor assigns hardware resources to runtimes and performs access control and context switching for the runtimes. A PPU instance uses a cryptographic key to protect data for secure communication. Compute engines of the PPU instance are prevented from writing outside of a protected memory region. Access to a write protected region in PPU memory is blocked from other computing devices and/or device instances.
    Type: Application
    Filed: March 17, 2023
    Publication date: September 21, 2023
    Inventors: Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, Mark Hairgrove, Mike Woodmansee
  • Publication number: 20230289453
    Abstract: Apparatuses, systems, and techniques for supporting fairness of multiple context sharing cryptographic hardware. An accelerator circuit includes a copy engine (CE) with AES-GCM hardware configured to perform both encryption and authentication of data transfers for multiple applications or multiple data streams in a single application or belonging to a single user. The CE splits a data transfer of a specified size into a set of partial transfers. The CE sequentially executes the set of partial transfers using a context for a period of time (e.g., a timeslice) for an application. The CE stores in a secure memory for the application one or more data for encryption or decryption (e.g., a hash key, a block counter, etc.) computed from a last partial transfer. The one or more data for encryption or decryption are retrieved and used when data transfers for the application is resumed by the CE.
    Type: Application
    Filed: March 10, 2022
    Publication date: September 14, 2023
    Inventors: Adam Hendrickson, Vaishali Kulkarni, Gobikrishna Dhanuskodi, Naveen Cherukuri, Wish Gandhi, Raymond Wong
  • Publication number: 20230267235
    Abstract: Apparatuses, systems, and techniques for handling faults by a direct memory access (DMA) engine. When a DMA engine detects an error associated with an encryption or decryption operation, the DMA engine reports the error to a CPU, which may be executing an untrusted software directing a DMA operation, and the secure processor. The DMA engine waits for clearance from the secure processor before responding to further directions from the potentially untrusted software.
    Type: Application
    Filed: February 22, 2022
    Publication date: August 24, 2023
    Inventors: Anuj Rao, Adam Hendrickson, Vaishali Kulkarni, Gobikrishna Dhanuskodi, Naveen Cherukuri
  • Patent number: 11720440
    Abstract: Various embodiments include a parallel processing computer system that detects memory errors as a memory client loads data from memory and disables the memory client from storing data to memory, thereby reducing the likelihood that the memory error propagates to other memory clients. The memory client initiates a stall sequence, while other memory clients continue to execute instructions and the memory continues to service memory load and store operations. When a memory error is detected, a specific bit pattern is stored in conjunction with the data associated with the memory error. When the data is copied from one memory to another memory, the specific bit pattern is also copied, in order to identify the data as having a memory error.
    Type: Grant
    Filed: July 12, 2021
    Date of Patent: August 8, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Naveen Cherukuri, Saurabh Hukerikar, Paul Racunas, Nirmal Raj Saxena, David Charles Patrick, Yiyang Feng, Abhijeet Ghadge, Steven James Heinrich, Adam Hendrickson, Gentaro Hirota, Praveen Joginipally, Vaishali Kulkarni, Peter C. Mills, Sandeep Navada, Manan Patel, Liang Yin
  • Patent number: 11698869
    Abstract: The subject application relates to computing an authentication tag for partial transfers scheduled across multiple direct memory access (DMA) engines. Apparatuses, systems, and techniques are described for computing an authentication tag for a data transfer when the data transfer is scheduled as partial transfers across a specified number of direct memory access (DMA) engines. An orchestration circuit stores partial authentication tags, computed by the DMA engines, and corresponding adjustment exponents during one or more rounds in which the partial transfers are scheduled and processed by the specified number of DMA engines. During a last round, a combined authentication tag can be computed based on the partial authentication tags and the corresponding adjustment exponents stored by the orchestration circuit during the rounds.
    Type: Grant
    Filed: March 10, 2022
    Date of Patent: July 11, 2023
    Assignee: NVIDIA Corporation
    Inventors: Vaishali Kulkarni, Naveen Cherukuri, Raymond Wong, Adam Hendrickson, Gobikrishna Dhanuskodi, Wish Gandhi
  • Publication number: 20230103518
    Abstract: Apparatuses, systems, and techniques to generate a trusted execution environment including multiple accelerators. In at least one embodiment, a parallel processing unit (PPU), such as a graphics processing unit (GPU), operates in a secure execution mode including a protect memory region. Furthermore, in an embodiment, a cryptographic key is utilzed to protect data during transmission between the accelerators.
    Type: Application
    Filed: September 24, 2021
    Publication date: April 6, 2023
    Inventors: Philip John Rogers, Mark Overby, Michael Asbury Woodmansee, Vyas Venkataraman, Naveen Cherukuri, Gobikrishna Dhanuskodi, Dwayne Frank Swoboda, Lucien Burton Dunning, Mark Hairgrove, Sudeshna Guha
  • Publication number: 20230094125
    Abstract: Apparatuses, systems, and techniques to generate a trusted execution environment including multiple accelerators. In at least one embodiment, a parallel processing unit (PPU), such as a graphics processing unit (GPU), operates in a secure execution mode including a protect memory region. Furthermore, in an embodiment, a cryptographic key is utilized to protect data during transmission between the accelerators.
    Type: Application
    Filed: September 24, 2021
    Publication date: March 30, 2023
    Inventors: Philip John Rogers, Mark Overby, Michael Asbury Woodmansee, Vyas Venkataraman, Naveen Cherukuri, Gobikrishna Dhanuskodi, Dwayne Frank Swoboda, Lucien Burton Dunning, Mark Hairgrove, Sudeshna Guha
  • Publication number: 20230011863
    Abstract: Various embodiments include a parallel processing computer system that detects memory errors as a memory client loads data from memory and disables the memory client from storing data to memory, thereby reducing the likelihood that the memory error propagates to other memory clients. The memory client initiates a stall sequence, while other memory clients continue to execute instructions and the memory continues to service memory load and store operations. When a memory error is detected, a specific bit pattern is stored in conjunction with the data associated with the memory error. When the data is copied from one memory to another memory, the specific bit pattern is also copied, in order to identify the data as having a memory error.
    Type: Application
    Filed: July 12, 2021
    Publication date: January 12, 2023
    Inventors: NAVEEN CHERUKURI, SAURABH HUKERIKAR, PAUL RACUNAS, NIRMAL RAJ SAXENA, DAVID CHARLES PATRICK, YIYANG FENG, ABHIJEET GHADGE, STEVEN JAMES HEINRICH, ADAM HENDRICKSON, GENTARO HIROTA, PRAVEEN JOGINIPALLY, VAISHALI KULKARNI, PETER C. MILLS, SANDEEP NAVADA, MANAN PATEL, LIANG YIN
  • Publication number: 20210294707
    Abstract: Apparatuses, systems, and techniques to detect memory errors and isolate or migrate partitions on a parallel processing unit using an application programming interface to facilitate parallel computing, such as CUDA. In at least one embodiment, interrupts are intercepted and processed on a graphics processing unit indicating a memory error for one or more partitions, and a policy is applied to isolate that memory error from other partitions.
    Type: Application
    Filed: March 20, 2020
    Publication date: September 23, 2021
    Inventors: Jonathon Stuart Ramsay Evans, Naveen Cherukuri, Jerome Francis Duluk, JR., Shailendra Singh, Vaibhav Vyas, Wishwesh Gandhi, Arvind Gopalakrishnan, Manas Mandal
  • Patent number: 10712809
    Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.
    Type: Grant
    Filed: January 7, 2019
    Date of Patent: July 14, 2020
    Assignee: Intel Corporation
    Inventors: Naveen Cherukuri, Jeffrey Wilcox, Venkatraman Iyer, Selim Bilgin, David S. Dunning, Robin Tim Frodsham, Theodore Z. Schoenborn, Sanjay Dabral
  • Publication number: 20190346909
    Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.
    Type: Application
    Filed: January 7, 2019
    Publication date: November 14, 2019
    Applicant: Intel Corporation
    Inventors: Naveen Cherukuri, Jeffrey Wilcox, Venkatraman Iyer, Selim Bilgin, David S. Dunning, Robin Tim Frodsham, Theodore Z. Schoenborn, Sanjay Dabral
  • Patent number: 10175744
    Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.
    Type: Grant
    Filed: March 7, 2017
    Date of Patent: January 8, 2019
    Assignee: Intel Corporation
    Inventors: Naveen Cherukuri, Jeffrey Wilcox, Venkatraman Iyer, Selim Bilgin, David S. Dunning, Robin Tim Frodsham, Theodore Z. Schoenborn, Sanjay Dabral
  • Publication number: 20170336853
    Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.
    Type: Application
    Filed: March 7, 2017
    Publication date: November 23, 2017
    Inventors: Naveen Cherukuri, Jeffrey WILCOX, Venkatraman Iyer, Selim BILGIN, David S. Dunning, Robin Tim FRODSHAM, Theodore Z. Schoenborn, Sanjay Dabral
  • Patent number: 9794349
    Abstract: Systems and methods of managing a link provide for receiving a remote width capability during a link initialization, the remote width capability corresponding to a remote port. A link between a local port and the remote port is operated at a plurality of link widths in accordance with the remote width capability.
    Type: Grant
    Filed: November 26, 2014
    Date of Patent: October 17, 2017
    Assignee: Intel Corporation
    Inventors: Naveen Cherukuri, Aaron T. Spink, Phanindra Mannava, Tim Frodsham, Jeffrey R. Wilcox, Sanjay Dabral, David Dunning, Theodore Z. Schoenborn
  • Patent number: 9588575
    Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.
    Type: Grant
    Filed: July 1, 2014
    Date of Patent: March 7, 2017
    Assignee: Intel Corporation
    Inventors: Naveen Cherukuri, Jeffrey Wilcox, Venkatraman Iyer, Selim Bilgin, David S. Dunning, Robin Tim Frodsham, Theodore Z. Schoenborn, Sanjay Dabral
  • Patent number: 9424191
    Abstract: An apparatus of an aspect includes a plurality of cores. The plurality of cores are logically grouped into a plurality of clusters. A cluster sharing map-based coherence directory is coupled with the plurality of cores and is to track sharing of data among the plurality of cores. The cluster sharing map-based coherence directory includes a tag array to store corresponding pairs of addresses and cluster identifiers. Each of the addresses is to identify data. Each of the cluster identifiers is to identify one of the clusters. The cluster sharing map-based coherence directory also includes a cluster sharing map array to store cluster sharing maps. Each of the cluster sharing maps corresponds to one of the pairs of addresses and cluster identifiers. Each of the cluster sharing maps is to indicate intra-cluster sharing of data identified by the corresponding address within a cluster identified by the corresponding cluster identifier.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: August 23, 2016
    Assignee: Intel Corporation
    Inventors: Naveen Cherukuri, Mani Azimi
  • Patent number: 9418011
    Abstract: In one embodiment, the present invention includes a processor comprising a page tracker buffer (PTB), the PTB including a plurality of entries to store an address to a cache page and to store a signature to track an access to each cache line of the cache page, and a PTB handler, the PTB handler to load entries into the PTB and to update the signature. Other embodiments are also described and claimed.
    Type: Grant
    Filed: June 23, 2010
    Date of Patent: August 16, 2016
    Assignee: Intel Corporation
    Inventors: Livio B. Soares, Naveen Cherukuri, Akhilesh Kumar, Mani Azimi
  • Patent number: 8990506
    Abstract: In one embodiment, the present invention includes a cache memory including cache lines that each have a tag field including a state portion to store a cache coherency state of data stored in the line and a weight portion to store a weight corresponding to a relative importance of the data. In various implementations, the weight can be based on the cache coherency state and a recency of usage of the data. Other embodiments are described and claimed.
    Type: Grant
    Filed: December 16, 2009
    Date of Patent: March 24, 2015
    Assignee: Intel Corporation
    Inventors: Naveen Cherukuri, Dennis W. Brzezinski, Ioannis T. Schoinas, Anahita Shayesteh, Akhilesh Kumar, Mani Azimi