Patents by Inventor Naveen Cherukuri
Naveen Cherukuri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11966480Abstract: Apparatuses, systems, and techniques for supporting fairness of multiple context sharing cryptographic hardware. An accelerator circuit includes a copy engine (CE) with AES-GCM hardware configured to perform both encryption and authentication of data transfers for multiple applications or multiple data streams in a single application or belonging to a single user. The CE splits a data transfer of a specified size into a set of partial transfers. The CE sequentially executes the set of partial transfers using a context for a period of time (e.g., a timeslice) for an application. The CE stores in a secure memory for the application one or more data for encryption or decryption (e.g., a hash key, a block counter, etc.) computed from a last partial transfer. The one or more data for encryption or decryption are retrieved and used when data transfers for the application is resumed by the CE.Type: GrantFiled: March 10, 2022Date of Patent: April 23, 2024Assignee: Nvidia CorporationInventors: Adam Hendrickson, Vaishali Kulkarni, Gobikrishna Dhanuskodi, Naveen Cherukuri, Wish Gandhi, Raymond Wong
-
Publication number: 20230297696Abstract: In examples, a parallel processing unit (PPU) operates within a trusted execution environment (TEE) implemented using a central processing unit (CPU). A virtual machine (VM) executing within the TEE is provided access to the PPU by a hypervisor. However, data of an application executed by the VM is inaccessible to the hypervisor and other untrusted entities outside of the TEE. To protect the data in transit, the VM and the PPU may encrypt or decrypt the data for secure communication between the devices. To protect the data within the PPU, a protected memory region may be created in PPU memory where compute engines of the PPU are prevented from writing outside of the protected memory region. A write protect memory region is generated where access to the PPU memory is blocked from other computing devices and/or device instances.Type: ApplicationFiled: March 17, 2023Publication date: September 21, 2023Inventors: Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, Mark Hairgrove, Michael Woodmansee
-
Publication number: 20230297406Abstract: In examples, trusted execution environments (TEE) are provided for an instance of a parallel processing unit (PPU) as PPU TEEs. Different instances of a PPU correspond to different PPU TEEs, and provide accelerated confidential computing to a corresponding TEE. The processors of each PPU instance have separate and isolated paths through the memory system of the PPU which are assigned uniquely to an individual PPU instance. Data in device memory of the PPU may be isolated and access controlled amongst the PPU instances using one or more hardware firewalls. A GPU hypervisor assigns hardware resources to runtimes and performs access control and context switching for the runtimes. A PPU instance uses a cryptographic key to protect data for secure communication. Compute engines of the PPU instance are prevented from writing outside of a protected memory region. Access to a write protected region in PPU memory is blocked from other computing devices and/or device instances.Type: ApplicationFiled: March 17, 2023Publication date: September 21, 2023Inventors: Philip Rogers, Mark Overby, Vyas Venkataraman, Naveen Cherukuri, James Leroy Deming, Gobikrishna Dhanuskodi, Dwayne Swoboda, Lucien Dunning, Aruna Manjunatha, Aaron Jiricek, Mark Hairgrove, Mike Woodmansee
-
Publication number: 20230289453Abstract: Apparatuses, systems, and techniques for supporting fairness of multiple context sharing cryptographic hardware. An accelerator circuit includes a copy engine (CE) with AES-GCM hardware configured to perform both encryption and authentication of data transfers for multiple applications or multiple data streams in a single application or belonging to a single user. The CE splits a data transfer of a specified size into a set of partial transfers. The CE sequentially executes the set of partial transfers using a context for a period of time (e.g., a timeslice) for an application. The CE stores in a secure memory for the application one or more data for encryption or decryption (e.g., a hash key, a block counter, etc.) computed from a last partial transfer. The one or more data for encryption or decryption are retrieved and used when data transfers for the application is resumed by the CE.Type: ApplicationFiled: March 10, 2022Publication date: September 14, 2023Inventors: Adam Hendrickson, Vaishali Kulkarni, Gobikrishna Dhanuskodi, Naveen Cherukuri, Wish Gandhi, Raymond Wong
-
Publication number: 20230267235Abstract: Apparatuses, systems, and techniques for handling faults by a direct memory access (DMA) engine. When a DMA engine detects an error associated with an encryption or decryption operation, the DMA engine reports the error to a CPU, which may be executing an untrusted software directing a DMA operation, and the secure processor. The DMA engine waits for clearance from the secure processor before responding to further directions from the potentially untrusted software.Type: ApplicationFiled: February 22, 2022Publication date: August 24, 2023Inventors: Anuj Rao, Adam Hendrickson, Vaishali Kulkarni, Gobikrishna Dhanuskodi, Naveen Cherukuri
-
Patent number: 11720440Abstract: Various embodiments include a parallel processing computer system that detects memory errors as a memory client loads data from memory and disables the memory client from storing data to memory, thereby reducing the likelihood that the memory error propagates to other memory clients. The memory client initiates a stall sequence, while other memory clients continue to execute instructions and the memory continues to service memory load and store operations. When a memory error is detected, a specific bit pattern is stored in conjunction with the data associated with the memory error. When the data is copied from one memory to another memory, the specific bit pattern is also copied, in order to identify the data as having a memory error.Type: GrantFiled: July 12, 2021Date of Patent: August 8, 2023Assignee: NVIDIA CORPORATIONInventors: Naveen Cherukuri, Saurabh Hukerikar, Paul Racunas, Nirmal Raj Saxena, David Charles Patrick, Yiyang Feng, Abhijeet Ghadge, Steven James Heinrich, Adam Hendrickson, Gentaro Hirota, Praveen Joginipally, Vaishali Kulkarni, Peter C. Mills, Sandeep Navada, Manan Patel, Liang Yin
-
Patent number: 11698869Abstract: The subject application relates to computing an authentication tag for partial transfers scheduled across multiple direct memory access (DMA) engines. Apparatuses, systems, and techniques are described for computing an authentication tag for a data transfer when the data transfer is scheduled as partial transfers across a specified number of direct memory access (DMA) engines. An orchestration circuit stores partial authentication tags, computed by the DMA engines, and corresponding adjustment exponents during one or more rounds in which the partial transfers are scheduled and processed by the specified number of DMA engines. During a last round, a combined authentication tag can be computed based on the partial authentication tags and the corresponding adjustment exponents stored by the orchestration circuit during the rounds.Type: GrantFiled: March 10, 2022Date of Patent: July 11, 2023Assignee: NVIDIA CorporationInventors: Vaishali Kulkarni, Naveen Cherukuri, Raymond Wong, Adam Hendrickson, Gobikrishna Dhanuskodi, Wish Gandhi
-
Publication number: 20230103518Abstract: Apparatuses, systems, and techniques to generate a trusted execution environment including multiple accelerators. In at least one embodiment, a parallel processing unit (PPU), such as a graphics processing unit (GPU), operates in a secure execution mode including a protect memory region. Furthermore, in an embodiment, a cryptographic key is utilzed to protect data during transmission between the accelerators.Type: ApplicationFiled: September 24, 2021Publication date: April 6, 2023Inventors: Philip John Rogers, Mark Overby, Michael Asbury Woodmansee, Vyas Venkataraman, Naveen Cherukuri, Gobikrishna Dhanuskodi, Dwayne Frank Swoboda, Lucien Burton Dunning, Mark Hairgrove, Sudeshna Guha
-
Publication number: 20230094125Abstract: Apparatuses, systems, and techniques to generate a trusted execution environment including multiple accelerators. In at least one embodiment, a parallel processing unit (PPU), such as a graphics processing unit (GPU), operates in a secure execution mode including a protect memory region. Furthermore, in an embodiment, a cryptographic key is utilized to protect data during transmission between the accelerators.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Inventors: Philip John Rogers, Mark Overby, Michael Asbury Woodmansee, Vyas Venkataraman, Naveen Cherukuri, Gobikrishna Dhanuskodi, Dwayne Frank Swoboda, Lucien Burton Dunning, Mark Hairgrove, Sudeshna Guha
-
Publication number: 20230011863Abstract: Various embodiments include a parallel processing computer system that detects memory errors as a memory client loads data from memory and disables the memory client from storing data to memory, thereby reducing the likelihood that the memory error propagates to other memory clients. The memory client initiates a stall sequence, while other memory clients continue to execute instructions and the memory continues to service memory load and store operations. When a memory error is detected, a specific bit pattern is stored in conjunction with the data associated with the memory error. When the data is copied from one memory to another memory, the specific bit pattern is also copied, in order to identify the data as having a memory error.Type: ApplicationFiled: July 12, 2021Publication date: January 12, 2023Inventors: NAVEEN CHERUKURI, SAURABH HUKERIKAR, PAUL RACUNAS, NIRMAL RAJ SAXENA, DAVID CHARLES PATRICK, YIYANG FENG, ABHIJEET GHADGE, STEVEN JAMES HEINRICH, ADAM HENDRICKSON, GENTARO HIROTA, PRAVEEN JOGINIPALLY, VAISHALI KULKARNI, PETER C. MILLS, SANDEEP NAVADA, MANAN PATEL, LIANG YIN
-
Publication number: 20210294707Abstract: Apparatuses, systems, and techniques to detect memory errors and isolate or migrate partitions on a parallel processing unit using an application programming interface to facilitate parallel computing, such as CUDA. In at least one embodiment, interrupts are intercepted and processed on a graphics processing unit indicating a memory error for one or more partitions, and a policy is applied to isolate that memory error from other partitions.Type: ApplicationFiled: March 20, 2020Publication date: September 23, 2021Inventors: Jonathon Stuart Ramsay Evans, Naveen Cherukuri, Jerome Francis Duluk, JR., Shailendra Singh, Vaibhav Vyas, Wishwesh Gandhi, Arvind Gopalakrishnan, Manas Mandal
-
Patent number: 10712809Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.Type: GrantFiled: January 7, 2019Date of Patent: July 14, 2020Assignee: Intel CorporationInventors: Naveen Cherukuri, Jeffrey Wilcox, Venkatraman Iyer, Selim Bilgin, David S. Dunning, Robin Tim Frodsham, Theodore Z. Schoenborn, Sanjay Dabral
-
Publication number: 20190346909Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.Type: ApplicationFiled: January 7, 2019Publication date: November 14, 2019Applicant: Intel CorporationInventors: Naveen Cherukuri, Jeffrey Wilcox, Venkatraman Iyer, Selim Bilgin, David S. Dunning, Robin Tim Frodsham, Theodore Z. Schoenborn, Sanjay Dabral
-
Patent number: 10175744Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.Type: GrantFiled: March 7, 2017Date of Patent: January 8, 2019Assignee: Intel CorporationInventors: Naveen Cherukuri, Jeffrey Wilcox, Venkatraman Iyer, Selim Bilgin, David S. Dunning, Robin Tim Frodsham, Theodore Z. Schoenborn, Sanjay Dabral
-
Publication number: 20170336853Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.Type: ApplicationFiled: March 7, 2017Publication date: November 23, 2017Inventors: Naveen Cherukuri, Jeffrey WILCOX, Venkatraman Iyer, Selim BILGIN, David S. Dunning, Robin Tim FRODSHAM, Theodore Z. Schoenborn, Sanjay Dabral
-
Patent number: 9794349Abstract: Systems and methods of managing a link provide for receiving a remote width capability during a link initialization, the remote width capability corresponding to a remote port. A link between a local port and the remote port is operated at a plurality of link widths in accordance with the remote width capability.Type: GrantFiled: November 26, 2014Date of Patent: October 17, 2017Assignee: Intel CorporationInventors: Naveen Cherukuri, Aaron T. Spink, Phanindra Mannava, Tim Frodsham, Jeffrey R. Wilcox, Sanjay Dabral, David Dunning, Theodore Z. Schoenborn
-
Patent number: 9588575Abstract: Methods and apparatus relating to link power savings with state retention are described. In one embodiment, one or more components of two agents coupled via a serial link are turned off during idle periods while retaining link state in each agent. Other embodiments are also disclosed.Type: GrantFiled: July 1, 2014Date of Patent: March 7, 2017Assignee: Intel CorporationInventors: Naveen Cherukuri, Jeffrey Wilcox, Venkatraman Iyer, Selim Bilgin, David S. Dunning, Robin Tim Frodsham, Theodore Z. Schoenborn, Sanjay Dabral
-
Patent number: 9424191Abstract: An apparatus of an aspect includes a plurality of cores. The plurality of cores are logically grouped into a plurality of clusters. A cluster sharing map-based coherence directory is coupled with the plurality of cores and is to track sharing of data among the plurality of cores. The cluster sharing map-based coherence directory includes a tag array to store corresponding pairs of addresses and cluster identifiers. Each of the addresses is to identify data. Each of the cluster identifiers is to identify one of the clusters. The cluster sharing map-based coherence directory also includes a cluster sharing map array to store cluster sharing maps. Each of the cluster sharing maps corresponds to one of the pairs of addresses and cluster identifiers. Each of the cluster sharing maps is to indicate intra-cluster sharing of data identified by the corresponding address within a cluster identified by the corresponding cluster identifier.Type: GrantFiled: June 29, 2012Date of Patent: August 23, 2016Assignee: Intel CorporationInventors: Naveen Cherukuri, Mani Azimi
-
Patent number: 9418011Abstract: In one embodiment, the present invention includes a processor comprising a page tracker buffer (PTB), the PTB including a plurality of entries to store an address to a cache page and to store a signature to track an access to each cache line of the cache page, and a PTB handler, the PTB handler to load entries into the PTB and to update the signature. Other embodiments are also described and claimed.Type: GrantFiled: June 23, 2010Date of Patent: August 16, 2016Assignee: Intel CorporationInventors: Livio B. Soares, Naveen Cherukuri, Akhilesh Kumar, Mani Azimi
-
Patent number: 8990506Abstract: In one embodiment, the present invention includes a cache memory including cache lines that each have a tag field including a state portion to store a cache coherency state of data stored in the line and a weight portion to store a weight corresponding to a relative importance of the data. In various implementations, the weight can be based on the cache coherency state and a recency of usage of the data. Other embodiments are described and claimed.Type: GrantFiled: December 16, 2009Date of Patent: March 24, 2015Assignee: Intel CorporationInventors: Naveen Cherukuri, Dennis W. Brzezinski, Ioannis T. Schoinas, Anahita Shayesteh, Akhilesh Kumar, Mani Azimi