Patents by Inventor Saurabh Gayen

Saurabh Gayen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240126555
    Abstract: A method of an aspect includes receiving a request for a chained accelerator operation, and configuring a chain of accelerators to perform the chained accelerator operation. This may include configuring a first accelerator to access an input data from a source memory location in system memory, process the input data, and generate first intermediate data. This may also include configuring a second accelerator to receive the first intermediate data, without the first intermediate data having been sent to the system memory, process the first intermediate data, and generate additional data. Other apparatus, methods, systems, and machine-readable medium are disclosed.
    Type: Application
    Filed: October 17, 2022
    Publication date: April 18, 2024
    Inventors: Saurabh GAYEN, Christopher J. HUGHES, Utkarsh Y. KAKAIYA, Alexander F. HEINECKE
  • Publication number: 20240126613
    Abstract: A chip or other apparatus of an aspect includes a first accelerator and a second accelerator. The first accelerator has support for a chained accelerator operation. The first accelerator is to be controlled as part of the chained accelerator operation to access an input data from a source memory location in system memory, process the input data, and generate first intermediate data. The second accelerator also has support for the chained accelerator operation. The second accelerator is to be controlled as part of the chained accelerator operation to receive the first intermediate data, without the first intermediate data having been sent to the system memory, process the first intermediate data, and generate additional data. Other apparatus, methods, systems, and machine-readable medium are disclosed.
    Type: Application
    Filed: October 17, 2022
    Publication date: April 18, 2024
    Inventors: Saurabh GAYEN, Christopher J. HUGHES, Utkarsh Y. KAKAIYA, Alexander F. HEINECKE
  • Publication number: 20240127392
    Abstract: A chip or other apparatus of an aspect includes a first accelerator and a second accelerator. The first accelerator has support for a chained accelerator operation. The first accelerator is to be controlled as part of the chained accelerator operation to access an input data from a source memory location in system memory, process the input data, generate first intermediate data, and store the first intermediate data to a storage. The second accelerator also has support for the chained accelerator operation. The second accelerator is to be controlled as part of the chained accelerator operation to receive the first intermediate data from the storage, without the first intermediate data having been sent to the system memory, process the first intermediate data, and generate additional data. Other apparatus, methods, systems, and machine-readable medium are disclosed.
    Type: Application
    Filed: October 17, 2022
    Publication date: April 18, 2024
    Inventors: Christopher J. HUGHES, Saurabh GAYEN, Utkarsh Y. KAKAIYA, Alexander F. HEINECKE
  • Publication number: 20240054011
    Abstract: Methods and apparatus relating to data streaming accelerators are described. In an embodiment, a hardware accelerator such as a Data Streaming Accelerator (DSA) logic circuitry performs data movement and/or data transformation for data to be transferred between a processor (having one or more processor cores) and a storage device. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: August 12, 2023
    Publication date: February 15, 2024
    Applicant: Intel Corporation
    Inventors: Rajesh M. Sankaran, Philip R. Lantz, Narayan Ranganathan, Saurabh Gayen, Sanjay Kumar, Nikhil Rao, Dhananjay A. Joshi, Hai Ming Khor, Utkarsh Y. Kakaiya
  • Publication number: 20230289229
    Abstract: Methods and apparatus relating to confidential computing extensions for highly scalable accelerators are described. One or more embodiments provide extensions for scalable accelerator(s) to be able to directly assign accelerator work-queue(s) to Trusted Execution Environment (TEE) Virtual Machines (TVMs). Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: June 30, 2022
    Publication date: September 14, 2023
    Applicant: Intel Corporation
    Inventors: Utkarsh Y. Kakaiya, Saurabh Gayen, Kapil Sood, Naveen Lakkakula
  • Publication number: 20230251986
    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
    Type: Application
    Filed: April 6, 2023
    Publication date: August 10, 2023
    Applicant: Intel Corporation
    Inventors: Philip R. Lantz, Sanjay Kumar, Rajesh M. Sankaran, Saurabh Gayen
  • Publication number: 20230185603
    Abstract: Methods and apparatus relating to dynamic capability discovery and enforcement for accelerators and devices in multi-tenant systems are described. In an embodiment, a hardware accelerator device advertises one or more available operations and/or capabilities of the hardware accelerator device to one or more tenants. Logic circuitry controls access to the one or more available operations and/or capabilities of the one or more work queues on a per-tenant basis. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: December 14, 2021
    Publication date: June 15, 2023
    Applicant: Intel Corporation
    Inventors: Saurabh Gayen, Philip Lantz, Narayan Ranganathan, Dhananjay Joshi, Rajesh Sankaran, Utkarsh Kakaiya
  • Patent number: 11650947
    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
    Type: Grant
    Filed: August 24, 2021
    Date of Patent: May 16, 2023
    Assignee: Intel Corporation
    Inventors: Philip R. Lantz, Sanjay Kumar, Rajesh M. Sankaran, Saurabh Gayen
  • Publication number: 20230042934
    Abstract: Apparatus and method for high-performance page fault handling. For example, one embodiment of an apparatus comprises: one or more accelerator engines to process work descriptors submitted by clients to a plurality of work queues; fault processing hardware logic associated with the one or more accelerator engines, the fault processing hardware logic to implement a specified page fault handling mode for each work queue of the plurality of work queues, the page fault handling modes including a first page fault handling mode and a second page fault handling mode.
    Type: Application
    Filed: December 22, 2021
    Publication date: February 9, 2023
    Inventors: Utkarsh Y. KAKAIYA, Philip LANTZ, Sanjay KUMAR, Rajesh SANKARAN, Narayan RANGANATHAN, Saurabh GAYEN, Dhananjay JOSHI, Nikhil P. RAO
  • Publication number: 20230040226
    Abstract: Apparatus and method for managing pipeline depth of a data processing device. For example, one embodiment of an apparatus comprises: an interface to receive a plurality of work requests from a plurality of clients; and a plurality of engines to perform the plurality of work requests; wherein the work requests are to be dispatched to the plurality of engines from a plurality of work queues, the work queues to store a work descriptor per work request, each work descriptor to include information needed to perform a corresponding work request, wherein the plurality of work queues include a first work queue to store work descriptors associated with first latency characteristics and a second work queue to store work descriptors associated with second latency characteristics; engine configuration circuitry to configure a first engine to have a first pipeline depth based on the first latency characteristics and to configure a second engine to have a second pipeline depth based on the second latency characteristics.
    Type: Application
    Filed: December 22, 2021
    Publication date: February 9, 2023
    Inventors: Saurabh GAYEN, Dhananjay JOSHI, Philip LANTZ, Rajesh SANKARAN, Narayan RANGANATHAN
  • Publication number: 20230032236
    Abstract: Methods and apparatus relating to data streaming accelerators are described. In an embodiment, a hardware accelerator such as a Data Streaming Accelerator (DSA) logic circuitry provides high-performance data movement and/or data transformation for data to be transferred between a processor (having one or more processor cores) and a storage device. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: July 27, 2022
    Publication date: February 2, 2023
    Applicant: Intel Corporation
    Inventors: Rajesh M. Sankaran, Philip R. Lantz, Narayan Ranganathan, Saurabh Gayen, Sanjay Kumar, Nikhil Rao, Dhananjay A. Joshi, Hai Ming Khor, Utkarsh Y. Kakaiya
  • Publication number: 20230032586
    Abstract: Methods and apparatus relating to scalable access control checking for cross-address-space data movement are described. In an embodiment, a memory stores an InterDomain Permissions Table (IDPT) having a plurality of entries. At least one entry of the IDPT provides a relationship between a target address space identifier and a plurality of requester address space identifiers. A hardware accelerator device allows access to a target address space, corresponding to the target address space identifier, by one or more of requesters, corresponding to the plurality of requester address space identifiers, respectively, based at least in part on the relationship provided by the at least one entry of the IDPT. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: April 1, 2022
    Publication date: February 2, 2023
    Applicant: Intel Corporation
    Inventors: Narayan Ranganathan, Philip R. Lantz, Rajesh M. Sankaran, Sanjay Kumar, Saurabh Gayen, Nikhil Rao, Utkarsh Y. Kakaiya, Dhananjay A. Joshi, David Jiang, Ashok Raj
  • Publication number: 20220350499
    Abstract: As described herein, for a selected process identifier and virtual address, a page fault arising from multiple sources can be solved by a one-time operation. The selected process identifier can include a virtual function (VF) identifier or process address space identifier (PASID). In some examples, solving a page fault arising from multiple sources by a one-time operation comprises invoking a page fault handler to determine an address translation.
    Type: Application
    Filed: May 16, 2022
    Publication date: November 3, 2022
    Inventors: Shaopeng HE, Yadong LI, Anjali Singhai JAIN, Kenneth G. KEELS, Andrzej SAWULA, Kun TIAN, Ashok RAJ, Rupin H. VAKHARWALA, Rajesh M. SANKARAN, Saurabh GAYEN, Baolu LU, Yan ZHAO
  • Publication number: 20220261178
    Abstract: Examples described herein relate to a packet processing device that includes circuitry to receive an address translation for a virtual to physical address prior to receipt of a GPUDirect remote direct memory access (RDMA) operation, wherein the address translation is provided at initiation of a process executed by a host system and circuitry to apply the address translation for a received GPUDirect RDMA operation.
    Type: Application
    Filed: March 7, 2022
    Publication date: August 18, 2022
    Inventors: Shaopeng HE, Yadong LI, Anjali Singhai JAIN, Kun TIAN, Yan ZHAO, Yaozu DONG, Baolu LU, Rajesh M. SANKARAN, Eliel LOUZOUN, Rupin H. VAKHARWALA, David HARRIMAN, Saurabh GAYEN, Philip LANTZ, Israel BEN SHAHAR, Kenneth G. KEELS
  • Publication number: 20210382836
    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
    Type: Application
    Filed: August 24, 2021
    Publication date: December 9, 2021
    Applicant: Intel Corporation
    Inventors: Philip R. Lantz, Sanjay Kumar, Rajesh M. Sankaran, Saurabh Gayen
  • Publication number: 20210342182
    Abstract: In one embodiment, a data mover accelerator is to receive, from a first agent having a first address space and a first process address space identifier (PASID) to identify the first address space, a first job descriptor comprising a second PASID selector to specify a second PASID to identify a second address space. In response to the first job descriptor, the data mover accelerator is to securely access the first address space and the second address space. Other embodiments are described and claimed.
    Type: Application
    Filed: June 23, 2020
    Publication date: November 4, 2021
    Inventors: SANJAY K. KUMAR, PHILIP LANTZ, RAJESH SANKARAN, NARAYAN RANGANATHAN, SAURABH GAYEN, DAVID A. KOUFATY, UTKARSH Y. KAKAIYA
  • Patent number: 11106613
    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
    Type: Grant
    Filed: March 29, 2018
    Date of Patent: August 31, 2021
    Assignee: Intel Corporation
    Inventors: Philip R. Lantz, Sanjay Kumar, Rajesh M. Sankaran, Saurabh Gayen
  • Publication number: 20210149815
    Abstract: Techniques for offload device address translation fetching are disclosed. In the illustrative embodiment, a processor of a compute device sends a translation fetch descriptor to an offload device before sending a corresponding work descriptor to the offload device. The offload device can request translations for virtual memory address and cache the corresponding physical addresses for later use. While the offload device is fetching virtual address translations, the compute device can perform other tasks before sending the corresponding work descriptor, including operations that modify the contents of the memory addresses whose translation are being cached. Even if the offload device does not cache the translations, the fetching can warm up the cache in a translation lookaside buffer. Such an approach can reduce the latency overhead that the offload device may otherwise incur in sending memory address translation requests that would be required to execute the work descriptor.
    Type: Application
    Filed: December 21, 2020
    Publication date: May 20, 2021
    Applicant: Intel Corporation
    Inventors: Saurabh Gayen, Philip R. Lantz, Dhananjay A. Joshi, Rupin H. Vakharwala, Rajesh M. Sankaran, Narayan Ranganathan, Sanjay Kumar
  • Patent number: 10969992
    Abstract: Systems, methods, and devices can include a processing engine implemented at least partially in hardware, the processing engine to process memory transactions; a memory element to index physical address and virtual address translations; and a memory controller logic implemented at least partially in hardware, the memory controller logic to receive an index from the processing engine, the index corresponding to a physical address and a virtual address; identify a physical address based on the received index; and provide the physical address to the processing engine. The processing engine can use the physical address for memory transactions in response to a streaming workload job request.
    Type: Grant
    Filed: December 29, 2018
    Date of Patent: April 6, 2021
    Assignee: Intel Corporation
    Inventors: Saurabh Gayen, Dhananjay A. Joshi, Philip R. Lantz, Rajesh M. Sankaran
  • Publication number: 20190303324
    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.
    Type: Application
    Filed: March 29, 2018
    Publication date: October 3, 2019
    Inventors: Philip R. Lantz, Sanjay Kumar, Rajesh M. Sankaran, Saurabh Gayen