Patents by Inventor Rajesh M Sankaran
Rajesh M Sankaran has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250123881Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.Type: ApplicationFiled: October 25, 2024Publication date: April 17, 2025Inventors: Rajesh M. SANKARAN, Gilbert NEIGER, Narayan RANGANATHAN, Stephen R. VAN DOREN, Joseph NUZMAN, Niall D. MCDONNELL, Michael A. O'HANLON, Lokpraveen B. MOSUR, Tracy Garrett DRYSDALE, Eriko NURVITADHI, Asit K. MISHRA, Ganesh VENKATESH, Deborah T. MARR, Nicholas P. CARTER, Jonathan D. PEARCE, Edward T. GROCHOWSKI, Richard J. GRECO, Robert VALENTINE, Jesus CORBAL, Thomas D. FLETCHER, Dennis R. BRADFORD, Dwight P. MANLEY, Mark J. CHARNEY, Jeffry J. COOK, Paul CAPRIOLI, Koichi YAMADA, Kent D. GLOSSOP, David B. SHEFFIELD
-
Publication number: 20250117264Abstract: Techniques for scalable virtualization of an Input/Output (I/O) device are described. An electronic device composes a virtual device comprising one or more assignable interface (AI) instances of a plurality of AI instances of a hosting function exposed by the I/O device. The electronic device emulates device resources of the I/O device via the virtual device. The electronic device intercepts a request from the guest pertaining to the virtual device, and determines whether the request from the guest is a fast-path operation to be passed directly to one of the one or more AI instances of the I/O device or a slow-path operation that is to be at least partially serviced via software executed by the electronic device. For a slow-path operation, the electronic device services the request at least partially via the software executed by the electronic device.Type: ApplicationFiled: November 1, 2024Publication date: April 10, 2025Applicant: Intel CorporationInventors: Utkarsh Y. KAKAIYA, Rajesh M. SANKARAN, Sanjay KUMAR, Kun TIAN, Philip LANTZ
-
Publication number: 20250103397Abstract: Techniques for quality of service (QoS) support for input/output devices and other agents are described. In embodiments, a processing device includes execution circuitry to execute a plurality of software threads; hardware to control monitoring or allocating, among the plurality of software threads, one or more shared resources; and configuration storage to enable the monitoring or allocating of the one or more shared resources among the plurality of software threads and one or more channels through which one or more devices are to be connected to the one or more shared resources.Type: ApplicationFiled: December 30, 2023Publication date: March 27, 2025Applicant: Intel CorporationInventors: Andrew J. Herdrich, Daniel Joe, Filip Schmole, Philip Abraham, Stephen R. Van Doren, Priya Autee, Rajesh M. Sankaran, Anthony Luck, Philip Lantz, Eric Wehage, Edwin Verplanke, James Coleman, Scott Oehrlein, David M. Lee, Lee Albion, David Harriman, Vinit Mathew Abraham, Yi-Feng Liu, Manjula Peddireddy, Robert G. Blankenship
-
Publication number: 20250053530Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.Type: ApplicationFiled: June 20, 2024Publication date: February 13, 2025Applicant: Intel CorporationInventors: Philip R. Lantz, Sanjay Kumar, Rajesh M. Sankaran, Saurabh Gayen
-
Patent number: 12197601Abstract: Examples described herein relate to offload circuitry comprising one or more compute engines that are configurable to perform a workload offloaded from a process executed by a processor based on a descriptor particular to the workload. In some examples, the offload circuitry is configurable to perform the workload, among multiple different workloads. In some examples, the multiple different workloads include one or more of: data transformation (DT) for data format conversion, Locality Sensitive Hashing (LSH) for neural network (NN), similarity search, sparse general matrix-matrix multiplication (SpGEMM) acceleration of hash based sparse matrix multiplication, data encode, data decode, or embedding lookup.Type: GrantFiled: December 22, 2021Date of Patent: January 14, 2025Assignee: Intel CorporationInventors: Ren Wang, Sameh Gobriel, Somnath Paul, Yipeng Wang, Priya Autee, Abhirupa Layek, Shaman Narayana, Edwin Verplanke, Mrittika Ganguli, Jr-Shian Tsai, Anton Sorokin, Suvadeep Banerjee, Abhijit Davare, Desmond Kirkpatrick, Rajesh M. Sankaran, Jaykant B. Timbadiya, Sriram Kabisthalam Muthukumar, Narayan Ranganathan, Nalini Murari, Brinda Ganesh, Nilesh Jain
-
Patent number: 12135981Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.Type: GrantFiled: June 9, 2023Date of Patent: November 5, 2024Assignee: Intel CorporationInventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
-
Publication number: 20240345865Abstract: Techniques for transferring virtual machines and resource management in a virtualized computing environment are described. In one embodiment, for example, an apparatus may include at least one memory, at least one processor, and logic for transferring a virtual machine (VM), at least a portion of the logic comprised in hardware coupled to the at least one memory and the at least one processor, the logic to generate a plurality of virtualized capability registers for a virtual device (VDEV) by virtualizing a plurality of device-specific capability registers of a physical device to be virtualized by the VM, the plurality of virtualized capability registers comprising a plurality of device-specific capabilities of the physical device, determine a version of the physical device to support via a virtual machine monitor (VMM), and expose a subset of the virtualized capability registers associated with the version to the VM. Other embodiments are described and claimed.Type: ApplicationFiled: April 23, 2024Publication date: October 17, 2024Applicant: Intel CorporationInventors: SANJAY KUMAR, PHILIP R. LANTZ, KUN TIAN, UTKARSH Y. KAKAIYA, RAJESH M. SANKARAN
-
Patent number: 12117910Abstract: Examples may include a method of instantiating a virtual machine, instantiating a virtual device to transmit data to and receive data from assigned resources of a shared physical device; and assigning the virtual device to the virtual machine, the virtual machine to transmit data to and receive data from the physical device via the virtual device.Type: GrantFiled: July 19, 2022Date of Patent: October 15, 2024Assignee: Intel CorporationInventors: Nrupal Jani, Manasi Deval, Anjali Singhai Jain, Parthasarathy Sarangam, Mitu Aggarwal, Neerav Parikh, Alexander H. Duyck, Kiran Patil, Rajesh M. Sankaran, Sanjay K. Kumar, Utkarsh Y. Kakaiya, Philip Lantz, Kun Tian
-
Publication number: 20240338319Abstract: Embodiments of apparatuses, methods, and systems for unified address translation for virtualization of input/output devices are described. In an embodiment, an apparatus includes first circuitry to use at least an identifier of a device to locate a context entry and second circuitry to use at least a process address space identifier (PASID) to locate a PASID-entry. The context entry is to include at least one of a page-table pointer to a page-table translation structure and a PASID. The PASID-entry is to include at least one of a first-level page-table pointer to a first-level translation structure and a second-level page-table pointer to a second-level translation structure. The PASID is to be supplied by the device. At least one of the apparatus, the context entry, and the PASID entry is to include one or more control fields to indicate whether the first-level page-table pointer or the second-level page-table pointer is to be used.Type: ApplicationFiled: June 17, 2024Publication date: October 10, 2024Applicant: Intel CorporationInventors: Utkarsh Y. Kakaiya, Sanjay Kumar, Rajesh M. Sankaran, Philip R. Lantz, Ashok Raj, Kun Tian
-
Publication number: 20240314072Abstract: A network interface controller can be programmed to direct write received data to a memory buffer via either a host-to-device fabric or an accelerator fabric. For packets received that are to be written to a memory buffer associated with an accelerator device, the network interface controller can determine an address translation of a destination memory address of the received packet and determine whether to use a secondary head. If a translated address is available and a secondary head is to be used, a direct memory access (DMA) engine is used to copy a portion of the received packet via the accelerator fabric to a destination memory buffer associated with the address translation. Accordingly, copying a portion of the received packet through the host-to-device fabric and to a destination memory can be avoided and utilization of the host-to-device fabric can be reduced for accelerator bound traffic.Type: ApplicationFiled: January 19, 2024Publication date: September 19, 2024Inventors: Pratik M. MAROLIA, Rajesh M. SANKARAN, Ashok RAJ, Nrupal JANI, Parthasarathy SARANGAM, Robert O. SHARP
-
Publication number: 20240289160Abstract: A processor of an aspect includes a decode unit to decode an aperture access instruction, and an execution unit coupled with the decode unit. The execution unit, in response to the aperture access instruction, is to read a host physical memory address, which is to be associated with an aperture that is to be in system memory, from an access protected structure, and access data within the aperture at a host physical memory address that is not to be obtained through address translation. Other processors are also disclosed, as are methods, systems, and machine-readable medium storing aperture access instructions.Type: ApplicationFiled: May 8, 2024Publication date: August 29, 2024Applicant: Intel CorporationInventors: Barry E. Huntley, Jr-Shian Tsai, Gilbert Neiger, Rajesh M. Sankaran, Mesut A. Ergin, Ravi L. Sahita, Andrew J. Herdrich, Wei Wang
-
Patent number: 12045185Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.Type: GrantFiled: April 6, 2023Date of Patent: July 23, 2024Assignee: Intel CorporationInventors: Philip R. Lantz, Sanjay Kumar, Rajesh M. Sankaran, Saurabh Gayen
-
Patent number: 12020031Abstract: A processor of an aspect includes a decode unit to decode a user-level suspend thread instruction that is to indicate a first alternate state. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the instruction at a user privilege level. The execution unit in response to the instruction, is to: (a) suspend execution of a user-level thread, from which the instruction is to have been received; (b) transition a logical processor, on which the user-level thread was to have been running, to the indicated first alternate state; and (c) resume the execution of the user-level thread, when the logical processor is in the indicated first alternate state, with a latency that is to be less than half a latency that execution of a thread can be resumed when the logical processor is in a halt processor power state.Type: GrantFiled: May 31, 2021Date of Patent: June 25, 2024Assignee: Intel CorporationInventors: Michael Mishaeli, Jason W. Brandt, Gilbert Neiger, Asit K. Mallick, Rajesh M. Sankaran, Raghunandan Makaram, Benjamin C. Chaffin, James B. Crossland, H. Peter Anvin
-
Patent number: 12013790Abstract: Embodiments of apparatuses, methods, and systems for unified address translation for virtualization of input/output devices are described. In an embodiment, an apparatus includes first circuitry to use at least an identifier of a device to locate a context entry and second circuitry to use at least a process address space identifier (PASID) to locate a PASID-entry. The context entry is to include at least one of a page-table pointer to a page-table translation structure and a PASID. The PASID-entry is to include at least one of a first-level page-table pointer to a first-level translation structure and a second-level page-table pointer to a second-level translation structure. The PASID is to be supplied by the device. At least one of the apparatus, the context entry, and the PASID entry is to include one or more control fields to indicate whether the first-level page-table pointer or the second-level page-table pointer is to be used.Type: GrantFiled: May 22, 2023Date of Patent: June 18, 2024Assignee: Intel CorporationInventors: Utkarsh Y. Kakaiya, Sanjay Kumar, Rajesh M. Sankaran, Philip R. Lantz, Ashok Raj, Kun Tian
-
Patent number: 11995462Abstract: Techniques for transferring virtual machines and resource management in a virtualized computing environment are described. In one embodiment, for example, an apparatus may include at least one memory, at least one processor, and logic for transferring a virtual machine (VM), at least a portion of the logic comprised in hardware coupled to the at least one memory and the at least one processor, the logic to generate a plurality of virtualized capability registers for a virtual device (VDEV) by virtualizing a plurality of device-specific capability registers of a physical device to be virtualized by the VM, the plurality of virtualized capability registers comprising a plurality of device-specific capabilities of the physical device, determine a version of the physical device to support via a virtual machine monitor (VMM), and expose a subset of the virtualized capability registers associated with the version to the VM. Other embodiments are described and claimed.Type: GrantFiled: January 11, 2023Date of Patent: May 28, 2024Assignee: Intel CorporationInventors: Sanjay Kumar, Philip R. Lantz, Kun Tian, Utkarsh Y. Kakaiya, Rajesh M. Sankaran
-
Patent number: 11929927Abstract: A network interface controller can be programmed to direct write received data to a memory buffer via either a host-to-device fabric or an accelerator fabric. For packets received that are to be written to a memory buffer associated with an accelerator device, the network interface controller can determine an address translation of a destination memory address of the received packet and determine whether to use a secondary head. If a translated address is available and a secondary head is to be used, a direct memory access (DMA) engine is used to copy a portion of the received packet via the accelerator fabric to a destination memory buffer associated with the address translation. Accordingly, copying a portion of the received packet through the host-to-device fabric and to a destination memory can be avoided and utilization of the host-to-device fabric can be reduced for accelerator bound traffic.Type: GrantFiled: December 21, 2020Date of Patent: March 12, 2024Assignee: Intel CorporationInventors: Pratik M. Marolia, Rajesh M. Sankaran, Ashok Raj, Nrupal Jani, Parthasarathy Sarangam, Robert O. Sharp
-
Publication number: 20240054011Abstract: Methods and apparatus relating to data streaming accelerators are described. In an embodiment, a hardware accelerator such as a Data Streaming Accelerator (DSA) logic circuitry performs data movement and/or data transformation for data to be transferred between a processor (having one or more processor cores) and a storage device. Other embodiments are also disclosed and claimed.Type: ApplicationFiled: August 12, 2023Publication date: February 15, 2024Applicant: Intel CorporationInventors: Rajesh M. Sankaran, Philip R. Lantz, Narayan Ranganathan, Saurabh Gayen, Sanjay Kumar, Nikhil Rao, Dhananjay A. Joshi, Hai Ming Khor, Utkarsh Y. Kakaiya
-
Publication number: 20230418655Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.Type: ApplicationFiled: June 9, 2023Publication date: December 28, 2023Inventors: Rajesh M. SANKARAN, Gilbert NEIGER, Narayan RANGANATHAN, Stephen R. VAN DOREN, Joseph NUZMAN, Niall D. MCDONNELL, Michael A. O'HANLON, Lokpraveen B. MOSUR, Tracy Garrett DRYSDALE, Eriko NURVITADHI, Asit K. MISHRA, Ganesh VENKATESH, Deborah T. MARR, Nicholas P. CARTER, Jonathan D. PEARCE, Edward T. GROCHOWSKI, Richard J. GRECO, Robert VALENTINE, Jesus CORBAL, Thomas D. FLETCHER, Dennis R. BRADFORD, Dwight P. MANLEY, Mark J. CHARNEY, Jeffrey J. COOK, Paul CAPRIOLI, Koichi YAMADA, Kent D. GLOSSOP, David B. SHEFFIELD
-
Publication number: 20230418762Abstract: Embodiments of apparatuses, methods, and systems for unified address translation for virtualization of input/output devices are described. In an embodiment, an apparatus includes first circuitry to use at least an identifier of a device to locate a context entry and second circuitry to use at least a process address space identifier (PASID) to locate a PASID-entry. The context entry is to include at least one of a page-table pointer to a page-table translation structure and a PASID. The PASID-entry is to include at least one of a first-level page-table pointer to a first-level translation structure and a second-level page-table pointer to a second-level translation structure. The PASID is to be supplied by the device. At least one of the apparatus, the context entry, and the PASID entry is to include one or more control fields to indicate whether the first-level page-table pointer or the second-level page-table pointer is to be used.Type: ApplicationFiled: May 22, 2023Publication date: December 28, 2023Applicant: Intel CorporationInventors: Utkarsh Y. Kakaiya, Sanjay Kumar, Rajesh M. Sankaran, Philip R. Lantz, Ashok Raj, Kun Tian
-
Patent number: 11797464Abstract: Systems and methods for delivering interrupts to user-level applications. An example processing system comprises: a memory configured to store a plurality of user-level APIC data structures and a plurality of user-level interrupt handler address data structures corresponding to a plurality of user-level applications being executed by the processing system; and a processing core configured, responsive to receiving a notification of a user-level interrupt, to: set a pending interrupt bit flag having a position defined by an identifier of the user-level interrupt in a user-level APIC data structure associated with a user-level application that is currently being executed by the processing core, and invoke a user-level interrupt handler identified by a user-level interrupt handler address data structure associated with the user-level application, for a pending user-level interrupt having a highest priority among one or more pending user-level interrupts identified by the user-level APIC data structure.Type: GrantFiled: August 31, 2021Date of Patent: October 24, 2023Assignee: Intel CorporationInventors: Gilbert Neiger, Rajesh M. Sankaran