Patents by Inventor Vasanth Ranganathan

Vasanth Ranganathan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12657128
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the Li cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Grant
    Filed: December 20, 2023
    Date of Patent: June 16, 2026
    Assignee: Intel Corporation
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Jr., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Patent number: 12625814
    Abstract: One embodiment provides a graphics processor including a processing resource including a register file, memory, a cache, and load/store/cache circuitry to process load, store, and prefetch messages from the processing resource. The circuitry will sort received memory access messages into address sorted lists of reads and writes. The circuitry schedules a first set of address sorted requests from a first request buffer for a first period of time, then schedules a second set of address sorted requests from a second request buffer for a second period of time.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: May 12, 2026
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Abhishek R. Appu, Altug Koker, Aditya Navale, Varghese George, Vasanth Ranganathan, Fangwen Fu, Ben J. Ashbaugh, Vidhya Krishnan, Sabareesh Ganapathy, Prathamesh Raghunath Shinde
  • Publication number: 20260127702
    Abstract: A mechanism is described for detecting, at training time, information related to one or more tasks to be performed by the one or more processors according to a training dataset for a neural network, analyzing the information to determine one or more portions of hardware of a processor of the one or more processors that is configurable to support the one or more tasks, configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks, and monitoring utilization of the hardware via a hardware unit of the graphics processor and, via a scheduler of the graphics processor, adjusting allocation of the one or more tasks to the one or more portions of the hardware based on the utilization.
    Type: Application
    Filed: October 6, 2025
    Publication date: May 7, 2026
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, Dukhwan Kim
  • Patent number: 12613739
    Abstract: Described herein is a partitional graphics processor having multiple hard partitions with separate software execution and fault domains. One embodiment provides a graphics processor comprising a system interface and a plurality of graphics processing resources coupled with the system interface. The plurality of graphics processing resources is configurable to be partitioned into a plurality of isolated device partitions, each isolated device partition configured for fault isolation and independent concurrent execution of workloads associated with a plurality of clients, and the system interface is configured to present each of the plurality of isolated device partitions as a virtual function.
    Type: Grant
    Filed: May 27, 2022
    Date of Patent: April 28, 2026
    Assignee: Intel Corporation
    Inventors: David Cowperthwaite, Kenneth Daxer, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Hema Chand Nalluri, Jeffery S. Boles, Vasanth Ranganathan, Joydeep Ray, David Puffer, Aravindh Anantaraman, Ankur Shah, Vidhya Krishnan, Kritika Bala, Michael Apodaca
  • Patent number: 12585590
    Abstract: Embodiments described herein provide a technique to facilitate the broadcast or multicast of asynchronous loads to shared local memory of a plurality of graphics cores within a graphics core cluster. One embodiment provides a graphics processor including a cache memory a graphics core cluster coupled with the cache memory. The graphics core cluster includes a plurality of graphics cores. The plurality of graphics cores includes a graphics core configured to receive a designation as a producer graphics core for a multicast load, read data from the cache memory; and transmit the data read from the cache memory to a consumer graphics core of the plurality of graphics cores.
    Type: Grant
    Filed: October 25, 2022
    Date of Patent: March 24, 2026
    Assignee: Intel Corporation
    Inventors: John A. Wiegert, Joydeep Ray, Vasanth Ranganathan, Biju George, Fangwen Fu, Abhishek R. Appu, Chunhui Mei, Changwon Rhee
  • Patent number: 12579072
    Abstract: One embodiment provides circuitry coupled with cache memory and a memory interface, the circuitry to compress compute data at multiple cache line granularity, and a processing resource coupled with the memory interface and the cache memory. The processing resource is configured to perform a general-purpose compute operation on compute data associated with multiple cache lines of the cache memory. The circuitry is configured to compress the compute data before a write of the compute data via the memory interface to the memory bus, in association with a read of the compute data associated with the multiple cache lines via the memory interface, decompress the compute data, and provide the decompressed compute data to the processing resource.
    Type: Grant
    Filed: January 5, 2024
    Date of Patent: March 17, 2026
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
  • Patent number: 12572997
    Abstract: Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a state of multiple intellectual property (IP) cores that have access to a common cache via a central fabric is observed. Responsive to the observed state being indicative of performance of a standalone workload by a first IP core of the multiple IP cores, the common cache is treated as a local cache of the first IP core by powering off the central fabric and causing the first IP core to access the common cache via a low power access path between the first IP core and the common cache that is outside of the central fabric.
    Type: Grant
    Filed: April 24, 2023
    Date of Patent: March 10, 2026
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Durgaprasad Bilagi, Joydeep Ray, Scott Janus, Sanjeev Jahagirdar, Brent Insko, Lidong Xu, Abhishek R Appu, James Holland, Vasanth Ranganathan, Nikos Kaburlasos, Altug Koker, Xinmin Tian, Guei-Yuan Lueh, Changliang Wang
  • Patent number: 12572392
    Abstract: Described herein is a partitionable graphics processor having a plurality of flexibly partitioned processing resources. One embodiment provides a graphics processor comprising a plurality of processing resources configurable to be flexibly partitioned into a plurality of resource partitions and circuitry to compose multiple graphics processor device partitions from the plurality of resource partitions. The multiple graphics processor device partitions are configurable to be asymmetrically composed of different types of functional units.
    Type: Grant
    Filed: May 27, 2022
    Date of Patent: March 10, 2026
    Assignee: Intel Corporation
    Inventors: David Cowperthwaite, Kenneth Daxer, Jeffery S. Boles, Hema Chand Nalluri, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Vasanth Ranganathan, Joydeep Ray, David Puffer, Aravindh Anantaraman, Ankur Shah, Vidhya Krishnan, Kritika Bala
  • Patent number: 12561276
    Abstract: Systems and methods for updating remote memory side caches in a multi-GPU configuration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a first memory, a first memory side cache memory, a first communication fabric, and a first memory management unit (MMU). The graphics processor includes a second graphics processing unit (GPU) having a second memory, a second memory side cache memory, a second memory management unit (MMU), and a second communication fabric that is communicatively coupled to the first communication fabric. The first MMU is configured to control memory requests for the first memory, to update content in the first memory, to update content in the first memory side cache memory, and to determine whether to update the content in the second memory side cache memory.
    Type: Grant
    Filed: March 14, 2020
    Date of Patent: February 24, 2026
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Joydeep Ray, Aravindh Anantaraman, Valentin Andrei, Abhishek Appu, Sean Coleman, Nicolas Galoppo Von Borries, Varghese George, Pattabhiraman K, SungYe Kim, Mike Macpherson, Subramaniam Maiyuran, Elmoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, James Valerio
  • Patent number: 12554674
    Abstract: Methods and apparatus relating to techniques for multi-tile memory management. In an example, a graphics processor includes an interposer, a first chiplet coupled with the interposer, the first chiplet including a graphics processing resource and an interconnect network coupled with the graphics processing resource, cache circuitry coupled with the graphics processing resource via the interconnect network, and a second chiplet coupled with the first chiplet via the interposer, the second chiplet including a memory-side cache and a memory controller coupled with the memory-side cache. The memory controller is configured to enable access to a high-bandwidth memory (HBM) device, the memory-side cache is configured to cache data associated with a memory access performed via the memory controller, and the cache circuitry is logically positioned between the graphics processing resource and a chiplet interface.
    Type: Grant
    Filed: October 15, 2024
    Date of Patent: February 17, 2026
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Aravindh Anantaraman, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Mike Macpherson, Subramaniam Maiyuran, Joydeep Ray, Lakshminarayanan Striramassarma, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Prasoonkumar Surti, David Puffer, James Valerio, Ankur N. Shah
  • Publication number: 20260037477
    Abstract: Post-synchronization operations in multi-tile processor computing is described. An example of an apparatus an apparatus includes a memory to store data for processing, including data for an application; and one or more processors including a graphical processing unit (GPU), the GPU including multiple compute engine tiles including multiple processing resources, and a dispatcher for dispatching kernels for processing by the compute engine tiles, wherein each of the compute engine tiles is to write a signal to a location in the memory upon the compute engine tile completing processing of a partition of a first kernel, wherein the location is a same location for each of the plurality of compute engine tiles.
    Type: Application
    Filed: August 5, 2024
    Publication date: February 5, 2026
    Applicant: Intel Corporation
    Inventors: Michal Mrozek, Vasanth Ranganathan, Pierre Boudier, Jeffery S. Boles, Aditya Navale, Hema Chand Nalluri
  • Patent number: 12541908
    Abstract: Apparatus and method for stack throttling.
    Type: Grant
    Filed: February 26, 2024
    Date of Patent: February 3, 2026
    Assignee: Intel Corporation
    Inventors: Karthik Vaidyanathan, Abhishek Appu, Vasanth Ranganathan, Joydeep Ray, Prasoonkumar Surti
  • Patent number: 12504989
    Abstract: Bank aware thread scheduling and early dependency clearing techniques are described herein. In one example, bank aware thread scheduling involves arbitrating and scheduling threads based on the cache bank that is to be accessed by the instructions to avoiding bank conflicts. Early dependency clearing involves clearing dependencies for cache loads in a scoreboard before the data is loaded. In early dependency clearing for loads, delays in operation can be reduced by clearing dependencies before data is required from the cache.
    Type: Grant
    Filed: March 21, 2022
    Date of Patent: December 23, 2025
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Joydeep Ray, Karthik Vaidyanathan, Sreedhar Chalasani, Vasanth Ranganathan
  • Patent number: 12499503
    Abstract: Described herein is a partitionable graphics processor having multiple render front ends. The partitions of the graphics processor maintain render functionality when partitioned and enable fault isolation and independent multi-client rendering.
    Type: Grant
    Filed: May 27, 2022
    Date of Patent: December 16, 2025
    Assignee: Intel Corporation
    Inventors: Hema Chand Nalluri, Jeffery S. Boles, David Cowperthwaite, Aditya Navale, Prasoonkumar Surti, Arthur Hunter, Vasanth Ranganathan, Joydeep Ray, David Puffer, Ankur Shah, Vidhya Krishnan, Kritika Bala, Aravindh Anantaraman, Michael Apodaca, Kenneth Daxer
  • Publication number: 20250378045
    Abstract: Embodiments described herein provide techniques to enable the dynamic reconfiguration of memory on a general-purpose graphics processing unit. One embodiment described herein enables dynamic reconfiguration of cache memory bank assignments based on hardware statistics. One embodiment enables for virtual memory address translation using mixed four kilobyte and sixty-four kilobyte pages within the same page table hierarchy and under the same page directory. One embodiment provides for a graphics processor and associated heterogenous processing system having near and far regions of the same level of a cache hierarchy.
    Type: Application
    Filed: May 22, 2025
    Publication date: December 11, 2025
    Applicant: Intel Corporation
    Inventors: JOYDEEP RAY, NIRANJAN COORAY, SUBRAMANIAM MAIYURAN, ALTUG KOKER, PRASOONKUMAR SURTI, VARGHESE GEORGE, VALENTIN ANDREI, ABHISHEK APPU, GUADALUPE GARCIA, PATTABHIRAMAN K, SUNGYE KIM, SANJAY KUMAR, PRATIK MAROLIA, ELMOUSTAPHA OULD-AHMED-VALL, VASANTH RANGANATHAN, WILLIAM SADLER, LAKSHMINARAYANAN STRIRAMASSARMA
  • Patent number: 12493922
    Abstract: Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a system includes a producer intellectual property (IP) (e.g., a media IP), a compute core (e.g., a GPU or an AI-specific core of the GPU), a streaming buffer logically interposed between the producer IP and the compute core. The producer IP is operable to consume data from memory and output results to the streaming buffer. The compute core is operable to perform AI inference processing based on data consumed from the streaming buffer and output AI inference processing results to the memory.
    Type: Grant
    Filed: October 19, 2023
    Date of Patent: December 9, 2025
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Durgaprasad Bilagi, Joydeep Ray, Scott Janus, Sanjeev Jahagirdar, Brent Insko, Lidong Xu, Abhishek R. Appu, James Holland, Vasanth Ranganathan, Nikos Kaburlasos, Altug Koker, Xinmin Tian, Guei-Yuan Lueh, Changliang Wang
  • Publication number: 20250363674
    Abstract: Embodiments described herein provided for an instruction and associated logic to enable a processing resource including a tensor accelerator to perform optimized computation of sparse submatrix operations. One embodiment provides a parallel processor comprising a processing cluster coupled with the cache memory. The processing cluster includes a plurality of multiprocessors coupled with a data interconnect, where a multiprocessor of the plurality of multiprocessors includes a tensor core configured to load tensor data and metadata associated with the tensor data from the cache memory, wherein the metadata indicates a first numerical transform applied to the tensor data, perform an inverse transform of the first numerical transform, perform a tensor operation on the tensor data after the inverse transform is performed, and write output of the tensor operation to a memory coupled with the processing cluster.
    Type: Application
    Filed: June 11, 2025
    Publication date: November 27, 2025
    Applicant: Intel Corporation
    Inventors: ABHISHEK R. APPU, PRASOONKUMAR SURTI, JILL BOYCE, SUBRAMANIAM MAIYURAN, MICHAEL APODACA, ADAM T. LAKE, JAMES HOLLAND, VASANTH RANGANATHAN, ALTUG KOKER, LIDONG XU, NIKOS KABURLASOS
  • Publication number: 20250355670
    Abstract: Apparatus and method for thread dispatch throttle control. For example, an example processor comprises: a plurality of graphics cores to execute instructions of a plurality of compute threads; and dispatch circuitry to dispatch each compute thread for execution on a graphics core of the plurality of graphics cores, the dispatch circuitry to track a number of compute threads of the plurality of compute threads dispatched to each graphics core of the plurality of graphics cores which have not completed; the dispatch circuitry to adjust a dispatch throttling threshold value based on stall metrics associated with each graphics core of the plurality of graphics cores, the stall metrics including a number of cycles for which the one or more compute threads of the plurality of compute threads are stalled within a clock window.
    Type: Application
    Filed: May 16, 2024
    Publication date: November 20, 2025
    Inventors: Deepak N K, Jain PHILIP, Vasanth RANGANATHAN
  • Publication number: 20250342384
    Abstract: Quantum-based dispatch of workgroups is described. An example of an apparatus includes a computer memory to store data for processing, including data for an application; and one or more processors including a graphical processing unit (GPU), the GPU including multiple chiplets, each of the multiple chiplets including compute containers and a cache, each compute container including a plurality of processing resources, and a dispatcher for dispatching workgroups to the processing resources of the GPU, wherein dispatching workgroups includes dispatching workgroups for the application according to a selected workgroup quantum, the selected workgroup quantum having a certain size and shape.
    Type: Application
    Filed: May 2, 2024
    Publication date: November 6, 2025
    Applicant: Intel Corporation
    Inventors: Milind Nemlekar, Changwon Rhee, Vasanth Ranganathan, Maxim Kazakov, Pierre Boudier, Wei Xiong, Moshe Maor, Deepak N K, Jain Philip, Abhishek Kumar Singh, Michal Mrozek
  • Patent number: 12462328
    Abstract: A mechanism is described for detecting, at training time, information related to one or more tasks to be performed by the one or more processors according to a training dataset for a neural network, analyzing the information to determine one or more portions of hardware of a processor of the one or more processors that is configurable to support the one or more tasks, configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks, and monitoring utilization of the hardware via a hardware unit of the graphics processor and, via a scheduler of the graphics processor, adjusting allocation of the one or more tasks to the one or more portions of the hardware based on the utilization.
    Type: Grant
    Filed: July 13, 2023
    Date of Patent: November 4, 2025
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, Dukhwan Kim