Patents by Inventor Vasanth Ranganathan

Vasanth Ranganathan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220383444
    Abstract: Generation and storage of compressed z-planes in graphics processing is described. An example of a processor includes a rasterizer to generate a fragment of pixel data including blocks of pixel data; a depth pipeline to receive the fragment, the pipeline including a first and second depth test hardware, the first depth test hardware to perform a coarse depth test including determining minimum and maximum depths for each block; and a depth buffer, wherein the processor is to determine whether the fragment meets requirements that the fragment fully covers a tile of pixel data and passes a first depth test, and that each of the minimum and maximum depths of the fragment has a same sign and exponent, and, upon determining that the fragment meets the requirements, to generate a compressed depth plane utilizing the first depth test and update the depth buffer with the compressed depth plane.
    Type: Application
    Filed: May 27, 2021
    Publication date: December 1, 2022
    Applicant: Intel Corporation
    Inventors: Saikat Mandal, Karol Szerszen, Vasanth Ranganathan, Altug Koker, Michael Norris, Prasoonkumar Surti, Takahiro Murata
  • Patent number: 11508338
    Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.
    Type: Grant
    Filed: October 5, 2020
    Date of Patent: November 22, 2022
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, Jr., Wei-Yu Chen, Subramaniam M. Maiyuran
  • Publication number: 20220366527
    Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing system that are associated with the respective contexts of the plurality of contexts.
    Type: Application
    Filed: July 22, 2022
    Publication date: November 17, 2022
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, DUKHWAN Kim
  • Publication number: 20220357945
    Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.
    Type: Application
    Filed: June 7, 2022
    Publication date: November 10, 2022
    Applicant: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20220357742
    Abstract: A mechanism is described for facilitating barriers and synchronization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting thread groups relating to machine learning associated with one or more processing devices. The method may further include facilitating barrier synchronization of the thread groups across multiple dies such that each thread in a thread group is scheduled across a set of compute elements associated with the multiple dies, where each die represents a processing device of the one or more processing devices, the processing device including a graphics processor.
    Type: Application
    Filed: May 23, 2022
    Publication date: November 10, 2022
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Balaji Vembu, John C. Weast, Mike B. Macpherson, Dukhwan Kim, Linda L. Hurd, Sanjeev Jahagirdar, Vasanth Ranganathan
  • Patent number: 11494187
    Abstract: In an example, an apparatus comprises a plurality of execution units, and logic, at least partially including hardware logic, to assemble a general register file (GRF) message and hold the GRF message in storage in a data port until all data for the GRF message is received. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: October 19, 2020
    Date of Patent: November 8, 2022
    Assignee: INTEL CORPORATION
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Ramkumar Ravikumar, Kiran C. Veernapu, Prasoonkumar Surti, Vasanth Ranganathan
  • Patent number: 11494868
    Abstract: An embodiment of a graphics apparatus may include a context engine to determine contextual information, a recommendation engine communicatively coupled to the context engine to determine a recommendation based on the contextual information, and a configuration engine communicatively coupled to the recommendation engine to adjust a configuration of a graphics operation based on the recommendation. Other embodiments are disclosed and claimed.
    Type: Grant
    Filed: February 19, 2021
    Date of Patent: November 8, 2022
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Ankur N. Shah, Abhishek R. Appu, Deepak S. Vembar, ElMoustapha Ould-Ahmed-Vall, Atsuo Kuwahara, Travis T. Schluessler, Linda L. Hurd, Josh B. Mastronarde, Vasanth Ranganathan
  • Publication number: 20220350751
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
    Type: Application
    Filed: July 12, 2022
    Publication date: November 3, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall, Abhishek Appu, Aravindh Anantaraman, Valentin Andrei, Durgaprasad Bilagi, Varghese George, Brent Insko, Sanjeev Jahagirdar, Scott Janus, Pattabhiraman K, SungYe Kim, Subramaniam Maiyuran, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Xinmin Tian
  • Publication number: 20220350651
    Abstract: A mechanism is described for facilitating intelligent thread scheduling at autonomous machines. A method of embodiments, as described herein, includes detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor. The method may further include generating a tree of thread groups based on the dependency information, where each thread group includes multiple threads, and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts.
    Type: Application
    Filed: May 17, 2022
    Publication date: November 3, 2022
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Rajkishore Barik, Eriko Nurvitadhi, Nicolas Galoppo Von Borries, Tsung-Han Lin, Sanjeev Jahagirdar, Vasanth Ranganathan
  • Publication number: 20220327772
    Abstract: One embodiment provides for a graphics processing unit comprising a processing cluster to perform coarse pixel shading and output shaded coarse pixels for processing by a pixel processing pipeline and a render cache to store coarse pixel data for input to or output from a pixel processing pipeline.
    Type: Application
    Filed: April 18, 2022
    Publication date: October 13, 2022
    Applicant: Intel Corporation
    Inventors: Prasoonkumar Surti, Abhishek R. Appu, Subhajit Dasgupta, Srivallaba Mysore, Michael J. Norris, Vasanth Ranganathan, Joydeep Ray
  • Patent number: 11430082
    Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, at training time, information relating to one or more tasks to be performed according to a training dataset relating to a processor including a graphics processor. The method may further include analyzing the information to determine one or more portions of hardware relating to the processor capable of supporting the one or more tasks, and configuring the hardware to pre-select the one or more portions to perform the one or more tasks, while other portions of the hardware remain available for other tasks.
    Type: Grant
    Filed: January 7, 2021
    Date of Patent: August 30, 2022
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, Dukhwan Kim
  • Publication number: 20220261347
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
    Type: Application
    Filed: April 28, 2022
    Publication date: August 18, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter,, JR., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
  • Publication number: 20220253317
    Abstract: A mechanism is described for facilitating fast data operations and for facilitating a finite state machine for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting input data to be used in computational tasks by a computation component of a processor including a graphics processor. The method may further include determining one or more frequently-used data values (FDVs) from the data, and pushing the one or more frequent data values to bypass the computational tasks.
    Type: Application
    Filed: March 1, 2022
    Publication date: August 11, 2022
    Applicant: Intel Corporation
    Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Abhishek R. Appu, Altug Koker, Kamal Sinha, Joydeep Ray, Balaji Vembu, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 11409658
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Grant
    Filed: January 28, 2021
    Date of Patent: August 9, 2022
    Assignee: Intel Corporation
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Jr., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Patent number: 11409579
    Abstract: An apparatus to facilitate thread barrier synchronization is disclosed. The apparatus includes a plurality of processing resources to execute a plurality of execution threads included in a thread workgroup and barrier synchronization hardware to assign a first named barrier to a first set of the plurality of execution threads in the thread workgroup, assign a second named barrier to a second set of the plurality of execution threads in the thread workgroup, synchronize execution of the first set of execution threads via the first named barrier and synchronize execution of the second set of execution threads via the second named barrier.
    Type: Grant
    Filed: February 24, 2020
    Date of Patent: August 9, 2022
    Assignee: Intel Corporation
    Inventors: James Valerio, Vasanth Ranganathan, Joydeep Ray
  • Patent number: 11393211
    Abstract: A mechanism is described for facilitating person tracking and data security in machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting, by a camera associated with one or more trackers, a person within a physical vicinity, where detecting includes capturing one or more images the person. The method may further include tracking, by the one or more trackers, the person based on the one or more images of the person, where tracking includes collect tracking data relating to the person. The method may further include selecting a tracker of the one or more trackers as a preferred tracker based on the tracking data.
    Type: Grant
    Filed: February 11, 2021
    Date of Patent: July 19, 2022
    Assignee: Intel Corporation
    Inventors: Mayuresh M. Varerkar, Barnan Das, Narayan Biswal, Stanley J. Baran, Gokcen Cilingir, Nilesh V. Shah, Archie Sharma, Sherine Abdelhak, Sachin Godse, Farshad Akhbari, Narayan Srinivasa, Altug Koker, Nadathur Rajagopalan Satish, Dukhwan Kim, Feng Chen, Abhishek R. Appu, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Patent number: 11379235
    Abstract: A mechanism is described for facilitating intelligent dispatching and vectorizing at autonomous machines. A method of embodiments, as described herein, includes detecting a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a graphics processor. The method may further include determining a first set of threads of the plurality of threads that are similar to each other or have adjacent surfaces, and physically clustering the first set of threads close together using a first set of adjacent compute blocks.
    Type: Grant
    Filed: December 21, 2020
    Date of Patent: July 5, 2022
    Assignee: Intel Corporation
    Inventors: Feng Chen, Narayan Srinivasa, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Joydeep Ray, Nicolas C. Galoppo Von Borries, Prasoonkumar Surti, Ben J. Ashbaugh, Sanjeev Jahagirdar, Vasanth Ranganathan
  • Patent number: 11373269
    Abstract: An apparatus to facilitate cache replacement is disclosed. The apparatus includes a cache memory and cache replacement logic to manage data in the cache memory. The cache replacement logic includes tracking logic to track addresses accessed at the cache memory and replacement control logic to monitor the tracking logic and apply a replacement policy based on information received from the tracking logic.
    Type: Grant
    Filed: July 8, 2020
    Date of Patent: June 28, 2022
    Assignee: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Abhishek R. Appu, Vasanth Ranganathan
  • Publication number: 20220197800
    Abstract: Graphics processors of the present design provide hierarchical open sectors and variable cache sizes for cache operations. In one embodiment, a graphics processor comprises a cache memory having a hierarchical open sector design including a first hierarchy of upper and lower regions with each region including a second hierarchy of sectors. A cache controller is configured to initially open a first sector of the lower region, to receive a memory request that does not match an address in the first sector, and to open a second sector of the lower region.
    Type: Application
    Filed: March 14, 2020
    Publication date: June 23, 2022
    Applicant: Intel Corporation
    Inventors: Abhishek Appu, Lakshminarayanan Striramassarma, Altug Koker, Sean Coleman, Varghese George, Arthur Hunter, Jr., Brent Insko, Scott Janus, Elmoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Karthik Vaidyanathan
  • Patent number: 11360767
    Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.
    Type: Grant
    Filed: July 6, 2021
    Date of Patent: June 14, 2022
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar