Patents by Inventor Lakshminarayanan Striramassarma

Lakshminarayanan Striramassarma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11620256
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
    Type: Grant
    Filed: April 28, 2022
    Date of Patent: April 4, 2023
    Assignee: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter, Jr., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
  • Patent number: 11593269
    Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: August 12, 2021
    Date of Patent: February 28, 2023
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
  • Patent number: 11586548
    Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: February 21, 2023
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
  • Publication number: 20230051190
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Application
    Filed: July 15, 2022
    Publication date: February 16, 2023
    Applicant: Intel Corporation
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, JR., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Publication number: 20220350751
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
    Type: Application
    Filed: July 12, 2022
    Publication date: November 3, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall, Abhishek Appu, Aravindh Anantaraman, Valentin Andrei, Durgaprasad Bilagi, Varghese George, Brent Insko, Sanjeev Jahagirdar, Scott Janus, Pattabhiraman K, SungYe Kim, Subramaniam Maiyuran, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Xinmin Tian
  • Publication number: 20220261347
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
    Type: Application
    Filed: April 28, 2022
    Publication date: August 18, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter,, JR., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
  • Patent number: 11409658
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Grant
    Filed: January 28, 2021
    Date of Patent: August 9, 2022
    Assignee: Intel Corporation
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Jr., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Patent number: 11410266
    Abstract: Embodiments described herein provide techniques to disaggregate an architecture of a system on a chip integrated circuit into multiple distinct chiplets that can be packaged onto a common chassis. In one embodiment, a graphics processing unit or parallel processor is composed from diverse silicon chiplets that are separately manufactured. A chiplet is an at least partially packaged integrated circuit that includes distinct units of logic that can be assembled with other chiplets into a larger package. A diverse set of chiplets with different IP core logic can be assembled into a single device.
    Type: Grant
    Filed: October 13, 2020
    Date of Patent: August 9, 2022
    Assignee: Intel Corporation
    Inventors: Naveen Matam, Lance Cheney, Eric Finley, Varghese George, Sanjeev Jahagirdar, Altug Koker, Josh Mastronarde, Iqbal Rajwani, Lakshminarayanan Striramassarma, Melaku Teshome, Vikranth Vemulapalli, Binoj Xavier
  • Patent number: 11386521
    Abstract: A disaggregated processor package can be configured to accept interchangeable chiplets. Interchangeability is enabled by specifying a standard physical interconnect for chiplets that can enable the chiplet to interface with a fabric or bridge interconnect. Chiplets from different IP designers can conform to the common interconnect, enabling such chiplets to be interchangeable during assembly. The fabric and bridge interconnects logic on the chiplet can then be configured to confirm with the actual interconnect layout of the on-board logic of the chiplet. Additionally, data from chiplets can be transmitted across an inter-chiplet fabric using encapsulation, such that the actual data being transferred is opaque to the fabric, further enable interchangeability of the individual chiplets. With such an interchangeable design, higher or lower density memory can be inserted into memory chiplet slots, while compute or graphics chiplets with a higher or lower core count can be inserted into logic chiplet slots.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: July 12, 2022
    Assignee: Intel Corporation
    Inventors: Altug Koker, Lance Cheney, Eric Finley, Varghese George, Sanjeev Jahagirdar, Josh Mastronarde, Naveen Matam, Iqbal Rajwani, Lakshminarayanan Striramassarma, Melaku Teshome, Vikranth Vemulapalli, Binoj Xavier
  • Publication number: 20220197800
    Abstract: Graphics processors of the present design provide hierarchical open sectors and variable cache sizes for cache operations. In one embodiment, a graphics processor comprises a cache memory having a hierarchical open sector design including a first hierarchy of upper and lower regions with each region including a second hierarchy of sectors. A cache controller is configured to initially open a first sector of the lower region, to receive a memory request that does not match an address in the first sector, and to open a second sector of the lower region.
    Type: Application
    Filed: March 14, 2020
    Publication date: June 23, 2022
    Applicant: Intel Corporation
    Inventors: Abhishek Appu, Lakshminarayanan Striramassarma, Altug Koker, Sean Coleman, Varghese George, Arthur Hunter, Jr., Brent Insko, Scott Janus, Elmoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Karthik Vaidyanathan
  • Publication number: 20220188967
    Abstract: A disaggregated processor package can be configured to accept interchangeable chiplets. Interchangeability is enabled by specifying a standard physical interconnect for chiplets that can enable the chiplet to interface with a fabric or bridge interconnect. Chiplets from different IP designers can conform to the common interconnect, enabling such chiplets to be interchangeable during assembly. The fabric and bridge interconnects logic on the chiplet can then be configured to confirm with the actual interconnect layout of the on-board logic of the chiplet. Additionally, data from chiplets can be transmitted across an inter-chiplet fabric using encapsulation, such that the actual data being transferred is opaque to the fabric, further enable interchangeability of the individual chiplets, With such an interchangeable design, cache or DRAM memory can be inserted into memory chiplet slots, while compute or graphics chiplets with a higher or lower core count can be inserted into logic chiplet slots.
    Type: Application
    Filed: March 2, 2022
    Publication date: June 16, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Lance Cheney, Eric Finley, Varghese George, Sanjeev Jahagirdar, Josh Mastronarde, Naveen Matam, Iqbal Rajwani, Lakshminarayanan Striramassarma, Melaku Teshome, Vikranth Vemulapalli, Binoj Xavier
  • Publication number: 20220180468
    Abstract: Embodiments described herein provide techniques to disaggregate an architecture of a system on a chip integrated circuit into multiple distinct chiplets that can be packaged onto a common chassis. In one embodiment, a graphics processing unit or parallel processor is composed from diverse silicon chiplets that are separately manufactured. A chiplet is an at least partially and distinctly packaged integrated circuit that includes distinct units of logic that can be assembled with other chiplets into a larger package. A diverse set of chiplets with different IP core logic can be assembled into a single device.
    Type: Application
    Filed: February 17, 2022
    Publication date: June 9, 2022
    Applicant: Intel Corporation
    Inventors: Naveen Matam, Lance Cheney, Eric Finley, Varghese George, Sanjeev Jahagirdar, Altug Koker, Josh Mastronarde, Iqbal Rajwani, Lakshminarayanan Striramassarma, Melaku Teshome, Vikranth Vemulapalli, Binoj Xavier
  • Publication number: 20220179787
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
    Type: Application
    Filed: March 14, 2020
    Publication date: June 9, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter, Jr., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
  • Publication number: 20220171710
    Abstract: Embodiments are generally directed to cache structure and utilization. An embodiment of an apparatus includes one or more processors including a graphics processor; a memory for storage of data for processing by the one or more processors; and a cache to cache data from the memory; wherein the apparatus is to provide for dynamic overfetching of cache lines for the cache, including receiving a read request and accessing the cache for the requested data, and upon a miss in the cache, overfetching data from memory or a higher level cache in addition to fetching the requested data, wherein the overfetching of data is based at least in part on a current overfetch boundary, and provides for data is to be prefetched extending to the current overfetch boundary.
    Type: Application
    Filed: February 17, 2022
    Publication date: June 2, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Lakshminarayanan Striramassarma, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Sean Coleman, Varghese George, Pattabhiraman K, Mike MacPherson, Subramaniam Maiyuran, ElMoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, Jayakrishna P S, Prasoonkumar Surti
  • Publication number: 20220156202
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
    Type: Application
    Filed: February 1, 2022
    Publication date: May 19, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall, Abhishek Appu, Aravindh Anantaraman, Valentin Andrei, Durgaprasad Bilagi, Varghese George, Brent Insko, Sanjeev Jahagirdar, Scott Janus, Pattabhiraman K, SungYe Kim, Subramaniam Maiyuran, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Xinmin Tian
  • Publication number: 20220137967
    Abstract: Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch information; wherein the load-store unit is to receive software assistance for prefetching, and wherein generation of the pre-fetch information is based at least in part on the software assistance.
    Type: Application
    Filed: March 14, 2020
    Publication date: May 5, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Varghese George, Aravindh Anantaraman, Valentin Andrel, Abhishek R. Appu, Niranjan Cooray, Nicolas Galoppo Von Borries, Mike MacPherson, Subramaniam Maiyuran, ElMoustapha Ould-Ahmed-Vall, David Puffer, Vasanth Ranganathan, Joydeep Ray, Ankur N. Shah, Lakshminarayanan Striramassarma, Prasoonkumar Surti, Saurabh Tangri
  • Publication number: 20220138895
    Abstract: Embodiments are generally directed to compute optimization in graphics processing. An embodiment of an apparatus includes one or more processors including a multi-tile graphics processing unit (GPU) to process data, the multi-tile GPU including multiple processor tiles; and a memory for storage of data for processing, wherein the apparatus is to receive compute work for processing by the GPU, partition the compute work into multiple work units, assign each of multiple work units to one of the processor tiles, and process the compute work using the processor tiles assigned to the work units.
    Type: Application
    Filed: March 14, 2020
    Publication date: May 5, 2022
    Applicant: Intel Corporation
    Inventors: Vasanth Raganathan, Abhishek R. Appu, Ben Ashbaugh, Peter Doyle, Brandon Fliflet, Arthur Hunter, Brent Insko, Scott Janus, Altug Koker, Aditya Navale, Joydeep Ray, Kamal Sinha, Lakshminarayanan Striramassarma, Prasoonkumar Surti, James Valerio
  • Publication number: 20220138104
    Abstract: Embodiments are generally directed to cache structure and utilization. An embodiment of an apparatus includes one or more processors including a graphics processor; a memory for storage of data for processing by the one or more processors; and a cache to cache data from the memory; wherein the apparatus is to provide for dynamic overfetching of cache lines for the cache, including receiving a read request and accessing the cache for the requested data, and upon a miss in the cache, overfetching data from memory or a higher level cache in addition to fetching the requested data, wherein the overfetching of data is based at least in part on a current overfetch boundary, and provides for data is to be prefetched extending to the current overfetch boundary.
    Type: Application
    Filed: March 14, 2020
    Publication date: May 5, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Lakshminarayanan Striramassarma, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Sean Coleman, Varghese Georgr, K. Pattabhiraman, Mike MacPherson, Subramaniam Maiyuran, ElMoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, S. Jayakrishna P, Prasoonkumar Surti
  • Publication number: 20220129521
    Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides techniques to optimize training and inference on a systolic array when using sparse data. One embodiment provides techniques to use decompression information when performing sparse compute operations. One embodiment enables the disaggregation of special function compute arrays via a shared reg file. One embodiment enables packed data compress and expand operations on a GPGPU. One embodiment provides techniques to exploit block sparsity within the cache hierarchy of a GPGPU.
    Type: Application
    Filed: March 14, 2020
    Publication date: April 28, 2022
    Applicant: INTEL CORPORATION
    Inventors: Prasoonkumar Surti, Subramaniam Maiyuran, Valentin Andrei, Abhishek Appu, Varghese George, Altug Koker, Mike Macpherson, Elmoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, Lakshminarayanan Striramassarma, SungYe Kim
  • Publication number: 20220114108
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received.
    Type: Application
    Filed: March 14, 2020
    Publication date: April 14, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall, Abhishek Appu, Aravindh Anantaraman, Valentin Andrei, Durgaprasad Bilagi, Varghese George, Brent Insko, Sanjeev Jahagirdar, Scott Janus, Pattabhiraman K., SungYe Kim, Subramaniam Maiyuran, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Xinmin Tian