Patents by Inventor Tarun Nakra
Tarun Nakra has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12282428Abstract: In response to generating one or more speculative prefetch requests for a last-level cache, a processor determines prefetch analytics for the generated speculative prefetch requests and compares the determined prefetch analytics of the speculative prefetch requests to selection thresholds. In response to a speculative prefetch request meeting or exceeding a selection threshold, the processor selects the speculative prefetch request for issuance to a memory-side cache controller. When one or more system conditions meet one or more condition thresholds, the processor issues the selected speculative prefetch request to the memory-side cache controller. The memory-side cache controller then retrieves the data indicated in the selected speculative prefetch request from a memory and stores it in a memory-side cache in the data fabric coupled to the last-level cache.Type: GrantFiled: December 28, 2021Date of Patent: April 22, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Tarun Nakra, Akhil Arunkumar, Paul Moyer, Jay Fleischman
-
Patent number: 11960399Abstract: Methods, systems, and devices maintain state information in a shadow tag memory for a plurality of cachelines in each of a plurality of private caches, with each of the private caches being associated with a corresponding one of multiple processing cores. One or more cache probes are generated based on a write operation associated with one or more cachelines of the plurality of cachelines, such that each of the cache probes is associated with cachelines of a particular private cache of the multiple private caches, the particular private cache being associated with an indicated processing core. Transmission of the cache probes to the particular private cache is prevented until, responsive to a scope acquire operation from the indicated processing core, the cache probes are released for transmission to the respectively associated cachelines in the particular private cache.Type: GrantFiled: December 21, 2021Date of Patent: April 16, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Akhil Arunkumar, Tarun Nakra, Maxim V. Kazakov, Milind N. Nemlekar
-
Patent number: 11847062Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.Type: GrantFiled: December 16, 2021Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
-
Publication number: 20230205700Abstract: In response to generating one or more speculative prefetch requests for a last-level cache, a processor determines prefetch analytics for the generated speculative prefetch requests and compares the determined prefetch analytics of the speculative prefetch requests to selection thresholds. In response to a speculative prefetch request meeting or exceeding a selection threshold, the processor selects the speculative prefetch request for issuance to a memory-side cache controller. When one or more system conditions meet one or more condition thresholds, the processor issues the selected speculative prefetch request to the memory-side cache controller. The memory-side cache controller then retrieves the data indicated in the selected speculative prefetch request from a memory and stores it in a memory-side cache in the data fabric coupled to the last-level cache.Type: ApplicationFiled: December 28, 2021Publication date: June 29, 2023Inventors: Tarun NAKRA, Akhil ARUNKUMAR, Paul MOYER, Jay FLEISCHMAN
-
Publication number: 20230195643Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.Type: ApplicationFiled: December 16, 2021Publication date: June 22, 2023Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
-
Publication number: 20230195628Abstract: Methods, systems, and devices maintain state information in a shadow tag memory for a plurality of cachelines in each of a plurality of private caches, with each of the private caches being associated with a corresponding one of multiple processing cores. One or more cache probes are generated based on a write operation associated with one or more cachelines of the plurality of cachelines, such that each of the cache probes is associated with cachelines of a particular private cache of the multiple private caches, the particular private cache being associated with an indicated processing core. Transmission of the cache probes to the particular private cache is prevented until, responsive to a scope acquire operation from the indicated processing core, the cache probes are released for transmission to the respectively associated cachelines in the particular private cache.Type: ApplicationFiled: December 21, 2021Publication date: June 22, 2023Inventors: Akhil Arunkumar, Tarun Nakra, Maxim V. Kazakov, Milind N. Nemlekar
-
Patent number: 11609858Abstract: A system and a method to allocate data to a first cache increments a first counter if a reuse indicator for the data indicates that the data is likely to be reused and decremented the counter if the reuse indicator for the data indicates that the data is likely not to be reused. A second counter is incremented upon eviction of the data from the second cache, which is a higher level cache than the first cache. The data is allocated to the first cache if the value of the first counter is equal to or greater than the first predetermined threshold or the value of the second counter equals zero, and the data is bypassed from the first cache if the value of the first counter is less than the first predetermined threshold and the value of the second counter is not equal to zero.Type: GrantFiled: August 13, 2021Date of Patent: March 21, 2023Inventors: Yingying Tian, Tarun Nakra, Vikas Sinha, Hien Le
-
Patent number: 11580025Abstract: Systems and methods for coordinated memory-side cache prefetching and dynamic interleaving configuration modification involve modifying one or both of the prefetch distance or the prefetch degree used by prefetcher modules of one or more memory-side caches by modifying interleaving configuration data following detection of an interleaving reconfiguration trigger condition indicative, for example, of low prefetch accuracy, low prefetch coverage, high prefetch lateness, or a combination of these. In response an interleaving reconfiguration trigger condition, a processor modifies the interleaving configuration data for the processing system based on the prefetch performance characteristics associated with the interleaving reconfiguration trigger condition. In some embodiments, the interleaving configuration data is modified by changing which physical memory address indices are used to determine the bits that define the channel identification number to which that physical memory address is to be mapped.Type: GrantFiled: September 30, 2021Date of Patent: February 14, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Tarun Nakra, Akhil Arunkumar, Vydhyanathan Kalyanasundharam, Chintan S. Patel, Nithesh Kurella Lakshmi Narayanamurthy
-
Publication number: 20210374064Abstract: A system and a method to allocate data to a first cache increments a first counter if a reuse indicator for the data indicates that the data is likely to be reused and decremented the counter if the reuse indicator for the data indicates that the data is likely not to be reused. A second counter is incremented upon eviction of the data from the second cache, which is a higher level cache than the first cache. The data is allocated to the first cache if the value of the first counter is equal to or greater than the first predetermined threshold or the value of the second counter equals zero, and the data is bypassed from the first cache if the value of the first counter is less than the first predetermined threshold and the value of the second counter is not equal to zero.Type: ApplicationFiled: August 13, 2021Publication date: December 2, 2021Inventors: Yingying TIAN, Tarun NAKRA, Vikas SINHA, Hien LE
-
Patent number: 11169925Abstract: According to one general aspect, an apparatus may include a store stream detector configured to detect when the apparatus is streaming data to a memory system. The apparatus may also include a write generator configured to route a stream of data to either a near memory of the memory system or a far memory of the memory system based upon a cache threshold value and a size of the stream of data. The apparatus may be configured to dynamically vary the cache threshold value based upon a predetermined rule set, such that cache pollution caused by the stream of data is managed.Type: GrantFiled: January 13, 2016Date of Patent: November 9, 2021Inventors: Tarun Nakra, Kevin Lepak
-
Patent number: 11113207Abstract: A system and a method to allocate data to a first cache increments a first counter if a reuse indicator for the data indicates that the data is likely to be reused and decremented the counter if the reuse indicator for the data indicates that the data is likely not to be reused. A second counter is incremented upon eviction of the data from the second cache, which is a higher level cache than the first cache. The data is allocated to the first cache if the value of the first counter is equal to or greater than the first predetermined threshold or the value of the second counter equals zero, and the data is bypassed from the first cache if the value of the first counter is less than the first predetermined threshold and the value of the second counter is not equal to zero.Type: GrantFiled: February 28, 2019Date of Patent: September 7, 2021Inventors: Yingying Tian, Tarun Nakra, Vikas Sinha, Hien Le
-
Patent number: 11055221Abstract: According to one general aspect, an apparatus may include a processor configured to issue a first request for a piece of data from a cache memory and a second request for the piece of data from a system memory. The apparatus may include the cache memory configured to temporarily store a subset of data. The apparatus may include a memory interconnect. The a memory interconnect may be configured to receive the second request for the piece of data from the system memory. The a memory interconnect may be configured to determine if the piece of memory is stored in the cache memory. The a memory interconnect may be configured to, if the piece of memory is determined to be stored in the cache memory, cancel the second request for the piece of data from the system memory.Type: GrantFiled: May 28, 2019Date of Patent: July 6, 2021Inventors: Vikas Sinha, Hien Le, Tarun Nakra, Yingying Tian, Apurva Patel, Omar Torres
-
Patent number: 10963388Abstract: According to one general aspect, an apparatus may include a multi-tiered cache system that includes at least one upper cache tier relatively closer, hierarchically, to a processor and at least one lower cache tier relatively closer, hierarchically, to a system memory. The apparatus may include a memory interconnect circuit hierarchically between the multi-tiered cache system and the system memory. The apparatus may include a prefetcher circuit coupled with a lower cache tier of the multi-tiered cache system, and configured to issue a speculative prefetch request to the memory interconnect circuit for data to be placed into the lower cache tier. The memory interconnect circuit may be configured to cancel the speculative prefetch request if the data exists in an upper cache tier of the multi-tiered cache system.Type: GrantFiled: August 16, 2019Date of Patent: March 30, 2021Inventors: Vikas Sinha, Teik Tan, Tarun Nakra
-
Publication number: 20200401523Abstract: According to one general aspect, an apparatus may include a multi-tiered cache system that includes at least one upper cache tier relatively closer, hierarchically, to a processor and at least one lower cache tier relatively closer, hierarchically, to a system memory. The apparatus may include a memory interconnect circuit hierarchically between the multi-tiered cache system and the system memory. The apparatus may include a prefetcher circuit coupled with a lower cache tier of the multi-tiered cache system, and configured to issue a speculative prefetch request to the memory interconnect circuit for data to be placed into the lower cache tier. The memory interconnect circuit may be configured to cancel the speculative prefetch request if the data exists in an upper cache tier of the multi-tiered cache system.Type: ApplicationFiled: August 16, 2019Publication date: December 24, 2020Inventors: Vikas SINHA, Teik TAN, Tarun NAKRA
-
Publication number: 20200301838Abstract: According to one general aspect, an apparatus may include a processor configured to issue a first request for a piece of data from a cache memory and a second request for the piece of data from a system memory. The apparatus may include the cache memory configured to temporarily store a subset of data. The apparatus may include a memory interconnect. The a memory interconnect may be configured to receive the second request for the piece of data from the system memory. The a memory interconnect may be configured to determine if the piece of memory is stored in the cache memory. The a memory interconnect may be configured to, if the piece of memory is determined to be stored in the cache memory, cancel the second request for the piece of data from the system memory.Type: ApplicationFiled: May 28, 2019Publication date: September 24, 2020Inventors: Vikas SINHA, Hien LE, Tarun NAKRA, Yingying TIAN, Apurva PATEL, Omar TORRES
-
Publication number: 20200210347Abstract: A system and a method to allocate data to a first cache increments a first counter if a reuse indicator for the data indicates that the data is likely to be reused and decremented the counter if the reuse indicator for the data indicates that the data is likely not to be reused. A second counter is incremented upon eviction of the data from the second cache, which is a higher level cache than the first cache. The data is allocated to the first cache if the value of the first counter is equal to or greater than the first predetermined threshold or the value of the second counter equals zero, and the data is bypassed from the first cache if the value of the first counter is less than the first predetermined threshold and the value of the second counter is not equal to zero.Type: ApplicationFiled: February 28, 2019Publication date: July 2, 2020Inventors: Yingying TIAN, Tarun NAKRA, Vikas SINHA, Hien LE
-
Patent number: 10649900Abstract: According to one general aspect, an apparatus may include a first cache configured to store data. The apparatus may include a second cache configured to, in response to a fill request, supply the first cache with data, and an incoming fill signal. The apparatus may also include an execution circuit configured to, via a load request, retrieve data from the first cache. The first cache may be configured to: derive, from the incoming fill signal, address and timing information associated with the fill request, and based, at least partially, upon the address and timing information, schedule the load request to attempt to avoid a load-fill conflict.Type: GrantFiled: February 20, 2018Date of Patent: May 12, 2020Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Tarun Nakra, Hao Wang, Paul Kitchin
-
Patent number: 10606752Abstract: Embodiments include a method and system for coordinating cache management for an exclusive cache hierarchy. The method and system may include managing, by a coordinated cache logic section, a level three (L3) cache, a level two (L2) cache, and/or a level one (L1) cache. Managing the L3 cache and the L2 cache may include coordinating a cache block replacement policy among the L3 cache and the L2 cache by filtering data with lower reuse probability from data with higher reuse probability. The method and system may include tracking reuse patterns of demand requests separately from reuse patterns of prefetch requests. Accordingly, a coordinated cache management policy may be built across multiple levels of a cache hierarchy, rather than a cache replacement policy within one cache level. Higher-level cache behavior may be used to guide lower-level cache allocation, bringing greater visibility of cache behavior to exclusive last level caches (LLCs).Type: GrantFiled: February 6, 2018Date of Patent: March 31, 2020Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Yingying Tian, Tarun Nakra, Khang Nguyen, Ravikanth Reddy, Edwin Silvera
-
Patent number: 10360158Abstract: Embodiments of the present system and method provide cache replacement in a victim exclusive cache using a snoop filter where replacement information is not lost during a re-reference back to the CPU. Replacement information is stored in a snoop filter, meaning that historical access data may be fully preserved and allows for more flexibility in the LLC re-insertion points, without additional bits stored in a L2 cache. The present system and method further include snoop filter replacement technique. The present system and method passes replacement information between a snoop filter and a victim exclusive cache (e.g., LLC) when transactions move cachelines to and from a master CPU. This maintains and advances existing replacement information for a cacheline that is removed from the victim exclusive cache on a read, as well as intelligently replaces and ages cachelines in the snoop filter.Type: GrantFiled: June 7, 2017Date of Patent: July 23, 2019Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Eric C. Quinnell, Kevin C. Heuer, Tarun Nakra, Akhil Arunkumar
-
Patent number: 10360160Abstract: According to one general aspect, an apparatus may include a cache and a cache replacement unit. The cache may be arranged in a plurality of cache sets each configured to store data. A number of cache sets are designated as leader cache sets and each leader cache set is associated with a first replacement policy or a second replacement policy. The cache replacement unit may be configured to monitor an effectiveness of the first replacement policy and, at least, the second replacement policy to accurately predict cache line replacement. The cache replacement unit may be configured to select the first replacement policy or the second replacement policy to be a dominant replacement policy. The cache replacement unit may be configured to dynamically scale the number of cache sets that are designated as leader cache sets based at least in part upon the effectiveness of the dominant replacement policy.Type: GrantFiled: October 21, 2016Date of Patent: July 23, 2019Assignee: Samsung Electronics Co., Ltd.Inventors: Tarun Nakra, Kevin Heuer, Khang Nguyen