Patents by Inventor Paul James Moyer
Paul James Moyer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12204454Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.Type: GrantFiled: October 29, 2021Date of Patent: January 21, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Paul James Moyer, Jay Fleischman
-
Publication number: 20240411692Abstract: Cache replacement policies are described. In accordance with the described techniques, a request for data is received and a cache replacement policy controls how a controller responds to the request. The cache replacement policy assigns each cacheline a priority value, which indicates whether the cacheline should be preserved relative to other cachelines, in response to the request being a cache miss that necessitates eviction of at least one cacheline. The cache replacement policy decrements priority values until at least one cacheline achieves a minimum priority value, at which point a cacheline is evicted. The cache replacement policy designates certain cachelines as protected, either via a separate protected indicator or via the cacheline's priority value, which causes unprotected cachelines to be selected for eviction while favoring preservation of protected cachelines in the cache.Type: ApplicationFiled: June 9, 2023Publication date: December 12, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Gabriel Hsiuwei Loh, Joseph Lee Greathouse, William Louie Walker, Paul James Moyer
-
Patent number: 12032965Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.Type: GrantFiled: November 5, 2021Date of Patent: July 9, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Paul James Moyer, Douglas Benson Hunt, Kai Troester
-
Patent number: 12026099Abstract: A cache stores, along with data that is being transferred from a higher level cache to a lower level cache, information indicating the higher level cache location from which the data was transferred. Upon receiving a request for data that is stored at the location in the higher level cache, a cache controller stores the higher level cache location information in a status tag of the data. The cache controller then transfers the data with the status tag indicating the higher level cache location to a lower level cache. When the data is subsequently updated or evicted from the lower level cache, the cache controller reads the status tag location information and transfers the data back to the location in the higher level cache from which it was originally transferred.Type: GrantFiled: February 22, 2021Date of Patent: July 2, 2024Assignee: Advanced Micro Devices, Inc.Inventor: Paul James Moyer
-
Publication number: 20240202116Abstract: An entry of a last level cache shadow tag array to track pending last level cache misses to private data in a previous level cache (e.g., an L2 cache), that also are misses to an exclusive last level cache (e.g., an L3 cache) and to the last level cache shadow tag array. Accordingly, last level cache miss status holding registers need not be expended to track cache misses to private data that are already being tracked by a previous level cache miss status holding register. Additionally or alternatively, up to a threshold number of last level cache pending misses to the same shared data from different processor cores are tracked in the last level cache shadow tag array, and any additional last level cache pending misses are tracked in a last level cache miss status holding register.Type: ApplicationFiled: December 20, 2022Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, John Kalamatianos, Paul James Moyer, Nicholas Dean Lance, Sriram Srinivasan, Patrick James Shyvers, William Louie Walker
-
Patent number: 11669457Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.Type: GrantFiled: August 27, 2021Date of Patent: June 6, 2023Inventors: Paul James Moyer, Douglas Benson Hunt
-
Patent number: 11561895Abstract: Systems, apparatuses, and methods for dynamically adjusting cache policies to reduce execution core wait time are disclosed. A processor includes a cache subsystem. The cache subsystem includes one or more cache levels and one or more cache controllers. A cache controller partitions a cache level into two test portions and a remainder portion. The cache controller applies a first policy to the first test portion and applies a second policy to the second test portion. The cache controller determines the amount of time the execution core spends waiting on accesses to the first and second test portions. If the measured wait time is less for the first test portion than for the second test portion, then the cache controller applies the first policy to the remainder portion. Otherwise, the cache controller applies the second policy to the remainder portion.Type: GrantFiled: September 5, 2019Date of Patent: January 24, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Paul James Moyer
-
Publication number: 20220100686Abstract: Systems, apparatuses, and methods for routing interrupts on a coherency probe network are disclosed. A computing system includes a plurality of processing nodes, a coherency probe network, and one or more control units. The coherency probe network carries coherency probe messages between coherent agents. Interrupts that are detected by a control unit are converted into messages that are compatible with coherency probe messages and then routed to a target destination via the coherency probe network. Interrupts are generated with a first encoding while coherency probe messages have a second encoding. Cache subsystems determine whether a message received via the coherency probe network is an interrupt message or a coherency probe message based on an encoding embedded in the received message. Interrupt messages are routed to interrupt controller(s) while coherency probe messages are processed in accordance with a coherence probe action field embedded in the message.Type: ApplicationFiled: December 10, 2021Publication date: March 31, 2022Inventors: Vydhyanathan Kalyanasundharam, Eric Christopher Morton, Bryan P. Broussard, Paul James Moyer, William Louie Walker
-
Publication number: 20220058025Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.Type: ApplicationFiled: November 5, 2021Publication date: February 24, 2022Inventors: Paul James Moyer, Douglas Benson Hunt, Kai Troester
-
Publication number: 20220050785Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.Type: ApplicationFiled: October 29, 2021Publication date: February 17, 2022Inventors: Paul James Moyer, Jay Fleischman
-
Patent number: 11210246Abstract: Systems, apparatuses, and methods for routing interrupts on a coherency probe network are disclosed. A computing system includes a plurality of processing nodes, a coherency probe network, and one or more control units. The coherency probe network carries coherency probe messages between coherent agents. Interrupts that are detected by a control unit are converted into messages that are compatible with coherency probe messages and then routed to a target destination via the coherency probe network. Interrupts are generated with a first encoding while coherency probe messages have a second encoding. Cache subsystems determine whether a message received via the coherency probe network is an interrupt message or a coherency probe message based on an encoding embedded in the received message. Interrupt messages are routed to interrupt controller(s) while coherency probe messages are processed in accordance with a coherence probe action field embedded in the message.Type: GrantFiled: August 24, 2018Date of Patent: December 28, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Vydhyanathan Kalyanasundharam, Eric Christopher Morton, Bryan P. Broussard, Paul James Moyer, William Louie Walker
-
Publication number: 20210390057Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.Type: ApplicationFiled: August 27, 2021Publication date: December 16, 2021Inventors: Paul James Moyer, Douglas Benson Hunt
-
Patent number: 11169812Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.Type: GrantFiled: September 26, 2019Date of Patent: November 9, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Paul James Moyer, Douglas Benson Hunt, Kai Troester
-
Patent number: 11163688Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.Type: GrantFiled: September 24, 2019Date of Patent: November 2, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Paul James Moyer, Jay Fleischman
-
Patent number: 11106594Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.Type: GrantFiled: September 5, 2019Date of Patent: August 31, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Paul James Moyer, Douglas Benson Hunt
-
Publication number: 20210173789Abstract: A cache stores, along with data that is being transferred from a higher level cache to a lower level cache, information indicating the higher level cache location from which the data was transferred. Upon receiving a request for data that is stored at the location in the higher level cache, a cache controller stores the higher level cache location information in a status tag of the data. The cache controller then transfers the data with the status tag indicating the higher level cache location to a lower level cache. When the data is subsequently updated or evicted from the lower level cache, the cache controller reads the status tag location information and transfers the data back to the location in the higher level cache from which it was originally transferred.Type: ApplicationFiled: February 22, 2021Publication date: June 10, 2021Inventor: Paul James MOYER
-
Publication number: 20210096873Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.Type: ApplicationFiled: September 26, 2019Publication date: April 1, 2021Inventors: Paul James Moyer, Douglas Benson Hunt, Kai Troester
-
Publication number: 20210089462Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.Type: ApplicationFiled: September 24, 2019Publication date: March 25, 2021Inventors: Paul James Moyer, Jay Fleischman
-
Patent number: 10956339Abstract: A cache stores, along with data that is being transferred from a higher level cache to a lower level cache, information indicating the higher level cache location from which the data was transferred. Upon receiving a request for data that is stored at the location in the higher level cache, a cache controller stores the higher level cache location information in a status tag of the data. The cache controller then transfers the data with the status tag indicating the higher level cache location to a lower level cache. When the data is subsequently updated or evicted from the lower level cache, the cache controller reads the status tag location information and transfers the data back to the location in the higher level cache from which it was originally transferred.Type: GrantFiled: July 14, 2016Date of Patent: March 23, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Paul James Moyer
-
Publication number: 20210073137Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.Type: ApplicationFiled: September 5, 2019Publication date: March 11, 2021Inventors: Paul James Moyer, Douglas Benson Hunt