Patents by Inventor William L. Walker
William L. Walker has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240126552Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.Type: ApplicationFiled: December 21, 2023Publication date: April 18, 2024Inventors: JOHN KALAMATIANOS, MICHAEL T. CLARK, MARIUS EVERS, WILLIAM L. WALKER, PAUL MOYER, JAY FLEISCHMAN, JAGADISH B. KOTRA
-
Patent number: 11868777Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.Type: GrantFiled: December 16, 2020Date of Patent: January 9, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Michael T. Clark, Marius Evers, William L. Walker, Paul Moyer, Jay Fleischman, Jagadish B. Kotra
-
Patent number: 11847062Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.Type: GrantFiled: December 16, 2021Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
-
Patent number: 11829196Abstract: An integrated circuit (IC) device includes a ring transport having a plurality of nodes and a wire interconnect coupling the plurality of nodes in a ring. The wire interconnect including a wire to transmit clock wake signals around the ring transport in advance of data signaling representing a data packet. Each node is to switch from a clock gated state to a clocked state responsive to receiving a clock wake signal. The ring transport further includes a sleep controller coupled to a select node of the plurality of nodes. The sleep controller is to configure the select node into a clock suppression state for a specified duration responsive to identifying an idle condition on the ring transport via monitoring of the wire. While in the clock suppression state the node suppresses further transmission of any clock wake signals received at the select node.Type: GrantFiled: October 22, 2019Date of Patent: November 28, 2023Assignee: Advanced Micro Devices, Inc.Inventor: William L. Walker
-
Patent number: 11704248Abstract: A processor core associated with a first cache initiates entry into a powered-down state. In response, information representing a set of entries of the first cache are stored in a retention region that receives a retention voltage while the processor core is in a powered-down state. Information indicating one or more invalidated entries of the set of entries is also stored in the retention region. In response to the processor core initiating exit from the powered-down state, entries of the first cache are restored using the stored information representing the entries and the stored information indicating the at least one invalidated entry.Type: GrantFiled: November 6, 2020Date of Patent: July 18, 2023Assignee: Advanced Micro Devices, Inc.Inventors: William L. Walker, Michael L. Golden, Marius Evers
-
Publication number: 20230195643Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.Type: ApplicationFiled: December 16, 2021Publication date: June 22, 2023Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
-
Patent number: 11675703Abstract: A processing system includes an interconnect fabric coupleable to a local memory and at least one compute cluster coupled to the interconnect fabric. The compute cluster includes a processor core and a cache hierarchy. The cache hierarchy has a plurality of caches and a throttle controller configured to throttle a rate of memory requests issuable by the processor core based on at least one of an access latency metric and a prefetch accuracy metric. The access latency metric represents an average access latency for memory requests for the processor core and the prefetch accuracy metric represents an accuracy of a prefetcher of a cache of the cache hierarchy.Type: GrantFiled: March 28, 2022Date of Patent: June 13, 2023Assignee: Advanced Micro Devices, Inc.Inventors: William L. Walker, William E. Jones
-
Publication number: 20230094030Abstract: A processor sets the size of a processor cache based on an identified workload executing at the processor. The cache size is set in response to the processor exiting a low-power mode. By setting the size of the cache based on the workload, the processor is able to tailor the size of the cache to the characteristics of a particular workload while also reducing, for at least some workloads, the overhead associated with entering or exiting the low-power mode.Type: ApplicationFiled: September 30, 2021Publication date: March 30, 2023Inventors: Dilip Jha, William L. Walker
-
Patent number: 11561906Abstract: A processing system rinses, from a cache, those cache lines that share the same memory page as a cache line identified for eviction. A cache controller of the processing system identifies a cache line as scheduled for eviction. In response, the cache controller, identifies additional “dirty victim” cache lines (cache lines that have been modified at the cache and not yet written back to memory) that are associated with the same memory page, and writes each of the identified cache lines to the same memory page. By writing each of the dirty victim cache lines associated with the memory page to memory, the processing system reduces memory overhead and improves processing efficiency.Type: GrantFiled: December 12, 2017Date of Patent: January 24, 2023Assignee: Advanced Micro Devices, Inc.Inventors: William L. Walker, William E. Jones
-
Publication number: 20220292019Abstract: A processing system includes an interconnect fabric coupleable to a local memory and at least one compute cluster coupled to the interconnect fabric. The compute cluster includes a processor core and a cache hierarchy. The cache hierarchy has a plurality of caches and a throttle controller configured to throttle a rate of memory requests issuable by the processor core based on at least one of an access latency metric and a prefetch accuracy metric. The access latency metric represents an average access latency for memory requests for the processor core and the prefetch accuracy metric represents an accuracy of a prefetcher of a cache of the cache hierarchy.Type: ApplicationFiled: March 28, 2022Publication date: September 15, 2022Inventors: William L. Walker, William E. Jones
-
Publication number: 20220188117Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.Type: ApplicationFiled: December 16, 2020Publication date: June 16, 2022Inventors: JOHN KALAMATIANOS, MICHAEL T. CLARK, MARIUS EVERS, WILLIAM L. WALKER, PAUL MOYER, JAY FLEISCHMAN, JAGADISH B. KOTRA
-
Patent number: 11294810Abstract: A processing system includes an interconnect fabric coupleable to a local memory and at least one compute cluster coupled to the interconnect fabric. The compute cluster includes a processor core and a cache hierarchy. The cache hierarchy has a plurality of caches and a throttle controller configured to throttle a rate of memory requests issuable by the processor core based on at least one of an access latency metric and a prefetch accuracy metric. The access latency metric represents an average access latency for memory requests for the processor core and the prefetch accuracy metric represents an accuracy of a prefetcher of a cache of the cache hierarchy.Type: GrantFiled: December 12, 2017Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventors: William L. Walker, William E. Jones
-
Publication number: 20210116956Abstract: An integrated circuit (IC) device includes a ring transport having a plurality of nodes and a wire interconnect coupling the plurality of nodes in a ring. The wire interconnect including a wire to transmit clock wake signals around the ring transport in advance of data signaling representing a data packet. Each node is to switch from a clock gated state to a clocked state responsive to receiving a clock wake signal. The ring transport further includes a sleep controller coupled to a select node of the plurality of nodes. The sleep controller is to configure the select node into a clock suppression state for a specified duration responsive to identifying an idle condition on the ring transport via monitoring of the wire. While in the clock suppression state the node suppresses further transmission of any clock wake signals received at the select node.Type: ApplicationFiled: October 22, 2019Publication date: April 22, 2021Inventor: William L. WALKER
-
Patent number: 10956332Abstract: A processor core associated with a first cache initiates entry into a powered-down state. In response, information representing a set of entries of the first cache are stored in a retention region that receives a retention voltage while the processor core is in a powered-down state. Information indicating one or more invalidated entries of the set of entries is also stored in the retention region. In response to the processor core initiating exit from the powered-down state, entries of the first cache are restored using the stored information representing the entries and the stored information indicating the at least one invalidated entry.Type: GrantFiled: November 1, 2017Date of Patent: March 23, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: William L. Walker, Michael L. Golden, Marius Evers
-
Publication number: 20210056031Abstract: A processor core associated with a first cache initiates entry into a powered-down state. In response, information representing a set of entries of the first cache are stored in a retention region that receives a retention voltage while the processor core is in a powered-down state. Information indicating one or more invalidated entries of the set of entries is also stored in the retention region. In response to the processor core initiating exit from the powered-down state, entries of the first cache are restored using the stored information representing the entries and the stored information indicating the at least one invalidated entry.Type: ApplicationFiled: November 6, 2020Publication date: February 25, 2021Inventors: William L. WALKER, Michael L. GOLDEN, Marius EVERS
-
Patent number: 10740029Abstract: A processing system employs an expandable memory buffer that supports enlarging the memory buffer when the processing system generates a large number of long latency memory transactions. The hybrid structure of the memory buffer allows a memory controller of the processing system to store a larger number of memory transactions while still maintaining adequate transaction throughput and also ensuring a relatively small buffer footprint and power consumption. Further, the hybrid structure allows different portions of the buffer to be placed on separate integrated circuit dies, which in turn allows the memory controller to be used in a wide variety of integrated circuit configurations, including configurations that use only one portion of the memory buffer.Type: GrantFiled: November 28, 2017Date of Patent: August 11, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Gabriel H. Loh, William L. Walker
-
Patent number: 10705958Abstract: A processor partitions a coherency directory into different regions for different processor cores and manages the number of entries allocated to each region based at least in part on monitored recall costs indicating expected resource costs for reallocating entries. Examples of monitored recall costs include a number of cache evictions associated with entry reallocation, the hit rate of each region of the coherency directory, and the like, or a combination thereof. By managing the entries allocated to each region based on the monitored recall costs, the processor ensures that processor cores associated with denser memory access patterns (that is, memory access patterns that more frequently access cache lines associated with the same memory pages) are assigned more entries of the coherency directory.Type: GrantFiled: August 22, 2018Date of Patent: July 7, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Michael W. Boyer, Gabriel H. Loh, Yasuko Eckert, William L. Walker
-
Publication number: 20200065246Abstract: A processor partitions a coherency directory into different regions for different processor cores and manages the number of entries allocated to each region based at least in part on monitored recall costs indicating expected resource costs for reallocating entries. Examples of monitored recall costs include a number of cache evictions associated with entry reallocation, the hit rate of each region of the coherency directory, and the like, or a combination thereof. By managing the entries allocated to each region based on the monitored recall costs, the processor ensures that processor cores associated with denser memory access patterns (that is, memory access patterns that more frequently access cache lines associated with the same memory pages) are assigned more entries of the coherency directory.Type: ApplicationFiled: August 22, 2018Publication date: February 27, 2020Inventors: Michael W. BOYER, Gabriel H. LOH, Yasuko ECKERT, William L. WALKER
-
Patent number: 10366027Abstract: A method for steering data for an I/O write operation includes, in response to receiving the I/O write operation, identifying, at an interconnect fabric, a cache of one of a plurality of compute complexes as a target cache for steering the data based on at least one of: a software-provided steering indicator, a steering configuration implemented at boot initialization, and coherency information for a cacheline associated with the data. The method further includes directing, via the interconnect fabric, the identified target cache to cache the data from the I/O write operation. The data is temporarily buffered at the interconnect fabric, and if the target cache attempts to fetch the data while the data is still buffered at the interconnect fabric, the interconnect fabric provides a copy of the buffered data in response to the fetch operation instead of initiating a memory access operation to obtain the data from memory.Type: GrantFiled: November 29, 2017Date of Patent: July 30, 2019Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Eric Christopher Morton, Elizabeth Cooper, William L. Walker, Douglas Benson Hunt, Richard Martin Born, Richard H. Lee, Paul C. Miranda, Philip Ng, Paul Moyer
-
Publication number: 20190179770Abstract: A processing system rinses, from a cache, those cache lines that share the same memory page as a cache line identified for eviction. A cache controller of the processing system identifies a cache line as scheduled for eviction. In response, the cache controller, identifies additional “dirty victim” cache lines (cache lines that have been modified at the cache and not yet written back to memory) that are associated with the same memory page, and writes each of the identified cache lines to the same memory page. By writing each of the dirty victim cache lines associated with the memory page to memory, the processing system reduces memory overhead and improves processing efficiency.Type: ApplicationFiled: December 12, 2017Publication date: June 13, 2019Inventors: William L. WALKER, William E. JONES