Patents by Inventor Winnie W. Yeung

Winnie W. Yeung has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Scoreboard for register data cache

Patent number: 12360899

Abstract: Techniques are disclosed relating to graphics processor data caches. In some embodiments, datapath executes instructions that operate on input operands from architectural registers. Data cache circuitry caches architectural register data for the datapath circuitry. Scoreboard circuitry tracks, for a given architectural register: map information that indicates whether the architectural register is mapped to an entry of the data cache circuitry and a pointer to the entry of the data cache circuitry. Tiered scoreboard circuitry and data storage circuitry may be implemented (e.g., to provide fast scoreboard access for active threads and to give a landing spot for long-latency data retrieval operations). Various disclosed techniques may improve cache performance, reduce power consumption, reduce area, or some combination thereof.

Type: Grant

Filed: January 11, 2024

Date of Patent: July 15, 2025

Assignee: Apple Inc.

Inventors: Winnie W. Yeung, Zelin Zhang, Cheng Li, Hungse Cha, Leela Kishore Kothamasu
Coherency Control for Compressed Graphics Data

Publication number: 20250104181

Abstract: Techniques are disclosed relating to data compression in graphics processors. In some embodiments, cache circuitry is coupled to shader processor circuitry and is configured to store graphics data that includes a compressed block of data associated with a surface and metadata for the compressed block of data. Metadata coherence circuitry may cache the metadata for the compressed block of data, receive an indication of a write command for non-compressed data associated with the surface, wherein the write command identifies the metadata and has a different address than the compressed block of data, and determine, based on the metadata and the indication, to invalidate the compressed block of data in the cache circuitry. This may maintain read/write coherence in a cache that stores both compressed and uncompressed data, in some embodiments.

Type: Application

Filed: August 6, 2024

Publication date: March 27, 2025

Inventors: Karthik Ramani, Tyson J. Bergland, Leela Kishore Kothamasu, Hongzhou Zhao, Winnie W. Yeung, Mladen Wilder
Scoreboard for Register Data Cache

Publication number: 20250103493

Abstract: Techniques are disclosed relating to graphics processor data caches. In some embodiments, datapath executes instructions that operate on input operands from architectural registers. Data cache circuitry caches architectural register data for the datapath circuitry. Scoreboard circuitry tracks, for a given architectural register: map information that indicates whether the architectural register is mapped to an entry of the data cache circuitry and a pointer to the entry of the data cache circuitry. Tiered scoreboard circuitry and data storage circuitry may be implemented (e.g., to provide fast scoreboard access for active threads and to give a landing spot for long-latency data retrieval operations). Various disclosed techniques may improve cache performance, reduce power consumption, reduce area, or some combination thereof.

Type: Application

Filed: January 11, 2024

Publication date: March 27, 2025

Inventors: Winnie W. Yeung, Zelin Zhang, Cheng Li, Hungse Cha, Leela Kishore Kothamasu
Cache Control to Preserve Register Data

Publication number: 20250094357

Abstract: Techniques are disclosed relating to eviction control for cache lines that store register data. In some embodiments, memory hierarchy circuitry is configured to provide memory backing for register operand data in one or more cache circuits. Lock circuitry may control a first set of lock indicators for a set of registers for a first thread, including to assert one or more lock indicators for registers that are indicated, by decode circuitry, as being utilized by decoded instructions of the first thread. The lock circuitry may preserve register operand data in the one or more cache circuits, including to prevent eviction of a given cache line from a cache circuit based on an asserted lock indicator. The lock circuitry may clear the first set of lock indicators in response to a reset event. Disclosed techniques may advantageously retain relevant register information in the cache with limited control circuit area.

Type: Application

Filed: November 27, 2024

Publication date: March 20, 2025

Inventors: Jonathan M. Redshaw, Winnie W. Yeung, Benjiman L. Goodman, David K. Li, Zelin Zhang, Yoong Chert Foo
Multi-block cache fetch techniques

Patent number: 12248399

Abstract: Techniques are disclosed relating to multi-block fetches for cache misses. In some embodiments, cache tag circuitry maintains a tag value that is shared by multiple cache blocks. In response to a miss, the cache may initiate a fetch request to a next level cache or memory. Aggregation circuitry may aggregate multiple fetch requests for cache blocks that share the tag value and fetch circuitry may initiate a single multi-block fetch operation to the next level cache or memory that returns cache blocks for the aggregated multiple fetch requests. In various embodiments, disclosed techniques may improve performance (e.g., by reducing fetch bus transactions), reduce power consumption, or both, relative to traditional techniques.

Type: Grant

Filed: May 19, 2021

Date of Patent: March 11, 2025

Assignee: Apple Inc.

Inventors: Winnie W. Yeung, Cheng Li
Cache control to preserve register data

Patent number: 12182037

Abstract: Techniques are disclosed relating to eviction control for cache lines that store register data. In some embodiments, memory hierarchy circuitry is configured to provide memory backing for register operand data in one or more cache circuits. Lock circuitry may control a first set of lock indicators for a set of registers for a first thread, including to assert one or more lock indicators for registers that are indicated, by decode circuitry, as being utilized by decoded instructions of the first thread. The lock circuitry may preserve register operand data in the one or more cache circuits, including to prevent eviction of a given cache line from a cache circuit based on an asserted lock indicator. The lock circuitry may clear the first set of lock indicators in response to a reset event. Disclosed techniques may advantageously retain relevant register information in the cache with limited control circuit area.

Type: Grant

Filed: February 23, 2023

Date of Patent: December 31, 2024

Assignee: Apple Inc.

Inventors: Jonathan M. Redshaw, Winnie W. Yeung, Benjiman L. Goodman, David K. Li, Zelin Zhang, Yoong Chert Foo
Cache Control to Preserve Register Data

Publication number: 20240289282

Abstract: Techniques are disclosed relating to eviction control for cache lines that store register data. In some embodiments, memory hierarchy circuitry is configured to provide memory backing for register operand data in one or more cache circuits. Lock circuitry may control a first set of lock indicators for a set of registers for a first thread, including to assert one or more lock indicators for registers that are indicated, by decode circuitry, as being utilized by decoded instructions of the first thread. The lock circuitry may preserve register operand data in the one or more cache circuits, including to prevent eviction of a given cache line from a cache circuit based on an asserted lock indicator. The lock circuitry may clear the first set of lock indicators in response to a reset event. Disclosed techniques may advantageously retain relevant register information in the cache with limited control circuit area.

Type: Application

Filed: February 23, 2023

Publication date: August 29, 2024

Inventors: Jonathan M. Redshaw, Winnie W. Yeung, Benjiman L. Goodman, David K. Li, Zelin Zhang, Yoong Chert Foo
Cache footprint management

Patent number: 11947462

Abstract: Techniques are disclosed relating to cache footprint management. In some embodiments, execution circuitry is configured to perform operations for instructions from multiple threads in parallel. Cache circuitry may store information operated on by threads executed by the execution circuitry. Scheduling circuitry may arbitrate among threads to schedule threads for execution by the execution circuitry. Tracking circuitry may determine one or more performance metrics for the cache circuitry. Control circuitry may, based on the one or more performance metrics meeting a threshold, reduce a limit on a number of threads considered for arbitration by the scheduling circuitry, to control a footprint of information stored by the cache circuitry. Disclosed techniques may advantageously reduce or avoid cache thrashing for certain processor workloads.

Type: Grant

Filed: March 3, 2022

Date of Patent: April 2, 2024

Assignee: Apple Inc.

Inventors: Yoong Chert Foo, Terence M. Potter, Donald R. DeSota, Benjiman L. Goodman, Aroun Demeure, Cheng Li, Winnie W. Yeung
Snapshot arbitration techniques for memory requests

Patent number: 11842436

Abstract: Techniques are disclosed relating to arbitration for computer memory resources. In some embodiments, an apparatus includes queue circuitry that implements multiple queues configured to queue requests to access a memory bus. Control circuitry may, in response to detecting a first threshold condition associated with the queue circuitry, generate a first snapshot that indicates numbers of requests in respective queues of the multiple queues at a first time. The control circuitry may generate a second snapshot that indicates numbers of requests in respective queues of the multiple queues at a second time that is subsequent to the first time. The control circuitry may arbitrate between requests from the multiple queues to select requests to access the memory bus, where the arbitration is based on snapshots to which requests from the multiple queues belong. Disclosed techniques may approximate age-based scheduling while reducing area and power consumption.

Type: Grant

Filed: August 1, 2022

Date of Patent: December 12, 2023

Assignee: Apple Inc.

Inventors: Winnie W. Yeung, Leela Kishore Kothamasu, Zelin Zhang, Guanlan Xu, Eddie M. Robinson
Snapshot Arbitration Techniques for Memory Requests

Publication number: 20220375161

Abstract: Techniques are disclosed relating to arbitration for computer memory resources. In some embodiments, an apparatus includes queue circuitry that implements multiple queues configured to queue requests to access a memory bus. Control circuitry may, in response to detecting a first threshold condition associated with the queue circuitry, generate a first snapshot that indicates numbers of requests in respective queues of the multiple queues at a first time. The control circuitry may generate a second snapshot that indicates numbers of requests in respective queues of the multiple queues at a second time that is subsequent to the first time. The control circuitry may arbitrate between requests from the multiple queues to select requests to access the memory bus, where the arbitration is based on snapshots to which requests from the multiple queues belong. Disclosed techniques may approximate age-based scheduling while reducing area and power consumption.

Type: Application

Filed: August 1, 2022

Publication date: November 24, 2022

Inventors: Winnie W. Yeung, Leela Kishore Kothamasu, Zelin Zhang, Guanlan Xu, Eddie M. Robinson
Multi-block Cache Fetch Techniques

Publication number: 20220374359

Abstract: Techniques are disclosed relating to multi-block fetches for cache misses. In some embodiments, cache tag circuitry maintains a tag value that is shared by multiple cache blocks. In response to a miss, the cache may initiate a fetch request to a next level cache or memory. Aggregation circuitry may aggregate multiple fetch requests for cache blocks that share the tag value and fetch circuitry may initiate a single multi-block fetch operation to the next level cache or memory that returns cache blocks for the aggregated multiple fetch requests. In various embodiments, disclosed techniques may improve performance (e.g., by reducing fetch bus transactions), reduce power consumption, or both, relative to traditional techniques.

Type: Application

Filed: May 19, 2021

Publication date: November 24, 2022

Inventors: Winnie W. Yeung, Cheng Li
Compression techniques and hierarchical caching

Patent number: 11488350

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, a memory system implements a storage hierarchy that includes first cache circuitry and second cache circuitry at different levels of the hierarchy. Processor circuitry generates write data to be written to the memory system. In some embodiments, first compression circuitry is configured to compress a first block of write data in response to full accumulation of the first block in the first cache circuitry and second compression circuitry is configured to compress a second block of write data in response to full accumulation of the second block in the second cache circuitry. Write circuitry may write the first and second compressed blocks of data in a single combined write to a higher level in the storage hierarchy.

Type: Grant

Filed: June 4, 2021

Date of Patent: November 1, 2022

Assignee: Apple Inc.

Inventors: Anthony P. DeLaurier, Karl D. Mann, Tyson J. Bergland, Winnie W. Yeung
Cache arbitration for address translation requests

Patent number: 11467959

Abstract: Techniques are disclosed relating to caching for address translation. In some embodiments, address translation circuitry is configured to process requests to translate addresses in a first address space to addresses in a second address space. The translation circuitry may include cache circuitry configured to store translation information, arbitration circuitry configured to arbitrate among ready requests for access to entries of the cache, and hazard circuitry. The hazard circuitry may assign a first request to an ready status the arbitration circuitry based on detection of an absence of hazards for a first address of the first request and add a second request to a queue of requests for the arbitration circuitry based on detection of a hazard for a second address of the second request. Independent arbitration for requests without hazards may improve performance in various aspects, relative to traditional techniques.

Type: Grant

Filed: May 19, 2021

Date of Patent: October 11, 2022

Assignee: Apple Inc.

Inventors: Winnie W. Yeung, Cheng Li
Snapshot arbitration techniques for memory requests

Patent number: 11443479

Abstract: Techniques are disclosed relating to arbitration for computer memory resources. In some embodiments, an apparatus includes queue circuitry that implements multiple queues configured to queue requests to access a memory bus. Control circuitry may, in response to detecting a first threshold condition associated with the queue circuitry, generate a first snapshot that indicates numbers of requests in respective queues of the multiple queues at a first time. The control circuitry may generate a second snapshot that indicates numbers of requests in respective queues of the multiple queues at a second time that is subsequent to the first time. The control circuitry may arbitrate between requests from the multiple queues to select requests to access the memory bus, where the arbitration is based on snapshots to which requests from the multiple queues belong. Disclosed techniques may approximate age-based scheduling while reducing area and power consumption.

Type: Grant

Filed: May 19, 2021

Date of Patent: September 13, 2022

Assignee: Apple Inc.

Inventors: Winnie W. Yeung, Leela Kishore Kothamasu, Zelin Zhang, Guanlan Xu, Eddie M. Robinson
Compression Techniques and Hierarchical Caching

Publication number: 20210295593

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, a memory system implements a storage hierarchy that includes first cache circuitry and second cache circuitry at different levels of the hierarchy. Processor circuitry generates write data to be written to the memory system. In some embodiments, first compression circuitry is configured to compress a first block of write data in response to full accumulation of the first block in the first cache circuitry and second compression circuitry is configured to compress a second block of write data in response to full accumulation of the second block in the second cache circuitry. Write circuitry may write the first and second compressed blocks of data in a single combined write to a higher level in the storage hierarchy.

Type: Application

Filed: June 4, 2021

Publication date: September 23, 2021

Inventors: Anthony P. DeLaurier, Karl D. Mann, Tyson J. Bergland, Winnie W. Yeung
Compression techniques for pixel write data

Patent number: 11062507

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, programmable shader circuitry is configured to execute program instructions of compute kernels that write pixel data. In some embodiments, a first cache is configured to store pixel write data from the programmable shader circuitry and first compression circuitry is configured to compress a first block of pixel write data in response to full accumulation of the first block in the first cache circuitry. In some embodiments, second cache circuitry is configured to store pixel write data from the programmable shader circuitry at a higher level in a storage hierarchy than the first cache circuitry and second compression circuitry is configured to compress a second block of pixel write data in response to full accumulation of the second block in the second cache circuitry.

Type: Grant

Filed: November 4, 2019

Date of Patent: July 13, 2021

Assignee: Apple Inc.

Inventors: Anthony P. DeLaurier, Karl D. Mann, Tyson J. Bergland, Winnie W. Yeung
Compression Techniques for Pixel Write Data

Publication number: 20210134052

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, programmable shader circuitry is configured to execute program instructions of compute kernels that write pixel data. In some embodiments, a first cache is configured to store pixel write data from the programmable shader circuitry and first compression circuitry is configured to compress a first block of pixel write data in response to full accumulation of the first block in the first cache circuitry. In some embodiments, second cache circuitry is configured to store pixel write data from the programmable shader circuitry at a higher level in a storage hierarchy than the first cache circuitry and second compression circuitry is configured to compress a second block of pixel write data in response to full accumulation of the second block in the second cache circuitry.

Type: Application

Filed: November 4, 2019

Publication date: May 6, 2021

Inventors: Anthony P. DeLaurier, Karl D. Mann, Tyson J. Bergland, Winnie W. Yeung
Cache drop feature to increase memory bandwidth and save power

Patent number: 10970223

Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.

Type: Grant

Filed: May 13, 2019

Date of Patent: April 6, 2021

Assignee: Apple Inc.

Inventors: Wolfgang H. Klingauf, Kenneth C. Dyke, Karthik Ramani, Winnie W. Yeung, Anthony P. DeLaurier, Luc R. Semeria, David A. Gotwalt, Srinivasa Rangan Sridharan, Muditha Kanchana
CACHE DROP FEATURE TO INCREASE MEMORY BANDWIDTH AND SAVE POWER

Publication number: 20190266102

Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.

Type: Application

Filed: May 13, 2019

Publication date: August 29, 2019

Inventors: Wolfgang H. Klingauf, Kenneth C. Dyke, Karthik Ramani, Winnie W. Yeung, Anthony P. DeLaurier, Luc R. Semeria, David A. Gotwalt, Srinivasa Rangan Sridharan, Muditha Kanchana
Cache drop feature to increase memory bandwidth and save power

Patent number: 10289565

Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.

Type: Grant

Filed: May 31, 2017

Date of Patent: May 14, 2019

Assignee: Apple Inc.

Inventors: Wolfgang H. Klingauf, Kenneth C. Dyke, Karthik Ramani, Winnie W. Yeung, Anthony P. DeLaurier, Luc R. Semeria, David A. Gotwalt, Srinivasa Rangan Sridharan, Muditha Kanchana

1 2 next