Patents by Inventor Yasuko ECKERT

Yasuko ECKERT has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10503640
    Abstract: A processor includes multiple processing units (e.g., processor cores), with each processing unit associated with at least one private, dedicated cache. The processor is also associated with a system memory that stores all data that can be accessed by the multiple processing units. A coherency manager (e.g., a coherence directory) of the processor enforces a specified coherency scheme to ensure data coherency between the different caches and between the caches and the system memory. In response to a memory access request to a given cache resulting in a cache miss, the coherency manager identifies the current access latency to the system memory as well as the current access latencies to other caches of the processor. The coherency manager transfers the targeted data to the given cache from the cache or system memory having the lower access latency.
    Type: Grant
    Filed: April 24, 2018
    Date of Patent: December 10, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Yasuko Eckert
  • Publication number: 20190370173
    Abstract: A miss in a cache by a thread in a wavefront is detected. The wavefront includes a plurality of threads that are executing a memory access request concurrently on a corresponding plurality of processor cores. A priority is assigned to the thread based on whether the memory access request is addressed to a local memory or a remote memory. The memory access request for the thread is performed based on the priority. In some cases, the cache is selectively bypassed depending on whether the memory access request is addressed to the local or remote memory. A cache block is requested in response to the miss. The cache block is biased towards a least recently used position in response to requesting the cache block from the local memory and towards a most recently used position in response to requesting the cache block from the remote memory.
    Type: Application
    Filed: May 30, 2018
    Publication date: December 5, 2019
    Inventors: Michael W. BOYER, Onur KAYIRAN, Yasuko ECKERT, Steven RAASCH, Muhammad SHOAIB BIN ALTAF
  • Publication number: 20190370174
    Abstract: A processing system includes a first set of one or more processing units including a first processing unit, a second set of one or more processing units including a second processing unit, and a memory having an address space shared by the first and second sets. The processing system further includes a distributed coherence directory subsystem having a first coherence directory to support a first subset of one or more address regions of the address space and a second coherence directory to support a second subset of one or more address regions of the address space. In some implementations, the first coherence directory is implemented in the system so as to have a lower access latency for the first set, whereas the second coherence directory is implemented in the system so as to have a lower access latency for the second set.
    Type: Application
    Filed: June 5, 2018
    Publication date: December 5, 2019
    Inventors: Yasuko ECKERT, Maurice B. STEINMAN, Steven RAASCH
  • Publication number: 20190324906
    Abstract: A processor includes multiple processing units (e.g., processor cores), with each processing unit associated with at least one private, dedicated cache. The processor is also associated with a system memory that stores all data that can be accessed by the multiple processing units. A coherency manager (e.g., a coherence directory) of the processor enforces a specified coherency scheme to ensure data coherency between the different caches and between the caches and the system memory. In response to a memory access request to a given cache resulting in a cache miss, the coherency manager identifies the current access latency to the system memory as well as the current access latencies to other caches of the processor. The coherency manager transfers the targeted data to the given cache from the cache or system memory having the lower access latency.
    Type: Application
    Filed: April 24, 2018
    Publication date: October 24, 2019
    Inventor: Yasuko ECKERT
  • Patent number: 10452437
    Abstract: Systems, apparatuses, and methods for performing temperature-aware task scheduling and proactive power management. A SoC includes a plurality of processing units and a task queue storing pending tasks. The SoC calculates a thermal metric for each pending task to predict an amount of heat the pending task will generate. The SoC also determines a thermal gradient for each processing unit to predict a rate at which the processing unit's temperature will change when executing a task. The SoC also monitors a thermal margin of how far each processing unit is from reaching its thermal limit. The SoC minimizes non-uniform heat generation on the SoC by scheduling pending tasks from the task queue to the processing units based on the thermal metrics for the pending tasks, the thermal gradients of each processing unit, and the thermal margin available on each processing unit.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: October 22, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Abhinandan Majumdar, Brian J. Kocoloski, Leonardo Piga, Wei Huang, Yasuko Eckert
  • Patent number: 10389251
    Abstract: The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.
    Type: Grant
    Filed: September 13, 2018
    Date of Patent: August 20, 2019
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Wei Huang, Yasuko Eckert, Xudong An, Muhammad Shoaib Bin Altaf, Jieming Yin
  • Patent number: 10339067
    Abstract: A technique for use in a memory system includes swapping a first plurality of pages of a first memory of the memory system with a second plurality of pages of a second memory of the memory system. The first memory has a first latency and the second memory has a second latency. The first latency is less than the second latency. The technique includes updating a page table and triggering a translation lookaside buffer shootdown to associate a virtual address of each of the first plurality of pages with a corresponding physical address in the second memory and to associate a virtual address for each of the second plurality of pages with a corresponding physical address in the first memory.
    Type: Grant
    Filed: June 19, 2017
    Date of Patent: July 2, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Yasuko Eckert, Thiruvengadam Vijayaraghavan, Gabriel H. Loh
  • Patent number: 10310981
    Abstract: A method and apparatus for performing memory prefetching includes determining whether to initiate prefetching. Upon a determination to initiate prefetching, a first memory row is determined as a suitable prefetch candidate, and it is determined whether a particular set of one or more cachelines of the first memory row is to be prefetched.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: June 4, 2019
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Yasuko Eckert, Nuwan Jayasena, Reena Panda, Onur Kayiran, Michael W. Boyer
  • Publication number: 20190163632
    Abstract: A method includes monitoring, at a cache coherence directory, states of cachelines stored in a cache hierarchy of a data processing system using a plurality of entries of the cache coherence directory. Each entry of the cache coherence directory is associated with a corresponding cache page of a plurality of cache pages, and each cache page representing a corresponding set of contiguous cachelines. The method further includes selectively evicting cachelines from a first cache of the cache hierarchy based on cacheline utilization densities of cache pages represented by the corresponding entries of the plurality of entries of the cache coherence directory.
    Type: Application
    Filed: November 29, 2017
    Publication date: May 30, 2019
    Inventors: William L. WALKER, Michael W. BOYER, Yasuko ECKERT, Gabriel H. LOH
  • Publication number: 20190163644
    Abstract: A first processor is configured to detect migration of a page from a second memory associated with a second processor to a first memory associated with the first processor or to detect duplication of the page in the first memory and the second memory. The first processor implements a translation lookaside buffer (TLB) and the first processor is configured to insert an entry in the TLB in response to the duplication or the migration of the page. The entry maps a virtual address of the page to a physical address in the first memory and the entry is inserted into the TLB without modifying a corresponding entry in a page table that maps the virtual address of the page to a physical address in the second memory. In some cases, a duplicate translation table (DTT) stores a copy of the entry that is accessed in response to a TLB miss.
    Type: Application
    Filed: November 29, 2017
    Publication date: May 30, 2019
    Inventors: Nuwan JAYASENA, Yasuko ECKERT
  • Patent number: 10303602
    Abstract: A processing system includes at least one central processing unit (CPU) core, at least one graphics processing unit (GPU) core, a main memory, and a coherence directory for maintaining cache coherence. The at least one CPU core receives a CPU cache flush command to flush cache lines stored in cache memory of the at least one CPU core prior to launching a GPU kernel. The coherence directory transfers data associated with a memory access request by the at least one GPU core from the main memory without issuing coherence probes to caches of the at least one CPU core.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: May 28, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Onur Kayiran, Gabriel H. Loh, Yasuko Eckert
  • Patent number: 10282295
    Abstract: A method includes monitoring, at a cache coherence directory, states of cachelines stored in a cache hierarchy of a data processing system using a plurality of entries of the cache coherence directory. Each entry of the cache coherence directory is associated with a corresponding cache page of a plurality of cache pages, and each cache page representing a corresponding set of contiguous cachelines. The method further includes selectively evicting cachelines from a first cache of the cache hierarchy based on cacheline utilization densities of cache pages represented by the corresponding entries of the plurality of entries of the cache coherence directory.
    Type: Grant
    Filed: November 29, 2017
    Date of Patent: May 7, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: William L. Walker, Michael W. Boyer, Yasuko Eckert, Gabriel H. Loh
  • Publication number: 20190123648
    Abstract: The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.
    Type: Application
    Filed: September 13, 2018
    Publication date: April 25, 2019
    Inventors: Wei Huang, Yasuko Eckert, Xudong An, Muhammad Shoaib Bin Altaf, Jieming Yin
  • Publication number: 20190065243
    Abstract: Systems, apparatuses, and methods for reducing memory power consumption without substantial performance impact by selectively delaying non-critical memory requests are disclosed. A system management unit transfers an amount of power allocated from a memory subsystem to other component(s) responsive to detecting a first condition. In one embodiment, the first condition is detecting one or more processors having tasks to execute. In response to the system management unit transferring the amount of power from the memory subsystem to one or more processors, a memory controller delays non-critical memory requests while performing critical memory requests to memory.
    Type: Application
    Filed: September 19, 2016
    Publication date: February 28, 2019
    Inventor: Yasuko Eckert
  • Patent number: 10198369
    Abstract: A data processing system includes a memory that includes a first memory bank and a second memory bank. The data processing system also includes a conflict detector connected to the memory and adapted to receive memory access information. The conflict detector tracks memory access statistics of the first memory bank, and determines if the first memory bank contains frequent row conflicts. The conflict detector also remaps a frequent row conflict in the first memory bank to the second memory bank. An indirection table is connected to the conflict detector and adapted to receive a memory access request, and redirects an address into a dynamically selected physical memory address in response to a remapping of the frequent row conflict to the second memory bank.
    Type: Grant
    Filed: March 24, 2017
    Date of Patent: February 5, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Yasuko Eckert, Reena Panda, Nuwan Jayasena
  • Patent number: 10162757
    Abstract: A distributed shared-memory system includes several nodes that each have one or more processor cores, caches, local main memory, and a directory. Each node further includes predictors that use historical memory access information to predict future coherence permission requirements and speculatively initiate coherence operations. In one embodiment, predictors are included at processor cores for monitoring a memory access stream (e.g., historical sequence of memory addresses referenced by a processor core) and predicting addresses of future accesses. In another embodiment, predictors are included at the directory of each node for monitoring memory access traffic and coherence-related activities for individual cache lines to predict future demands for particular cache lines. In other embodiments, predictors are included at both the processor cores and directory of each node.
    Type: Grant
    Filed: December 6, 2016
    Date of Patent: December 25, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Yasuko Eckert
  • Publication number: 20180365167
    Abstract: A technique for use in a memory system includes swapping a first plurality of pages of a first memory of the memory system with a second plurality of pages of a second memory of the memory system. The first memory has a first latency and the second memory has a second latency. The first latency is less than the second latency. The technique includes updating a page table and triggering a translation lookaside buffer shootdown to associate a virtual address of each of the first plurality of pages with a corresponding physical address in the second memory and to associate a virtual address for each of the second plurality of pages with a corresponding physical address in the first memory.
    Type: Application
    Filed: June 19, 2017
    Publication date: December 20, 2018
    Inventors: Yasuko Eckert, Thiruvengadam Vijayaraghavan, Gabriel H. Loh
  • Patent number: 10133678
    Abstract: In some embodiments, a method of managing cache memory includes identifying a group of cache lines in a cache memory, based on a correlation between the cache lines. The method also includes tracking evictions of cache lines in the group from the cache memory and, in response to a determination that a criterion regarding eviction of cache lines in the group from the cache memory is satisfied, selecting one or more (e.g., all) remaining cache lines in the group for eviction.
    Type: Grant
    Filed: August 28, 2013
    Date of Patent: November 20, 2018
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Yasuko Eckert, Syed Ali Jafri, Srilatha Manne, Gabriel Loh
  • Patent number: 10097091
    Abstract: The described embodiments include an apparatus that controls voltages for an integrated circuit chip having a set of circuits. The apparatus includes a switching voltage regulator separate from the integrated circuit chip and two or more low dropout (LDO) regulators fabricated on the integrated circuit chip. The switching voltage regulator provides an output voltage that is received as an input voltage by each of the two or more LDO regulators, and each of the two or more LDO regulators provides a local output voltage, each local output voltage received as a local input voltage by a different subset of the circuits in the set of circuits. During operation, a controller sets an operating point for each of the subsets of circuits based on a combined power efficiency for the subsets of the circuits and the LDO regulators, each operating point including a corresponding frequency and voltage.
    Type: Grant
    Filed: October 25, 2017
    Date of Patent: October 9, 2018
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Wei Huang, Yasuko Eckert, Xudong An, Muhammad Shoaib Bin Altaf, Jieming Yin
  • Publication number: 20180285264
    Abstract: A processing system includes at least one central processing unit (CPU) core, at least one graphics processing unit (GPU) core, a main memory, and a coherence directory for maintaining cache coherence. The at least one CPU core receives a CPU cache flush command to flush cache lines stored in cache memory of the at least one CPU core prior to launching a GPU kernel. The coherence directory transfers data associated with a memory access request by the at least one GPU core from the main memory without issuing coherence probes to caches of the at least one CPU core.
    Type: Application
    Filed: March 31, 2017
    Publication date: October 4, 2018
    Inventors: Onur KAYIRAN, Gabriel H. LOH, Yasuko ECKERT