Patents by Inventor Yasuko ECKERT

Yasuko ECKERT has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VARIATION-AWARE QUBIT MOVEMENT SCHEME FOR NOISE INTERMEDIATE SCALE QUANTUM ERA COMPUTERS

Publication number: 20210182234

Abstract: Systems and methods for efficiently routing qubits in a quantum computing system include selecting bubble nodes and routing qubits to the bubble nodes. The systems and methods further include dividing a system of nodes into regions and selecting a bubble node for each region. The systems and methods further include using super bubble nodes with reliable links connected to other super bubble nodes and bubble nodes to improve cross-region operations.

Type: Application

Filed: December 16, 2019

Publication date: June 17, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Majed Valad Beigi, Yasuko Eckert, Dongping Zhang
ADAPTIVE CACHE MANAGEMENT BASED ON PROGRAMMING MODEL INFORMATION

Publication number: 20210182193

Abstract: A processing system selectively allocates space to store a group of one or more cache lines at a cache level of a cache hierarchy having a plurality of cache levels based on memory access patterns of a software application executing at the processing system. The processing system generates bit vectors indicating which cache levels are to allocate space to store groups of one or more cache lines based on the memory access patterns, which are derived from data granularity and movement information. Based on the bit vectors, the processing system provides hints to the cache hierarchy indicating the lowest cache level that can exploit the reuse potential for a particular data.

Type: Application

Filed: December 13, 2019

Publication date: June 17, 2021

Inventors: Weon Taek NA, Jagadish B. KOTRA, Yasuko ECKERT, Steven RAASCH, Sergey BLAGODUROV
CACHE MANAGEMENT BASED ON ACCESS TYPE PRIORITY

Publication number: 20210182216

Abstract: Systems, apparatuses, and methods for cache management based on access type priority are disclosed. A system includes at least a processor and a cache. During a program execution phase, certain access types are more likely to cause demand hits in the cache than others. Demand hits are load and store hits to the cache. A run-time profiling mechanism is employed to find which access types are more likely to cause demand hits. Based on the profiling results, the cache lines that will likely be accessed in the future are retained based on their most recent access type. The goal is to increase demand hits and thereby improve system performance. An efficient cache replacement policy can potentially reduce redundant data movement, thereby improving system performance and reducing energy consumption.

Type: Application

Filed: December 16, 2019

Publication date: June 17, 2021

Inventors: Jieming Yin, Yasuko Eckert, Subhash Sethumurugan
MEMORY REQUEST PRIORITY ASSIGNMENT TECHNIQUES FOR PARALLEL PROCESSORS

Publication number: 20210173796

Abstract: Systems, apparatuses, and methods for implementing memory request priority assignment techniques for parallel processors are disclosed. A system includes at least a parallel processor coupled to a memory subsystem, where the parallel processor includes at least a plurality of compute units for executing wavefronts in lock-step. The parallel processor assigns priorities to memory requests of wavefronts on a per-work-item basis by indexing into a first priority vector, with the index generated based on lane-specific information. If a given event is detected, a second priority vector is generated by applying a given priority promotion vector to the first priority vector. Then, for subsequent wavefronts, memory requests are assigned priorities by indexing into the second priority vector with lane-specific information. The use of priority vectors to assign priorities to memory requests helps to reduce the memory divergence problem experienced by different work-items of a wavefront.

Type: Application

Filed: December 6, 2019

Publication date: June 10, 2021

Inventors: Sooraj Puthoor, Kishore Punniyamurthy, Onur Kayiran, Xianwei Zhang, Yasuko Eckert, Johnathan Alsop, Bradford Michael Beckmann
Interconnect architecture for three-dimensional processing systems

Patent number: 10984838

Abstract: A processing system includes a plurality of processor cores formed in a first layer of an integrated circuit device and a plurality of partitions of memory formed in one or more second layers of the integrated circuit device. The one or more second layers are deployed in a stacked configuration with the first layer. Each of the partitions is associated with a subset of the processor cores that have overlapping footprints with the partitions. The processing system also includes first memory paths between the processor cores and their corresponding subsets of partitions. The processing system further includes second memory paths between the processor cores and the partitions.

Type: Grant

Filed: November 17, 2015

Date of Patent: April 20, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Nuwan S. Jayasena, Yasuko Eckert
CACHE MANAGEMENT BASED ON REUSE DISTANCE

Publication number: 20210109861

Abstract: A cache of a processor includes a cache controller to implement a cache management policy for the insertion and replacement of cache lines of the cache. The cache management policy assigns replacement priority levels to each cache line of at least a subset of cache lines in a region of the cache based on a comparison of a number of accesses to a cache set having a way that stores a cache line since the cache line was last accessed to a reuse distance determined for the region of the cache, wherein the reuse distance represents an average number of accesses to a given cache set of the region between accesses to any given cache line of the cache set.

Type: Application

Filed: October 14, 2019

Publication date: April 15, 2021

Inventors: Jieming YIN, Subhash SETHUMURUGAN, Yasuko ECKERT
Method and system for opportunistic load balancing in neural networks using metadata

Patent number: 10970120

Abstract: Methods and systems for opportunistic load balancing in deep neural networks (DNNs) using metadata. Representative computational costs are captured, obtained or determined for a given architectural, functional or computational aspect of a DNN system. The representative computational costs are implemented as metadata for the given architectural, functional or computational aspect of the DNN system. In an implementation, the computed computational cost is implemented as the metadata. A scheduler detects whether there are neurons in subsequent layers that are ready to execute. The scheduler uses the metadata and neuron availability to schedule and load balance across compute resources and available resources.

Type: Grant

Filed: June 26, 2018

Date of Patent: April 6, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Nicholas Malaya, Yasuko Eckert
Mechanism for dynamic latency-bandwidth trade-off for efficient broadcasts/multicasts

Patent number: 10938709

Abstract: A method includes receiving, from an origin computing node, a first communication addressed to multiple destination computing nodes in a processor interconnect fabric, measuring a first set of one or more communication metrics associated with a transmission path to one or more of the multiple destination computing nodes, and for each of the destination computing nodes, based on the set of communication metrics, selecting between a multicast transmission mode and unicast transmission mode as a transmission mode for transmitting the first communication to the destination computing node.

Type: Grant

Filed: December 18, 2018

Date of Patent: March 2, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert, Jieming Yin
Hierarchical register file at a graphics processing unit

Patent number: 10853904

Abstract: A processor employs a hierarchical register file for a graphics processing unit (GPU). A top level of the hierarchical register file is stored at a local memory of the GPU (e.g., a memory on the same integrated circuit die as the GPU). Lower levels of the hierarchical register file are stored at a different, larger memory, such as a remote memory located on a different die than the GPU. A register file control module monitors the status of in-flight wavefronts at the GPU, and in particular whether each in-flight wavefront is active, predicted to be become active, or inactive. The register file control module places execution data for active and predicted-active wavefronts in the top level of the hierarchical register file and places execution data for inactive wavefronts at lower levels of the hierarchical register file.

Type: Grant

Filed: March 24, 2016

Date of Patent: December 1, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Yasuko Eckert, Nuwan Jayasena
Prioritizing local and remote memory access in a non-uniform memory access architecture

Patent number: 10838864

Abstract: A miss in a cache by a thread in a wavefront is detected. The wavefront includes a plurality of threads that are executing a memory access request concurrently on a corresponding plurality of processor cores. A priority is assigned to the thread based on whether the memory access request is addressed to a local memory or a remote memory. The memory access request for the thread is performed based on the priority. In some cases, the cache is selectively bypassed depending on whether the memory access request is addressed to the local or remote memory. A cache block is requested in response to the miss. The cache block is biased towards a least recently used position in response to requesting the cache block from the local memory and towards a most recently used position in response to requesting the cache block from the remote memory.

Type: Grant

Filed: May 30, 2018

Date of Patent: November 17, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Michael W. Boyer, Onur Kayiran, Yasuko Eckert, Steven Raasch, Muhammad Shoaib Bin Altaf
ADAPTIVE CACHE RECONFIGURATION VIA CLUSTERING

Publication number: 20200293445

Abstract: A method of dynamic cache configuration includes determining, for a first clustering configuration, whether a current cache miss rate exceeds a miss rate threshold. The first clustering configuration includes a plurality of graphics processing unit (GPU) compute units clustered into a first plurality of compute unit clusters. The method further includes clustering, based on the current cache miss rate exceeding the miss rate threshold, the plurality of GPU compute units into a second clustering configuration having a second plurality of compute unit clusters fewer than the first plurality of compute unit clusters.

Type: Application

Filed: March 15, 2019

Publication date: September 17, 2020

Inventors: Mohamed Assem IBRAHIM, Onur KAYIRAN, Yasuko ECKERT, Gabriel H. LOH
DISTRIBUTED COHERENCE DIRECTORY SUBSYSTEM WITH EXCLUSIVE DATA REGIONS

Publication number: 20200278930

Abstract: A processing system includes a first set of one or more processing units including a first processing unit, a second set of one or more processing units including a second processing unit, and a memory having an address space shared by the first and second sets. The processing system further includes a distributed coherence directory subsystem having a first coherence directory to support a first subset of one or more address regions of the address space and a second coherence directory to support a second subset of one or more address regions of the address space. In some implementations, the first coherence directory is implemented in the system so as to have a lower access latency for the first set, whereas the second coherence directory is implemented in the system so as to have a lower access latency for the second set.

Type: Application

Filed: March 17, 2020

Publication date: September 3, 2020

Inventors: Yasuko ECKERT, Maurice B. STEINMAN, Steven RAASCH
Using Predictions of Outcomes of Cache Memory Access Requests for Contolling Whether A Request Generator Sends Memory Access Requests To A Memory In Parallel With Cache Memory Access Requests

Publication number: 20200257623

Abstract: An electronic device handles memory access requests for data in a memory. The electronic device includes a memory controller for the memory, a last-level cache memory, a request generator, and a predictor. The predictor determines a likelihood that a cache memory access request for data at a given address will hit in the last-level cache memory. Based on the likelihood, the predictor determines: whether a memory access request is to be sent by the request generator to the memory controller for the data in parallel with the cache memory access request being resolved in the last-level cache memory, and, when the memory access request is to be sent, a type of memory access request that is to be sent. When the memory access request is to be sent, the predictor causes the request generator to send a memory request of the type to the memory controller.

Type: Application

Filed: February 12, 2019

Publication date: August 13, 2020

Inventors: Jieming Yin, Yasuko Eckert, Matthew R. Poremba, Steven E. Raasch, Doug Hunt
Using predictions of outcomes of cache memory access requests for controlling whether a request generator sends memory access requests to a memory in parallel with cache memory access requests

Patent number: 10719441

Abstract: An electronic device handles memory access requests for data in a memory. The electronic device includes a memory controller for the memory, a last-level cache memory, a request generator, and a predictor. The predictor determines a likelihood that a cache memory access request for data at a given address will hit in the last-level cache memory. Based on the likelihood, the predictor determines: whether a memory access request is to be sent by the request generator to the memory controller for the data in parallel with the cache memory access request being resolved in the last-level cache memory, and, when the memory access request is to be sent, a type of memory access request that is to be sent. When the memory access request is to be sent, the predictor causes the request generator to send a memory request of the type to the memory controller.

Type: Grant

Filed: February 12, 2019

Date of Patent: July 21, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Jieming Yin, Yasuko Eckert, Matthew R. Poremba, Steven E. Raasch, Doug Hunt
Coherency directory entry allocation based on eviction costs

Patent number: 10705958

Abstract: A processor partitions a coherency directory into different regions for different processor cores and manages the number of entries allocated to each region based at least in part on monitored recall costs indicating expected resource costs for reallocating entries. Examples of monitored recall costs include a number of cache evictions associated with entry reallocation, the hit rate of each region of the coherency directory, and the like, or a combination thereof. By managing the entries allocated to each region based on the monitored recall costs, the processor ensures that processor cores associated with denser memory access patterns (that is, memory access patterns that more frequently access cache lines associated with the same memory pages) are assigned more entries of the coherency directory.

Type: Grant

Filed: August 22, 2018

Date of Patent: July 7, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Michael W. Boyer, Gabriel H. Loh, Yasuko Eckert, William L. Walker
MECHANISM FOR DYNAMIC LATENCY-BANDWIDTH TRADE-OFF FOR EFFICIENT BROADCASTS/MULTICASTS

Publication number: 20200195546

Abstract: A method includes receiving, from an origin computing node, a first communication addressed to multiple destination computing nodes in a processor interconnect fabric, measuring a first set of one or more communication metrics associated with a transmission path to one or more of the multiple destination computing nodes, and for each of the destination computing nodes, based on the set of communication metrics, selecting between a multicast transmission mode and unicast transmission mode as a transmission mode for transmitting the first communication to the destination computing node.

Type: Application

Filed: December 18, 2018

Publication date: June 18, 2020

Inventors: Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert, Jieming Yin
MECHANISM FOR DISTRIBUTED-SYSTEM-AWARE DIFFERENCE ENCODING/DECODING IN GRAPH ANALYTICS

Publication number: 20200167328

Abstract: A portion of a graph dataset is generated for each computing node in a distributed computing system by, for each subject vertex in a graph, recording for the computing node an offset for the subject vertex, where the offset references a first position in an edge array for the computing node, and for each edge of a set of edges coupled with the subject vertex in the graph, calculating an edge value for the edge based on a connected vertex identifier identifying a vertex coupled with the subject vertex via the edge. When the edge value is assigned to the first position, the edge value is determined by a first calculation, and when the edge value is assigned to position subsequent to the first position, the edge value is determined by a second calculation. In the computing node, the edge value is recorded in the edge array.

Type: Application

Filed: November 27, 2018

Publication date: May 28, 2020

Inventors: Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert
Distributed coherence directory subsystem with exclusive data regions

Patent number: 10635588

Abstract: A processing system includes a first set of one or more processing units including a first processing unit, a second set of one or more processing units including a second processing unit, and a memory having an address space shared by the first and second sets. The processing system further includes a distributed coherence directory subsystem having a first coherence directory to support a first subset of one or more address regions of the address space and a second coherence directory to support a second subset of one or more address regions of the address space. In some implementations, the first coherence directory is implemented in the system so as to have a lower access latency for the first set, whereas the second coherence directory is implemented in the system so as to have a lower access latency for the second set.

Type: Grant

Filed: June 5, 2018

Date of Patent: April 28, 2020

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Yasuko Eckert, Maurice B. Steinman, Steven Raasch
COHERENCY DIRECTORY ENTRY ALLOCATION BASED ON EVICTION COSTS

Publication number: 20200065246

Abstract: A processor partitions a coherency directory into different regions for different processor cores and manages the number of entries allocated to each region based at least in part on monitored recall costs indicating expected resource costs for reallocating entries. Examples of monitored recall costs include a number of cache evictions associated with entry reallocation, the hit rate of each region of the coherency directory, and the like, or a combination thereof. By managing the entries allocated to each region based on the monitored recall costs, the processor ensures that processor cores associated with denser memory access patterns (that is, memory access patterns that more frequently access cache lines associated with the same memory pages) are assigned more entries of the coherency directory.

Type: Application

Filed: August 22, 2018

Publication date: February 27, 2020

Inventors: Michael W. BOYER, Gabriel H. LOH, Yasuko ECKERT, William L. WALKER
METHOD AND SYSTEM FOR OPPORTUNISTIC LOAD BALANCING IN NEURAL NETWORKS USING METADATA

Publication number: 20190391850

Abstract: Methods and systems for opportunistic load balancing in deep neural networks (DNNs) using metadata. Representative computational costs are captured, obtained or determined for a given architectural, functional or computational aspect of a DNN system. The representative computational costs are implemented as metadata for the given architectural, functional or computational aspect of the DNN system. In an implementation, the computed computational cost is implemented as the metadata. A scheduler detects whether there are neurons in subsequent layers that are ready to execute. The scheduler uses the metadata and neuron availability to schedule and load balance across compute resources and available resources.

Type: Application

Filed: June 26, 2018

Publication date: December 26, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Nicholas Malaya, Yasuko Eckert

prev 1 2 3 4 5 6 7 … next