Patents Assigned to Advanced Micro Devices

Selecting cache aging policy for prefetches based on cache test regions

Patent number: 11321245

Abstract: A cache controller applies an aging policy to a portion of a cache based on access metrics for different test regions of the cache, whereby each test region implements a different aging policy. The aging policy for each region establishes an initial age value for each entry of the cache, and a particular aging policy can set the age for a given entry based on whether the entry was placed in the cache in response to a demand request from a processor core or in response to a prefetch request. The cache controller can use the age value of each entry as a criterion in its cache replacement policy.

Type: Grant

Filed: November 12, 2019

Date of Patent: May 3, 2022

Assignee: Advanced Micro Devices, Inc.

Inventor: Paul Moyer
Techniques to improve translation lookaside buffer reach by leveraging idle resources

Patent number: 11321241

Abstract: Techniques are disclosed for processing address translations. The techniques include detecting a first miss for a first address translation request for a first address translation in a first translation lookaside buffer, in response to the first miss, fetching the first address translation into the first translation lookaside buffer and evicting a second address translation from the translation lookaside buffer into an instruction cache or local data share memory, detecting a second miss for a second address translation request referencing the second address translation, in the first translation lookaside buffer, and in response to the second miss, fetching the second address translation from the instruction cache or the local data share memory.

Type: Grant

Filed: August 31, 2020

Date of Patent: May 3, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Jagadish B. Kotra, Michael W. LeBeane
Region based split-directory scheme to adapt to large cache sizes

Patent number: 11314646

Abstract: Systems, apparatuses, and methods for maintaining region-based cache directories split between node and memory are disclosed. The system with multiple processing nodes includes cache directories split between the nodes and memory to help manage cache coherency among the nodes' cache subsystems. In order to reduce the number of entries in the cache directories, the cache directories track coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Each processing node includes a node-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the node. The node-based cache directory includes a reference count field in each entry to track the aggregate number of cache lines that are cached per region. The memory-based cache directory includes entries for regions which have an entry stored in any node-based cache directory of the system.

Type: Grant

Filed: July 2, 2020

Date of Patent: April 26, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Vydhyanathan Kalyanasundharam, Kevin M. Lepak, Amit P. Apte, Ganesh Balakrishnan
Integrated circuit product customizations for identification code visibility

Patent number: 11315883

Abstract: An apparatus includes a substrate including an identification code on a first side of the substrate and near a perimeter of the substrate. The apparatus includes a stiffener structure attached to the first side of the substrate. The stiffener structure has a cutout in an outer perimeter of the stiffener structure. The stiffener structure is oriented with respect to the substrate to cause the cutout to expose the identification code. The cutout may have a first dimension and a second dimension orthogonal to the first dimension. The first dimension may exceed a corresponding first dimension of the identification code and the second dimension may exceed a corresponding second dimension of the identification code, thereby forming a void region between the identification code and edges of the stiffener structure.

Type: Grant

Filed: November 12, 2019

Date of Patent: April 26, 2022

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Suming Hu, Roden Topacio, Farshad Ghahghahi, Jianguo Li, Andrew Kwan Wai Leung
REFRESH MANAGEMENT FOR MEMORY

Publication number: 20220122652

Abstract: A memory controller interfaces with a random access memory over a memory channel. A refresh control circuit monitors an activate counter which counts a rolling number of activate commands sent over the memory channel to a memory region of the memory. In response to the activate counter being above an intermediate management threshold value, the refresh control circuit only issue a refresh management (RFM) command if there is no REF command currently held at the refresh command circuit for the memory region.

Type: Application

Filed: December 29, 2021

Publication date: April 21, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Kevin M. Brandl, Kedarnath Balakrishnan, Jing Wang, Guanhao Shen
Semiconductor chip with solder cap probe test pads

Patent number: 11309222

Abstract: Various semiconductor chips with solder capped probe test pads are disclosed. In accordance with one aspect of the present invention, a semiconductor chip is provided that includes a substrate, plural input/output (I/O) structures on the substrate and plural test pads on the substrate. Each of the test pads includes a first conductor pad and a first solder cap on the first conductor pad.

Type: Grant

Filed: August 29, 2019

Date of Patent: April 19, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Lei Fu, Milind S. Bhagavat, Chia-Hao Cheng
Dynamic voltage frequency scaling based on active memory barriers

Patent number: 11307631

Abstract: A processing unit includes compute units partitioned into one or islands that are provided with operating voltages and clock signals having clock frequencies independent of providing operating voltages or clock signals to other islands of compute units. The processing unit also includes dynamic voltage and frequency scaling (DVFS) hardware configured to compute one or more numbers of active memory barriers in the one or more islands. The DVFS hardware is also configured to modify the operating voltages or clock frequencies provided to the one or more islands in response to a change in numbers of active memory barriers in the one or more islands. In some cases, the operating voltage or clock frequency provided to an island is increased in response to the number of active memory barriers in the island decreasing. The operating voltage or clock frequency provided to the island is decreased in response to the number of active memory barriers in the island increasing.

Type: Grant

Filed: May 29, 2019

Date of Patent: April 19, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventor: Vedula Venkata Srikant Bharadwaj
System and method for multiplexer tree indexing

Patent number: 11308057

Abstract: Described herein is a system and method for multiplexer tree (muxtree) indexing. Muxtree indexing performs hashing and row reduction in parallel by use of each select bit only once in a particular path of the muxtree. The muxtree indexing generates a different final index as compared to conventional hashed indexing but still results in a fair hash, where all table entries get used with equal distribution with uniformly random selects.

Type: Grant

Filed: November 28, 2017

Date of Patent: April 19, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Steven R. Havlir, Patrick J. Shyvers
Semi-sorting compression with encoding and decoding tables

Patent number: 11309911

Abstract: A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.

Type: Grant

Filed: August 16, 2019

Date of Patent: April 19, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Alexander D. Breslow, Nuwan Jayasena, John Kalamatianos
Compressing texture data on a per-channel basis

Patent number: 11308648

Abstract: Sampling circuitry independently accesses channels of texture data that represent a set of pixels. One or more processing units separately compress the channels of the texture data and store compressed data representative of the channels of the texture data for the set of pixels. The channels can include a red channel, a blue channel, and a green channel that represent color values of the set of pixels and an alpha channel that represents degrees of transparency of the set of pixels. Storing the compressed data can include writing the compress data to portions of a cache. The processing units can identify a subset of the set of pixels that share a value of a first channel of the plurality of channels and represent the value of the first channel over the subset of the set of pixels using information representing the value, the first channel, and boundaries of the subset.

Type: Grant

Filed: September 23, 2020

Date of Patent: April 19, 2022

Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC

Inventors: Saurabh Sharma, Laurent Lefebvre, Sagar Shankar Bhandare, Ruijin Wu
Dynamic remapping of virtual address ranges using remap vector

Patent number: 11307993

Abstract: For one or more stages of execution of a software application at a first processor, a remap vector of a second processor is reconfigured to represent a dynamic mapping of virtual address groups to physical address groups for that stage. Each bit position of the remap vector is configured to store a value indicating whether a corresponding virtual address group is actively mapped to a corresponding physical address group. Address translation operations issued during a stage of execution of the software application are selectively processed based on the configuration of the remap vector for that stage, with the particular value at the bit position of the remap vector associated with the corresponding virtual address group controlling whether processing of the address translation operation is continued to obtain a virtual-to-physical address translation sought by the address translation operation or processing of the address translation operation is ceased and a fault is issued.

Type: Grant

Filed: November 26, 2018

Date of Patent: April 19, 2022

Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC

Inventors: Anthony Asaro, Richard E. George
SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS

Publication number: 20220114097

Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.

Type: Application

Filed: December 20, 2021

Publication date: April 14, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Zhe Wang, Sooraj Puthoor, Bradford M. Beckmann
System and method for page-conscious GPU instruction

Patent number: 11301256

Abstract: Embodiments disclose a system and method for reducing virtual address translation latency in a wide execution engine that implements virtual memory. One example method describes a method comprising receiving a wavefront, classifying the wavefront into a subset based on classification criteria selected to reduce virtual address translation latency associated with a memory support structure, and scheduling the wavefront for processing based on the classifying.

Type: Grant

Filed: August 22, 2014

Date of Patent: April 12, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Lisa R. Hsu, James Michael O'Connor
Scheduler queue assignment

Patent number: 11294678

Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.

Type: Grant

Filed: May 29, 2018

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
Shared resource allocation in a multi-threaded microprocessor

Patent number: 11294724

Abstract: An approach is provided for allocating a shared resource to threads in a multi-threaded microprocessor based upon the usefulness of the shared resource to each of the threads. The usefulness of a shared resource to a thread is determined based upon the number of entries in the shared resource that are allocated to the thread and the number of active entries that the thread has in the shared resource. Threads that are allocated a large number of entries in the shared resource and have a small number of active entries in the shared resource, indicative of a low level of parallelism, can operate efficiently with fewer entries in the shared resource, and have their allocation limit in the shared resource reduced.

Type: Grant

Filed: September 27, 2019

Date of Patent: April 5, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Kai Troester, Neil Marketkar, Matthew T. Sobel, Srinivas Keshav
Self-regulating power management for a neural network system

Patent number: 11294747

Abstract: A neural network runs a known input data set using an error free power setting and using an error prone power setting. The differences in the outputs of the neural network using the two different power settings determine a high level error rate associated with the output of the neural network using the error prone power setting. If the high level error rate is excessive, the error prone power setting is adjusted to reduce errors by changing voltage and/or clock frequency utilized by the neural network system. If the high level error rate is within bounds, the error prone power setting can remain allowing the neural network to operate with an acceptable error tolerance and improved efficiency. The error tolerance can be specified by the neural network application.

Type: Grant

Filed: January 31, 2018

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Andrew G. Kegel, David A. Roberts
Thread switch for accesses to slow memory

Patent number: 11294710

Abstract: A processing system suspends execution of a program thread based on an access latency required for a program thread to access memory. The processing system employs different memory modules having different memory technologies, located at different points in the processing system, and the like, or a combination thereof. The different memory modules therefore have different access latencies for memory transactions (e.g., memory reads and writes). When a program thread issues a memory transaction that results in an access to a memory module having a relatively long access latency (referred to as “slow” memory), the processor suspends execution of the program thread and releases processor resources used by the program thread. When the processor receives a response to the memory transaction from the memory module, the processor resumes execution of the suspended program thread.

Type: Grant

Filed: November 10, 2017

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventor: Douglas Benson Hunt
Spatial partitioning in a multi-tenancy graphics processing unit

Patent number: 11295507

Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

Type: Grant

Filed: November 6, 2020

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Mark Leather, Michael Mantor
Memory request throttling to constrain memory bandwidth utilization

Patent number: 11294810

Abstract: A processing system includes an interconnect fabric coupleable to a local memory and at least one compute cluster coupled to the interconnect fabric. The compute cluster includes a processor core and a cache hierarchy. The cache hierarchy has a plurality of caches and a throttle controller configured to throttle a rate of memory requests issuable by the processor core based on at least one of an access latency metric and a prefetch accuracy metric. The access latency metric represents an average access latency for memory requests for the processor core and the prefetch accuracy metric represents an accuracy of a prefetcher of a cache of the cache hierarchy.

Type: Grant

Filed: December 12, 2017

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: William L. Walker, William E. Jones
COMPILER DIRECTED FINE GRAINED POWER MANAGEMENT

Publication number: 20220100257

Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.

Type: Application

Filed: September 25, 2020

Publication date: March 31, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan

prev … 70 71 72 73 74 75 76 77 78 … next