Patents Assigned to Advanced Micro Device, Inc.

REFRESH MANAGEMENT FOR MEMORY

Publication number: 20220122652

Abstract: A memory controller interfaces with a random access memory over a memory channel. A refresh control circuit monitors an activate counter which counts a rolling number of activate commands sent over the memory channel to a memory region of the memory. In response to the activate counter being above an intermediate management threshold value, the refresh control circuit only issue a refresh management (RFM) command if there is no REF command currently held at the refresh command circuit for the memory region.

Type: Application

Filed: December 29, 2021

Publication date: April 21, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Kevin M. Brandl, Kedarnath Balakrishnan, Jing Wang, Guanhao Shen
Dynamic voltage frequency scaling based on active memory barriers

Patent number: 11307631

Abstract: A processing unit includes compute units partitioned into one or islands that are provided with operating voltages and clock signals having clock frequencies independent of providing operating voltages or clock signals to other islands of compute units. The processing unit also includes dynamic voltage and frequency scaling (DVFS) hardware configured to compute one or more numbers of active memory barriers in the one or more islands. The DVFS hardware is also configured to modify the operating voltages or clock frequencies provided to the one or more islands in response to a change in numbers of active memory barriers in the one or more islands. In some cases, the operating voltage or clock frequency provided to an island is increased in response to the number of active memory barriers in the island decreasing. The operating voltage or clock frequency provided to the island is decreased in response to the number of active memory barriers in the island increasing.

Type: Grant

Filed: May 29, 2019

Date of Patent: April 19, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventor: Vedula Venkata Srikant Bharadwaj
Semiconductor chip with solder cap probe test pads

Patent number: 11309222

Abstract: Various semiconductor chips with solder capped probe test pads are disclosed. In accordance with one aspect of the present invention, a semiconductor chip is provided that includes a substrate, plural input/output (I/O) structures on the substrate and plural test pads on the substrate. Each of the test pads includes a first conductor pad and a first solder cap on the first conductor pad.

Type: Grant

Filed: August 29, 2019

Date of Patent: April 19, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Lei Fu, Milind S. Bhagavat, Chia-Hao Cheng
System and method for multiplexer tree indexing

Patent number: 11308057

Abstract: Described herein is a system and method for multiplexer tree (muxtree) indexing. Muxtree indexing performs hashing and row reduction in parallel by use of each select bit only once in a particular path of the muxtree. The muxtree indexing generates a different final index as compared to conventional hashed indexing but still results in a fair hash, where all table entries get used with equal distribution with uniformly random selects.

Type: Grant

Filed: November 28, 2017

Date of Patent: April 19, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Steven R. Havlir, Patrick J. Shyvers
Semi-sorting compression with encoding and decoding tables

Patent number: 11309911

Abstract: A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.

Type: Grant

Filed: August 16, 2019

Date of Patent: April 19, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Alexander D. Breslow, Nuwan Jayasena, John Kalamatianos
Compressing texture data on a per-channel basis

Patent number: 11308648

Abstract: Sampling circuitry independently accesses channels of texture data that represent a set of pixels. One or more processing units separately compress the channels of the texture data and store compressed data representative of the channels of the texture data for the set of pixels. The channels can include a red channel, a blue channel, and a green channel that represent color values of the set of pixels and an alpha channel that represents degrees of transparency of the set of pixels. Storing the compressed data can include writing the compress data to portions of a cache. The processing units can identify a subset of the set of pixels that share a value of a first channel of the plurality of channels and represent the value of the first channel over the subset of the set of pixels using information representing the value, the first channel, and boundaries of the subset.

Type: Grant

Filed: September 23, 2020

Date of Patent: April 19, 2022

Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC

Inventors: Saurabh Sharma, Laurent Lefebvre, Sagar Shankar Bhandare, Ruijin Wu
Dynamic remapping of virtual address ranges using remap vector

Patent number: 11307993

Abstract: For one or more stages of execution of a software application at a first processor, a remap vector of a second processor is reconfigured to represent a dynamic mapping of virtual address groups to physical address groups for that stage. Each bit position of the remap vector is configured to store a value indicating whether a corresponding virtual address group is actively mapped to a corresponding physical address group. Address translation operations issued during a stage of execution of the software application are selectively processed based on the configuration of the remap vector for that stage, with the particular value at the bit position of the remap vector associated with the corresponding virtual address group controlling whether processing of the address translation operation is continued to obtain a virtual-to-physical address translation sought by the address translation operation or processing of the address translation operation is ceased and a fault is issued.

Type: Grant

Filed: November 26, 2018

Date of Patent: April 19, 2022

Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC

Inventors: Anthony Asaro, Richard E. George
SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS

Publication number: 20220114097

Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.

Type: Application

Filed: December 20, 2021

Publication date: April 14, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Zhe Wang, Sooraj Puthoor, Bradford M. Beckmann
System and method for page-conscious GPU instruction

Patent number: 11301256

Abstract: Embodiments disclose a system and method for reducing virtual address translation latency in a wide execution engine that implements virtual memory. One example method describes a method comprising receiving a wavefront, classifying the wavefront into a subset based on classification criteria selected to reduce virtual address translation latency associated with a memory support structure, and scheduling the wavefront for processing based on the classifying.

Type: Grant

Filed: August 22, 2014

Date of Patent: April 12, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Lisa R. Hsu, James Michael O'Connor
Self-regulating power management for a neural network system

Patent number: 11294747

Abstract: A neural network runs a known input data set using an error free power setting and using an error prone power setting. The differences in the outputs of the neural network using the two different power settings determine a high level error rate associated with the output of the neural network using the error prone power setting. If the high level error rate is excessive, the error prone power setting is adjusted to reduce errors by changing voltage and/or clock frequency utilized by the neural network system. If the high level error rate is within bounds, the error prone power setting can remain allowing the neural network to operate with an acceptable error tolerance and improved efficiency. The error tolerance can be specified by the neural network application.

Type: Grant

Filed: January 31, 2018

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Andrew G. Kegel, David A. Roberts
Scheduler queue assignment

Patent number: 11294678

Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.

Type: Grant

Filed: May 29, 2018

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
Shared resource allocation in a multi-threaded microprocessor

Patent number: 11294724

Abstract: An approach is provided for allocating a shared resource to threads in a multi-threaded microprocessor based upon the usefulness of the shared resource to each of the threads. The usefulness of a shared resource to a thread is determined based upon the number of entries in the shared resource that are allocated to the thread and the number of active entries that the thread has in the shared resource. Threads that are allocated a large number of entries in the shared resource and have a small number of active entries in the shared resource, indicative of a low level of parallelism, can operate efficiently with fewer entries in the shared resource, and have their allocation limit in the shared resource reduced.

Type: Grant

Filed: September 27, 2019

Date of Patent: April 5, 2022

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Kai Troester, Neil Marketkar, Matthew T. Sobel, Srinivas Keshav
Thread switch for accesses to slow memory

Patent number: 11294710

Abstract: A processing system suspends execution of a program thread based on an access latency required for a program thread to access memory. The processing system employs different memory modules having different memory technologies, located at different points in the processing system, and the like, or a combination thereof. The different memory modules therefore have different access latencies for memory transactions (e.g., memory reads and writes). When a program thread issues a memory transaction that results in an access to a memory module having a relatively long access latency (referred to as “slow” memory), the processor suspends execution of the program thread and releases processor resources used by the program thread. When the processor receives a response to the memory transaction from the memory module, the processor resumes execution of the suspended program thread.

Type: Grant

Filed: November 10, 2017

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventor: Douglas Benson Hunt
Spatial partitioning in a multi-tenancy graphics processing unit

Patent number: 11295507

Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.

Type: Grant

Filed: November 6, 2020

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Mark Leather, Michael Mantor
Memory request throttling to constrain memory bandwidth utilization

Patent number: 11294810

Abstract: A processing system includes an interconnect fabric coupleable to a local memory and at least one compute cluster coupled to the interconnect fabric. The compute cluster includes a processor core and a cache hierarchy. The cache hierarchy has a plurality of caches and a throttle controller configured to throttle a rate of memory requests issuable by the processor core based on at least one of an access latency metric and a prefetch accuracy metric. The access latency metric represents an average access latency for memory requests for the processor core and the prefetch accuracy metric represents an accuracy of a prefetcher of a cache of the cache hierarchy.

Type: Grant

Filed: December 12, 2017

Date of Patent: April 5, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: William L. Walker, William E. Jones
COMPILER DIRECTED FINE GRAINED POWER MANAGEMENT

Publication number: 20220100257

Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.

Type: Application

Filed: September 25, 2020

Publication date: March 31, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan
COMBINED CODEC BUFFER MANAGEMENT

Publication number: 20220103907

Abstract: Techniques are provided herein for processing video data. The techniques include identifying one or more input factors including one or more of signal quality factors, video content complexity factors, and hardware buffering factors for one or more of a video encoding system and a video playback system; evaluating the one or more input factors to determine adjustments to apply to one or both of the video encoding system and the video playback system; and applying the determine adjustments to the one or both of the video encoding system and the video playback system.

Type: Application

Filed: September 25, 2020

Publication date: March 31, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Adam H. Li, Eugene Kuznetsov, Girish P. Subramaniam, Jihyuk Choi
TECHNIQUES FOR HANDLING CACHE COHERENCY TRAFFIC FOR CONTENDED SEMAPHORES

Publication number: 20220100662

Abstract: The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.

Type: Application

Filed: December 9, 2021

Publication date: March 31, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: John M. King, Gregory W. Smaus
EFFICIENT MEMORY-SEMANTIC NETWORKING USING SCOPED MEMORY MODELS

Publication number: 20220100391

Abstract: A framework disclosed herein extends a relaxed, scoped memory model to a system that includes nodes across a commodity network and maintains coherency across the system. A new scope, cluster scope, is defined, that allows for memory accesses at scopes less than cluster scope to operate on locally cached versions of remote data from across the commodity network without having to issue expensive network operations. Cluster scope operations generate network commands that are used to synchronize memory across the commodity network.

Type: Application

Filed: September 25, 2020

Publication date: March 31, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Michael W. LeBeane, Khaled Hamidouche, Hari S. Thangirala, Brandon Keith Potter
DIRECT-CONNECTED MACHINE LEARNING ACCELERATOR

Publication number: 20220101179

Abstract: Techniques are disclosed for communicating between a machine learning accelerator and one or more processing cores. The techniques include obtaining data at the machine learning accelerator via an input/output die; processing the data at the machine learning accelerator to generate machine learning processing results; and exporting the machine learning processing results via the input/output die, wherein the input/output die is coupled to one or more processor chiplets via one or more processor ports, and wherein the input/output die is coupled to the machine learning accelerator via an accelerator port.

Type: Application

Filed: September 25, 2020

Publication date: March 31, 2022

Applicant: Advanced Micro Devices, Inc.

Inventor: Maxim V. Kazakov

prev … 70 71 72 73 74 75 76 77 78 … next