Patents Assigned to Advanced Micro Devices
-
Patent number: 11321245Abstract: A cache controller applies an aging policy to a portion of a cache based on access metrics for different test regions of the cache, whereby each test region implements a different aging policy. The aging policy for each region establishes an initial age value for each entry of the cache, and a particular aging policy can set the age for a given entry based on whether the entry was placed in the cache in response to a demand request from a processor core or in response to a prefetch request. The cache controller can use the age value of each entry as a criterion in its cache replacement policy.Type: GrantFiled: November 12, 2019Date of Patent: May 3, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Paul Moyer
-
Patent number: 11321241Abstract: Techniques are disclosed for processing address translations. The techniques include detecting a first miss for a first address translation request for a first address translation in a first translation lookaside buffer, in response to the first miss, fetching the first address translation into the first translation lookaside buffer and evicting a second address translation from the translation lookaside buffer into an instruction cache or local data share memory, detecting a second miss for a second address translation request referencing the second address translation, in the first translation lookaside buffer, and in response to the second miss, fetching the second address translation from the instruction cache or the local data share memory.Type: GrantFiled: August 31, 2020Date of Patent: May 3, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, Michael W. LeBeane
-
Patent number: 11314646Abstract: Systems, apparatuses, and methods for maintaining region-based cache directories split between node and memory are disclosed. The system with multiple processing nodes includes cache directories split between the nodes and memory to help manage cache coherency among the nodes' cache subsystems. In order to reduce the number of entries in the cache directories, the cache directories track coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Each processing node includes a node-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the node. The node-based cache directory includes a reference count field in each entry to track the aggregate number of cache lines that are cached per region. The memory-based cache directory includes entries for regions which have an entry stored in any node-based cache directory of the system.Type: GrantFiled: July 2, 2020Date of Patent: April 26, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Vydhyanathan Kalyanasundharam, Kevin M. Lepak, Amit P. Apte, Ganesh Balakrishnan
-
Patent number: 11315883Abstract: An apparatus includes a substrate including an identification code on a first side of the substrate and near a perimeter of the substrate. The apparatus includes a stiffener structure attached to the first side of the substrate. The stiffener structure has a cutout in an outer perimeter of the stiffener structure. The stiffener structure is oriented with respect to the substrate to cause the cutout to expose the identification code. The cutout may have a first dimension and a second dimension orthogonal to the first dimension. The first dimension may exceed a corresponding first dimension of the identification code and the second dimension may exceed a corresponding second dimension of the identification code, thereby forming a void region between the identification code and edges of the stiffener structure.Type: GrantFiled: November 12, 2019Date of Patent: April 26, 2022Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Suming Hu, Roden Topacio, Farshad Ghahghahi, Jianguo Li, Andrew Kwan Wai Leung
-
Publication number: 20220122652Abstract: A memory controller interfaces with a random access memory over a memory channel. A refresh control circuit monitors an activate counter which counts a rolling number of activate commands sent over the memory channel to a memory region of the memory. In response to the activate counter being above an intermediate management threshold value, the refresh control circuit only issue a refresh management (RFM) command if there is no REF command currently held at the refresh command circuit for the memory region.Type: ApplicationFiled: December 29, 2021Publication date: April 21, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Kevin M. Brandl, Kedarnath Balakrishnan, Jing Wang, Guanhao Shen
-
Patent number: 11309222Abstract: Various semiconductor chips with solder capped probe test pads are disclosed. In accordance with one aspect of the present invention, a semiconductor chip is provided that includes a substrate, plural input/output (I/O) structures on the substrate and plural test pads on the substrate. Each of the test pads includes a first conductor pad and a first solder cap on the first conductor pad.Type: GrantFiled: August 29, 2019Date of Patent: April 19, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Lei Fu, Milind S. Bhagavat, Chia-Hao Cheng
-
Patent number: 11307631Abstract: A processing unit includes compute units partitioned into one or islands that are provided with operating voltages and clock signals having clock frequencies independent of providing operating voltages or clock signals to other islands of compute units. The processing unit also includes dynamic voltage and frequency scaling (DVFS) hardware configured to compute one or more numbers of active memory barriers in the one or more islands. The DVFS hardware is also configured to modify the operating voltages or clock frequencies provided to the one or more islands in response to a change in numbers of active memory barriers in the one or more islands. In some cases, the operating voltage or clock frequency provided to an island is increased in response to the number of active memory barriers in the island decreasing. The operating voltage or clock frequency provided to the island is decreased in response to the number of active memory barriers in the island increasing.Type: GrantFiled: May 29, 2019Date of Patent: April 19, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Vedula Venkata Srikant Bharadwaj
-
Patent number: 11308057Abstract: Described herein is a system and method for multiplexer tree (muxtree) indexing. Muxtree indexing performs hashing and row reduction in parallel by use of each select bit only once in a particular path of the muxtree. The muxtree indexing generates a different final index as compared to conventional hashed indexing but still results in a fair hash, where all table entries get used with equal distribution with uniformly random selects.Type: GrantFiled: November 28, 2017Date of Patent: April 19, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Steven R. Havlir, Patrick J. Shyvers
-
Patent number: 11309911Abstract: A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.Type: GrantFiled: August 16, 2019Date of Patent: April 19, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Alexander D. Breslow, Nuwan Jayasena, John Kalamatianos
-
Patent number: 11308648Abstract: Sampling circuitry independently accesses channels of texture data that represent a set of pixels. One or more processing units separately compress the channels of the texture data and store compressed data representative of the channels of the texture data for the set of pixels. The channels can include a red channel, a blue channel, and a green channel that represent color values of the set of pixels and an alpha channel that represents degrees of transparency of the set of pixels. Storing the compressed data can include writing the compress data to portions of a cache. The processing units can identify a subset of the set of pixels that share a value of a first channel of the plurality of channels and represent the value of the first channel over the subset of the set of pixels using information representing the value, the first channel, and boundaries of the subset.Type: GrantFiled: September 23, 2020Date of Patent: April 19, 2022Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Saurabh Sharma, Laurent Lefebvre, Sagar Shankar Bhandare, Ruijin Wu
-
Patent number: 11307993Abstract: For one or more stages of execution of a software application at a first processor, a remap vector of a second processor is reconfigured to represent a dynamic mapping of virtual address groups to physical address groups for that stage. Each bit position of the remap vector is configured to store a value indicating whether a corresponding virtual address group is actively mapped to a corresponding physical address group. Address translation operations issued during a stage of execution of the software application are selectively processed based on the configuration of the remap vector for that stage, with the particular value at the bit position of the remap vector associated with the corresponding virtual address group controlling whether processing of the address translation operation is continued to obtain a virtual-to-physical address translation sought by the address translation operation or processing of the address translation operation is ceased and a fault is issued.Type: GrantFiled: November 26, 2018Date of Patent: April 19, 2022Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Anthony Asaro, Richard E. George
-
Publication number: 20220114097Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.Type: ApplicationFiled: December 20, 2021Publication date: April 14, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Zhe Wang, Sooraj Puthoor, Bradford M. Beckmann
-
Patent number: 11301256Abstract: Embodiments disclose a system and method for reducing virtual address translation latency in a wide execution engine that implements virtual memory. One example method describes a method comprising receiving a wavefront, classifying the wavefront into a subset based on classification criteria selected to reduce virtual address translation latency associated with a memory support structure, and scheduling the wavefront for processing based on the classifying.Type: GrantFiled: August 22, 2014Date of Patent: April 12, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Lisa R. Hsu, James Michael O'Connor
-
Patent number: 11294678Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment logic are disclosed. A processor includes at least a decode unit, scheduler queue assignment logic, scheduler queues, pickers, and execution units. The assignment logic receives a plurality of operations from a decode unit in each clock cycle. The assignment logic includes a separate logical unit for each different type of operation which is executable by the different execution units of the processor. For each different type of operation, the assignment logic determines which of the possible assignment permutations are valid for assigning different numbers of operations to scheduler queues in a given clock cycle. The assignment logic receives an indication of how many operations to assign in the given clock cycle, and then the assignment logic selects one of the valid assignment permutations for the number of operations specified by the indication.Type: GrantFiled: May 29, 2018Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Matthew T. Sobel, Donald A. Priore, Alok Garg
-
Patent number: 11294724Abstract: An approach is provided for allocating a shared resource to threads in a multi-threaded microprocessor based upon the usefulness of the shared resource to each of the threads. The usefulness of a shared resource to a thread is determined based upon the number of entries in the shared resource that are allocated to the thread and the number of active entries that the thread has in the shared resource. Threads that are allocated a large number of entries in the shared resource and have a small number of active entries in the shared resource, indicative of a low level of parallelism, can operate efficiently with fewer entries in the shared resource, and have their allocation limit in the shared resource reduced.Type: GrantFiled: September 27, 2019Date of Patent: April 5, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Kai Troester, Neil Marketkar, Matthew T. Sobel, Srinivas Keshav
-
Patent number: 11294747Abstract: A neural network runs a known input data set using an error free power setting and using an error prone power setting. The differences in the outputs of the neural network using the two different power settings determine a high level error rate associated with the output of the neural network using the error prone power setting. If the high level error rate is excessive, the error prone power setting is adjusted to reduce errors by changing voltage and/or clock frequency utilized by the neural network system. If the high level error rate is within bounds, the error prone power setting can remain allowing the neural network to operate with an acceptable error tolerance and improved efficiency. The error tolerance can be specified by the neural network application.Type: GrantFiled: January 31, 2018Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Andrew G. Kegel, David A. Roberts
-
Patent number: 11294710Abstract: A processing system suspends execution of a program thread based on an access latency required for a program thread to access memory. The processing system employs different memory modules having different memory technologies, located at different points in the processing system, and the like, or a combination thereof. The different memory modules therefore have different access latencies for memory transactions (e.g., memory reads and writes). When a program thread issues a memory transaction that results in an access to a memory module having a relatively long access latency (referred to as “slow” memory), the processor suspends execution of the program thread and releases processor resources used by the program thread. When the processor receives a response to the memory transaction from the memory module, the processor resumes execution of the suspended program thread.Type: GrantFiled: November 10, 2017Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Douglas Benson Hunt
-
Patent number: 11295507Abstract: A graphics processing unit (GPU) or other apparatus includes a plurality of shader engines. The apparatus also includes a first front end (FE) circuit and one or more second FE circuits. The first FE circuit is configured to schedule geometry workloads for the plurality of shader engines in a first mode. The first FE circuit is configured to schedule geometry workloads for a first subset of the plurality of shader engines and the one or more second FE circuits are configured to schedule geometry workloads for a second subset of the plurality of shader engines in a second mode. In some cases, a partition switch is configured to selectively connect the first FE circuit or the one or more second FE circuits to the second subset of the plurality of shader engines depending on whether the apparatus is in the first mode or the second mode.Type: GrantFiled: November 6, 2020Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Mark Leather, Michael Mantor
-
Patent number: 11294810Abstract: A processing system includes an interconnect fabric coupleable to a local memory and at least one compute cluster coupled to the interconnect fabric. The compute cluster includes a processor core and a cache hierarchy. The cache hierarchy has a plurality of caches and a throttle controller configured to throttle a rate of memory requests issuable by the processor core based on at least one of an access latency metric and a prefetch accuracy metric. The access latency metric represents an average access latency for memory requests for the processor core and the prefetch accuracy metric represents an accuracy of a prefetcher of a cache of the cache hierarchy.Type: GrantFiled: December 12, 2017Date of Patent: April 5, 2022Assignee: Advanced Micro Devices, Inc.Inventors: William L. Walker, William E. Jones
-
Publication number: 20220100257Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.Type: ApplicationFiled: September 25, 2020Publication date: March 31, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan