Patents Assigned to Advanced Micros Devices, Inc.
  • Patent number: 11055150
    Abstract: A thread holding a lock notifies a sleeping thread that is waiting on the lock that the lock holding thread is “about” to release the lock. In response to the notification, the waiting thread is woken up. While the waiting thread is woken up, the lock holding thread completes other operations prior to actually releasing the lock and then releases the lock. The notification to the waiting thread hides latency associated with waking up the waiting thread by allowing operations that wake up the waiting thread to occur while the lock holding thread is performing the other operations prior to releasing the thread.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: July 6, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini-Farahani, David A. Roberts
  • Patent number: 11055895
    Abstract: Described herein are techniques for reducing control flow divergence. The method includes identifying two or more shader programs having commonalities, generating a merged shader program that implements functionality of the identified two or more shader programs, wherein the functionality implemented includes a first execution option for a first shader program of the two or more shader programs and a second execution option for a second shader program of the two or more shader programs, modifying shader programs that call the first shader program to instead call the merged shader program and select the first execution option, modifying shader programs that call the second shader program to instead call the merged shader program and select the second execution option.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: July 6, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: David Ronald Oldcorn
  • Patent number: 11054887
    Abstract: Systems, apparatuses, and methods for performing efficient power management for a multi-node computing system are disclosed. A computing system includes multiple nodes. When power down negotiation is distributed, negotiation for system-wide power down occurs within a lower level of a node hierarchy prior to negotiation for power down occurring at a higher level of the node hierarchy. When power down negotiation is centralized, a given node combines a state of its clients with indications received on its downstream link and sends an indication on an upstream link based on the combining. Only a root node sends power down requests.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: July 6, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Benjamin Tsien, Greggory D. Donley, Bryan P. Broussard
  • Patent number: 11055098
    Abstract: A processor includes a branch target buffer (BTB) having a plurality of entries whereby each entry corresponds to an associated instruction pointer value that is predicted to be a branch instruction. Each BTB entry stores a predicted branch target address for the branch instruction, and further stores information indicating whether the next branch in the block of instructions associated with the predicted branch target address is predicted to be a return instruction. In response to the BTB indicating that the next branch is predicted to be a return instruction, the processor initiates an access to a return stack that stores the return address for the predicted return instruction. By initiating access to the return stack responsive to the return prediction stored at the BTB, the processor reduces the delay in identifying the return address, thereby improving processing efficiency.
    Type: Grant
    Filed: July 24, 2018
    Date of Patent: July 6, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Aparna Thyagarajan, Marius Evers, Arunachalam Annamalai
  • Patent number: 11055806
    Abstract: A method and system for directing image rendering, implemented in a computer system including a plurality of processors includes determining one or more processors in the system on which to execute one or more commands. A graphics processing unit (GPU) control application program interface (API) determines one or more processors in the system on which to execute one or more commands. A signal is transmitted to each of the one or more processors indicating which of the one or more commands are to be executed by that processor. The one or more processors execute their respective command. A request is transmitted to each of the one or more processors to transfer information to one another once processing is complete, and an image is rendered based upon the processed information by at least one processor and the received transferred information from at least another processor.
    Type: Grant
    Filed: February 24, 2016
    Date of Patent: July 6, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Gregory A. Grebe, Jonathan Lawrence Campbell, Layla A. Mah
  • Publication number: 20210200298
    Abstract: Methods, devices and systems for power management in a computer processing device are disclosed. The methods may include selecting, by a data fabric, D23 as target state, selecting D3 state by a memory controller, blocking memory access, reducing data fabric and memory controller clocks, reduce SoC voltage, and turning PHY voltage off. The methods may include signaling to wake up the SoC, starting exit flow by ramping up SoC voltage and ramping data fabric and memory controller clocks, unblocking memory access, propagating activity associated with the wake up event to memory, exiting D3 by PHY, and exiting self-refresh by a memory.
    Type: Application
    Filed: December 30, 2019
    Publication date: July 1, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Alexander J. Branover, Benjamin Tsien
  • Publication number: 20210200618
    Abstract: A memory controller includes a command queue, a memory interface queue, and a non-volatile error reporting circuit. The command queue receives memory access commands including volatile reads, volatile writes, non-volatile reads, and non-volatile writes, and an output. The memory interface queue has an input coupled to the output of the command queue, and an output for coupling to a non-volatile storage class memory (SCM) module. The non-volatile error reporting circuit identifies error conditions associated with the non-volatile SCM module and maps the error conditions from a first number of possible error conditions associated with the non-volatile SCM module to a second, smaller number of virtual error types for reporting to an error monitoring module of a host operating system, the mapping based at least on a classification that the error condition will or will not have a deleterious effect on an executable process running on the host operating system.
    Type: Application
    Filed: December 30, 2019
    Publication date: July 1, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: James R. Magro, Kedarnath Balakrishnan, Vilas Sridharan
  • Publication number: 20210200649
    Abstract: A memory controller includes a command queue, a memory interface queue, at least one storage queue, and a replay control circuit. The command queue has a first input for receiving memory access commands. The memory interface queue receives commands selected from the command queue and couples to a heterogeneous memory channel which is coupled to at least one non-volatile storage class memory (SCM) module. The at least one storage queue stores memory access commands that are placed in the memory interface queue. The replay control circuit detects that an error has occurred requiring a recovery sequence, and in response to the error, initiates the recovery sequence. In the recovery sequence, the replay control circuit transmits selected memory access commands from the at least one storage queue by grouping non-volatile read commands together separately from all pending volatile reads, volatile writes, and non-volatile writes.
    Type: Application
    Filed: December 30, 2019
    Publication date: July 1, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Jing Wang, James R. Magro, Kedarnath Balakrishnan
  • Publication number: 20210201986
    Abstract: Methods for reducing boot time of a system-on-a-chip (SOC) by reducing double data rate (DDR) memory training and memory context restore. Dynamic random access memory (DRAM) controller and DDR physical interface (PHY) settings are stored into a non-volatile memory and the DRAM controller and DDR PHY are powered down. On system resume, a basic input/output system restores the DRAM controller and DDR PHY settings from non-volatile memory, and finalizes the DRAM controller and DDR PHY settings for operation with the SOC. Reducing the boot time of the SOC by reducing DDR training includes setting DRAMs into self-refresh mode, and programing a self-refresh state machine memory operation (MOP) array to exit self-refresh mode and update any DRAM device state for the target power management state. The DRAM device is reset, and the self-refresh state machine MOP array reinitializes the DRAM device state for the target power management state.
    Type: Application
    Filed: December 30, 2019
    Publication date: July 1, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Kevin M. Brandl, Naveen Davanam, Oswin E. Housty
  • Publication number: 20210200468
    Abstract: Memory access commands are placed in a memory interface queue and transmitted from the memory interface queue to a heterogeneous memory channel coupled to a volatile dual in-line memory module (DIMM) and a non-volatile DIMM. Selected memory access commands that are placed in the memory interface queue are stored in a replay queue. The non-volatile reads that are placed in the memory interface queue are in a non-volatile command queue (NV queue). The method detects, based on information received over the heterogeneous memory channel, that an error has occurred requiring a recovery sequence. In response to the error, the method initiates the recovery sequence including (i) transmitting selected memory access commands that are stored in the replay queue, and (ii) transmitting non-volatile reads that are stored in the NV queue.
    Type: Application
    Filed: December 30, 2019
    Publication date: July 1, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Jing Wang, James R. Magro, Kedarnath Balakrishnan
  • Publication number: 20210200467
    Abstract: A memory controller interfaces with a non-volatile storage class memory (SCM) module over a heterogenous memory channel, and includes a command queue for receiving memory access commands. A memory interface queue is coupled to the command queue for holding outgoing commands. A non-volatile command queue is coupled to the command queue for storing non-volatile read commands that are placed in the memory interface queue. An arbiter selects entries from the command queue, and places them in the memory interface queue for transmission over a heterogenous memory channel. A control circuit is coupled to the heterogenous memory channel for receiving a ready response from the non-volatile SCM module indicating that responsive data is available for a non-volatile read command, and in response to receiving the ready response, causing a send command to be placed in the memory interface queue for commanding the non-volatile SCM module to send the responsive data.
    Type: Application
    Filed: December 30, 2019
    Publication date: July 1, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: James R. Magro, Kedarnath Balakrishnan
  • Patent number: 11049794
    Abstract: Various circuit board embodiments are disclosed. In one aspect, an apparatus is provided that includes a circuit board and a first phase change material pocket positioned on or in the circuit board and contacting a surface of the circuit board.
    Type: Grant
    Filed: March 1, 2014
    Date of Patent: June 29, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Manish Arora, Nuwan Jayasena
  • Patent number: 11048506
    Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. Store-load pairs which have a strong history of store-to-load forwarding are identified. Once identified, the load is memory renamed to the register stored by the store. The memory dependency predictor may also be used to detect loads that are dependent on a store but cannot be renamed. In such a configuration, the dependence is signaled to the load store unit and the load store unit uses the information to issue the load after the identified store has its physical address.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: June 29, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Krishnan V. Ramani, Kai Troester, Frank C. Galloway, David N. Suggs, Michael D. Achenbach, Betty Ann McDaniel, Marius Evers
  • Publication number: 20210192087
    Abstract: Devices, methods, and systems for secure communications on a computing device. A host operating system (OS) runs on a host processor in communication with a host memory. A secure OS runs on a coprocessor in communication with a secure memory. The coprocessor receives information from an external device over a secure peer-to-peer (P2P) connection. The secure P2P connection is managed by the secure OS and is not accessible by the host OS.
    Type: Application
    Filed: December 19, 2019
    Publication date: June 24, 2021
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Guhan Krishnan, Carl K. Wakeland, Saikishore Reddipalli, Philip Ng
  • Publication number: 20210191732
    Abstract: A processing device is provided which comprises memory and a processor. The processor is configured to receive an array of floating point numbers each having a plurality of bits used to represent a probability value. For each floating point number, the processor is configured to replace values in a portion of the bits used to represent the probability value with index values to represent an index corresponding to a location of a corresponding floating point number in the memory. The processor is also configured to process the floating point numbers using SIMD instructions to execute one of an argmax operation and an argmin operation.
    Type: Application
    Filed: December 19, 2019
    Publication date: June 24, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael L. Schmit, Lakshmi Kumar
  • Publication number: 20210192827
    Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data, and writing a VRS rate feedback buffer based on the updated VRS data.
    Type: Application
    Filed: December 20, 2019
    Publication date: June 24, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Vineet Goel, Pazhani Pillai, Ruijin Wu, Christopher J. Brennan, Andrew S. Pomianowski
  • Publication number: 20210191435
    Abstract: A technique for adjusting a power supply for a device is provided. The technique includes detecting a low-power trigger for a device; switching a power supply for the device from a high-power power supply to a low-power power supply; detecting a high-power trigger for a device; and switching a power supply for the device from the low-power power supply to the high-power power supply, wherein the high-power power supply consumes a larger amount of power than the low-power power supply, and wherein the high-power power supply provides a greater amount of noise reducing and a greater tolerance to temperature differences than the low-power power supply.
    Type: Application
    Filed: December 20, 2019
    Publication date: June 24, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sonu Arora, Michael Arn Nix, Moises E. Robinson, Xiaojie He
  • Publication number: 20210191454
    Abstract: A method and apparatus for synchronizing a time stamp counter (TSC) associated with a processor core in a computer system includes initializing the TSC associated with the processor core by synchronizing the TSC associated with the processor core with at least one other TSC in a hierarchy of TSCs. One or more processor cores are powered down. Upon powering up of the one or more processor cores, the TSC associated with the processor core is synchronized with the at least one other TSC in the hierarchy of TSCs.
    Type: Application
    Filed: December 19, 2019
    Publication date: June 24, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Amitabh Mehra, David M. Dahle, Richard M. Born
  • Publication number: 20210191461
    Abstract: An external module is for use with a designated computing device. The external module includes a body forming a hollow chamber. An external air intake formed in the body and connected to a first portal of the chamber. An air outlet formed in the body along a wall of the chamber and adapted to align with a cooling air intake of the designated computing device when the external module is positioned in a designated relationship to the computing device. A blower is positioned to force air through the external air intake into the chamber and maintain a positive air pressure in the chamber such that the positive air pressure is maintained against at least part of the cooling air intake of the computing device when the air outlet is aligned with the cooling air intake.
    Type: Application
    Filed: December 23, 2019
    Publication date: June 24, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Christopher Jaggers, Constantine Conrad Peter Venizelos
  • Publication number: 20210191865
    Abstract: A coherency management device receives requests to read data from or write data to an address in a main memory. On a write, if the data includes zero data, an entry corresponding to the memory address is created in a cache directory if it does not already exist, is set to an invalid state, and indicates that the data includes zero data. The zero data is not written to main memory or a cache. On a read, the cache directory is checked for an entry corresponding to the memory address. If the entry exists in the cache directory, is invalid, and includes an indication that data corresponding to the memory address includes zero data, the coherency management device returns zero data in response to the request without fetching the data from main memory or a cache.
    Type: Application
    Filed: December 20, 2019
    Publication date: June 24, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Vydhyanathan Kalyanasundharam, Amit P. Apte