Patents Assigned to Advanced Micro Devices
  • Publication number: 20220103907
    Abstract: Techniques are provided herein for processing video data. The techniques include identifying one or more input factors including one or more of signal quality factors, video content complexity factors, and hardware buffering factors for one or more of a video encoding system and a video playback system; evaluating the one or more input factors to determine adjustments to apply to one or both of the video encoding system and the video playback system; and applying the determine adjustments to the one or both of the video encoding system and the video playback system.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Adam H. Li, Eugene Kuznetsov, Girish P. Subramaniam, Jihyuk Choi
  • Publication number: 20220100662
    Abstract: The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.
    Type: Application
    Filed: December 9, 2021
    Publication date: March 31, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: John M. King, Gregory W. Smaus
  • Publication number: 20220101110
    Abstract: Techniques are disclosed for performing machine learning operations. The techniques include fetching weights for a first layer in a first format; performing matrix multiplication of the weights fetched in the first format with values provided by a prior layer in a forwards training pass; fetching the weights for the first layer in a second format different from the first format; and performing matrix multiplication for a backwards pass, the matrix multiplication including multiplication of the weights fetched in the second format with values corresponding to values provided as the result of the forwards training pass for the first layer.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Swapnil P. Sakharshete, Maxim V. Kazakov
  • Publication number: 20220101179
    Abstract: Techniques are disclosed for communicating between a machine learning accelerator and one or more processing cores. The techniques include obtaining data at the machine learning accelerator via an input/output die; processing the data at the machine learning accelerator to generate machine learning processing results; and exporting the machine learning processing results via the input/output die, wherein the input/output die is coupled to one or more processor chiplets via one or more processor ports, and wherein the input/output die is coupled to the machine learning accelerator via an accelerator port.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Maxim V. Kazakov
  • Publication number: 20220100391
    Abstract: A framework disclosed herein extends a relaxed, scoped memory model to a system that includes nodes across a commodity network and maintains coherency across the system. A new scope, cluster scope, is defined, that allows for memory accesses at scopes less than cluster scope to operate on locally cached versions of remote data from across the commodity network without having to issue expensive network operations. Cluster scope operations generate network commands that are used to synchronize memory across the commodity network.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael W. LeBeane, Khaled Hamidouche, Hari S. Thangirala, Brandon Keith Potter
  • Patent number: 11288095
    Abstract: A technique for synchronizing workgroups is provided. The techniques comprise detecting that one or more non-executing workgroups are ready to execute, placing the one or more non-executing workgroups into one or more ready queues based on the synchronization status of the one or more workgroups, detecting that computing resources are available for execution of one or more ready workgroups, and scheduling for execution one or more ready workgroups from the one or more ready queues in an order that is based on the relative priority of the ready queues.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: March 29, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexandru Dutu, Matthew D. Sinclair, Bradford M. Beckmann, David A. Wood
  • Patent number: 11288205
    Abstract: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.
    Type: Grant
    Filed: June 23, 2015
    Date of Patent: March 29, 2022
    Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC
    Inventors: Benjamin T. Sander, Mark Fowler, Anthony Asaro, Gongxian Jeffrey Cheng, Mike Mantor
  • Patent number: 11289131
    Abstract: Systems, apparatuses, and methods for implementing dynamic control of a multi-region fabric are disclosed. A system includes at least one or more processing units, one or more memory devices, and a communication fabric coupled to the processing unit(s) and memory device(s). The system partitions the fabric into multiple regions based on different traffic types and/or periodicities of the clients connected to the regions. For example, the system partitions the fabric into a stutter region for predictable, periodic clients and a non-stutter region for unpredictable, non-periodic clients. The system power-gates the entirety of the fabric in response to detecting a low activity condition. After power-gating the entirety of the fabric, the system periodically wakes up one or more stutter regions while keeping the other non-stutter regions in power-gated mode. Each stutter region monitors stutter client(s) for activity and processes any requests before going back into power-gated mode.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: March 29, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Benjamin Tsien, Alexander J. Branover, Alan Dodson Smith, Chintan S. Patel
  • Patent number: 11290515
    Abstract: Systems, apparatuses, and methods for implementing real-time, low-latency packetization protocols for live compressed video data are disclosed. A wireless transmitter includes at least a codec and a media access control (MAC) layer unit. In order for the codec to communicate with the MAC layer unit, the codec encodes the compression ratio in a header embedded inside the encoded video stream. The MAC layer unit extracts the compression ratio from the header and determines a modulation coding scheme (MCS) for transferring the video stream based on the compression ratio. The MAC layer unit and the codec also implement a feedback loop such that the MAC layer unit can command the codec to adjust the compression ratio. Since the changes to the video might not be implemented immediately, the MAC layer unit relies on the header to determine when the video data is coming in with the requested compression ratio.
    Type: Grant
    Filed: December 7, 2017
    Date of Patent: March 29, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ngoc Vinh Vu, Darren Rae Di Cera, Adam William Lynch, Shane Bentley, Douglas Mammoser, David Robert Stark, Jr.
  • Publication number: 20220091974
    Abstract: A processing device and methods of controlling remote persistent writes are provided. Methods include receiving an instruction of a program to issue a persistent write to remote memory. The methods also include logging an entry in a local domain when the persistent write instruction is received and providing a first indication that the persistent write will be persisted to the remote memory. The methods also include executing the persistent write to the remote memory and providing a second indication that the persistent write to the remote memory is completed. The methods also include providing the first and second indications when it is determined not to execute the persistent write according to global ordering and providing the second indication without providing the first indication when it is determined to execute the persistent write to remote memory according to global ordering.
    Type: Application
    Filed: September 24, 2020
    Publication date: March 24, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Shaizeen Aga
  • Publication number: 20220091921
    Abstract: A data processor includes provides memory commands to a memory channel according to predetermined criteria. The data processor includes a first error code generation circuit, a second error code generation circuit, and a queue. The first error code generation circuit generates a first type of error code in response to data of a write request. The second error code generation circuit generates a second type of error code for the write request, the second type of error code different from the first type of error code. The queue is coupled to the first error code generation circuit and to the second error code generation circuit, for provides write commands to an interface, the write commands including the data, the first type of error code, and the second type of error code.
    Type: Application
    Filed: December 7, 2021
    Publication date: March 24, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Kedarnath Balakrishnan, James R. Magro, Kevin Michael Lepak, Vilas Sridharan
  • Publication number: 20220091784
    Abstract: A memory controller includes a command queue having a first input for receiving memory access requests, and a memory interface queue having an output for coupling to a memory channel adapted for connecting to at least one dynamic random access memory (DRAM) module. A refresh control circuit monitors activate commands to be sent over the memory channel. In response to an activate command meeting a designated condition, the refresh control circuit identifies a candidate aggressor row associated with the activate command. A command is sent to the DRAM requesting that the candidate aggressor row be queued for mitigation in a future refresh or refresh management event.
    Type: Application
    Filed: September 21, 2020
    Publication date: March 24, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Kevin M. Brandl
  • Publication number: 20220092001
    Abstract: Described is a method and apparatus for application migration between a dockable device and a docking station in a seamless manner. The dockable device includes a processor and the docking station includes a high-performance processor. The method includes determining a docking state of a dockable device while at least an application is running. Application migration from the dockable device to a docking station is initiated when the dockable device is moving to a docked state. Application migration from the docking station to the dockable device is initiated when the dockable device is moving to an undocked state. The application continues to run during the application migration from the dockable device to the docking station or during the application migration from the docking station to the dockable device.
    Type: Application
    Filed: December 6, 2021
    Publication date: March 24, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Jonathan Lawrence Campbell, Yuping Shen
  • Publication number: 20220091822
    Abstract: A multiply-accumulate computation is performed using digital logic circuits. To perform the computation, a plurality of target signals are received at a respective plurality of ripple counters. The counter outputs of the respective ripple counters are scaled by setting stop count values. Counter outputs of the respective ripple counters are adjusted with respective constant values by setting counter reset values at the respective ripple counters. Each count pulses of the target signals for an adjusted period. The count values of the ripple counters are summed. The results may be used to calculate an average value for an adaptive voltage and frequency scaling process.
    Type: Application
    Filed: September 22, 2020
    Publication date: March 24, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Ravinder Reddy Rachala, Stephen Victor Kosonocky, Miguel Rodriguez
  • Publication number: 20220091880
    Abstract: Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup dependency instruction including an indication to prioritize execution of the third workgroup has been executed.
    Type: Application
    Filed: September 24, 2020
    Publication date: March 24, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Alexandru Dutu, Marcus Nathaniel Chow, Matthew D. Sinclair, Bradford M. Beckmann, David A. Wood
  • Patent number: 11281466
    Abstract: A floating point unit includes a non-pickable scheduler queue (NSQ) that offers a load operation concurrently with a load store unit retrieving load data for an operand that is to be loaded by the load operation. The floating point unit also includes a renamer that renames architectural registers used by the load operation and allocates physical register numbers to the load operation in response to receiving the load operation from the NSQ. The floating point unit further includes a set of pickable scheduler queues that receive the load operation from the renamer and store the load operation prior to execution. A physical register file is implemented in the floating point unit and a free list is used to store physical register numbers of entries in the physical register file that are available for allocation.
    Type: Grant
    Filed: October 22, 2019
    Date of Patent: March 22, 2022
    Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC
    Inventors: Arun A. Nair, Michael Estlick, Erik Swanson, Sneha V. Desai, Donglin Ji
  • Patent number: 11284096
    Abstract: A host processor, such as a central processing unit (CPU), programmed to execute a software driver that causes the host processor to generate a motion compensation command for a plurality of cores of a massively parallel processor, such as a graphics processing unit (GPU), to provide motion compensation for encoded video. The motion compensation command for the plurality of cores of the massively parallel processor contains executable instructions for processing a plurality of motion vectors grouped by a plurality of prediction modes from a re-ordered motion vector buffer by the plurality of cores of the massively parallel processor.
    Type: Grant
    Filed: February 25, 2021
    Date of Patent: March 22, 2022
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Michael L. Schmit, Ashish Farmer, Radhakrishna Giduthuri
  • Patent number: 11281495
    Abstract: A system and method for providing security of sensitive information within chips using SIMD micro-architecture are described. A command processor within a parallel data processing unit, such as a graphics processing unit (GPU), schedules commands across multiple compute units based on state information. When the command processor determines a rescheduling condition is satisfied, it causes the overwriting of at least a portion of data stored in each of the one or more local memories used by the multiple compute units. The command processor also stores in the secure memory a copy of state information associated with a given group of commands and later checks it to ensure corruption by a malicious or careless program is prevented.
    Type: Grant
    Filed: August 27, 2018
    Date of Patent: March 22, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Rex Eldon McCrary
  • Patent number: 11281470
    Abstract: A processing device is provided which comprises memory and a processor. The processor is configured to receive an array of floating point numbers each having a plurality of bits used to represent a probability value. For each floating point number, the processor is configured to replace values in a portion of the bits used to represent the probability value with index values to represent an index corresponding to a location of a corresponding floating point number in the memory. The processor is also configured to process the floating point numbers using SIMD instructions to execute one of an argmax operation and an argmin operation.
    Type: Grant
    Filed: December 19, 2019
    Date of Patent: March 22, 2022
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Michael L. Schmit, Lakshmi Kumar
  • Patent number: 11281592
    Abstract: Memories that are configurable to operate in either a banked mode or a bit-separated mode. The memories include a plurality of memory banks; multiplexing circuitry; input circuitry; and output circuitry. The input circuitry inputs at least a portion of a memory address and configuration information to the multiplexing circuitry. The multiplexing circuitry generates read data by combining a selected subset of data corresponding to the address from each of the plurality of memory banks, the subset selected based on the configuration information, if the configuration information indicates a bit-separated mode. The multiplexing circuitry generates the read data by combining data corresponding to the address from one of the memory banks, the one of the memory banks selected based on the configuration information, if the configuration information indicates a banked mode. The output circuitry outputs the generated read data from the memory.
    Type: Grant
    Filed: November 11, 2019
    Date of Patent: March 22, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Russell J. Schreiber