Patents Assigned to Advanced Micro Devices
  • Patent number: 11119923
    Abstract: A cache coherence technique for operating a multi-processor system including shared memory includes allocating a cache line of a cache memory of a processor to a memory address in the shared memory in response to execution of an instruction of a program executing on the processor. The technique includes encoding a shared information state of the cache line to indicate whether the memory address is a shared memory address shared by the processor and a second processor, or a private memory address private to the processor, in response to whether the instruction is included in a critical section of the program, the critical section being a portion of the program that confines access to shared, writeable data.
    Type: Grant
    Filed: February 23, 2017
    Date of Patent: September 14, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amin Farmahini Farahani, Nuwan Jayasena
  • Patent number: 11119549
    Abstract: Control of power supplied to a machine intelligence (MI) processor is provided with an energy reservoir and power switching circuitry coupled to a power supply, the energy reservoir, and to power delivery circuitry of the MI processor. Control circuitry directs the power switching circuitry to charge the energy reservoir from the power supply or discharge the energy reservoir to the MI processor based on MI state information obtained from the MI processor. Processes for charging and discharging such an energy reservoir are provided. Processes for analyzing state information of the MI processor and configuring the control circuitry are also provided.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: September 14, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Greg Sadowski
  • Patent number: 11119926
    Abstract: Systems, apparatuses, and methods for maintaining a region-based cache directory are disclosed. A system includes multiple processing nodes, with each processing node including a cache subsystem. The system also includes a cache directory to help manage cache coherency among the different cache subsystems of the system. In order to reduce the number of entries in the cache directory, the cache directory tracks coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Accordingly, the system includes a region-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the system. The cache directory includes a reference count in each entry to track the aggregate number of cache lines that are cached per region. If a reference count of a given entry goes to zero, the cache directory reclaims the given entry.
    Type: Grant
    Filed: December 18, 2017
    Date of Patent: September 14, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Vydhyanathan Kalyanasundharam, Kevin M. Lepak, Amit P. Apte, Ganesh Balakrishnan, Eric Christopher Morton, Elizabeth M. Cooper, Ravindra N. Bhargava
  • Publication number: 20210279065
    Abstract: Techniques are provided for performing memory operations. The techniques include issuing, by a processor, a fence primitive to a memory system, the fence primitive issued in a manner that indicates a program order of memory operation execution.
    Type: Application
    Filed: March 3, 2020
    Publication date: September 9, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Shaizeen Aga, Anirban Nag
  • Patent number: 11115693
    Abstract: Systems, apparatuses, and methods for performing efficient video transmission are disclosed. In a video processing system, a transmitter sends encoded pixel data to a receiver. The receiver stores the encoded pixel data in a buffer at an input data rate. A decoder of the receiver reads the pixel data from the buffer at an output data rate. Each of a transmitter and the receiver maintains a respective synchronization counter. When detecting a start of a frame, each of the transmitter and the receiver stores a respective frame start count as a copy of a current value of the respective synchronization counter. The transmitter sends its frame start count to the receiver. The receiver determines a difference between the respective frame start counts, and adjusts the decoding rate based on the difference.
    Type: Grant
    Filed: March 27, 2019
    Date of Patent: September 7, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Stephen Mark Ryan, Carson Ryley Reece Green
  • Patent number: 11113061
    Abstract: Described herein are techniques for saving registers in the event of a function call. The techniques include modifying a program including a block of code designated as a calling code that calls a function. The modifying includes modifying the calling code to set a register usage mask indicating which registers are in use at the time of the function call. The modifying also includes modifying the function to combine the information of the register usage mask with information indicating registers used by the function to generate registers to be saved and save the registers to be saved.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: September 7, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Michael John Bedy
  • Patent number: 11112926
    Abstract: Systems, apparatuses, and methods for implementing enhanced scaling techniques for display objects are disclosed. When graphical content is created by an application, display objects register with a scaling manager to be notified of display scaling events. These display scaling events can be caused by changing displays, changing resolution or other parameters on a display, changing a text size, resizing one or more graphical elements, or otherwise. When a display scaling event is detected, display objects are notified of the event by the scaling manager. If a given display object makes a decision to change the amount of space it occupies based on the event, the given display object notifies its parent object of the desired change. The parent can then decide whether to allow the change and/or to make adjustments to other display objects to accommodate the change sought by the given display object.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: September 7, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Peter James Lohrmann
  • Patent number: 11113056
    Abstract: A technique for performing store-to-load forwarding is provided. The technique includes determining a virtual address for data to be loaded for the load instruction, identifying a matching store instruction from one or more store instruction memories by comparing a virtual-address-based comparison value for the load instruction to one or more virtual-address-based comparison values of one or more store instructions, determining a physical address for the load instruction, and validating the load instruction based on a comparison between the physical address of the load instruction and a physical address of the matching store instruction.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: September 7, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John M. King, Matthew T. Sobel
  • Patent number: 11113065
    Abstract: A technique for speculatively executing load-dependent instructions includes detecting that a memory ordering consistency queue is full for a completed load instruction. The technique also includes storing data loaded by the completed load instruction into a storage location for storing data when the memory ordering consistency queue is full. The technique further includes speculatively executing instructions that are dependent on the completed load instruction. The technique also includes in response to a slot becoming available in the memory ordering consistency queue, replaying the load instruction. The technique further includes in response to receiving loaded data for the replayed load instruction, testing for a data mis-speculation by comparing the loaded data for the replayed load instruction with the data loaded by the completed load instruction that is stored in the storage location.
    Type: Grant
    Filed: October 31, 2019
    Date of Patent: September 7, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Susumu Mashimo, Krishnan V. Ramani, Scott Thomas Bingham
  • Publication number: 20210272354
    Abstract: Improvements to graphics processing pipelines are disclosed. More specifically, the vertex shader stage, which performs vertex transformations, and the hull or geometry shader stages, are combined. If tessellation is disabled and geometry shading is enabled, then the graphics processing pipeline includes a combined vertex and graphics shader stage. If tessellation is enabled, then the graphics processing pipeline includes a combined vertex and hull shader stage. If tessellation and geometry shading are both disabled, then the graphics processing pipeline does not use a combined shader stage. The combined shader stages improve efficiency by reducing the number of executing instances of shader programs and associated resources reserved.
    Type: Application
    Filed: April 19, 2021
    Publication date: September 2, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Mangesh P. Nijasure, Randy W. Ramsey, Todd Martin
  • Patent number: 11106594
    Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.
    Type: Grant
    Filed: September 5, 2019
    Date of Patent: August 31, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Paul James Moyer, Douglas Benson Hunt
  • Patent number: 11106596
    Abstract: Methods, devices, and systems for determining an address in a physical memory which corresponds to a virtual address using a skewed-associative translation lookaside buffer (TLB) are described. A virtual address and a configuration indication are received using receiver circuitry. A physical address corresponding to the virtual address is output if a TLB hit occurs. A first subset of a plurality of ways of the TLB is configured to hold a first page size. The first subset includes a number of the ways based on the configuration indication. A physical address corresponding to the virtual address is retrieved from a page table if a TLB miss occurs, and at least a portion of the physical address is installed in a least recently used way of a subset of a plurality of ways the TLB, determined according to a replacement policy based on the configuration indication.
    Type: Grant
    Filed: December 23, 2016
    Date of Patent: August 31, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John M. King, Michael T. Clark
  • Patent number: 11108382
    Abstract: An oscillator circuit includes a first oscillator, a second oscillator, and a calibration circuit to calibrate the first and second oscillators. The first oscillator is supplied with a first supply voltage, and the second oscillator is supplied with a second supply voltage. The calibration includes setting a frequency control of the second oscillator at a target frequency. Then, a voltage control of the second supply voltage is adjusted incrementally until a first control value is identified at which a second oscillator output frequency matches the target frequency. Then, a voltage control of the first supply voltage is set to the first control value. Then, the voltage control for the first supply voltage is adjusted incrementally until a second control value is identified at which a first oscillator output frequency is as close to the second oscillator output frequency as is achievable, but does not exceed it.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: August 31, 2021
    Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Joyce Cheuk Wai Wong, Naeem Ibrahim Ally, Jonathan Hauke, Stephen Victor Kosonocky
  • Patent number: 11106600
    Abstract: A processing system adjusts a cache replacement priority of cache lines at a cache based on evictions of entries mapping virtual-to-physical address translations from a translation lookaside buffer (TLB). Upon eviction of a TLB entry, the processing system identifies cache lines corresponding to the physical addresses of the evicted TLB entry and evicts the cache lines or adjusts the cache replacement priority of the cache lines so that their eviction from the cache will be accelerated.
    Type: Grant
    Filed: January 24, 2019
    Date of Patent: August 31, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Gabriel H. Loh, Paul Moyer
  • Patent number: 11100604
    Abstract: Systems, apparatuses, and methods for scheduling jobs for multiple frame-based applications are disclosed. A computing system executes a plurality of frame-based applications for generating pixels for display. The applications convey signals to a scheduler to notify the scheduler of various events within a given frame being rendered. The scheduler adjusts the priorities of applications based on the signals received from the applications. The scheduler attempts to adjust priorities of applications and schedule jobs from these applications so as to minimize the perceived latency of each application. When an application has enqueued the last job for the current frame, the scheduler raises the priority of the application to high. This results in the scheduler attempting to schedule all remaining jobs for the application back-to-back. Once all jobs of the application have been completed, the priority of the application is reduced, permitting jobs of other applications to be executed.
    Type: Grant
    Filed: January 31, 2019
    Date of Patent: August 24, 2021
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Jeffrey Gongxian Cheng, Ahmed M. Abdelkhalek, Yinan Jiang, Xingsheng Wan, Anthony Asaro, David Martinez Nieto
  • Patent number: 11100004
    Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.
    Type: Grant
    Filed: June 23, 2015
    Date of Patent: August 24, 2021
    Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULC
    Inventors: Gongxian Jeffrey Cheng, Mark Fowler, Philip J. Rogers, Benjamin T. Sander, Anthony Asaro, Mike Mantor, Raja Koduri
  • Patent number: 11099786
    Abstract: A memory controller interfaces with a non-volatile storage class memory (SCM) module over a heterogeneous memory channel, and includes a command queue for receiving memory access commands. A memory interface queue is coupled to the command queue for holding outgoing commands. A non-volatile command queue is coupled to the command queue for storing non-volatile read commands that are placed in the memory interface queue. An arbiter selects entries from the command queue, and places them in the memory interface queue for transmission over a heterogeneous memory channel. A control circuit is coupled to the heterogeneous memory channel for receiving a ready response from the non-volatile SCM module indicating that responsive data is available for a non-volatile read command, and in response to receiving the ready response, causing a send command to be placed in the memory interface queue for commanding the non-volatile SCM module to send the responsive data.
    Type: Grant
    Filed: December 30, 2019
    Date of Patent: August 24, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: James R. Magro, Kedarnath Balakrishnan
  • Patent number: 11099788
    Abstract: An approach is provided for implementing near-memory data reduction during store operations to off-chip or off-die memory. A Near-Memory Reduction (NMR) unit provides near-memory data reduction during write operations to a specified address range. The NMR unit is configured with a range of addresses to be reduced and when a store operation specifies an address within the range of addresses, the NRM unit performs data reduction by adding the data value specified by the store operation to an accumulated reduction result. According to an embodiment, the NRM unit maintains a count of the number of updates to the accumulated reduction result that are used to determine when data reduction has been completed.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: August 24, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Nuwan Jayasena, Shaizeen Aga
  • Patent number: 11099846
    Abstract: A method and apparatus generates control information that indicates whether to change a counter value associated with a particular load instruction. In response to the control information, the method and apparatus causes a hysteresis effect for operating between a speculative mode and a non-speculative mode based on the counter value. The hysteresis effect is in favor of the non-speculative mode. The method and apparatus causes the hysteresis effect by incrementing the counter value associated with the particular load instruction by a first value or decrementing the counter value by a second value. The first value is greater than the second value.
    Type: Grant
    Filed: June 20, 2018
    Date of Patent: August 24, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Krishnan V. Ramani, Chetana N. Keltcher
  • Publication number: 20210255871
    Abstract: A technique for processing qubits in a quantum computing device is provided. The technique includes determining that, in a first cycle, a first quantum processing region is to perform a first quantum operation that does not use a qubit that is stored in the first quantum processing region, identifying a second quantum processing region that is to perform a second quantum operation at a second cycle that is later than the first cycle, wherein the second quantum operation uses the qubit, determining that between the first cycle and the second cycle, no quantum operations are performed in the second quantum processing region, and moving the qubit from the first quantum processing region to the second quantum processing region.
    Type: Application
    Filed: February 18, 2020
    Publication date: August 19, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Onur Kayiran, Jieming Yin, Yasuko Eckert