Patents Assigned to Advanced Micro Devices

Thread selection at a processor based on branch prediction confidence

Patent number: 10223124

Abstract: A processor employs one or more branch predictors to issue branch predictions for each thread executing at an instruction pipeline. Based on the branch predictions, the processor determines a branch prediction confidence for each of the executing threads, whereby a lower confidence level indicates a smaller likelihood that the corresponding thread will actually take the predicted branch. Because speculative execution of an untaken branch wastes resources of the instruction pipeline, the processor prioritizes threads associated with a higher confidence level for selection at the stages of the instruction pipeline.

Type: Grant

Filed: January 11, 2013

Date of Patent: March 5, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Ramkumar Jayaseelan, Ravindra N Bhargava
Input/output memory map unit and northbridge

Patent number: 10223280

Abstract: A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.

Type: Grant

Filed: July 2, 2018

Date of Patent: March 5, 2019

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Vydhyanathan Kalyanasundharam, Yaniv Adiri, Philip Ng, Maggie Chan, Vincent Cueva, Anthony Asaro, Jimshed Mirza, Greggory D. Donley, Bryan Broussard, Benjamin Tsien
Mechanism for resource utilization metering in a computer system

Patent number: 10223162

Abstract: Systems, apparatuses, and methods for tracking system resource utilization of guest virtual machines (VMs). Counters may be maintained to track resource utilization of different system resources by different guest VMs executing on the system. When a guest VM initiates execution, stored values may be loaded into the resource utilization counters. While the guest VM executes, the counters may track the resource utilization of the guest VM. When the guest VM terminates execution, the counter values may be written to a virtual machine control block (VMCB) corresponding to the guest VM. Scaling factors may be applied to the counter values to normalize the values prior to writing the values to the VMCB. A cloud computing environment may utilize the tracking mechanisms to guarantee resource utilization levels in accordance with users' service level agreements.

Type: Grant

Filed: April 13, 2016

Date of Patent: March 5, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael T. Clark, Jay Fleischman, Thaddeus S. Fortenberry, Maurice B. Steinman
METHOD AND APPARATUS OF PERFORMING A MEMORY OPERATION IN A HIERARCHICAL MEMORY ASSEMBLY

Publication number: 20190065100

Abstract: A method and apparatus of performing a memory operation includes receiving a memory operation request at a first memory controller that is in communication with a second memory controller. The first memory controller forwards the memory operation request to the second memory controller. Upon receipt of the memory operation request, the second memory controller provides first information or second information depending on a condition of a pseudo-bank of the second memory controller and a type of the memory operation request.

Type: Application

Filed: August 24, 2017

Publication date: February 28, 2019

Applicant: Advanced Micro Devices, Inc.

Inventor: Dmitri Yudanov
VARIABLE RATE SHADING

Publication number: 20190066371

Abstract: A technique for performing rasterization and pixel shading with decoupled resolution is provided herein. The technique involves performing rasterization as normal to generate fine rasterization data and a set of (fine) quads. The quads are accumulated into a tile buffer and coarse quads are generated from the quads in the tile buffer based on a shading rate. The shading rate determines how many pixels of the fine quads are combined to generate coarse pixels of the coarse quads. Combination of fine pixels involves generating a single coarse pixel for each such fine pixel to be combined. The positions of the coarse pixels of the coarse quads are set based on the positions of the corresponding fine pixels. The coarse quads are shaded normally and the resulting shaded coarse quads are modified based on the fine rasterization data to generate shaded fine quads.

Type: Application

Filed: August 25, 2017

Publication date: February 28, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Skyler Jonathon Saleh, Christopher J. Brennan, Andrew S. Pomianowski, Ruijin Wu
Identifying primitives in input index system

Patent number: 10217280

Abstract: Techniques for removing reset indices from, and identifying primitives in, an index stream that defines a set of primitives to be rendered, are disclosed. The index stream may be specified by an application program executing on the central processing unit. The technique involves classifying the primitive topology for the index stream as either requiring an offset-based technique or requiring a non-offset-based technique. This classification is done by determining whether, according to the primitive topology, each subsequent index can form a primitive with prior indices (e.g., line strip, triangle strip). If each subsequent index can form a primitive with prior indices, then the technique used is the non-offset-based technique. If each subsequent index does not form a primitive with prior indices, but instead at least two indices are required to form a new primitive (e.g., line list, triangle list), then the technique used is the offset-based technique.

Type: Grant

Filed: November 17, 2016

Date of Patent: February 26, 2019

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Saad Arrabi, Mangesh P. Nijasure, Todd Martin
On die voltage regulation with distributed switches

Patent number: 10218273

Abstract: A distributed voltage regulator has switches that function as resistors and are distributed in rows in a grid pattern across a regulated voltage domain. The switches receive an unregulated voltage and supply the regulated voltage. Switch control lines selectively enable the switches to achieve the desired voltage regulation. Droop detect circuits are also distributed through regulated voltage domain. The droop detect circuits detect when the regulated voltage is below a threshold and supply droop detect signals indicative thereof. A plurality of select circuits receive a first group of control lines to configure the switches for charge injection in response to a droop condition and a second group of control lines to configure the switches for other voltage regulation. The select circuits select one of the first and second group of control lines as switch control lines to configure the switches based on the droop detect signals.

Type: Grant

Filed: June 26, 2017

Date of Patent: February 26, 2019

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Erhan Ergin, Dipanjan Sengupta, Elsie Lo, Stephen V. Kosonocky, Sree Rajesh Saha, Divya Guruja
Method and apparatus of performing a memory operation in a hierarchical memory assembly

Patent number: 10216454

Abstract: A method and apparatus of performing a memory operation includes receiving a memory operation request at a first memory controller that is in communication with a second memory controller. The first memory controller forwards the memory operation request to the second memory controller. Upon receipt of the memory operation request, the second memory controller provides first information or second information depending on a condition of a pseudo-bank of the second memory controller and a type of the memory operation request.

Type: Grant

Filed: August 24, 2017

Date of Patent: February 26, 2019

Assignee: Advanced Micro Devices, Inc.

Inventor: Dmitri Yudanov
SHADER PIPELINES AND HIERARCHICAL SHADER RESOURCES

Publication number: 20190056958

Abstract: Shader resources may be specified for input to a shader using a hierarchical data structure which may be referred to as a descriptor set. The descriptor set may be bound to a bind point of the shader and may contain slots with pointers to memory containing shader resources. The shader may reference a particular slot of the descriptor set using an offset, and may change shader resources by referencing a different slot of the descriptor set or by binding or rebinding a new descriptor set. A graphics pipeline may be specified by creating a pipeline object which specifies a shader and a rendering context object, and linking the pipeline object. Part or all of the pipeline may be validated, cross-validated, or optimized during linking.

Type: Application

Filed: October 22, 2018

Publication date: February 21, 2019

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Guennadi Riguer, Brian K. Bennett
Conditional atomic operations in single instruction multiple data processors

Patent number: 10209990

Abstract: A conditional fetch-and-phi operation tests a memory location to determine if the memory locations stores a specified value and, if so, modifies the value at the memory location. The conditional fetch-and-phi operation can be implemented so that it can be concurrently executed by a plurality of concurrently executing threads, such as the threads of wavefront at a GPU. To execute the conditional fetch-and-phi operation, one of the concurrently executing threads is selected to execute a compare-and-swap (CAS) operation at the memory location, while the other threads await the results. The CAS operation tests the value at the memory location and, if the CAS operation is successful, the value is passed to each of the concurrently executing threads.

Type: Grant

Filed: June 2, 2015

Date of Patent: February 19, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: David A. Wood, Steven K. Reinhardt, Bradford M. Beckmann, Marc S. Orr
Integrated thermoelectric cooler for three-dimensional stacked DRAM and temperature-inverted cores

Patent number: 10210912

Abstract: Managing temperature of a semiconductor device having a temperature inverted processor core and stacked memory by operation of an integrated thermoelectric cooler. The thermoelectric cooler is operated to pump heat from a stacked memory device that requires a cool operating temperature to a temperature inverted processor core that maintains a higher operating temperature until threshold operating temperatures are achieved.

Type: Grant

Filed: June 9, 2017

Date of Patent: February 19, 2019

Assignee: Advanced Micro Devices, Inc.

Inventor: Wei Huang
Primitive level preemption using discrete non-real-time and real time pipelines

Patent number: 10210650

Abstract: Processing of non-real-time and real-time workloads is performed using discrete pipelines. A first pipeline includes a first shader and one or more fixed function hardware blocks. A second pipeline includes a second shader that is configured to emulate the at least one fixed function hardware block. First and second memory elements store first state information for the first pipeline and second state information for the second pipeline, respectively. A non-real-time workload executing in the first pipeline is preempted at a primitive boundary in response to a real-time workload being dispatched for execution in the second pipeline. The first memory element retains the first state information in response to preemption of the non-real-time workload. The first pipeline is configured to resume processing the subsequent primitive on the basis of the first state information stored in the first memory element.

Type: Grant

Filed: November 30, 2017

Date of Patent: February 19, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Anirudh R. Acharya, Swapnil Sakharshete, Michael Mantor, Mangesh P. Nijasure, Todd Martin, Vineet Goel
Instruction set and micro-architecture supporting asynchronous memory access

Patent number: 10209991

Abstract: A system and method for reducing latencies of main memory data accesses are described. A non-blocking load (NBLD) instruction identifies an address of requested data and a subroutine. The subroutine includes instructions dependent on the requested data. A processing unit verifies that address translations are available for both the address and the subroutine. The processing unit continues processing instructions with no stalls caused by younger-in-program-order instructions waiting for the requested data. The non-blocking load unit performs a cache coherent data read request on behalf of the NBLD instruction and requests that the processing unit perform an asynchronous jump to the subroutine upon return of the requested data from lower-level memory.

Type: Grant

Filed: November 16, 2016

Date of Patent: February 19, 2019

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Meenakshi Sundaram Bhaskaran, Elliot H. Mednick, David A. Roberts, Anthony Asaro, Amin Farmahini-Farahani
RING BUFFER INCLUDING A PRELOAD BUFFER

Publication number: 20190050198

Abstract: A system and method for managing data in a ring buffer is disclosed. The system includes a legacy ring buffer functioning as an on-chip ring buffer, a supplemental buffer for storing data in the ring buffer, a preload ring buffer that is on-chip and capable of receiving preload data from the supplemental buffer, a write controller that determines where to write data that is write requested by a write client of the ring buffer, and a read controller that controls a return of data to a read client pursuant to a read request to the ring buffer.

Type: Application

Filed: October 15, 2018

Publication date: February 14, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: XuHong Xiong, Pingping Shao, ZhongXiang Luo, ChenBin Wang
Exclusive access to shared registers in virtualized systems

Patent number: 10198283

Abstract: A request is sent from a new virtual function (VF) to a physical function for requesting the initialization of the new VF. The controlling physical function and the new VF establish a two-way communication channel that to start and end the VF's exclusive accesses to registers in a configuration space. The physical function uses a timing control to monitor that exclusive register access by the new VF is completed within a predetermined time period. The new VF is only granted a predetermined time period of exclusive access to complete its initialization process. If the exclusive access period is timed out, the controlling physical function can terminate the VF to prevent GPU stalls.

Type: Grant

Filed: November 10, 2016

Date of Patent: February 5, 2019

Assignees: ATI Technologies ULC, Advanced Micro Devices (Shanghai) Co., LTD.

Inventors: Jeffrey G. Cheng, Yinan Jiang, Guangwen Yang, Kelly Donald Clark Zytaruk, LingFei Liu, XiaoWei Wang
System and method of testing processor units using cache resident testing

Patent number: 10198358

Abstract: Apparatuses, computer readable mediums, and methods of processor unit testing using cache resident testing are disclosed. The method may include loading a test program in a cache on a chip comprising one or more processor units. The method may include the one or more processor units executing the test program to generate one or more results. The method may include redirecting a first memory reference to the cache, wherein the first memory reference is generated during the execution of the test program. The method may include determining whether the one or more generated results match one or more test results. The method may include redirecting a memory request to a memory location resident in the cache if the memory request includes a memory location not resident in the cache. The method may include redirecting a memory request to the cache if the memory request is not directed to the cache.

Type: Grant

Filed: April 2, 2014

Date of Patent: February 5, 2019

Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULC

Inventors: Angel E. Socarras, Kostantinos Danny Christidis, Curtis Alan Gilgan, Alexander Fuad Ashkar
System and method for scheduling instructions in a multithread SIMD architecture with a fixed number of registers

Patent number: 10198259

Abstract: A method and apparatus for scheduling instructions of a shader program for a graphics processing unit (GPU) with a fixed number of registers. The method and apparatus include computing, via a processing unit (PU), a liveness-based register usage across all basic blocks in the shader program, computing, via the PU, the range of numbers of waves of a plurality of registers for the shader program, assessing the impact of available post-register allocation optimizations, computing, via the PU, the scoring data based on number of waves of the plurality of registers, and computing, via the PU, the number of waves for execution for the plurality of registers.

Type: Grant

Filed: June 23, 2016

Date of Patent: February 5, 2019

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Robert A. Gottlieb, Christopher L. Reeve, Michael John Bedy
Programming in-memory accelerators to improve the efficiency of datacenter operations

Patent number: 10198349

Abstract: Systems, apparatuses, and methods for utilizing in-memory accelerators to perform data conversion operations are disclosed. A system includes one or more main processors coupled to one or more memory modules. Each memory module includes one or more memory devices coupled to a processing in memory (PIM) device. The main processors are configured to generate an executable for a PIM device to accelerate data conversion tasks of data stored in the local memory devices. In one embodiment, the system detects a read request for data stored in a given memory module. In order to process the read request, the system determines that a conversion from a first format to a second format is required. In response to detecting the read request, the given memory module's PIM device performs the conversion of the data from the first format to the second format and then provides the data to a consumer application.

Type: Grant

Filed: September 19, 2016

Date of Patent: February 5, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Mauricio Breternitz, Walter B. Benton
Low power memory throttling

Patent number: 10198216

Abstract: In one form, a data processing system includes a memory channel having a plurality of ranks, and a data processor. The data processor is coupled to the memory channel and is adapted to access each of the plurality of ranks. In response to detecting a predetermined event, the data processor selects an active rank of the plurality of ranks and places other ranks besides the active rank in a low power state, wherein the other ranks include at least one rank with a pending request at a time of detection of the predetermined event. The data processor subsequently processes a memory access request to the active rank.

Type: Grant

Filed: May 28, 2016

Date of Patent: February 5, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Kedarnath Balakrishnan, Kevin M. Brandl, James R. Magro
Dynamic memory remapping to reduce row-buffer conflicts

Patent number: 10198369

Abstract: A data processing system includes a memory that includes a first memory bank and a second memory bank. The data processing system also includes a conflict detector connected to the memory and adapted to receive memory access information. The conflict detector tracks memory access statistics of the first memory bank, and determines if the first memory bank contains frequent row conflicts. The conflict detector also remaps a frequent row conflict in the first memory bank to the second memory bank. An indirection table is connected to the conflict detector and adapted to receive a memory access request, and redirects an address into a dynamically selected physical memory address in response to a remapping of the frequent row conflict to the second memory bank.

Type: Grant

Filed: March 24, 2017

Date of Patent: February 5, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Yasuko Eckert, Reena Panda, Nuwan Jayasena

prev … 116 117 118 119 120 121 122 123 124 … next