Patents Assigned to Advanced Micro Devices
  • Patent number: 10846095
    Abstract: A system and method for a virtual load queue is described. Load micro-operations are processed through an instruction pipeline without requiring an entry in a load queue (LDQ). An address generation scheduler queue (AGSQ) entry is allocated to the load micro-operation and a LDQ entry is not allocated to the load micro-operation. The LDQ entries are reserved for the N oldest load micro-operations, where N is the depth of the LDQ. Deallocation of the AGSQ entry is done if the load micro-operation is one of the N oldest load micro-operations, or upon successful completion of the load micro-operation. Deallocation of the AGSQ entry is not done if the load micro-operation gets a bad status and is not one of the N oldest micro-operations. Consequently, the AGSQ acts as a virtual queue for the LDQ and mitigates the limiting effect of the LDQ depth.
    Type: Grant
    Filed: November 28, 2017
    Date of Patent: November 24, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: John M. King
  • Publication number: 20200364573
    Abstract: Systems, methods, and devices for pruning a convolutional neural network (CNN). A subset of layers of the CNN is chosen, and for each layer of the subset of layers, how salient each filter in the layer is to an output of the CNN is determined, a subset of the filters in the layer is determined based on the salience of each filter in the layer, and the subset of filters in the layer is pruned. In some implementations, the layers of the subset of layers of the CNN are non-contiguous. In some implementations, the subset of layers includes odd numbered layers of the CNN and excludes even numbered layers of the CNN. In some implementations, the subset of layers includes even numbered layers of the CNN and excludes odd numbered layers of the CNN.
    Type: Application
    Filed: June 28, 2019
    Publication date: November 19, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Arun Coimbatore Ramachandran, Chandra Kumar Ramasamy, Prakash Sathyanath Raghavendra, Keerthan Subraya Shagrithaya
  • Patent number: 10838727
    Abstract: A processing device is provided which includes memory and at least one processor. The memory includes main memory and cache memory in communication with the main memory via a link. The at least one processor is configured to receive a request for a cache line and read the cache line from main memory. The at least one processor is also configured to compress the cache line according to a compression algorithm and, when the compressed cache line includes at least one byte predicted not to be accessed, drop the at least one byte from the compressed cache line based on whether the compression algorithm is determined to successfully compress the cache line according to a compression parameter.
    Type: Grant
    Filed: December 14, 2018
    Date of Patent: November 17, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Shomit N. Das, Kishore Punniyamurthy, Matthew Tomei, Bradford M. Beckmann
  • Patent number: 10839875
    Abstract: A timing circuit includes an input for receiving the control signal from a logic circuit operating with a first supply voltage and an output for supplying a control signal to a circuit operating with a second supply voltage different from the first supply voltage. The timing circuit also includes a plurality of delay elements connected in series between the input and output and supplied with the first supply voltage, and one or more NFET footer transistors that couple respective delay elements to a negative supply rail, the NFET footer transistors having the second supply voltage applied to their gates. A memory apparatus employing such a circuit is provided.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: November 17, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Srinivas R. Sathu, John Wuu, Russell Schreiber, Martin Piorkowski
  • Patent number: 10838864
    Abstract: A miss in a cache by a thread in a wavefront is detected. The wavefront includes a plurality of threads that are executing a memory access request concurrently on a corresponding plurality of processor cores. A priority is assigned to the thread based on whether the memory access request is addressed to a local memory or a remote memory. The memory access request for the thread is performed based on the priority. In some cases, the cache is selectively bypassed depending on whether the memory access request is addressed to the local or remote memory. A cache block is requested in response to the miss. The cache block is biased towards a least recently used position in response to requesting the cache block from the local memory and towards a most recently used position in response to requesting the cache block from the remote memory.
    Type: Grant
    Filed: May 30, 2018
    Date of Patent: November 17, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Michael W. Boyer, Onur Kayiran, Yasuko Eckert, Steven Raasch, Muhammad Shoaib Bin Altaf
  • Patent number: 10840167
    Abstract: Various integrated heat spreaders and methods of making the same are disclosed. In one aspect, an integrated heat spreader to provide thermal management of a first heat generating component on a circuit board is provided. The integrated heat spreader includes a shell that has an internal space, at least one inlet port to receive a coolant to cool the first heat generating component and at least one outlet port to discharge the coolant. Plural heat fins are connected to the shell in the internal space. The heat fins are selectively connectable to the shell in multiple arrangements to provide selected flow rates of the coolant in one or more regions of the internal space.
    Type: Grant
    Filed: November 19, 2018
    Date of Patent: November 17, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Andrew McNamara, Swagata Kalve
  • Publication number: 20200357093
    Abstract: Methods are provided for creating objects in a way that permits an API client to explicitly participate in memory management for an object created using the API. Methods for managing data object memory include requesting memory requirements for an object using an API and expressly allocating a memory location for the object based on the memory requirements. Methods are also provided for cloning objects such that a state of the object remains unchanged from the original object to the cloned object or can be explicitly specified.
    Type: Application
    Filed: July 30, 2020
    Publication date: November 12, 2020
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Guennadi Riguer, Brian K. Bennett
  • Publication number: 20200358447
    Abstract: A C-element circuit for use in an oscillator or the like includes a first input terminal for receiving a first input signal, a second input terminal for receiving a second input signal, and an output latch for providing an output signal based on a relationship between the two input signals. A stack of input transistors is included with an outer pair of input transistors with gates connected to the first input terminal and an inner pair of input transistors with gates connected to a second input terminal. A balancing circuit operates to equalize a first delay of a change in the first input signal affecting the output signal with a second delay of a change in the second input signal affecting the output signal. Bypass control techniques are provided for using the C-element circuit with a single input.
    Type: Application
    Filed: May 8, 2019
    Publication date: November 12, 2020
    Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Mikhail Rodionov, Stephen Victor Kosonocky, Joyce Cheuk Wai Wong
  • Patent number: 10832465
    Abstract: A technique for executing pixel shader programs is provided. The pixel shader programs are executed in workgroups, which allows access by work-items to a local data store and also allows program synchronization at barrier points. Utilizing workgroups allows for more flexible and efficient execution than previous implementations in the pixel shader stage. Several techniques for assigning fragments to wavefronts and workgroups are also provided. The techniques differ in the degree of geometric locality of fragments within wavefronts and/or workgroups. In some techniques, a greater degree of locality is enforced, which reduces processing unit occupancy but also reduces program complexity. In other techniques, a lower degree of locality is enforced, which increases processing unit occupancy.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: November 10, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Skyler Jonathon Saleh
  • Patent number: 10824349
    Abstract: A processing system includes a plurality of input/output (I/O) devices representing a plurality of I/O resources. Each I/O resource has at least one corresponding memory mapped I/O (MMIO) address range. A trap handler detects a write request targeting a configuration space of an identified I/O resource of the plurality of I/O resources and, responsive to determining the identified I/O resource is a protected I/O resource, selectively blocks the write request from further processing by the processing system based on whether the write request would change an MMIO address decoding of the identified I/O resource.
    Type: Grant
    Filed: December 17, 2018
    Date of Patent: November 3, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Maggie Chan, Philip Ng, David Kaplan
  • Patent number: 10825692
    Abstract: Various semiconductor chips with gettering regions and methods of making the same are disclosed. In one aspect, an apparatus is provided that includes a semiconductor chip that has a first side and a second side opposite the first side. The first side has a plurality of laser ablation craters. Each of the ablation craters has a bottom. A gettering region is in the semiconductor chip beneath the laser ablation craters. The gettering region includes plural structural defects. At least some of the structural defects emanate from at least some of the bottoms of the laser ablation craters.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: November 3, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Rahul Agarwal, Milind S. Bhagavat, Ivor Barber, Venkatachalam Valliappan, Yuen Ting Cheng, Guan Sin Chok
  • Publication number: 20200344378
    Abstract: A computer vision processing device is provided which comprises memory configured to store data and a processor. The processor is configured to store captured image data in a first buffer and acquire access to the captured image data in the first buffer when the captured image data is available for processing. The processor is also configured to execute a first group of operations in a processing pipeline, each of which processes the captured image data accessed from the first buffer and return the first buffer for storing next captured image data when a last operation of the first group of operations executes.
    Type: Application
    Filed: July 10, 2020
    Publication date: October 29, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Radhakrishna Giduthuri, Michael L. Schmit
  • Patent number: 10817422
    Abstract: In one form, a data processing system includes a host integrated circuit having a memory controller, a memory bus coupled to the memory controller, and a memory module. The memory module includes a bulk memory and a memory module scratchpad coupled to the bulk memory, wherein the memory module scratchpad has a lower access overhead than the bulk memory. The memory controller selectively provides predetermined commands over the memory bus to cause the memory module to copy data between the bulk memory and the memory module scratchpad without conducting data on the memory bus in response to a data movement decision.
    Type: Grant
    Filed: August 17, 2018
    Date of Patent: October 27, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini Farahani, Michael Ignatowski
  • Patent number: 10818762
    Abstract: A system and method for laying out power grid connections for standard cells are described. In various implementations, gate metal is placed over non-planar vertical conducting structures, which are used to form non-planar devices (transistors). Gate contacts connect gate metal to gate extension metal (GEM) above the gate metal. GEM is placed above the gate metal and makes a connection with gate metal through the one or more gate contacts. Gate extension contacts are formed on the GEM above the active regions. Similar to gate contacts, gate extension contacts are formed with a less complex fabrication process than using a self-aligned contacts process. Gate extension contacts connect GEM to an interconnect layer such as a metal zero layer. Gate extension contacts are aligned vertically with one of the non-planar vertical conducting structures. Therefore, in an implementation, one or more gate extension contacts are located above the active region.
    Type: Grant
    Filed: May 25, 2018
    Date of Patent: October 27, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Richard T. Schultz
  • Patent number: 10817302
    Abstract: Systems, apparatuses, and methods for implementing a high bandwidth, low power vector register file for use by a parallel processor are disclosed. In one embodiment, a system includes at least a parallel processing unit with a plurality of processing pipeline. The parallel processing unit includes a vector arithmetic logic unit and a high bandwidth, low power, vector register file. The vector register file includes multi-bank high density random-access memories (RAMs) to satisfy register bandwidth requirements. The parallel processing unit also includes an instruction request queue and an instruction operand buffer to provide enough local bandwidth for VALU instructions and vector I/O instructions. Also, the parallel processing unit is configured to leverage the RAM's output flops as a last level cache to reduce duplicate operand requests between multiple instructions. The parallel processing unit includes a vector destination cache to provide additional R/W bandwidth for the vector register file.
    Type: Grant
    Filed: July 7, 2017
    Date of Patent: October 27, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jiasheng Chen, Bin He, Mark M. Leather, Michael J. Mantor, Yunxiao Zou
  • Publication number: 20200335142
    Abstract: A circuit includes a repeating series of first circuits and a repeating series of second circuits placed next to the repeating series of first circuits and interacts with corresponding portions of the first circuits in the series. The repeating series of second circuits is formed in diffusion regions and diffusion wells which extend along the direction in which the second circuits repeat. The repeating series of the first and second circuits is interrupted by at least one dummy circuit region, which occupies the space of one or more instances of the first and second repeating series. The dummy circuit region also includes taps for biasing the diffusion regions and diffusion wells of the second circuits.
    Type: Application
    Filed: June 27, 2019
    Publication date: October 22, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Russell Schreiber, Keith A. Kasprak
  • Publication number: 20200327715
    Abstract: A method and system for performing graphics processing is provided. The method and system includes storing stencil buffer values in a stencil buffer; generating either or both of a reference value and a source value in a fragment shader; comparing the stencil buffer values against the reference value; and processing a fragment based on the comparing the stencil buffer values against the reference value.
    Type: Application
    Filed: June 29, 2020
    Publication date: October 15, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Graham Sellers, Eric Zolnowski, Pierre Boudier, Juraj Obert
  • Patent number: 10802977
    Abstract: A processing system tracks counts of accesses to memory pages using a set of counters located at the memory module that stores the pages, wherein the counts are adjusted at least in part based on refreshes of the memory pages. This approach allows a processing system to efficiently maintain the counts with relatively small counters and with relatively low overhead. Furthermore, the rate at which the counters are adjusted, relative to the page refreshes, is adjustable, so that the access counts are useful for a wide variety of application types.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: October 13, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Georgios Mappouras, Amin Farmahini Farahani, Nuwan Jayasena
  • Patent number: 10803655
    Abstract: A method for enhanced forward rendering is disclosed which includes a depth pre-pass, light culling and a final shading. The depth pre-pass minimizes the cost of final shading by avoiding high pixel overdraw. The light culling stage calculates a list of light indices overlapping a pixel. The light indices are calculated on a per-tile basis, where the screen has been split into units of tiles. The final shading evaluates materials using information stored for each light. The forward rendering method may be executed on a processor, such as a single graphics processing unit (GPU) for example.
    Type: Grant
    Filed: May 13, 2013
    Date of Patent: October 13, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Takahiro Harada, Jerry McKee, Jason Yang
  • Patent number: 10802806
    Abstract: A reconverging control flow graph is generated by receiving an input control flow graph including a plurality of basic code blocks, determining an order of the basic code blocks, and traversing the input control flow graph. The input control flow graph is traversed by, for each basic code block B of the plurality of basic code blocks, according to the determined order of the basic code blocks, visiting the basic code block B prior to visiting a subsequent block C of the plurality of basic code blocks, and based on determining that the basic code block B has a prior block A and that the prior block A has an open edge AC to the subsequent block C, in the reconverging control flow graph, creating an edge AF between the prior block A and a flow block F1, and creating an edge FC between the flow block F1 and the subsequent block C.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: October 13, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Nicolai Haehnle