Patents Assigned to Advanced Micros Devices, Inc.
  • Patent number: 10620958
    Abstract: Systems, apparatuses, and methods for efficiently reducing power consumption in a crossbar of a computing system are disclosed. A data transfer crossbar uses a first interface for receiving data fetched from a data storage device that is partitioned into multiple banks. The crossbar uses a second interface for sending data fetched from the multiple banks to multiple compute units. Logic in the crossbar selects data from a most recent fetch operation for a given compute unit when the logic determines the given compute unit is an inactive compute unit for which no data is being fetched. The logic sends via the second interface the selected data for the given compute unit. Therefore, when the given compute unit is inactive, the data lines for the fetched data do not transition for each inactive clock cycle after the most recent active clock cycle.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: April 14, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Xianwen Cheng
  • Patent number: 10620994
    Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.
    Type: Grant
    Filed: May 30, 2017
    Date of Patent: April 14, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Steven Tony Tye, Brian L. Sumner, Bradford Michael Beckmann, Sooraj Puthoor
  • Publication number: 20200111248
    Abstract: A technique for compressing an original image is disclosed. According to the technique, an original image is obtained and a delta-encoded image is generated based on the original image. Next, a segregated image is generated based on the delta-encoded image and then the segregated image is compressed to produce a compressed image. The segregated image is generated because the segregated image may be compressed more efficiently than the original image and the delta image.
    Type: Application
    Filed: December 6, 2019
    Publication date: April 9, 2020
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Ruijin Wu, Skyler Jonathon Saleh, Christopher J. Brennan, Kei Ming Kwong, Anthony Hung-Cheong Chan
  • Publication number: 20200112731
    Abstract: A system and method for scalable video coding that includes base layer having lower resolution encoding, enhanced layer having higher resolution encoding and the data transferring between two layers. The system and method provides several methods to reduce bandwidth of inter-layer transfers while at the same time reducing memory requirements. Due to less memory access, the system clock frequency can be lowered so that system power consumption is lowered as well. The system avoids having prediction data from base layer to enhanced layer to be up-sampled for matching resolution in the enhanced layer as transferring up-sampled data can impose a big burden on memory bandwidth.
    Type: Application
    Filed: December 6, 2019
    Publication date: April 9, 2020
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Lei Zhang, Ji Zhou, Zhen Chen, Min Yu
  • Patent number: 10613983
    Abstract: A method includes monitoring a request rate of speculative memory read requests from a penultimate-level cache to a main memory. The speculative memory read requests correspond to data read requests that missed in the penultimate-level cache. A hit rate of searches of a last-level cache for data requested by the data read requests is monitored. Core demand speculative memory read requests to the main memory are selectively enabled in parallel with searching of the last-level cache for data of a corresponding core demand data read request based on the request rate and the hit rate. Prefetch speculative memory read requests to the main memory are selectively enabled in parallel with searching of the last-level cache for data of a corresponding prefetch data read request based on the request rate and the hit rate.
    Type: Grant
    Filed: March 20, 2018
    Date of Patent: April 7, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Tanuj Kumar Agarwal, Anasua Bhowmik, Douglas Benson Hunt
  • Patent number: 10613764
    Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. In various embodiments, a computing system includes a computing resource and a memory controller coupled to a memory device. The computing resource selectively generates a hint that includes a target address of a memory request generated by the processor. The hint is sent outside the primary communication fabric to the memory controller. The hint conditionally triggers a data access in the memory device. When no page in a bank targeted by the hint is open, the memory controller processes the hint by opening a target page of the hint without retrieving data. The memory controller drops the hint if there are other pending requests that target the same page or the target page is already open.
    Type: Grant
    Filed: November 20, 2017
    Date of Patent: April 7, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ravindra N. Bhargava, Philip S. Park, Vydhyanathan Kalyanasundharam, James Raymond Magro
  • Patent number: 10613957
    Abstract: Systems, apparatuses, and methods for achieving balanced execution in a multi-node cluster through runtime detection of performance variation are described. During a training phase, performance counters and an amount of time spent waiting for synchronization is monitored for a plurality of tasks for each node of the multi-node cluster. These values are utilized to generate a model which correlates the values of the performance counters to the amount of time spent waiting for synchronization. Once the model is built, the values of the performance counters are monitored for a period of time at the start of each task, and these values are input into the model. The model generates a prediction of whether a given node is on the critical path. If the given node is predicted to be on the critical path, the power allocation of the given node is increased.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: April 7, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Brian J. Kocoloski, Leonardo Piga, Wei Huang, Indrani Paul
  • Publication number: 20200104262
    Abstract: A processing device is provided which includes memory comprising data cache memory configured to store compressed data and metadata cache memory configured to store metadata, each portion of metadata comprising an encoding used to compress a portion of data. The processing device also includes at least one processor configured to compress portions of data and select, based on one or more utility level metrics, portions of metadata to be stored in the metadata cache memory. The at least one processor is also configured to store, in the metadata cache memory, the portions of metadata selected to be stored in the metadata cache memory, store, in the data cache memory, each portion of compressed data having a selected portion of corresponding metadata stored in the metadata cache memory. Each portion of compressed data, having the selected portion of corresponding metadata stored in the metadata cache memory, is decompressed.
    Type: Application
    Filed: September 28, 2018
    Publication date: April 2, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
  • Patent number: 10606599
    Abstract: A system and method for using an operation (op) cache is disclosed. The system and method include an op cache for caching previously decoded instructions. The op cache includes a plurality of physically indexed and tagged instructions allowing sharing of instructions between threads. The op cache is chained through multiple ways allowing service of a plurality of instructions in a cache line. The op cache is stored between a shared operation storage and immediate/displacement storage to maximize capacity.
    Type: Grant
    Filed: December 9, 2016
    Date of Patent: March 31, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: David N. Suggs
  • Patent number: 10608633
    Abstract: An electronic device includes a die stack having a plurality of die. The die stack includes a die parity path spanning the plurality of die and configured to alternatingly identify each die as a first type or a second type. The die stack further includes an inter-die signal path spanning the plurality of die and configured to propagate an inter-die signal through the plurality of die, wherein the inter-die signal path is configured to invert a logic state of the inter-die signal between each die. Each die of the plurality of die includes signal formatting logic configured to selectively invert a logic state of the inter-die signal before providing it to other circuitry of the die responsive to whether the die is designated as the first type or the second type.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: March 31, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: Russell Schreiber
  • Patent number: 10608076
    Abstract: A system and method for fabricating metal insulator metal capacitors while managing semiconductor processing yield and increasing capacitance per area are described. A semiconductor device fabrication process places a polysilicon layer on top of an oxide layer which is on top of a metal layer. The process etches trenches into areas of the polysilicon layer where the repeated trenches determine a frequency of an oscillating wave structure to be formed later. The top and bottom corners of the trenches are rounded. The process deposits a bottom metal, a dielectric, and a top metal on the polysilicon layer both on areas with the trenches and on areas without the trenches. A series of a barrier metal and a second polysilicon layer is deposited on the oscillating structure. The process completes the MIM capacitor with metal nodes contacting each of the top metal and the bottom metal of the oscillating structure.
    Type: Grant
    Filed: March 22, 2017
    Date of Patent: March 31, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Richard T. Schultz
  • Patent number: 10606740
    Abstract: Systems, apparatuses, and methods for generating flexibly addressed memory requests are disclosed. In one embodiment, a system includes a processor, control unit, and memory subsystem. The processor launches a plurality of threads on a plurality of compute units, wherein each thread generates memory requests without specifying target memory addresses. The threads executing on the plurality of compute units convey a plurality of memory requests to the control unit. The control unit generates target memory addresses for the plurality of received memory requests. In one embodiment, the memory requests are write requests, and the control unit interleaves write requests from the plurality of threads into a single output buffer stored in the memory subsystem. The control unit can be located in a cache, in a memory controller, or in another location within the system.
    Type: Grant
    Filed: May 26, 2017
    Date of Patent: March 31, 2020
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Yunpeng Zhu, Jimshed Mirza
  • Patent number: 10608943
    Abstract: Systems, apparatuses, and methods for dynamic buffer management in multi-client token flow control routers are disclosed. A system includes at least one or more processing units, a memory, and a communication fabric with a plurality of routers coupled to the processing unit(s) and the memory. A router servicing multiple active clients allocates a first number of tokens to each active client. The first number of tokens is less than a second number of tokens needed to saturate the bandwidth of each client to the router. The router also allocates a third number of tokens to a free pool, with tokens from the free pool being dynamically allocated to different clients. The third number of tokens is equal to the difference between the second number of tokens and the first number of tokens. An advantage of this approach is reducing the amount of buffer space needed at the router.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: March 31, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alan Dodson Smith, Chintan S. Patel, Eric Christopher Morton, Vydhyanathan Kalyanasundharam, Narendra Kamat
  • Publication number: 20200098169
    Abstract: Described herein are techniques for improving the effectiveness of depth culling. In a first technique, a binner is used to sort primitives into depth bins. Each depth bin covers a range of depths. The binner transmits the depth bins to the screen space pipeline for processing in near-to-far order. Processing the near bins first results in the depth buffer being updated, allowing fragments for the primitives in the farther bins to be culled more aggressively than if the depth binning did not occur. In a second technique, a buffer is used to initiate two-pass processing through the screen space pipeline. In the first pass, primitives are sent down to update the depth block and are then culled. The fragments are processed normally in the second pass, with the benefit of the updated depth values.
    Type: Application
    Filed: September 21, 2018
    Publication date: March 26, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Ruijin Wu, Young In Yeo, Sagar S. Bhandare, Vineet Goel, Martin G. Sarov, Christopher J. Brennan
  • Patent number: 10600142
    Abstract: A compute unit accesses a chunk of bits that represent indices of vertices of a graphics primitive. The compute unit sets values of a first bit to indicate whether the chunk is monotonic or ordinary, second bits to define an offset that is determined based on values of indices in the chunk, and sets of third bits that determine values of the indices in the chunk based on the offset defined by the second bits. The compute unit writes a compressed chunk represented by the first bit, the second bits, and the sets of third bits to a memory. The compressed chunk is decompressed and the decompressed indices are written to an index buffer. In some embodiments, the indices are decompressed based on metadata that includes offsets that are determined based on values of the indices and bitfields that indicate characteristics of the indices.
    Type: Grant
    Filed: December 5, 2017
    Date of Patent: March 24, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Usame Ceylan, Young In Yeo, Todd Martin, Vineet Goel
  • Patent number: 10601723
    Abstract: A computing system uses a memory for storing data, one or more clients for generating network traffic and a communication fabric with network switches. The network switches include centralized storage structures, rather than separate input and output storage structures. The network switches store particular metadata corresponding to received packets in a single, centralized collapsing queue where the age of the packets corresponds to a queue entry position. The payload data of the packets are stored in a separate memory, so the relatively large amount of data is not shifted during the lifetime of the packet in the network switch. The network switches select sparse queue entries in the collapsible queue, deallocate the selected queue entries, and shift remaining allocated queue entries toward a first end of the queue with a delay proportional to the radix of the network switches.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: March 24, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alan Dodson Smith, Vydhyanathan Kalyanasundharam, Bryan P. Broussard, Greggory D. Donley, Chintan S. Patel
  • Patent number: 10599578
    Abstract: A processing system fills a memory access request for data from a processor core by bypassing a cache when a write congestion condition is detected, and when transferring the data to the cache would cause eviction of a dirty cache line. The cache is bypassed by transferring the requested data to the processor core or to a different cache. Accordingly, the processing system can temporarily bypass the cache storing the dirty cache line when filling a memory access request, thereby avoiding the eviction and write back to main memory of a dirty cache line when a write congestion condition exists.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: March 24, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amin Farmahini Farahani, David A. Roberts
  • Publication number: 20200090736
    Abstract: A write driver includes a first write data driver, a second write driver, and a control circuit. The first (second) write data driver provides a true (complement) write data signal to an output thereof at a high voltage when a true (complement) data signal is in a first logic state, at a ground voltage when the true (complement) data signal is in a second logic state and a negative bit line enable signal is inactive, and at a voltage below the ground voltage when the true (complement) data signal is in the second logic state and the negative bit line enable signal is active. The control circuit provides the negative bit line enable signal in an active state when a power supply voltage is below a first threshold, and in an inactive state when the power supply voltage is above a second threshold higher than the first threshold.
    Type: Application
    Filed: September 14, 2018
    Publication date: March 19, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Alex Schaefer, Ravi Jotwani, David Hugh McIntyre
  • Patent number: 10593391
    Abstract: In one form, a memory controller includes a command queue, an arbiter, a refresh logic circuit, and a final arbiter. The command queue receives and stores memory access requests for a memory. The arbiter selectively picks accesses from the command queue according to a first type of accesses and a second type of accesses. The first type of accesses and the second type of accesses correspond to different page statuses of corresponding memory accesses in the memory. The refresh logic circuit generates a refresh command to a bank of the memory and provides a priority indicator with the refresh command whose value is set according to a number of pending refreshes. The final arbiter selectively orders the refresh command with respect to memory access requests of the first type accesses and the second type accesses based on the priority indicator.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: March 17, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Liang Zhao, YuBin Yao
  • Patent number: 10593111
    Abstract: A method, a system, and a computer-readable storage medium directed to performing high-speed parallel tessellation of 3D surface patches are disclosed. The method includes generating a plurality of primitives in parallel. Each primitive in the plurality is generated by a sequence of functional blocks, in which each sequence acts independently of all the other sequences.
    Type: Grant
    Filed: August 7, 2018
    Date of Patent: March 17, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Timour T. Paltashev, Boris Prokopenko, Vladimir V. Kibardin