Patents Assigned to Advanced Micro Devices
  • Publication number: 20180165872
    Abstract: Techniques for removing or identifying overlapping fragments in a fragment stream after z-culling are disclosed. The techniques include maintaining a first-in-first-out buffer that stores post-z-cull fragments. Each time a new fragment is received at the buffer, the screen position of the fragment is checked against all other fragments in the buffer. If the screen position of the fragment matches the screen position of a fragment in the buffer, then the fragment in the buffer is removed or marked as overlapping. If the screen position of the fragment does not match the screen position of any fragment in the buffer, then no modification is performed to fragments already in the buffer. In either case, he fragment is added to the buffer. The contents of the buffer are transmitted to the pixel shader for pixel shading at a later time.
    Type: Application
    Filed: December 9, 2016
    Publication date: June 14, 2018
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Laurent Lefebvre, Michael Mantor, Mark Fowler, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi, Christopher J. Brennan
  • Publication number: 20180165314
    Abstract: Described herein is a system and method for multiplexer tree (muxtree) indexing. Muxtree indexing performs hashing and row reduction in parallel by use of each select bit only once in a particular path of the muxtree. The muxtree indexing generates a different final index as compared to conventional hashed indexing but still results in a fair hash, where all table entries get used with equal distribution with uniformly random selects.
    Type: Application
    Filed: November 28, 2017
    Publication date: June 14, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Steven R. Havlir, Patrick J. Shyvers
  • Publication number: 20180165202
    Abstract: A data processing system includes a processor and a cache controller coupled to the processor, and adapted to be coupled to a memory. The cache controller uses the memory to form a pseudo direct mapped cache having a plurality of groups of pages. The memory forms a first number of selected pages, including a first page for storing a plurality of sets of tags and a plurality of remaining pages for storing data. Each tag, of the plurality of sets of tags, stores tags for respective entries in a corresponding one of the plurality of remaining pages.
    Type: Application
    Filed: December 12, 2016
    Publication date: June 14, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Ganesh Balakrishnan, Vydhyanathan Kalyanasundharam, Kevin M. Lepak
  • Publication number: 20180165790
    Abstract: Techniques for allowing cache access returns out of order are disclosed. A return ordering queue exists for each of several cache access types and stores outstanding cache accesses in the order in which those accesses were made. When a cache access request for a particular type is at the head of the return ordering queue for that type and the cache access is available for return to the wavefront that made that access, the cache system returns the cache access to the wavefront. Thus, cache accesses can be returned out of order with respect to cache accesses of different types. Allowing out-of-order returns can help to improve latency, for example in the situation where a relatively low-latency access type (e.g., a read) is issued after a relatively high-latency access type (e.g., a texture sampler operation).
    Type: Application
    Filed: December 13, 2016
    Publication date: June 14, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Daniel Schneider, Fataneh Ghodrat
  • Patent number: 9996478
    Abstract: A system and method for efficiently performing data allocation in a cache memory are described. A lookup is performed in a cache responsive to detecting an access request. If the targeted data is found in the cache and the targeted data is of a no allocate data type indicating the targeted data is not expected to be reused, then the targeted data is read from the cache without updating cache replacement policy information for the targeted data responsive to the access. If the lookup results in a miss, the targeted data is prevented from being allocated in the cache.
    Type: Grant
    Filed: December 9, 2016
    Date of Patent: June 12, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Mark Fowler
  • Patent number: 9990203
    Abstract: Methods, devices, and systems for capturing an accuracy of an instruction executing on a processor. An instruction may be executed on the processor, and the accuracy of the instruction may be captured using a hardware counter circuit. The accuracy of the instruction may be captured by analyzing bits of at least one value of the instruction to determine a minimum or maximum precision datatype for representing the field, and determining whether to adjust a value of the hardware counter circuit accordingly. The representation may be output to a debugger or logfile for use by a developer, or may be output to a runtime or virtual machine to automatically adjust instruction precision or gating of portions of the processor datapath.
    Type: Grant
    Filed: December 28, 2015
    Date of Patent: June 5, 2018
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Leonardo de Paula Rosa Piga, Abhinandan Majumdar, Indrani Paul, Wei Huang, Manish Arora, Joseph L. Greathouse
  • Patent number: 9990289
    Abstract: A processing system having a multilevel cache hierarchy employs techniques for repurposing dead cache blocks so as to use otherwise wasted space in a cache hierarchy employing a write-back scheme. For a cache line containing invalid data with a valid tag, the valid tag is maintained for cache coherence purposes or otherwise, resulting in a valid tag for a dead cache block. A cache controller repurposes the dead cache block by storing any of a variety of new data at the dead cache block, while storing the new tag in a tag entry of a dead block tag way with an identifier indicating the location of the new data.
    Type: Grant
    Filed: September 19, 2014
    Date of Patent: June 5, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Gabriel H. Loh, Derek R. Hower, Shuai Che
  • Patent number: 9983652
    Abstract: Systems, apparatuses, and methods for balancing computation and communication power in power constrained environments. A data processing cluster with a plurality of compute nodes may perform parallel processing of a workload in a power constrained environment. Nodes that finish tasks early may be power-gated based on one or more conditions. In some scenarios, a node may predict a wait duration and go into a reduced power consumption state if the wait duration is predicted to be greater than a threshold. The power saved by power-gating one or more nodes may be reassigned for use by other nodes. A cluster agent may be configured to reassign the unused power to the active nodes to expedite workload processing.
    Type: Grant
    Filed: December 4, 2015
    Date of Patent: May 29, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Leonardo Piga, Indrani Paul, Wei Huang
  • Patent number: 9983655
    Abstract: A method and apparatus for performing inter-lane power management includes de-energizing one or more execution lanes upon a determination that the one or more execution lanes are to be predicated. Energy from the predicated execution lanes is redistributed to one or more active execution lanes.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: May 29, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mitesh R. Meswani, David A. Roberts, Dmitri Yudanov, Arkaprava Basu, Sergey Blagodurov
  • Publication number: 20180143781
    Abstract: A processing apparatus is provided that includes a plurality of memory regions each corresponding to a memory address and configured to store data associated with the corresponding memory address. The processing apparatus also includes an accelerated processing device in communication with the memory regions and configured to determine a request to allocate an initial memory buffer comprising a number of contiguous memory regions, create a new memory buffer comprising one or more additional memory regions adjacent to the contiguous memory regions of the initial memory buffer, assign one or more values to the one or more additional memory regions and detect a change to the one or more values at the one or more additional memory regions.
    Type: Application
    Filed: November 23, 2016
    Publication date: May 24, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Joseph L. Greathouse, Christopher D. Erb, Michael G. Collins
  • Publication number: 20180144536
    Abstract: Techniques for removing duplicate indices from an index stream are disclosed. The techniques involve dividing the indices into chunks. For any particular chunk, the techniques involve examining each index in the chunk to determine whether a “match” exists for that index within a reuse depth sliding window. The reuse depth sliding window includes a fixed number of indices immediately prior to the index being examined for a match. If a match exists, then the index is marked as non-unique and is assigned a position value equal to the position value of the matching index. If a match does not exist, then the index is marked as unique and assigned the next available position value for the chunk. After assigning position values to indices in a chunk, the indices in the chunk are transmitted to a vertex shader stage for processing in the order indicated by the position values.
    Type: Application
    Filed: November 23, 2016
    Publication date: May 24, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Saad Arrabi, Mangesh P. Nijasure, Todd Martin
  • Patent number: 9977756
    Abstract: An internal bus architecture and method is described. Embodiments include a system with multiple bus endpoints coupled to a bus. In addition, the bus endpoints are directly coupled to each other. Embodiments are usable with known bus protocols.
    Type: Grant
    Filed: November 17, 2014
    Date of Patent: May 22, 2018
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Stephen Morein, Mark S. Grossman
  • Patent number: 9977609
    Abstract: Systems, apparatuses, and methods for implementing efficient queues and other data structures. A queue may be shared among multiple processors and/or threads without using explicit software atomic instructions to coordinate access to the queue. System software may allocate an atomic queue and corresponding queue metadata in system memory and return, to the requesting thread, a handle referencing the queue metadata. Any number of threads may utilize the handle for accessing the atomic queue. The logic for ensuring the atomicity of accesses to the atomic queue may reside in a management unit in the memory controller coupled to the memory where the atomic queue is allocated.
    Type: Grant
    Filed: March 7, 2016
    Date of Patent: May 22, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan S. Jayasena, Dong Ping Zhang, Paula Aguilera Diez
  • Patent number: 9977854
    Abstract: A computer-implemented method of fabricating an integrated circuit structure includes selecting a first cell from a standard cell library, the first cell having a cell boundary and comprising a metal segment at a first metal track at a metal layer, the metal segment extending along a direction and terminating a specified distance beyond a first edge of the cell boundary. The method further includes placing the first cell at a first location of a physical layout for the integrated circuit structure. The method also includes selecting a second cell from the standard cell library and placing the second cell at a second location of the physical layout such that a second edge of a cell boundary of the second cell abuts the first edge of the cell boundary of the first cell, and wherein the metal segment extends into a metal track at the metal layer of the second cell.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: May 22, 2018
    Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Omid Rowhani, Ioan Cordos, Kerry Hamel, Donald Clay
  • Publication number: 20180137676
    Abstract: Techniques for removing reset indices from, and identifying primitives in, an index stream that defines a set of primitives to be rendered, are disclosed. The index stream may be specified by an application program executing on the central processing unit. The technique involves classifying the primitive topology for the index stream as either requiring an offset-based technique or requiring a non-offset-based technique. This classification is done by determining whether, according to the primitive topology, each subsequent index can form a primitive with prior indices (e.g., line strip, triangle strip). If each subsequent index can form a primitive with prior indices, then the technique used is the non-offset-based technique. If each subsequent index does not form a primitive with prior indices, but instead at least two indices are required to form a new primitive (e.g., line list, triangle list), then the technique used is the offset-based technique.
    Type: Application
    Filed: November 17, 2016
    Publication date: May 17, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Saad Arrabi, Mangesh P. Nijasure, Todd Martin
  • Patent number: 9971700
    Abstract: A processing device includes a cache implementing a set of at least three cache slices. Each cache slice is to store a corresponding set of cache lines. The cache further includes cache control logic coupled to the set of at least three cache slices. The cache control logic is to map addresses of an address space to the cache such that each address within the address space maps to a corresponding strict subset of two or more cache slices of the set of cache slices.
    Type: Grant
    Filed: November 6, 2015
    Date of Patent: May 15, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Gabriel H. Loh
  • Patent number: 9971708
    Abstract: Described is a method and apparatus for application migration between a dockable device and a docking station in a seamless manner. The dockable device includes a processor and the docking station includes a high-performance processor. The method includes determining a docking state of a dockable device while at least an application is running. Application migration from the dockable device to a docking station is initiated when the dockable device is moving to a docked state. Application migration from the docking station to the dockable device is initiated when the dockable device is moving to an undocked state. The application continues to run during the application migration from the dockable device to the docking station or during the application migration from the docking station to the dockable device.
    Type: Grant
    Filed: December 2, 2015
    Date of Patent: May 15, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jonathan Lawrence Campbell, Yuping Shen
  • Publication number: 20180129504
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Application
    Filed: November 6, 2017
    Publication date: May 10, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
  • Patent number: 9965343
    Abstract: Disclosed is a method of determining concurrency factors for an application running on a parallel processor. Also disclosed is a system for implementing the method. In an embodiment, the method includes running at least a portion of the kernel as sequences of mini-kernels, each mini-kernel including a number of concurrently executing workgroups. The number of concurrently executing workgroups is defined as a concurrency factor of the mini-kernel. A performance measure is determined for each sequence of mini-kernels. From the sequences, a particular sequence is chosen that achieves a desired performance of the kernel, based on the performance measures. The kernel is executed with the particular sequence.
    Type: Grant
    Filed: May 13, 2015
    Date of Patent: May 8, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Rathijit Sen, Indrani Paul, Wei Huang
  • Patent number: 9965222
    Abstract: A data processing system includes a memory channel and a data processor coupled to the memory channel. The data processor includes a memory controller coupled to the memory channel and is adapted to access at least one rank of double data rate memory. The memory controller includes a command queue for storing received memory access requests, and an arbiter for picking memory access requests from the command queue, and then providing the memory access requests to the memory channel. The memory access requests are selected based on predetermined criteria, and in response to a mode register access request to quiesce pending operations. Additionally, the memory controller includes a mode register access controller that in response to the mode register access request, generates at least one corresponding mode register set command to a memory bus. The memory controller then relinquishes control of the memory bus to the arbiter thereafter.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: May 8, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Kevin M. Brandl, Scott P. Murphy, James R. Magro, Paramjit K. Lubana