Patents Assigned to Advanced Micro Devices
  • Patent number: 10079916
    Abstract: Systems, apparatuses, and methods for reducing inter-node bandwidth are contemplated. A computer system includes requesting nodes sending transactions to target nodes. A requesting node sends a packet that includes a register identifier (ID) in place of a data value in the packet. The register ID indicates a register in the target node storing the data value. The register ID uses fewer bits in the packet than the data value. The data value may be a memory address referencing a memory location in the target node. The received packet may also include an opcode indicating an operation to perform on the targeted data value.
    Type: Grant
    Filed: April 26, 2016
    Date of Patent: September 18, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: David A. Roberts, Kevin Y. Cheng, Nathan Hu
  • Patent number: 10078588
    Abstract: The described embodiments include a computing device with two or more translation lookaside buffers (TLB) that performs operations for handling entries in the TLBs. During operation, the computing device maintains lease values for entries in the TLBs, the lease values representing times until leases for the entries expire, wherein a given entry in the TLB is invalid when the associated lease has expired. The computing device uses the lease value to control operations that are allowed to be performed using information from the entries in the TLBs. In addition, the computing device maintains, in a page table, longest lease values for page table entries indicating when corresponding longest leases for entries in TLBs expire. The longest lease values are used to determine when and if a TLB shootdown is to be performed.
    Type: Grant
    Filed: March 25, 2016
    Date of Patent: September 18, 2018
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Arkaprava Basu, Mark H. Oskin, Gabriel H. Loh, Andrew G. Kegel, David S. Christie, Kevin J. McGrath
  • Patent number: 10073785
    Abstract: In a processing system comprising a cache, a method includes monitoring demand cache accesses for a thread to maintain a first running count of a number of times demand cache accesses for the thread are directed to cachelines that are adjacent in a first direction to cachelines that are targets of a set of sampled cache accesses for the thread. In response to determining the first running count has exceeded a first threshold, the method further includes enabling a first prefetching mode in which a received demand cache access for the thread triggers a prefetch request for a cacheline adjacent in the first direction to a cacheline targeted by the received demand cache access.
    Type: Grant
    Filed: June 13, 2016
    Date of Patent: September 11, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventor: William Evan Jones, III
  • Patent number: 10073746
    Abstract: Methods and apparatus presented herein provide distributed checkpointing in a multi-node system, such as a network of servers in a data center. When checkpointing of application state data is needed in a node, the methods and apparatus determine whether checkpoint memory space is available in the node for checkpointing the application state data. If not enough checkpoint memory space is available in the node, the methods and apparatus request and find additional checkpoint memory space from other nodes in the system. In this manner, the methods and apparatus can checkpoint the application state data into available checkpoint memory spaces distributed among a plurality of nodes. This allows for high bandwidth and low latency checkpointing operations in the multi-node system.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: September 11, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sergey Blagodurov, Taniya Siddiqua, Vilas Sridharan
  • Patent number: 10075383
    Abstract: Systems, apparatuses, and methods for implementing an asynchronous router with virtual channel (VC) control. The asynchronous router may support multiple VCs for connections to other routers. The asynchronous router may include an interface unit on each switch boundary, with each interface unit including a data merge unit. The data merge unit may include a full detector unit for each VC, with the full detector unit counting the number of flits sent out on a respective VC and counting the number of credits released by the successor router. Whenever the successor router has no credits available, the full detector unit will assert the full signal to prevent any input requests from requesting to transmit over that particular VC. When the full signal is asserted, a timer unit may be activated to repeatedly check if any credits have been released in the successor router.
    Type: Grant
    Filed: March 30, 2016
    Date of Patent: September 11, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Weiwei Jiang, Greg Sadowski
  • Patent number: 10073783
    Abstract: A system and method for efficiently processing access requests for a shared resource are described. Each of many requestors are assigned to a partition of a shared resource. When a controller determines no requestor generates an access request for an unassigned partition, the controller permits simultaneous access to the assigned partitions for active requestors. When the controller determines at least one active requestor generates an access request for an unassigned partition, the controller allows a single active requestor to gain exclusive access to the entire shared resource while stalling access for the other active requestors. The controller alternatives exclusive access among the active requestors. In various embodiments, the shared resource is a local data store in a graphics processing unit and each of the multiple requestors is a single instruction multiple data (SIMD) compute unit.
    Type: Grant
    Filed: November 23, 2016
    Date of Patent: September 11, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Daniel Clifton, Michael J. Mantor, Hans Burton
  • Patent number: 10074600
    Abstract: Various resistor circuits and methods of making and using the same are disclosed. In one aspect, a method of manufacturing is provided that includes forming a resistor onboard an interposer. The resistor is adapted to dampen a capacitive network. The capacitive network has at least one capacitor positioned external to the interposer.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: September 11, 2018
    Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Fei Guo, Feng Zhu, Julius Din, Anwar Kashem, Sally Yeung
  • Patent number: 10068372
    Abstract: A method, a system, and a computer-readable storage medium directed to performing high-speed parallel tessellation of 3D surface patches are disclosed. The method includes generating a plurality of primitives in parallel. Each primitive in the plurality is generated by a sequence of functional blocks, in which each sequence acts independently of all the other sequences.
    Type: Grant
    Filed: December 30, 2015
    Date of Patent: September 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Timour T. Paltashev, Boris Prokopenko, Vladimir V. Kibardin
  • Patent number: 10067911
    Abstract: Systems, apparatuses, and methods for performing in-place matrix transpose operations are disclosed. Operations for transposing tiles of a matrix are scheduled in an order determined by moving diagonally through tiles of the matrix. When a diagonal line hits a boundary, then a tile on a new diagonal line of the matrix is selected and operations are scheduled for transposing this tile. Only tiles within a triangular region of the matrix are scheduled for being transposed. This allows memory access operations to be performed in parallel, expediting the matrix transpose operation compared to linear tile indexing.
    Type: Grant
    Filed: July 26, 2016
    Date of Patent: September 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amir Gholaminejad, Bragadeesh Natarajan
  • Patent number: 10067709
    Abstract: Systems, apparatuses, and methods for accelerating page migration using a two-level bloom filter are disclosed. In one embodiment, a system includes a GPU and a CPU and a multi-level memory hierarchy. When a memory request misses in a first memory, the GPU is configured to check a first level of a two-level bloom filter to determine if a page targeted by the memory request is located in a second memory. If the first level of the two-level bloom filter indicates that the page is not in the second memory, then the GPU generates a page fault and sends the memory request to a third memory. If the first level of the two-level bloom filter indicates that the page is in the second memory, then the GPU sends the memory request to the CPU.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: September 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Leonardo Piga, Mauricio Breternitz
  • Patent number: 10067710
    Abstract: A processing apparatus is provided that includes a plurality of memory regions each corresponding to a memory address and configured to store data associated with the corresponding memory address. The processing apparatus also includes an accelerated processing device in communication with the memory regions and configured to determine a request to allocate an initial memory buffer comprising a number of contiguous memory regions, create a new memory buffer comprising one or more additional memory regions adjacent to the contiguous memory regions of the initial memory buffer, assign one or more values to the one or more additional memory regions and detect a change to the one or more values at the one or more additional memory regions.
    Type: Grant
    Filed: November 23, 2016
    Date of Patent: September 4, 2018
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Joseph L. Greathouse, Christopher D. Erb, Michael G. Collins
  • Patent number: 10067555
    Abstract: An apparatus and a method for controlling power consumption associated with a computing device having first and second processors configured to perform different types of operations includes providing a user interface that allows, during normal operation of the computing device, at least one of: (i) a user selection of desired performance levels of the first and second processors relative to one another, such that higher desired performance levels of one processor correspond to lower desired performance levels of the other processor, and (ii) a user selection of a desired performance level of the first processor and a user selection of a desired performance level of the second processor, the two user selections being made independently of one another. The apparatus and method control, during normal operation of the computing device, performance levels of the processors in response to the one or more user selections of the desired performance levels.
    Type: Grant
    Filed: February 20, 2014
    Date of Patent: September 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventor: I-Ming Lin
  • Patent number: 10067718
    Abstract: Dynamic random access memory (DRAM) chips in memory modules include multi-purpose registers (MPRs) having pre-defined data patterns which, when selected, are accessed with read commands and output on data lines for performing read training. The MPRs are accessed by issuing read commands to specific register addresses to request reads from specific MPR locations. In some embodiments, read training for memory modules includes addressing, for a first half of a memory module, a read command to a first register address and performing read training using a first set of bit values received in response to addressing using the first register address. For a second half of the memory module, the same read command is used, but read training is performed using a second set of bit values received in response to addressing using the first register address.
    Type: Grant
    Filed: September 23, 2016
    Date of Patent: September 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Glennis Eliagh Covington, Kevin M. Brandl, Nienchi Hu, Shannon T. Kesner
  • Patent number: 10068794
    Abstract: A system and method for fabricating non-planar devices while managing semiconductor processing yield and cost are described. A semiconductor device fabrication process forms a stack of alternating semiconductor layers. A trench is etched and filled with at least an oxide layer with a length at least that of a device channel length while being bounded by sites for a source region and a drain region. The process places a second silicon substrate on top of both the oxide layer in the trench and the top-most semiconducting layer of the stack. The two surfaces making contact by wafer bonding use the same type of semiconducting layer. The device is flipped such that the first substrate and the stack are on top of the second substrate. The process forms nanowires of a gate region from the stack in the top first substrate.
    Type: Grant
    Filed: January 31, 2017
    Date of Patent: September 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Richard T. Schultz
  • Patent number: 10067872
    Abstract: A plurality of memory modules, which may be used to form a heterogeneous memory system, are connected to a plurality of prefetchers. Each prefetcher is independently configured to prefetch information from a corresponding one of the plurality of memory modules in response to feedback from the corresponding one of the plurality of memory modules.
    Type: Grant
    Filed: June 22, 2015
    Date of Patent: September 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventor: David A. Roberts
  • Publication number: 20180246655
    Abstract: Improvements in compute shader programs executed on parallel processing hardware are disclosed. An application or other entity defines a sequence of shader programs to execute. Each shader program defines inputs and outputs which would, if unmodified, execute as loads and stores to a general purpose memory, incurring high latency. A compiler combines the shader programs into groups that can operate in a lower-latency, but lower-capacity local data store memory. The boundaries of these combined shader programs are defined by several aspects including where memory barrier operations are to execute, whether combinations of shader programs can execute using only the local data store and not the global memory (except for initial reads and writes) and other aspects.
    Type: Application
    Filed: February 24, 2017
    Publication date: August 30, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael L. Schmit, Radhakrishna Giduthuri
  • Publication number: 20180246815
    Abstract: Techniques are provided for managing address translation request traffic where memory access requests can be made with differing quality-of-service levels, which specify latency and/or bandwidth requirements. The techniques involve translation lookaside buffers. Within the translation lookaside buffers, certain resources are reserved for specific quality-of-service levels. More specifically, translation lookaside buffer slots, which store the actual translations, as well as finite state machines in a work queue, are reserved for specific quality-of-service levels. The translation lookaside buffer receives multiple requests for address translation. The translation lookaside buffer selects requests having the highest quality-of-service level for which an available finite state machine is available.
    Type: Application
    Filed: February 24, 2017
    Publication date: August 30, 2018
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Wade K. Smith, Kostantinos Danny Christidis
  • Publication number: 20180246816
    Abstract: Techniques are provided for using a translation lookaside buffer to provide low latency memory address translations for data streams. Clients of a memory system first prepare the address translation cache hierarchy by requesting that a translation pre-fetch stream is initialized. After the translation pre-fetch stream is initialized, the cache hierarchy returns an acknowledgment of completion to the client, which then begins to access memory. Pre-fetch streams are specified in terms of address ranges and are performed for large contiguous portions of the virtual memory address space.
    Type: Application
    Filed: February 24, 2017
    Publication date: August 30, 2018
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Wade K. Smith, Kostantinos Danny Christidis
  • Publication number: 20180246657
    Abstract: Techniques for handling data compression in which metadata that indicates which portions of data are compressed are which portions of data are not compressed are disclosed. Segments of a buffer referred to as block groups store compressed blocks of data along with uncompressed blocks of data and hash blocks. If a block group includes a block that is a hash of another block in the block group, then the other block is considered to be compressed. If the block group does not include a block that is a hash of another block in the block group, then the blocks in the block group are uncompressed. The hash function to generate the hash is selected to prevent “collisions,” which occur when the data being stored in the buffer is such that it is possible for a hash block and an uncompressed block to be the same.
    Type: Application
    Filed: February 24, 2017
    Publication date: August 30, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Greg Sadowski
  • Patent number: 10062206
    Abstract: A parallel adaptable graphics rasterization system in which a primitive assembler includes a router to selectively route a primitive to a first rasterizer or one of a plurality of second rasterizers. The second rasterizers concurrently operate on different primitives and the primitive is selectively routed based on an area of the primitive. In some variations, a bounding box of the primitive is reduced to a predetermined number of pixels prior to providing the primitive to the one of the plurality of second rasterizers. Reducing the bounding box can include subtracting an origin of the bounding box from coordinates of points that represent the primitive.
    Type: Grant
    Filed: August 30, 2016
    Date of Patent: August 28, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Boris Prokopenko, Timour T. Paltashev, Vladimir V. Kibardin