Patents Assigned to Advanced Micro Devices
  • Publication number: 20210225060
    Abstract: A processing device and a method of tiled rendering of an image for display is provided. The processing device includes memory and a processor. The processor is configured to receive the image comprising one or more three dimensional (3D) objects, divide the image into tiles, execute coarse level tiling for the tiles of the image and execute fine level tiling for the tiles of the image. The processing device also includes same fixed function hardware used to execute the coarse level tiling and the fine level tiling. The processor is also configured to determine visibility information for a first one of the tiles. The visibility information is divided into draw call visibility information and triangle visibility information for each remaining tile of the image.
    Type: Application
    Filed: September 25, 2020
    Publication date: July 22, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Mika Tuomi, Kiia Kallio, Ruijin Wu, Anirudh R. Acharya, Vineet Goel
  • Publication number: 20210224130
    Abstract: Methods and systems for load balancing in a neural network system using metadata are disclosed. Any one or a combination of one or more kernels, one or more neurons, and one or more layers of the neural network system are tagged with metadata. A scheduler detects whether there are neurons that are available to execute. The scheduler uses the metadata to schedule and load balance computations across compute resources and available resources.
    Type: Application
    Filed: April 5, 2021
    Publication date: July 22, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nicholas Malaya, Yasuko Eckert
  • Patent number: 11068368
    Abstract: Automatic part testing includes: booting a part under testing into a first operating environment; executing, via the first operating environment, one or more test patterns on the part; performing a comparison between one or more observed characteristics associated with the one or more test patterns and one or more expected characteristics; and modifying one or more operational parameters of a central processing unit of the part based on the comparison.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: July 20, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Amitabh Mehra, Anil Harwani, William R. Alverson, Grant E. Ley, Jerry A. Ahrens, Mustansir M. Pratapgarhwala, Scott E. Swanstrom
  • Patent number: 11068458
    Abstract: A portion of a graph dataset is generated for each computing node in a distributed computing system by, for each subject vertex in a graph, recording for the computing node an offset for the subject vertex, where the offset references a first position in an edge array for the computing node, and for each edge of a set of edges coupled with the subject vertex in the graph, calculating an edge value for the edge based on a connected vertex identifier identifying a vertex coupled with the subject vertex via the edge. When the edge value is assigned to the first position, the edge value is determined by a first calculation, and when the edge value is assigned to position subsequent to the first position, the edge value is determined by a second calculation. In the computing node, the edge value is recorded in the edge array.
    Type: Grant
    Filed: November 27, 2018
    Date of Patent: July 20, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert
  • Patent number: 11061583
    Abstract: An electronic device includes a non-volatile memory and a controller. The controller receives data to be written to the non-volatile memory and determines a type of the data. Based on the type of the data, the controller selects a given duration of the data from among multiple durations of the data in the non-volatile memory. The controller sets values of one or more parameters for writing the data to the non-volatile memory based on the given duration. The controller writes the data to the non-volatile memory using the values of the one or more write parameters.
    Type: Grant
    Filed: July 30, 2019
    Date of Patent: July 13, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Andrew G. Kegel, Steven E. Raasch
  • Patent number: 11061753
    Abstract: Systems, apparatuses, and methods for implementing a hardware enforcement mechanism to enable platform-specific firmware visibility into an error state ahead of the operating system are disclosed. A system includes at least one or more processor cores, control logic, a plurality of registers, platform-specific firmware, and an operating system (OS). The control logic allows the platform-specific firmware to decide if and when the error state is visible to the OS. In some cases, the platform-specific firmware blocks the OS from accessing the error state. In other cases, the platform-specific firmware allows the OS to access the error state such as when the OS needs to unmap a page. The control logic enables the platform-specific firmware, rather than the OS, to make decisions about the replacement of faulty components in the system.
    Type: Grant
    Filed: March 29, 2018
    Date of Patent: July 13, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Dean A. Liberty, Vilas K. Sridharan, Michael T. Clark, Jelena Ilic, David S. Christie, James R. Williamson, Cristian Constantinescu
  • Patent number: 11062680
    Abstract: Systems, apparatuses, and methods for implementing raster order view enforcement techniques are disclosed. A processor includes a plurality of compute units coupled to one or more memories. A plurality of waves are launched in parallel for execution on the plurality of compute units, where each wave comprises a plurality of threads. A dependency chain is generated for each wave of the plurality of waves. The compute units wait for all older waves to complete dependency chain generation prior to executing any threads with dependencies. Responsive to all older waves completing dependency chain generation, a given thread with a dependency is executed only if all other threads upon which the given thread is dependent have become inactive. When executed, the plurality of waves generate a plurality of pixels to be driven to a display.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: July 13, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Pazhani Pillai, Christopher J. Brennan
  • Patent number: 11061429
    Abstract: A technique for fine-granularity speed binning for a processing device is provided. The processing device includes a plurality of clock domains, each of which may be clocked with independent clock signals. The clock frequency at which a particular clock domain may operate is determined based on the longest propagation delay between clocked elements in that particular clock domain. The processing device includes measurement circuits for each clock domain that measure such propagation delay. The measurement circuits are replica propagation delay paths of actual circuit elements within each particular clock domain. A speed bin for each clock domain is determined based on the propagation delay measured for the measurement circuits for a particular clock domain. Specifically, a speed bin is chosen that is associated with the fastest clock speed whose clock period is longer than the slowest propagation delay measured for the measurement circuit for the clock domain.
    Type: Grant
    Filed: October 26, 2017
    Date of Patent: July 13, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Greg Sadowski, Shomit N. Das
  • Patent number: 11061572
    Abstract: Described are a method and processing apparatus to tag and track objects related to memory allocation calls. An application or software adds a tag to a memory allocation call to enable object level tracking. An entry is made into an object tracking table, which stores the tag and a variety of statistics related to the object and associated memory devices. The object statistics may be queried by the application to tune power/performance characteristics either by the application making runtime placement decisions, or by off-line code tuning based on a previous run. The application may add a tag to a memory allocation call to specify the type of memory characteristics requested based on the object statistics.
    Type: Grant
    Filed: April 22, 2016
    Date of Patent: July 13, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: David A. Roberts, Michael Ignatowski
  • Patent number: 11064019
    Abstract: A server includes a plurality of nodes that are connected by a network that includes an on-chip network or an inter-chip network that connects the nodes. The server also includes a controller to configure the network based on relative priorities of workloads that are executing on the nodes. Configuring the network can include allocating buffers to virtual channels supported by the network based on the relative priorities of the workloads associated with the virtual channels, configuring routing tables that route the packets over the network based on the relative priorities of the workloads that generate the packets, or modifying arbitration weights to favor granting access to the virtual channels to packets generated by higher priority workloads.
    Type: Grant
    Filed: September 14, 2016
    Date of Patent: July 13, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: Sergey Blagodurov
  • Publication number: 20210209192
    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Application
    Filed: March 22, 2021
    Publication date: July 8, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Publication number: 20210209832
    Abstract: A technique for performing ray tracing operations is provided. The technique includes initiating bounding volume hierarchy traversal for a ray against geometry represented by a bounding volume hierarchy; identifying multiple nodes of the bonding volume hierarchy for concurrent intersection tests; and performing operations for the concurrent intersection tests concurrently.
    Type: Application
    Filed: December 18, 2020
    Publication date: July 8, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Ruijin Wu
  • Publication number: 20210209831
    Abstract: A method, system, and non-transitory computer readable storage medium for rasterizing primitives are disclosed. The method, system, and non-transitory computer readable storage medium includes: generating a primitive batch from a sequence of one or more primitives, wherein the primitive batch includes primitives sorted into one or more row groups based on which row of a plurality of rows each primitive intersects; and processing each row group, the processing for each row group including: identifying one or more primitive column intercepts for each of the one or more primitives in the row group, wherein each combination of primitive column intercept and row identifies a bin; and rasterizing the one or more primitives that intersect the bin.
    Type: Application
    Filed: March 22, 2021
    Publication date: July 8, 2021
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Michael Mantor, Laurent Lefebvre, Mikko Alho, Mika Tuomi, Kiia Kallio
  • Patent number: 11055806
    Abstract: A method and system for directing image rendering, implemented in a computer system including a plurality of processors includes determining one or more processors in the system on which to execute one or more commands. A graphics processing unit (GPU) control application program interface (API) determines one or more processors in the system on which to execute one or more commands. A signal is transmitted to each of the one or more processors indicating which of the one or more commands are to be executed by that processor. The one or more processors execute their respective command. A request is transmitted to each of the one or more processors to transfer information to one another once processing is complete, and an image is rendered based upon the processed information by at least one processor and the received transferred information from at least another processor.
    Type: Grant
    Filed: February 24, 2016
    Date of Patent: July 6, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Gregory A. Grebe, Jonathan Lawrence Campbell, Layla A. Mah
  • Patent number: 11054887
    Abstract: Systems, apparatuses, and methods for performing efficient power management for a multi-node computing system are disclosed. A computing system includes multiple nodes. When power down negotiation is distributed, negotiation for system-wide power down occurs within a lower level of a node hierarchy prior to negotiation for power down occurring at a higher level of the node hierarchy. When power down negotiation is centralized, a given node combines a state of its clients with indications received on its downstream link and sends an indication on an upstream link based on the combining. Only a root node sends power down requests.
    Type: Grant
    Filed: December 28, 2017
    Date of Patent: July 6, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Benjamin Tsien, Greggory D. Donley, Bryan P. Broussard
  • Patent number: 11055895
    Abstract: Described herein are techniques for reducing control flow divergence. The method includes identifying two or more shader programs having commonalities, generating a merged shader program that implements functionality of the identified two or more shader programs, wherein the functionality implemented includes a first execution option for a first shader program of the two or more shader programs and a second execution option for a second shader program of the two or more shader programs, modifying shader programs that call the first shader program to instead call the merged shader program and select the first execution option, modifying shader programs that call the second shader program to instead call the merged shader program and select the second execution option.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: July 6, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventor: David Ronald Oldcorn
  • Patent number: 11055150
    Abstract: A thread holding a lock notifies a sleeping thread that is waiting on the lock that the lock holding thread is “about” to release the lock. In response to the notification, the waiting thread is woken up. While the waiting thread is woken up, the lock holding thread completes other operations prior to actually releasing the lock and then releases the lock. The notification to the waiting thread hides latency associated with waking up the waiting thread by allowing operations that wake up the waiting thread to occur while the lock holding thread is performing the other operations prior to releasing the thread.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: July 6, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini-Farahani, David A. Roberts
  • Patent number: 11054883
    Abstract: A power management algorithm framework proposes: 1) a Quality-of-Service (QoS) metric for throughput-based workloads; 2) heuristics to differentiate between throughput and latency sensitive workloads; and 3) an algorithm that combines the heuristic and QoS metric to determine target frequency for minimizing idle time and improving power efficiency without any performance degradation. A management algorithm framework enables optimizing power efficiency in server-class throughput-based workloads while still providing desired performance for latency sensitive workloads. The power savings are achieved by identifying workloads in which one or more cores can be run at a lower frequency (and consequently lower power) without a significant negative performance impact.
    Type: Grant
    Filed: June 18, 2018
    Date of Patent: July 6, 2021
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Leonardo De Paula Rosa Piga, Samuel Naffziger, Ivan Matosevic, Indrani Paul
  • Patent number: 11055098
    Abstract: A processor includes a branch target buffer (BTB) having a plurality of entries whereby each entry corresponds to an associated instruction pointer value that is predicted to be a branch instruction. Each BTB entry stores a predicted branch target address for the branch instruction, and further stores information indicating whether the next branch in the block of instructions associated with the predicted branch target address is predicted to be a return instruction. In response to the BTB indicating that the next branch is predicted to be a return instruction, the processor initiates an access to a return stack that stores the return address for the predicted return instruction. By initiating access to the return stack responsive to the return prediction stored at the BTB, the processor reduces the delay in identifying the return address, thereby improving processing efficiency.
    Type: Grant
    Filed: July 24, 2018
    Date of Patent: July 6, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Aparna Thyagarajan, Marius Evers, Arunachalam Annamalai
  • Publication number: 20210200618
    Abstract: A memory controller includes a command queue, a memory interface queue, and a non-volatile error reporting circuit. The command queue receives memory access commands including volatile reads, volatile writes, non-volatile reads, and non-volatile writes, and an output. The memory interface queue has an input coupled to the output of the command queue, and an output for coupling to a non-volatile storage class memory (SCM) module. The non-volatile error reporting circuit identifies error conditions associated with the non-volatile SCM module and maps the error conditions from a first number of possible error conditions associated with the non-volatile SCM module to a second, smaller number of virtual error types for reporting to an error monitoring module of a host operating system, the mapping based at least on a classification that the error condition will or will not have a deleterious effect on an executable process running on the host operating system.
    Type: Application
    Filed: December 30, 2019
    Publication date: July 1, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: James R. Magro, Kedarnath Balakrishnan, Vilas Sridharan