Patents Assigned to Advanced Micro Devices, Incs.
-
Patent number: 12067640Abstract: Techniques for managing register allocation are provided. The techniques include detecting a first request to allocate first registers for a first wavefront; first determining, based on allocation information, that allocating the first registers to the first wavefront would result in a condition in which a deadlock is possible; in response to the first determining, refraining from allocating the first registers to the first wavefront; detecting a second request to allocate second registers for a second wavefront; second determining, based on the allocation information, that allocating the second registers to the second wavefront would result in a condition in which deadlock is not possible; and in response to the second determining, allocating the second registers to the second wavefront.Type: GrantFiled: March 26, 2021Date of Patent: August 20, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Pramod Vasant Argade, Martin G. Sarov, Milind N. Nemlekar
-
Publication number: 20240273040Abstract: Multi-stack compute chip and memory architecture is described. In accordance with the described techniques, a package includes a plurality of computing stacks, and each computing stack includes at least one compute chip and a memory. The package also includes one or more interconnects that couple the computing stacks to at least one other computing stack for sharing the memory in a coherent fashion across the plurality of computing stacks.Type: ApplicationFiled: December 20, 2023Publication date: August 15, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Michael Ignatowski, Michael J. Schulte, Gabriel Hsiuwei Loh
-
Publication number: 20240272791Abstract: Automatic generation of data layout instructions for locating data objects in memory that are involved in a sequence of operations for a computational task is described. In accordance with the described techniques, an interference graph is generated for the sequence of operations, where individual nodes in the interference graph represent data objects involved in the computational task. The interference graph includes edges connecting different pairs of nodes, such that an edge indicates the connected data objects are involved in a common operation of the sequence of operations. Weights are assigned to edges based on architectural characteristics of a system performing the computational task as well as a size of the data objects connected by an edge. Individual data objects are then assigned to locations in memory based on edge weights of edges connected to a node representing the data object, optimizing system performance during the computational task.Type: ApplicationFiled: February 12, 2023Publication date: August 15, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Benjamin Youngjae Cho, Armand Bahram Behroozi, Michael L. Chu, Emily Anne Furst
-
Patent number: 12062126Abstract: Systems, apparatuses, and methods for loading multiple primitives per thread in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with a geometry engine, shader processor input (SPI), and a plurality of compute units. The geometry engine generates primitives which are accumulated by the SPI into primitive groups. While accumulating primitives, the SPI tracks the number of vertices and primitives per group. The SPI determines wavefront boundaries based on mapping a single vertex to each thread of the wavefront while allowing more than one primitive per thread. The SPI launches wavefronts with one vertex per thread and potentially multiple primitives per thread. The compute units execute a vertex phase and a multi-cycle primitive phase for wavefronts with multiple primitives per thread.Type: GrantFiled: September 29, 2021Date of Patent: August 13, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Todd Martin, Tad Robert Litwiller, Nishank Pathak, Randy Wayne Ramsey
-
Patent number: 12061510Abstract: The computer system responds to a first trigger event to enter a partial off state in which a boot cycle is required to return to a working state. A device plugged into a serial bus port can be charged in the partial off state. A configuration register or runtime environment controls whether the computer system enters the partial off state in response to a trigger event. The computer system stays in the partial off state until another trigger event returns the computer system to the working state. In some implementations, the computer system leaves the partial off state and enters the shutdown state after an unplug event, a predetermined amount of time after an unplug event, a predetermined amount of time after entering the partial off state, a predetermined amount of time after charging of a device is complete, or any combination of such events.Type: GrantFiled: March 5, 2021Date of Patent: August 13, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Jyoti Raheja, Alexander J. Branover
-
Patent number: 12062214Abstract: Methods and systems are disclosed for encoding a Morton code. Techniques disclosed comprise receiving location vectors associated with primitives, where the primitives are graphical elements spatially located within a three-dimensional scene. Techniques further comprise determining a code pattern comprising a prefix pattern and a base pattern, and, then, coding each of the location vectors according to the code pattern.Type: GrantFiled: December 27, 2021Date of Patent: August 13, 2024Assignee: Advanced Micro Devices, Inc.Inventor: John Alexandre Tsakok
-
Publication number: 20240264900Abstract: An exemplary computing device includes a plurality of circuits and/or a plurality of in-situ monitors configured to generate outputs that indicate one or more operating conditions of the circuits. The computing device also includes a system management unit configured to detect a potentially faulty voltage-to-frequency ratio implemented by one of the circuits based at least in part on one or more of the outputs. The system management unit is also configured to modify the potentially faulty voltage-to-frequency ratio based at least in part on one or more of the outputs. Various other devices, systems, and methods are also disclosed.Type: ApplicationFiled: April 3, 2024Publication date: August 8, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Divya Madapusi Srinivas Prasad, Sudhanva Gurumurthi, Yasuko Eckert, Jeffrey Richard Rearick, Sankaranarayanan Gurumurthy, Amitabh Mehra, Shidhartha Das, Alex W. Schaefer, Vikram Ramachandra, Vilas Sridharan
-
Patent number: 12056352Abstract: Generating optimization instructions for data processing pipelines is described. A pipeline optimization system computes resource usage information that describes memory and compute usage metrics during execution of each stage of the data processing pipeline. The system additionally generates data storage information that describes how data output by each pipeline stage is utilized by other stages of the pipeline. The pipeline optimization system then generates the optimization instructions to control how memory operations are performed for a specific data processing pipeline during execution. In implementations, the optimization instructions cause a memory system to discard data (e.g., invalidate cache entries) without copying the discarded data to another storage location after the data is no longer needed by the pipeline. The optimization instructions alternatively or additionally control at least one of evicting, writing-back, or prefetching data to minimize latency during pipeline execution.Type: GrantFiled: September 28, 2022Date of Patent: August 6, 2024Assignee: Advanced Micro Devices, Inc.Inventor: Harris Eleftherios Gasparakis
-
Patent number: 12055991Abstract: A power supply monitor includes a droop detection circuit which receives a digital signal and converts the digital signal to an analog signal, compares the analog signal to a monitored supply voltage, and responsive to detecting a droop below a designated value relative to the analog signal, produces a droop detection signal. The droop detection circuit includes a first comparator circuit with a series of inverters including at least a first complimentary-metal-oxide-semiconductor (CMOS) inverter with an input for receiving the analog signal and a second CMOS inverter, which are both supplied with a monitored supply voltage. The inverters operate in a crowbar mode when the monitored voltage supply is near a designated level, and each include four pull-up transistors connected in two parallel legs of two transistors, and four pull-down transistors connected in two parallel legs of two transistors.Type: GrantFiled: December 22, 2021Date of Patent: August 6, 2024Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Kaushik Mazumdar, Miguel Rodriguez, Mikhail Rodionov, Stephen Victor Kosonocky
-
Patent number: 12056787Abstract: Methods and systems are disclosed for inline suspension of an accelerated processing unit (APU). Techniques include receiving a packet, including a mode of operation and commands to be executed by the APU; suspending execution of commands received in previous packets when the mode of operation is a suspension initiation mode; and executing, by the APU, the commands in the received packet. The execution of the suspended commands is restored when the mode of operation in a subsequently received packet is a suspension conclusion mode.Type: GrantFiled: December 28, 2021Date of Patent: August 6, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Alexander Fuad Ashkar, Mangesh P. Nijasure, Rakan Z. Khraisha, Manu Rastogi
-
Patent number: 12056522Abstract: Within a processing system, thread count asymmetries manifest when one or more cores of a processing device are disabled. To determine such thread count asymmetries, discovery operations are performed to determine thread count asymmetries for one or more hierarchy levels of a processing device based on a number of threads per enumerated instance within the hierarchy level. In response to the determining a thread count asymmetry, one thread identifier for each enumerated instance within the asymmetric hierarchy level is defined to determine a representation of the asymmetry. Using the representation of the symmetry, software tasks associated with one or more application within the processing system are performed.Type: GrantFiled: November 19, 2021Date of Patent: August 6, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Michael L. Golden, Paul Blinzer, Magiting M. Talisayon, Srikanth Masanam, Ripal Butani, Upasanah Swaminathan
-
Publication number: 20240257435Abstract: A processing device and a method of tiled rendering of an image for display is provided. The processing device includes memory and a processor. The processor is configured to receive the image comprising one or more three dimensional (3D) objects, divide the image into tiles, execute coarse level tiling for the tiles of the image and execute fine level tiling for the tiles of the image. The processing device also includes same fixed function hardware used to execute the coarse level tiling and the fine level tiling. The processor is also configured to determine visibility information for a first one of the tiles. The visibility information is divided into draw call visibility information and triangle visibility information for each remaining tile of the image.Type: ApplicationFiled: April 11, 2024Publication date: August 1, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Mika Tuomi, Kiia Kallio, Ruijin Wu, Anirudh R. Acharya, Vineet Goel
-
Patent number: 12050531Abstract: In accordance with the described techniques for data compression and decompression for processing in memory, a page address is received by a processing in memory component that maps to a first location in memory where data of a page is maintained. The data of the page is compressed by the processing in memory component. Further, compressed data of the page is written by the processing in memory component to a compressed block device responsive to the compressed data satisfying one or more compressibility criteria. The compressed block device is a portion of the memory dedicated to storing data in a compressed form.Type: GrantFiled: September 26, 2022Date of Patent: July 30, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Kishore Punniyamurthy, Jagadish B Kotra
-
Patent number: 12051154Abstract: Systems and methods for distributed rendering using two-level binning include processing primitives of a frame to be rendered at a first graphics processing unit (GPU) chiplet in a set of GPU chiplets to generate visibility information of primitives for each coarse bin and providing the visibility information to the other GPU chiplets in the set of GPU chiplets. Each coarse bin is then assigned to one of the GPU chiplets of the set of GPU chiplets and rendered at the assigned GPU chiplet based on the corresponding visibility information.Type: GrantFiled: December 27, 2021Date of Patent: July 30, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Anirudh R. Acharya, Ruijin Wu
-
Patent number: 12050916Abstract: Array of pointers prefetching is described. In accordance with described techniques, a pointer target instruction is detected by identifying that a destination location of a load instruction is used in an address compute for a memory operation and the load instruction is included in a sequence of load instructions having addresses separated by a step size. An instruction for fetching data of a future load instruction is injected in an instruction stream of a processor. The data of the future load instruction is stored in a temporary register. An additional instruction is injected in the instruction stream for prefetching a pointer target based on an address of the memory operation and the data of the future load instruction.Type: GrantFiled: March 25, 2022Date of Patent: July 30, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Chetana N Keltcher, Alok Garg, Paul S Keltcher
-
Patent number: 12052472Abstract: Techniques are provided herein for processing video data. The techniques include identifying one or more input factors including one or more of signal quality factors, video content complexity factors, and hardware buffering factors for one or more of a video encoding system and a video playback system; evaluating the one or more input factors to determine adjustments to apply to one or both of the video encoding system and the video playback system; and applying the determine adjustments to the one or both of the video encoding system and the video playback system.Type: GrantFiled: September 25, 2020Date of Patent: July 30, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Adam H. Li, Eugene Kuznetsov, Girish P. Subramaniam, Jihyuk Choi
-
Patent number: 12051144Abstract: An apparatus such as a graphics processing unit (GPU) includes a set of shader engines and a set of front end (FE) circuits. Subsets of the set of FE circuits schedule geometry workloads for subsets of the set of shader engines based on a mapping. The apparatus also includes a set of physical paths that convey information from the set of FE circuits to a memory via the set of shader engines. Subsets of the set of physical paths are allocated to the subsets of the set of FE circuits and the subsets of the set of shader engines based on the mapping. The mapping determines information stored in a set of registers used to configure the apparatus. In some cases, the set of registers store information indicating a spatial partitioning of the set of physical paths.Type: GrantFiled: February 28, 2020Date of Patent: July 30, 2024Assignee: Advanced Micro Devices, Inc.Inventor: Rex Eldon McCrary
-
Patent number: 12052153Abstract: Systems, apparatuses, and methods for enabling localized control of link states in a computing system are disclosed. A computing system includes at least a host processor, a communication fabric, one or more devices, one or more links, and a local link controller to monitor the one or more links. In various implementations, the local link controller detects and controls states of a link without requiring communication with, or intervention by, the host processor. In various implementations, this local control by the link controller includes control over the clock signals provided to the link. For example, the local link controller can directly control the frequency of a clock supplied to the link. In addition, in various implementations the link controller controls the power supplied to the link. For example, the link controller can control the voltage supplied to the link.Type: GrantFiled: August 31, 2018Date of Patent: July 30, 2024Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Alexander J. Branover, Thomas James Gibney, Michael J. Tresidder, Nat Barbiero
-
Patent number: 12045169Abstract: Techniques for identifying a hardware configuration for operation are disclosed. The techniques include applying feature measurements to a trained model; obtaining output values from the trained model, the output values corresponding to different hardware configurations; and operating according to the output values, wherein the output values include one of a certainty score, a ranking, or a regression value.Type: GrantFiled: December 23, 2020Date of Patent: July 23, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Furkan Eris, Paul S. Keltcher, John Kalamatianos, Mayank Chhablani, Alok Garg
-
Patent number: 12045362Abstract: A computer vision processor in an image cluster defines a fenced memory region (FMR) that controls access to image data stored in a first portion of a trusted memory region (TMR). The computer vision processor receives FMR requests from an application implemented in a processing cluster. The FMR requests are to access the image data in the first portion of the TMR. The computer vision processor selectively allows the requesting application to access the image data. In some cases, the computer vision processor acquires the image data and stores the image data in the first portion of the TMR, such as buffers in the TMR. A data fabric selectively permits the image processing application to access the data stored in the TMR based on whether the image cluster has opened or closed the FMR for the portion of the TMR.Type: GrantFiled: August 17, 2022Date of Patent: July 23, 2024Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Benjamin Koon Pan Chan, William Lloyd Atkinson, Tung Chuen Kwong, Guhan Krishnan