Patents Assigned to Advanced Micro Devices, Incs.

Graphics primitives and positions through memory buffers

Patent number: 12169896

Abstract: Systems, apparatuses, and methods for preemptively reserving buffer space for primitives and positions in a graphics pipeline are disclosed. A system includes a graphics pipeline frontend with any number of geometry engines coupled to corresponding shader engines. Each geometry engine launches shader wavefronts to execute on a corresponding shader engine. The geometry engine preemptively reserves buffer space for each wavefront prior to the wavefront being launched on the shader engine. When the shader engine executes a wavefront, the shader engine exports primitive and position data to the reserved buffer space. Multiple scan converters will consume the primitive and position data, with each scan converter consuming primitive and position data based on the screen coverage of the scan converter. After consuming the primitive and position data, the scan converters mark the buffer space as freed so that the geometry engine can then allocate the freed buffer space to subsequent shader wavefronts.

Type: Grant

Filed: September 29, 2021

Date of Patent: December 17, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Todd Martin, Tad Robert Litwiller, Nishank Pathak, Randy Wayne Ramsey, Michael J. Mantor, Christopher J. Brennan, Mark M. Leather, Ryan James Cash
Reducing system power consumption when capturing data from a USB device

Patent number: 12169430

Abstract: Systems and methods are disclosed for reducing power consumed by capturing data from an I/O device. Techniques disclosed include receiving descriptors, by a controller of an I/O host of a system, including information associated with respective data chunks to be captured from an I/O device buffer of the I/O device. Techniques disclosed further include capturing, based on the descriptors, the data chunks. The capturing comprises pulling the data chunks from the I/O device buffer at a pulling rate, where the data chunks are transferred to a local buffer of the I/O host, and pushing segments of the pulled data chunks from the local buffer, where each segment is transferred to a data buffer of the system after a respective target time that precedes a time at which the data chunks in the segment are to be processed by an application executing on the system.

Type: Grant

Filed: May 25, 2022

Date of Patent: December 17, 2024

Assignee: Advanced Micro Devices, Inc.

Inventor: Raul Gutierrez
Dynamic precision scaling at epoch granularity in neural networks

Patent number: 12169782

Abstract: A processor determines losses of samples within an input volume that is provided to a neural network during a first epoch, groups the samples into subsets based on losses, and assigns the subsets to operands in the neural network that represent the samples at different precisions. Each subset is associated with a different precision. The processor then processes the subsets in the neural network at the different precisions during the first epoch. In some cases, the samples in the subsets are used in a forward pass and a backward pass through the neural network. A memory configured to store information representing the samples in the subsets at the different precisions. In some cases, the processor stores information representing model parameters of the neural network in the memory at the different precisions of the subsets of the corresponding samples.

Type: Grant

Filed: May 29, 2019

Date of Patent: December 17, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Shomit N. Das, Abhinav Vishnu
Graphics pipeline optimizations

Patent number: 12169703

Abstract: Systems, apparatuses, and methods for implementing graphics pipeline optimizations are disclosed. A user interface (UI) is generated to allow a user to analyze shaders and determine resource utilization on any of multiple different target graphic devices. The UI allows the user to manipulate the state associated with the target graphics device for a given graphics pipeline. After being edited by the user, the state of the graphics pipeline is converted into a textual representation and input into a meta-app. The meta-app creates an application programming interface (API) construct from the shader source code and textual representation of the state which is compiled by a driver component into machine-level instructions. Also, resource usage statistics are generated for a simulated run of the graphics pipeline on the target graphics device. Then, the machine-level instructions and resource usage statistics are displayed in the UI for the user to analyze.

Type: Grant

Filed: March 18, 2021

Date of Patent: December 17, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Amit Ben-Moshe, Brian Kenneth Bennett, Qun Lin, David Ronald Oldcorn
Fabricating active-bridge-coupled GPU chiplets

Patent number: 12170263

Abstract: Various multi-die arrangements and methods of manufacturing the same are disclosed. In some embodiments, a method of manufacture includes a face-to-face process in which a first GPU chiplet and a second GPU chiplet are bonded to a temporary carrier wafer. A face surface of an active bridge chiplet is bonded to a face surface of the first and second GPU chiplets before mounting the GPU chiplets to a carrier substrate. In other embodiments, a method of manufacture includes a face-to-back process in which a face surface of an active bridge chiplet is bonded to a back surface of the first and second GPU chiplets.

Type: Grant

Filed: September 27, 2019

Date of Patent: December 17, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Skyler J. Saleh, Ruijin Wu, Milind S. Bhagavat, Rahul Agarwal
Server-side frame render timing delay to reduce client-side frame present delay

Patent number: 12170801

Abstract: In a cloud gaming system or other remote video streaming system, a client device and a server coordinate to introduce an adjustable delay in the frame start timing in the frame rendering pipeline at the server to reducing vertical synchronization (VSYNC) presentation latency, and thus reduce overall frame latency. In implementations, the coordination between the client device and the server includes the client device observing the current VSYNC presentation latencies in recently processed video frames reporting this observed VSYNC presentation latency to the server. The server uses this feedback to determine a frame start delay that is then used to introduce a frame start shift for an upcoming frame and subsequent frames, thereby shifting the server rendering and encoding pipeline back so that the resulting video frames are made available to present at the client device closer to their respective VSYNC signal assertions.

Type: Grant

Filed: December 9, 2022

Date of Patent: December 17, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Yuping Shen, Min Zhang
TRAVERSAL AND PROCEDURAL SHADER BOUNDS REFINEMENT

Publication number: 20240412445

Abstract: A technique for performing ray tracing operations is provided. The technique includes, traversing through a bounding volume hierarchy for a ray to arrive at a well-fit bounding volume that is associated with first node, wherein the first node is one of a traversal node or a procedural node, and wherein the well-fit bounding volume comprises geometry other than a single axis-aligned bounding box for the first node; evaluating the ray for intersection with the well-fit bounding volume; determining whether to execute a first shader program associated with the first node based on the evaluating, wherein the first shader program comprises a traversal shader program or a procedural shader program; and executing or not executing the first shader program based on the determining.

Type: Application

Filed: June 9, 2023

Publication date: December 12, 2024

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: David William John Pankratz, David Ronald Oldcorn
SPLIT BOUNDING VOLUMES FOR INSTANCES

Publication number: 20240412446

Abstract: A technique for performing ray tracing operations is provided. The technique includes detecting intersection of a ray with a split bounding volume of an instance of a bounding volume hierarchy; determining whether the split bounding volume meets an instance traversal limiting criterion; and continuing BVH traversal based on the determining.

Type: Application

Filed: June 9, 2023

Publication date: December 12, 2024

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: David William John Pankratz, Christiaan Paul Gribble
Local Triggering of Processing-in-Memory Operations

Publication number: 20240411462

Abstract: Local and dynamic triggering of operations executed by a processing-in-memory component is described. In accordance with the described techniques, a processing-in-memory component receives a command from a host for execution by the processing-in-memory component. The processing-in-memory component references a tracking table that includes at least one entry associated with an operation performed as part of executing the command and identifies at least one additional command to be triggered locally after executing the command received from the host. Responsive to identifying that conditions associated with the at least one additional command are satisfied, the processing-in-memory component executes the at least one additional command, independent of instructions from the host.

Type: Application

Filed: June 8, 2023

Publication date: December 12, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Mohamed Assem Abd ElMohsen Ibrahim, Shaizeen Dilawarhusen Aga, Mahzabeen Islam
Super-Temporal Cache Replacement Policy

Publication number: 20240411692

Abstract: Cache replacement policies are described. In accordance with the described techniques, a request for data is received and a cache replacement policy controls how a controller responds to the request. The cache replacement policy assigns each cacheline a priority value, which indicates whether the cacheline should be preserved relative to other cachelines, in response to the request being a cache miss that necessitates eviction of at least one cacheline. The cache replacement policy decrements priority values until at least one cacheline achieves a minimum priority value, at which point a cacheline is evicted. The cache replacement policy designates certain cachelines as protected, either via a separate protected indicator or via the cacheline's priority value, which causes unprotected cachelines to be selected for eviction while favoring preservation of protected cachelines in the cache.

Type: Application

Filed: June 9, 2023

Publication date: December 12, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Gabriel Hsiuwei Loh, Joseph Lee Greathouse, William Louie Walker, Paul James Moyer
Compression metadata assisted computation

Patent number: 12164924

Abstract: A method includes, in response to receiving an instruction to perform a first operation on first data stored in a memory device, obtaining first compression metadata from the memory device based on an address for the first data, and reducing a number of operations in a set of operations based on the first operation and one or more matching addresses, the one or more matching addresses corresponding to second compression metadata matching the first compression metadata.

Type: Grant

Filed: September 25, 2020

Date of Patent: December 10, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Matthew Tomei, Shomit Das
Accelerating predicated instruction execution in vector processors

Patent number: 12164923

Abstract: Methods and systems are disclosed for processing a vector by a vector processor. Techniques disclosed include receiving predicated instructions by a scheduler, each of which is associated with an opcode, a vector of elements, and a predicate. The techniques further include executing the predicated instructions. Executing a predicated instruction includes compressing, based on an index derived from a predicate of the instruction, elements in a vector of the instruction, where the elements in the vector are contiguously mapped, then, after the mapped elements are processed, decompressing the processed mapped elements, where the processed mapped elements are reverse mapped based on the index.

Type: Grant

Filed: June 29, 2022

Date of Patent: December 10, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Elliott David Binder, Onur Kayiran, Masab Ahmad
Buffer display data in a chiplet architecture

Patent number: 12164365

Abstract: An apparatus and method for efficiently managing power consumption among multiple, replicated functional blocks of an integrated circuit. An integrated circuit includes multiple, replicated functional blocks that use separate power domains. Data of a given type is stored in an interleaved manner among at least two of the multiple functional blocks. In one implementation, a prior static allocation determines that only a subset of the functional blocks store the data of the given type. In another implementation, each of the functional blocks stores the data of the given type, and when an idle state has occurred, data of the given type is moved between the multiple functional blocks until one or more functional blocks no longer store data of the given type. When a transition to the idle state has occurred, the functional blocks that do not store the data of the given type are transitioned to a sleep state.

Type: Grant

Filed: December 27, 2022

Date of Patent: December 10, 2024

Assignees: Advanced Micro Devices, Inc, ATI Technologies ULC

Inventors: Gia Tung Phan, Ashish Jain, Shang Yang
Frequency/state based power management thresholds

Patent number: 12164353

Abstract: A system and method for determining power-performance state transition thresholds in a computing system. A processor comprises several functional blocks and a power manager. Each of the functional blocks produces data corresponding to an activity level associated with the respective functional block. The power manager determines activity levels of the functional blocks and compares the activity level of a given functional block to a threshold to determine if a power-performance state (P-state) transition is indicated. The threshold is determined in part on a current P-state of the given functional block. When the current P-state of the given functional block is relatively high, the threshold activity level to transition to a higher P-state is higher than it would be if the current P-state were relatively low. The power manager is further configured to determine the thresholds based in part on one or more of a type of circuit being monitored and a type of workload being executed.

Type: Grant

Filed: September 29, 2022

Date of Patent: December 10, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Ashish Jain, Shang Yang
Multi-accelerator compute dispatch

Patent number: 12165252

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

Type: Grant

Filed: October 3, 2023

Date of Patent: December 10, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
Direct-connected machine learning accelerator

Patent number: 12165016

Abstract: Techniques are disclosed for communicating between a machine learning accelerator and one or more processing cores. The techniques include obtaining data at the machine learning accelerator via an input/output die; processing the data at the machine learning accelerator to generate machine learning processing results; and exporting the machine learning processing results via the input/output die, wherein the input/output die is coupled to one or more processor chiplets via one or more processor ports, and wherein the input/output die is coupled to the machine learning accelerator via an accelerator port.

Type: Grant

Filed: September 25, 2020

Date of Patent: December 10, 2024

Assignee: Advanced Micro Devices, Inc.

Inventor: Maxim V. Kazakov
SRAM power savings and write assist

Patent number: 12165700

Abstract: A technique reduces power consumption of a bit cell in a memory and provides write assistance to the bit cell. When the bit cell is active, a power-saving write-assist circuit coupled to the bit cell is selectively sized according to a type of memory access. When the bit cell is inactive, the virtual power supply node floats to a predetermined voltage between a first voltage on a first power supply node coupled to the bit cell and a second voltage on a second power supply node coupled to the bit cell. A method for controlling power consumption of a bit cell and assisting a write to the bit cell includes providing a reference voltage to a virtual power supply node coupled to the bit cell. The reference voltage is provided based on an operational state of the bit cell and a type of memory access to the bit cell.

Type: Grant

Filed: September 29, 2021

Date of Patent: December 10, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Russell J. Schreiber, John J. Wuu, Keith A. Kasprak
3D semiconductor package with die-mounted voltage regulator

Patent number: 12165981

Abstract: A semiconductor package includes a package substrate having a first surface and an opposing second surface, and further includes an integrated circuit (IC) die disposed at the second surface and having a third surface facing the second surface and an opposing fourth surface. The IC die has a first region comprising one or more metal layers and circuit components for one or more functions of the IC die and a second region offset from the first region in a direction parallel with the third and fourth surfaces. The semiconductor package further includes a voltage regulator disposed at the fourth surface in the second region and having an input configured to receive a supply voltage and an output configured to provide a regulated voltage, and also includes a conductive path coupling the output of the voltage regulator to a voltage input of circuitry of the IC die.

Type: Grant

Filed: December 20, 2021

Date of Patent: December 10, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Gabriel H Loh, Raja Swaminathan, Rahul Agarwal, Brett P. Wilkerson
Multicast in the probe channel

Patent number: 12167102

Abstract: Systems, apparatuses, and methods for processing multi-cast messages are disclosed. A system includes at least one or more processing units, one or more memory controllers, and a communication fabric coupled to the processing unit(s) and the memory controller(s). The communication fabric includes a plurality of crossbars which connect various agents within the system. When a multi-cast message is received by a crossbar, the crossbar extracts a message type indicator and a recipient type indicator from the message. The crossbar uses the message type indicator to determine which set of masks to lookup using the recipient type indicator. Then, the crossbar determines which one or more masks to extract from the selected set of masks based on values of the recipient type indicator. The crossbar combines the one or more masks with a multi-cast route to create a port vector for determining on which ports to forward the multi-cast message.

Type: Grant

Filed: September 21, 2018

Date of Patent: December 10, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Vydhyanathan Kalyanasundharam, Joe G. Cruz, Eric Christopher Morton, Alan Dodson Smith
PARALLELIZED BOOT SEQUENCE

Publication number: 20240403065

Abstract: The disclosed device includes multiple special purpose processors that are configured to perform, in parallel, a power on transition sequence for the device, which can involve restoring a data state of components of the device using data stored in local storages of the special purpose processors. Various other methods, systems, and computer-readable media are also disclosed.

Type: Application

Filed: May 31, 2024

Publication date: December 5, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Sriram Sambamurthy, Indrani Paul, Kevin M. Brandl, James R. Magro, Zhao Hui Yu, Oswin E. Housty

prev … 13 14 15 16 17 18 19 20 21 … next