Patents Assigned to Advanced Micro Devics, Inc.

DYNAMIC NODE TRAVERSAL ORDER FOR RAY TRACING

Publication number: 20240112392

Abstract: Devices and methods for node traversal for ray tracing are provided, which comprise casting a first ray in a space comprising objects represented by geometric shapes, traversing, for the first ray, at least one first node of an accelerated hierarchy structure representing an approximate volume of a group of the geometric shapes and a second node representing a volume of one of the geometric shapes, casting a second ray in the space, selecting, for the second ray, a starting node of traversal based on locations of intersection of the first ray and the second ray and an identifier which identifies one or more nodes intersected by the first ray and traversing, for the second ray, the accelerated hierarchy structure beginning at the starting node of traversal.

Type: Application

Filed: September 29, 2022

Publication date: April 4, 2024

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: David William John Pankratz, Konstantin I. Shkurko
SPECULATIVE DRAM REQUEST ENABLING AND DISABLING

Publication number: 20240111420

Abstract: Methods, devices, and systems for retrieving information based on cache miss prediction. It is predicted, based on a history of cache misses at a private cache, that a cache lookup for the information will miss a shared victim cache. A speculative memory request is enabled based on the prediction that the cache lookup for the information will miss the shared victim cache. The information is fetched based on the enabled speculative memory request.

Type: Application

Filed: September 29, 2022

Publication date: April 4, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Jagadish B. Kotra, John Kalamatianos
Runtime Flushing to Persistency in Heterogenous Systems

Publication number: 20240111682

Abstract: Runtime flushing to persistency in heterogenous systems is described. In accordance with the described techniques, a system may include a persistent memory in electronic communication with at least one cache and a controller configured to command the at least one cache to flush dirty data to the persistent memory in response to a dirtiness of the at least one cache reaching a cache dirtiness threshold.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Advanced Micro Devices, Inc.

Inventor: Alexander Joseph Branover
Throttling hull shaders based on tessellation factors in a graphics pipeline

Patent number: 11948251

Abstract: A processing system includes hull shader circuitry that launches thread groups including one or more primitives. The hull shader circuitry also generates tessellation factors that indicate subdivisions of the primitives. The processing system also includes throttling circuitry that estimates a primitive launch time interval for the domain shader based on the tessellation factors and selectively throttles launching of the thread groups from the hull shader circuitry based on the primitive launch time interval of the domain shader and a hull shader latency. In some cases, the throttling circuitry includes a first counter that is incremented in response to launching a thread group from the buffer and a second counter that modifies the first counter based on a measured latency of the domain shader.

Type: Grant

Filed: October 26, 2022

Date of Patent: April 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventor: Nishank Pathak
Weak cache line invalidation requests for speculatively executing instructions

Patent number: 11947456

Abstract: Techniques for invalidating cache lines are provided. The techniques include issuing, to a first level of a memory hierarchy, a weak exclusive read request for a speculatively executing store instruction; determining whether to invalidate one or more cache lines associated with the store instruction in one or more memories; and issuing the weak invalidation request to additional levels of the memory hierarchy.

Type: Grant

Filed: September 30, 2021

Date of Patent: April 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventor: Paul J. Moyer
Machine learning inference engine scalability

Patent number: 11948073

Abstract: Systems, apparatuses, and methods for adaptively mapping a machine learning model to a multi-core inference accelerator engine are disclosed. A computing system includes a multi-core inference accelerator engine with multiple inference cores coupled to a memory subsystem. The system also includes a control unit which determines how to adaptively map a machine learning model to the multi-core inference accelerator engine. In one implementation, the control unit selects a mapping scheme which minimizes the memory bandwidth utilization of the multi-core inference accelerator engine. In one implementation, this mapping scheme involves having one inference core of the multi-core inference accelerator engine fetch given data and broadcast the given data to other inference cores of the inference accelerator engine. Each inference core fetches second data unique to the respective inference core.

Type: Grant

Filed: August 30, 2018

Date of Patent: April 2, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Lei Zhang, Sateesh Lagudu, Allen Rush
Gang scheduling for low-latency task synchronization

Patent number: 11948000

Abstract: Systems, apparatuses, and methods for performing command buffer gang submission are disclosed. A system includes at least first and second processors and a memory. The first processor (e.g., CPU) generates a command buffer and stores the command buffer in the memory. A mechanism is implemented where a granularity of work provided to the second processor (e.g., GPU) is increased which, in turn, increases the opportunities for parallel work. In gang submission mode, the user-mode driver (UMD) specifies a set of multiple queues and command buffers to execute on those multiple queues, and that work is guaranteed to execute as a single unit from the GPU operating system scheduler point of view. Using gang submission, synchronization between command buffers executing on multiple queues in the same submit is safe. This opens up optimization opportunities for application use (explicit gang submission) and for internal driver use (implicit gang submission).

Type: Grant

Filed: March 31, 2021

Date of Patent: April 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Mitchell Howard Singer, Derrick Trevor Owens
Duplicated registers in chiplet processing units

Patent number: 11947473

Abstract: Systems, apparatuses, and methods for implementing duplicated registers for access by initiators across multiple semiconductor dies are disclosed. A system includes multiple initiators on multiple semiconductor dies of a chiplet processor. One of the semiconductor dies is the master die, and this master die has copies of registers which can be accessed by the multiple initiators on the multiple semiconductor dies. When a given initiator on a given secondary die generates a register access, the register access is routed to the master die and a particular duplicate copy of the register maintained for the given secondary die. From the point of view of software, the multiple semiconductor dies appear as a single die, and the multiple initiators appear as a single initiator. Multiple types of registers can be maintained by the master die, with a flush register being one of the register types.

Type: Grant

Filed: October 12, 2021

Date of Patent: April 2, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Haikun Dong, Kostantinos Danny Christidis, Ling-Ling Wang, MinHua Wu, Gaojian Cong, Rui Wang
Separate clocking for components of a graphics processing unit

Patent number: 11947380

Abstract: Systems and methods related to controlling clock signals for clocking shader engines modules (SEs) and non-shader-engine modules (nSEs) of a graphics processing unit (GPU) are provided. One or more dividers receive a clock signal CLK and output a clock signal CLKA to the SEs and output a clock signal CLKB to the nSEs. The frequencies of CLKA and CLKB are independently selected based on sets of performance counter data monitored at the SEs and nSEs, respectively. The clock signal frequency for either the SEs or the nSEs is reduced when the corresponding sets of performance counter data indicates a comparatively lower processing workload for the SEs or for the nSEs.

Type: Grant

Filed: August 18, 2022

Date of Patent: April 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Ranjith Kumar Sajja, Sreekanth Godey, Anirudh R. Acharya
Cross-chiplet performance data streaming

Patent number: 11947476

Abstract: Methods and systems are disclosed for cross-chiplet performance data streaming. Techniques disclosed include accumulating, by a subservient chiplet, event data associated with an event indicative of a performance aspect of the subservient chiplet; sending, by the subservient chiplet, the event data over a chiplet bus to a master chiplet; and adding, by the master chiplet, the received event data to an event record, the event record containing previously received, from the subservient chiplet over the chiplet bus, event data associated with the event.

Type: Grant

Filed: March 31, 2022

Date of Patent: April 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Bryan Broussard, Pravesh Gupta, Benjamin Tsien, Vydhyanathan Kalyanasundharam
Enabling accelerated processing units to perform dataflow execution

Patent number: 11947487

Abstract: Methods and systems are disclosed for performing dataflow execution by an accelerated processing unit (APU). Techniques disclosed include decoding information from one or more dataflow instructions. The decoded information is associated with dataflow execution of a computational task. Techniques disclosed further include configuring, based on the decoded information, dataflow circuitry, and, then, executing the dataflow execution of the computational task using the dataflow circuitry.

Type: Grant

Filed: June 28, 2022

Date of Patent: April 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Johnathan Robert Alsop, Karthik Ramu Sangaiah, Anthony T. Gutierrez
Method and apparatus for training memory

Patent number: 11947833

Abstract: A method and apparatus for training data in a computer system includes reading data stored in a first memory address in a memory and writing it to a buffer. Training data is generated for transmission to the first memory address. The data is transmitted to the first memory address. Information relating to the training data is read from the first memory address and the stored data is read from the buffer and written to the memory area where the training data was transmitted.

Type: Grant

Filed: June 21, 2022

Date of Patent: April 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Anwar Kashem, Craig Daniel Eaton, Pouya Najafi Ashtiani, Tsun Ho Liu
Redundancy method and apparatus for shader column repair

Patent number: 11948223

Abstract: Methods and systems are described. A system includes a redundant shader pipe array that performs rendering calculations on data provided thereto and a shader pipe array that includes a plurality of shader pipes, each of which performs rendering calculations on data provided thereto. The system also includes a circuit that identifies a defective shader pipe of the plurality of shader pipes in the shader pipe array. In response to identifying the defective shader pipe, the circuit generates a signal. The system also includes a redundant shader switch. The redundant shader switch receives the generated signal, and, in response to receiving the generated signal, transfers the data for the defective shader pipe to the redundant shader pipe array.

Type: Grant

Filed: July 11, 2022

Date of Patent: April 2, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
A/D Bit Storage, Processing, and Modes

Publication number: 20240104023

Abstract: A/D bit storage, processing, and mode management techniques through use of a dense A/D bit representation are described. In one example, a memory management unit employs an A/D bit representation generation module to generate the dense A/D bit representation. In an implementation, the A/D bit representation is stored adjacent to existing page table structures of the multilevel page table hierarchy. In another example, memory management unit supports use of modes as part of A/D bit storage.

Type: Application

Filed: September 26, 2022

Publication date: March 28, 2024

Applicant: Advanced Micro Devices, Inc.

Inventor: William A. Moyes
Block Data Load with Transpose into Memory

Publication number: 20240103879

Abstract: Block data load with transpose techniques are described. In one example, an input is received, at a control unit, specifying an instruction to load a block of data to at least one memory module using a transpose operation. Responsive to the receiving the input by the control unit, the block of data is caused to be loaded to the at least one memory module by transposing the block of data to form a transposed block of data and storing the transposed block of data in the at least one memory.

Type: Application

Filed: September 25, 2022

Publication date: March 28, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Bin He, Michael John Mantor, Brian Emberling, Liang Huang, Chao Liu
Reduction of Parallel Memory Operation Messages

Publication number: 20240103730

Abstract: In accordance with described techniques for reduction of parallel memory operation messages, a computing system or computing device includes a memory system that receives memory operation messages. A shared response component in the memory system receives responses to the memory operation messages, and identifies a set of the responses that are coalesceable. The shared response component then coalesces the set of the responses into a combined message for communication completion through a communication path in the memory system.

Type: Application

Filed: September 28, 2022

Publication date: March 28, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Johnathan Robert Alsop, Shaizeen Dilawarhusen Aga, Mohamed Assem Abd ElMohsen Ibrahim
DIVERSIFIED VIRTUAL MEMORY

Publication number: 20240103897

Abstract: Systems and methods are disclosed for managing diversified virtual memory by an engine. Techniques disclosed include receiving one or more request messages, each request message including a job descriptor that specifies an operation to be performed on a respective virtual memory space, processing the job descriptors by generating one or more commands for transmission to one or more virtual memory managers, and transmitting the one or more commands to the one or more virtual memory managers (VMMs) for processing.

Type: Application

Filed: September 27, 2022

Publication date: March 28, 2024

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Norman Vernon Douglas Stewart, Mihir Shaileshbhai Doctor, Omar Fakhri Ahmed
DEVICE AND METHOD OF IMPLEMENTING SUBPASS INTERLEAVING OF TILED IMAGE RENDERING

Publication number: 20240104685

Abstract: Devices and methods method of tiled rendering are provided which comprises dividing a frame to be rendered, into a plurality of tiles, receiving commands to execute a plurality of subpasses of the tiles, interleaving execution of same subpasses of multiple tiles of the frame by executing one or more subpasses as skip operations, storing visibility data, for subsequently ordered subpasses of the tiles, at memory addresses allocated for data of corresponding adjacent tiles in a first direction of traversal and rendering the tiles for the subsequently ordered subpasses using the visibility data stored at the memory addresses allocated for corresponding adjacent tiles in a second direction of traversal, opposite the first direction of traversal.

Type: Application

Filed: September 28, 2022

Publication date: March 28, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Ruijin Wu, Michael John Livesley, Kiia Kallio, Jan H. Achrenius, Mika Tuomi
Predicates for Processing-in-Memory

Publication number: 20240103860

Abstract: Predicates for processing in memory is described. In accordance with the described techniques, a predicate instruction to compute a conditional value based on data stored in a memory is provided to a processing-in-memory component. A response that includes the conditional value computed by the processing-in-memory component is received, and the conditional value is stored in a predicate register. One or more conditional instructions are provided to the processing-in-memory component based on the conditional value stored in the predicate register.

Type: Application

Filed: September 26, 2022

Publication date: March 28, 2024

Applicant: Advanced Micro Devices, Inc.

Inventor: Nuwan S. Jayasena
Filtered Responses of Memory Operation Messages

Publication number: 20240106782

Abstract: In accordance with described techniques for filtered responses to memory operation messages, a computing system or computing device includes a memory system that receives messages. A filter component in the memory system receives the responses to the memory operation messages, and filters one or more of the responses based on a filterable condition. A tracking logic component tracks the one or more responses as filtered responses for communication completion.

Type: Application

Filed: September 28, 2022

Publication date: March 28, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Johnathan Robert Alsop, Shaizeen Dilawarhusen Aga, MOHAMED ASSEM ABD ELMOHSEN IBRAHIM

prev … 25 26 27 28 29 30 31 32 33 … next