Patents Assigned to Advanced Micro Devices
-
Patent number: 12019499Abstract: A system and method for fast save/restore is disclosed. The system and method include one or more logical units (LUs) residing in independent power domains, one or more digital frequency synthesizers (DFS), each of the one or more DFS associated with one of the one or more LUs, the one or more DFSs configured to lock a system complex frequency and ramp the one or more LUs to system complex frequency, and one or more slave fast save/restore control (FSRC) units, each slave FSRC unit associated with one of the one or more LUs, the one or more slave FSRC units configured to save/restore the FSRC states of the one or more LUs.Type: GrantFiled: December 16, 2021Date of Patent: June 25, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Mom-Eng Ng, Dilip Kumar Jha
-
Patent number: 12019560Abstract: Process isolation for a PIM device includes: receiving, from a process, a call to allocate a virtual address space where the process stores a PIM configuration context; allocating the virtual address space including mapping a physical address space including PIM device configuration registers to the virtual address space only if the physical address space is not mapped to another process's virtual address space; and programming the PIM device configuration space according to the configuration context. When a PIM command is executed, a translation mechanism determines whether there is a valid mapping of a virtual address of the PIM command to a physical address of a PIM resource, such as a LIS entry. If a valid mapping exists, the translation is completed and the resource is accessed, but if there is not a valid mapping, the translation fails and the process is blocked from accessing the PIM resource.Type: GrantFiled: December 20, 2021Date of Patent: June 25, 2024Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Sooraj Puthoor, Muhammad Amber Hassaan, Ashwin Aji, Michael L. Chu, Nuwan Jayasena
-
Patent number: 12019904Abstract: One or both of read and write accesses to a fabric-attached memory module via a fabric interconnect are monitored. In one or more implementations, offloading of one or more tasks accessing the fabric-attached memory module to a processor of a routing system associated with the fabric-attached memory module is initiated based on the read and write accesses to the fabric-attached memory module. Additionally or alternatively, replicating memory of the fabric-attached memory module to a cache memory of a computing node in the disaggregated memory system executing one or more tasks of a host application is initiated based on the write accesses to the fabric-attached memory module.Type: GrantFiled: December 15, 2021Date of Patent: June 25, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Vamsee Reddy Kommareddy, Seyedmohammad SeyedzadehDelcheh, Sergey Blagodurov
-
Publication number: 20240205092Abstract: The disclosed device includes a collective engine that can receive state information from nodes of a collective network. The collective engine can use the state information to initialize a topology of appropriate data routes between the nodes for the collective operation. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 14, 2023Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventor: Josiah I. Clark
-
Publication number: 20240203033Abstract: A technique for performing ray tracing operations is provided. The technique includes, in a first iteration of a ray traversal technique, traversing to an instance node of a bounding volume hierarchy; in a second iteration of the ray traversal technique that is subsequent to the first iteration, transforming a ray based on an instance transform of the instance node to generate a transformed ray; and in the second iteration, performing a ray-box intersection test for box node data of the instance node based on the transformed ray.Type: ApplicationFiled: December 14, 2022Publication date: June 20, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: David William John Pankratz, David Kirk McAllister, David Ronald Oldcorn, Michael John Livesley, Daniel James Skinner
-
Publication number: 20240202047Abstract: The disclosed computer-implemented method can include reaching, by a chiplet involved in carrying out an operation for a process, a synchronization barrier. The method can additionally include receiving, by the chiplet, dedicated control messages pushed to the chiplet by other chiplets involved in carrying out the operation for the process, wherein the dedicated control messages are pushed over a control network by the other chiplets. The method can also include advancing, by the chiplet, the synchronization barrier in response to receipt of the dedicated control messages. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 16, 2022Publication date: June 20, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Joseph L. Greathouse, Alan D. Smith, Anthony Asaro, Kostantinos Danny Christidis, Alexander Fuad Ashkar, Milind N. Nemlekar
-
Publication number: 20240202862Abstract: A processing device and a method of auto-tiled workload processing is provided. The processing device includes memory and a processor. The processor is configured to store instructions for operations to be executed on an image to be divided into a plurality of tiles, store information associated with the operations, select one of the operations for execution and execute an auto-tiling plan for the operation based on the information associated with the operations. The auto-tiling plan comprises, for example, determining a number of tiles used to divide the image and determining a size of one or more of the tiles of the image.Type: ApplicationFiled: December 20, 2022Publication date: June 20, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Guennadi Riguer, Mark Satterthwaite, Jeremy Lukacs, Zhuo Chen, Gareth Havard Thomas
-
Publication number: 20240202121Abstract: Programmable data storage memory hierarchy techniques are described. In one example, a data storage system includes a memory hierarchy and a data movement controller. The memory hierarchy includes a hierarchical arrangement of a plurality of memory buffers. The data movement controller is configured to receive a data movement command and control data movement between the plurality of memory buffers based on the data movement command.Type: ApplicationFiled: December 20, 2022Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Shaizeen Dilawarhusen Aga, Johnathan Robert Alsop, Nuwan S Jayasena
-
Publication number: 20240201777Abstract: The disclosed method includes observing a utilization of a target sub-component of a functional unit of a processor using a control circuit coupled to the target sub-component. The method also includes detecting that the utilization is outside a desired utilization range and throttling one or more sub-components of the functional unit to reduce a power consumption of the functional unit. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 19, 2022Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventors: James Mossman, Robert Cohen, Sudherssen Kalaiselvan, Tzu-Wei Lin
-
Publication number: 20240202116Abstract: An entry of a last level cache shadow tag array to track pending last level cache misses to private data in a previous level cache (e.g., an L2 cache), that also are misses to an exclusive last level cache (e.g., an L3 cache) and to the last level cache shadow tag array. Accordingly, last level cache miss status holding registers need not be expended to track cache misses to private data that are already being tracked by a previous level cache miss status holding register. Additionally or alternatively, up to a threshold number of last level cache pending misses to the same shared data from different processor cores are tracked in the last level cache shadow tag array, and any additional last level cache pending misses are tracked in a last level cache miss status holding register.Type: ApplicationFiled: December 20, 2022Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, John Kalamatianos, Paul James Moyer, Nicholas Dean Lance, Sriram Srinivasan, Patrick James Shyvers, William Louie Walker
-
Publication number: 20240205093Abstract: The disclosed device includes a collective engine that can select a communication cost model from multiple communication cost models for a collective operation and configure a topology of a collective network for performing the collective operation using the selected communication cost model. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 14, 2023Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventor: Josiah I. Clark
-
Publication number: 20240201876Abstract: A method and apparatus of managing memory includes storing a first memory page at a shared memory location in response to the first memory page including data shared between a first virtual machine and a second virtual machine. A second memory page is stored at a memory location unique to the first virtual machine in response to the second memory page including data unique to the first virtual machine. The first memory page is accessed by the first virtual machine and the second virtual machine, and the second memory page is accessed by the first virtual machine and not the second virtual machine.Type: ApplicationFiled: December 16, 2022Publication date: June 20, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Lu Lu, Anthony Asaro, Yinan Jiang
-
Publication number: 20240202144Abstract: A coherent memory fabric includes a plurality of coherent master controllers and a coherent slave controller. The plurality of coherent master controllers each include a response data buffer. The coherent slave controller is coupled to the plurality of coherent master controllers. The coherent slave controller, responsive to determining a selected coherent block read command is guaranteed to have only one data response, sends a target request globally ordered message to the selected coherent master controller and transmits responsive data. The selected coherent master controller, responsive to receiving the target request globally ordered message, blocks any coherent probes to an address associated with the selected coherent block read command until receipt of the responsive data is acknowledged by a requesting client.Type: ApplicationFiled: January 11, 2024Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Vydhyanathan Kalyanasundharam, Amit P. Apte, Eric Christopher Morton, Ganesh Balakrishnan, Ann M. Ling
-
Publication number: 20240201948Abstract: A processing device for encoding floating point numbers comprising memory configured to store data comprising the floating point numbers and circuitry. The circuitry is configured to, for a set of the floating point numbers, identify which of the floating point numbers represent a zero value and which of the floating point numbers represent a non-zero value, convert the floating point numbers which represent a non-zero value into a block floating point format value and generate an encoded sparse block floating point format value. The circuitry is also configured to, decode floating point numbers. For an encoded block floating point format value, the circuitry converts the encoded block floating point format value to a set of non-zero floating point numbers based on a sparsity mask previously generated to encode the encoded block floating point format value and generates a non-sparse set of floating point values.Type: ApplicationFiled: December 16, 2022Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventor: Gabriel H. Loh
-
Publication number: 20240203034Abstract: A technique for performing ray tracing operations is provided. The technique includes, testing a plurality of bounding boxes for intersection with a ray in parallel, wherein the plurality of bounding boxes are specified by a plurality of box data items of a parent box node of a bounding volume hierarchy; determining that, for a first child node that is pointed to by a two or more node pointers specified by two or more box data items of the plurality of box data items, at least one bounding box specified by the two or more box data items is intersected by the ray; and in response to the determining, traversing to the first child node.Type: ApplicationFiled: December 14, 2022Publication date: June 20, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: David William John Pankratz, David Kirk McAllister, Daniel James Skinner, Michael John Livesley, David Ronald Oldcorn
-
Publication number: 20240205133Abstract: The disclosed device can perform a collective operation on received datasets, and split the result into chunks in accordance with a chunking scheme. The device can also forward the chunks in accordance with a routing scheme that can direct chunks to appropriate nodes of a collective network. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 14, 2023Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventor: Josiah I. Clark
-
Publication number: 20240201990Abstract: Fused data generation and associated communication techniques are described. In an implementation, a system includes processing system having a plurality of processors. A data generation and communication tracking module is configured to track programmatically defined data generation and associated communication as performed by the plurality of processors. A targeted communication module is configured to trigger targeted communication of data between the plurality of processors based on the tracked programmatically defined data generation and associated communication.Type: ApplicationFiled: March 27, 2023Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Shaizeen Dilawarhusen Aga, Suchita Pati, Nuwan S. Jayasena
-
Publication number: 20240203032Abstract: A technique for performing ray tracing operations is provided. The technique includes identifying triangles to include in a compressed triangle block; storing data common to the identified triangles as common data of the compressed triangle block; and storing data unique to the identified triangles as unique data of the compressed triangle block.Type: ApplicationFiled: December 14, 2022Publication date: June 20, 2024Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: David William John Pankratz, David Ronald Oldcorn, Daniel James Skinner, Michael John Livesley, David Kirk McAllister
-
Publication number: 20240203036Abstract: A technique for building a bounding volume hierarchy is disclosed. The technique subdividing a candidate box node based on a resolution to generate a plurality of cells of the candidate box node; identifying a plurality of nodes of a triangle set collection that fit within the cells; generating a plurality of candidate splits based on the plurality of nodes; selecting a candidate split based on a selection criterion to obtain a selected candidate split; and generating child box nodes for a box node of a bounding volume hierarchy under construction, based on the selected candidate split.Type: ApplicationFiled: December 16, 2022Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventor: John Alexandre Tsakok
-
Publication number: 20240201993Abstract: Data evaluation using processing-in-memory is described. In accordance with the described techniques, data evaluation logic is loaded into a processing-in-memory component. The processing-in-memory component executes the data evaluation logic to evaluate a minimum number of bits required to retrieve data from, or store data to, at least one memory location. A result is output indicating the number of bits required to represent data at the at least one memory location based on the evaluation.Type: ApplicationFiled: December 16, 2022Publication date: June 20, 2024Applicant: Advanced Micro Devices, Inc.Inventors: Shaizeen Dilawarhusen Aga, Leopold Grinberg