Patents Assigned to Advanced Micro Devices, Incs.
-
Patent number: 12361628Abstract: A graphics processing unit (GPU) of a processing system is partitioned into multiple dies (referred to as GPU chiplets) that are configurable to collectively function and interface with an application as a single GPU in a first mode and as multiple GPUs in a second mode. By dividing the GPU into multiple GPU chiplets, the processing system flexibly and cost-effectively configures an amount of active GPU physical resources based on an operating mode. In addition, a configurable number of GPU chiplets are assembled into a single GPU, such that multiple different GPUs having different numbers of GPU chiplets can be assembled using a small number of tape-outs and a multiple-die GPU can be constructed out of GPU chiplets that implement varying generations of technology.Type: GrantFiled: December 8, 2022Date of Patent: July 15, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Mark Fowler, Samuel Naffziger, Michael Mantor, Mark Leather
-
Patent number: 12360804Abstract: A processing system flexibly schedules workgroups across kernels based on data dependencies between workgroups to enhance processing efficiency. The workgroups are partitioned into subsets based on the data dependencies and workgroups of a first subset that produces data are scheduled to execute immediately before workgroups of a second subset that consumes the data generated by the first subset. Thus, the processing system does not execute one kernel at a time, but instead schedules workgroups across kernels based on data dependencies across kernels. By limiting the sizes of the subsets to the amount of data that can be stored at local caches, the processing system increases the probability that data to be consumed by workgroups of a subset will be resident in a local cache and will not require a memory access.Type: GrantFiled: December 30, 2022Date of Patent: July 15, 2025Assignee: Advanced Micro Devices, Inc.Inventor: Harris Gasparakis
-
Publication number: 20250224982Abstract: In accordance with the described techniques, a scalable input/output virtualization (SIOV) device includes multiple hardware queues, backend hardware resources, and a command processor running scheduling firmware. The scheduling firmware selects a shared work queue of multiple shared work queues managed by the scheduling firmware from which to dispatch tasks based on one or dispatch policies. In addition, the scheduling firmware selects a hardware queue of the multiple hardware queues in which to enqueue the tasks based on one or more queue policies. Further, the scheduler dispatches the tasks from the shared work queue to the hardware queue, and the tasks are read from the hardware queue by the backend hardware resources for execution.Type: ApplicationFiled: January 10, 2024Publication date: July 10, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Anthony Thomas Gutierrez, Stephen Alexander Zekany, Ali Arda Eker
-
Publication number: 20250225017Abstract: A system comprises a machine check architecture and a processor. The machine check architecture is configured to log hardware errors. The processor is configured to obtain a log of one or more of the hardware errors from the machine check architecture and/or to generate a copy of the log. The processor is further configured to either (1) deliver the log to an in-band agent and the copy of the log to an out-of-band agent or (2) deliver the copy of the log to the in-band agent and the log to the out-of-band agent. Various other devices, systems, and methods are also disclosed.Type: ApplicationFiled: March 31, 2025Publication date: July 10, 2025Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Vilas Sridharan, Vamsi Krishna Alla, Maher Mounir Moghabghab, Kabita Rani Saha, Carlos Vallin, Vignesh Vaidhyanathan Seshan
-
Patent number: 12353338Abstract: A data processing node includes a processor element and a data fabric circuit. The data fabric circuit is coupled to the processor element and to a local memory element and includes a crossbar switch. The data fabric circuit is operable to bypass the crossbar switch for memory access requests between the processor element and the local memory element.Type: GrantFiled: June 29, 2023Date of Patent: July 8, 2025Assignee: Advanced Micro Devices, Inc.Inventor: Gabriel H. Loh
-
Publication number: 20250217120Abstract: Using artificial intelligence (AI)-based techniques to guide instruction scheduling in a compiler can improve the efficiency and code generation quality of the compiler. AI-guided scheduling of a basic block of a computer program can include obtaining first and second representations of the basic block; selecting K instruction scheduling procedures from a set of N instruction scheduling procedures based on analysis of the first representation of the basic block by a model, where 1?K<N and N?2; generating K candidate schedules of the basic block, including applying the K instruction scheduling procedures to the second representation of the basic block, and ordering the instructions of the second representation of the basic block in accordance with a candidate schedule included in the K candidate schedules.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Jake Matthew Daly, Ian Charles Colbert, Andrei Rudolfovich Yershov, Ryan Mitchell, Robert A. Gottlieb, Norman Rubin
-
Publication number: 20250217298Abstract: A method for reducing cache fills can include training a filter, by at least one processor and in response to at least one of eviction or rewrite of one or more entries of a cache, the filter indicating one or more cache loads from which the one or more entries were previously filled. The method can also include preventing, by the at least one processor and based on the trained filter, one or more subsequent fills to the cache from the one or more cache loads. Various other methods and systems are also disclosed.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Alok Garg, Matthew Sobel, Alice Danielle Kivity
-
Publication number: 20250218104Abstract: A device that defines and uses a bounding volume for testing for ray intersections with a displaced micro-mesh. The bounding volume is indirectly based on a twisted prism composed of two triangles and three bilinear patches that bounds the displaced micro-mesh. Instead of detecting intersection with the bilinear patches directly, tetrahedrons that circumscribe the bilinear patches can be used instead. The two bases and the three tetrahedra make fourteen triangles. The device tests for potential intersection with the displaced micro-mesh by testing for an intersection with any of the fourteen triangles. Various other methods and systems are also disclosed.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: David Kirk McAllister, Andrew Erin Kensler, Holger Gruen
-
Publication number: 20250216889Abstract: The disclosed device includes various circuit blocks and a clock tree for sending a clock signal to the circuit blocks. The clock tree includes various clock drivers. The device also includes a control circuit that power gates, in response to one of the circuit blocks being power gated, a portion of the clock tree that includes one of the clock drivers. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Benjamin Tsien, Pravesh Gupta, Madhusudan Chilakam, Jeffrey Lynn Freeman, Indrani Paul, Guhan Krishnan, Ann M. Ling, Chandana Yerneni
-
Publication number: 20250217185Abstract: A computer-implemented method for physical core-specific wear-based task scheduling can include obtaining a wear metric for each physical core based of the plurality of physical cores of the at least one integrated circuit, wherein the wear metric is indicative of a physical condition of each physical core. The computer-implemented method can then schedule a plurality of tasks across at least one physical core of the plurality of physical cores based at least in part on the wear metric of each physical core of the plurality of cores. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Moumita Dey, Heather Lynn Hanson, Aarti Choudhary, Srilatha Manne
-
Publication number: 20250217287Abstract: An example device can include at least one network controller configured to receive a data request and to retrieve data based on the data request, and a cache agent configured to receive a data access parameter based on the data request, and reconfigure a cache for at least one memory cache based on the data access parameter. The data request can be received from a computer device and the data can be retrieved from at least one memory device. An example data access parameter can include a latency of at least one network-attached memory device to retrieve data from the at least one memory device based on the data request. An example device can further comprises a flit profiler configured to determine the data access parameter. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Sergey Blagodurov, Vamsee Reddy Kommareddy, Pratik Mishra, Nathaniel Morris, Kevin Y. Cheng
-
Publication number: 20250216888Abstract: Temporary system adjustment for component overclocking is described. In accordance with the described techniques, a processor and/or memory are operated according to first settings. During operation of the processor and/or the memory according to the first settings, a signal triggers a temporary adjustment of operation of the processor and/or the memory according to second settings. Responsive to the request, operation of the processor and/or the memory is switched to the second settings without rebooting. After a duration, operation of the processor and/or the memory is switched back to the first settings. In one or more implementations, at least one of the first settings or the second settings overclock the processor and/or the memory.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicants: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Wayne Paul Rodrigue, Grant Evan Ley, Jerry Anton Ahrens, JR., Coralie So, Xianglong Du, Nicholas Carmine DeFiore, Ronald James Baughman, Joshua Taylor Knight, William Robert Alverson
-
Publication number: 20250217692Abstract: A quantum computing device includes a plurality of quantum parallel processing units (Q-PPUs) configured to execute a set of quantum instructions of a quantum application program. The quantum computing device includes an adaptive quantum instruction scheduler to dynamically distribute the set of quantum instructions to the plurality of Q-PPUs based, at least in part, upon a measured probability of a desired result of executing the set of quantum instructions of the quantum application program and a decoherence time of a qubit.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: DaZheng WANG, Jie ZHANG, Zhenyu XU
-
Publication number: 20250217297Abstract: A computing device includes detection circuitry configured to detect invalidation of a line of a cache array. The computing device additionally includes setting circuitry configured to set, in response to the detected invalidation, a spare state encoding in an entry of a partial line-based probe filter that indicates recent invalidation of the line of the cache array. The computing device also includes processing circuitry configured to process a transaction that hits on the entry of the partial line-based probe filter by avoiding a multicast probe of the cache array. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: November 22, 2022Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Amit P. Apte, Ganesh Balakrishnan
-
Publication number: 20250217292Abstract: Adaptive system probe action to minimize input/output dirty data transfers is described. In one or more implementations, a system includes a processor, a memory configured to store data, and a cache configured to store a portion of the data stored in the memory for execution by the processor. The system also includes a cache coherence controller including a cache line history. The cache coherence controller is configured detect a direct memory access request from an input/output device. The direct memory access request is associated with an input/output operation involving the data. The cache coherence controller is further configured to identify a cache line associated with the direct memory access request, and, in response to the cache line history including a dirty data transfer record corresponding to the cache line, selectively send a probe to the cache based on a state of the cache line.Type: ApplicationFiled: December 28, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Li Ou, Ganesh Balakrishnan, Amit Apte
-
Publication number: 20250216456Abstract: A system includes a first chiplet and a second chiplet connected via a plurality of interconnects. The system includes a pattern generator configured to generate a test pattern on behalf of the first chiplet. The system includes a pattern checker configured to check the test pattern on behalf of the second chiplet. The system includes a first repair multiplexer and a second repair multiplexer corresponding to the first chiplet and the second chiplet, respectively. The first repair multiplexer and the second repair multiplexer configured to selectively enable a repair path responsive to a short fault between two interconnects of the plurality interconnects based on the checked test pattern.Type: ApplicationFiled: December 27, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Nehal R. Patel, Carl Dean Dietz, Michael Kevin Ciraula, John J. Wuu, Russell J. Schreiber
-
Publication number: 20250217246Abstract: An exemplary apparatus for interfacing dies that use incompatible protocols includes a first die that uses a first protocol, a second die that uses a second protocol, and a die management unit communicatively coupled to both the first die and the second die in an integrated circuit. In some examples, the die management unit is configured to translate at least one message between the first protocol and the second protocol to support communication between the first die and the second die. Various other apparatuses, systems, and methods are also disclosed.Type: ApplicationFiled: December 29, 2023Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Robert Landon Pelt, Nehal Patel, Lu Lu, Atul Subhash Athavale, Alexander J. Branover, Abin Thomas
-
Publication number: 20250217293Abstract: Shared last level cache usage management for multiple clients is described. In one or more implementations, a system includes a shared last level cache coupled to multiple clients and a dynamic random access memory. The system further includes a linear dropout regulator that supplies power to the shared last level cache. A data fabric included in the system is configured to control a level of the power supplied from the linear dropout regulator to be either a first level or a second level based on usage of the shared last level cache.Type: ApplicationFiled: December 20, 2024Publication date: July 3, 2025Applicant: Advanced Micro Devices, Inc.Inventors: Indrani Paul, Benjamin Tsien, Mahesh Subramony, Oleksandr Khodorkovsky
-
Patent number: 12346265Abstract: Systems, apparatuses, and methods for implementing cache line re-reference interval prediction using a physical page address are disclosed. When a cache line is accessed, a controller retrieves a re-reference interval counter value associated with the line. If the counter is less than a first threshold, then the address of the cache line is stored in a small re-use page buffer. If the counter is greater than a second threshold, then the address is stored in a large re-use page buffer. When a new cache line is inserted in the cache, if its address is stored in the small re-use page buffer, then the controller assigns a high priority to the line to cause it to remain in the cache to be re-used. If a match is found in the large re-use page buffer, then the controller assigns a low priority to the line to bias it towards eviction.Type: GrantFiled: December 16, 2019Date of Patent: July 1, 2025Assignee: Advanced Micro Devices, Inc.Inventors: Jieming Yin, Yasuko Eckert, Subhash Sethumurugan
-
Patent number: 12346226Abstract: Embodiments herein describe a circuit for detecting a single event upset (SEU). The circuit includes a latch including an output node, a first parity node, and a second parity node and correction circuitry configured to correct a single event upset (SEU) at the output node using the first and second parity nodes.Type: GrantFiled: October 4, 2023Date of Patent: July 1, 2025Assignees: XILINX, INC., Advanced Micro Devices, Inc.Inventors: Kumar Rahul, Santosh Yachareni, Pierre Maillard, Mrinmoy Goswami, Tabrez Alam, Gokul Puthenpurayil Ravindran, Md Hussain, Sanat Kumar Dubey, John J. Wuu