Patents Assigned to Advanced Micro Devices, Inc.
  • Publication number: 20240411462
    Abstract: Local and dynamic triggering of operations executed by a processing-in-memory component is described. In accordance with the described techniques, a processing-in-memory component receives a command from a host for execution by the processing-in-memory component. The processing-in-memory component references a tracking table that includes at least one entry associated with an operation performed as part of executing the command and identifies at least one additional command to be triggered locally after executing the command received from the host. Responsive to identifying that conditions associated with the at least one additional command are satisfied, the processing-in-memory component executes the at least one additional command, independent of instructions from the host.
    Type: Application
    Filed: June 8, 2023
    Publication date: December 12, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Mohamed Assem Abd ElMohsen Ibrahim, Shaizeen Dilawarhusen Aga, Mahzabeen Islam
  • Publication number: 20240412446
    Abstract: A technique for performing ray tracing operations is provided. The technique includes detecting intersection of a ray with a split bounding volume of an instance of a bounding volume hierarchy; determining whether the split bounding volume meets an instance traversal limiting criterion; and continuing BVH traversal based on the determining.
    Type: Application
    Filed: June 9, 2023
    Publication date: December 12, 2024
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: David William John Pankratz, Christiaan Paul Gribble
  • Publication number: 20240411692
    Abstract: Cache replacement policies are described. In accordance with the described techniques, a request for data is received and a cache replacement policy controls how a controller responds to the request. The cache replacement policy assigns each cacheline a priority value, which indicates whether the cacheline should be preserved relative to other cachelines, in response to the request being a cache miss that necessitates eviction of at least one cacheline. The cache replacement policy decrements priority values until at least one cacheline achieves a minimum priority value, at which point a cacheline is evicted. The cache replacement policy designates certain cachelines as protected, either via a separate protected indicator or via the cacheline's priority value, which causes unprotected cachelines to be selected for eviction while favoring preservation of protected cachelines in the cache.
    Type: Application
    Filed: June 9, 2023
    Publication date: December 12, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Gabriel Hsiuwei Loh, Joseph Lee Greathouse, William Louie Walker, Paul James Moyer
  • Publication number: 20240412445
    Abstract: A technique for performing ray tracing operations is provided. The technique includes, traversing through a bounding volume hierarchy for a ray to arrive at a well-fit bounding volume that is associated with first node, wherein the first node is one of a traversal node or a procedural node, and wherein the well-fit bounding volume comprises geometry other than a single axis-aligned bounding box for the first node; evaluating the ray for intersection with the well-fit bounding volume; determining whether to execute a first shader program associated with the first node based on the evaluating, wherein the first shader program comprises a traversal shader program or a procedural shader program; and executing or not executing the first shader program based on the determining.
    Type: Application
    Filed: June 9, 2023
    Publication date: December 12, 2024
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: David William John Pankratz, David Ronald Oldcorn
  • Patent number: 12167102
    Abstract: Systems, apparatuses, and methods for processing multi-cast messages are disclosed. A system includes at least one or more processing units, one or more memory controllers, and a communication fabric coupled to the processing unit(s) and the memory controller(s). The communication fabric includes a plurality of crossbars which connect various agents within the system. When a multi-cast message is received by a crossbar, the crossbar extracts a message type indicator and a recipient type indicator from the message. The crossbar uses the message type indicator to determine which set of masks to lookup using the recipient type indicator. Then, the crossbar determines which one or more masks to extract from the selected set of masks based on values of the recipient type indicator. The crossbar combines the one or more masks with a multi-cast route to create a port vector for determining on which ports to forward the multi-cast message.
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Vydhyanathan Kalyanasundharam, Joe G. Cruz, Eric Christopher Morton, Alan Dodson Smith
  • Patent number: 12165252
    Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.
    Type: Grant
    Filed: October 3, 2023
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
  • Patent number: 12164923
    Abstract: Methods and systems are disclosed for processing a vector by a vector processor. Techniques disclosed include receiving predicated instructions by a scheduler, each of which is associated with an opcode, a vector of elements, and a predicate. The techniques further include executing the predicated instructions. Executing a predicated instruction includes compressing, based on an index derived from a predicate of the instruction, elements in a vector of the instruction, where the elements in the vector are contiguously mapped, then, after the mapped elements are processed, decompressing the processed mapped elements, where the processed mapped elements are reverse mapped based on the index.
    Type: Grant
    Filed: June 29, 2022
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Elliott David Binder, Onur Kayiran, Masab Ahmad
  • Patent number: 12165981
    Abstract: A semiconductor package includes a package substrate having a first surface and an opposing second surface, and further includes an integrated circuit (IC) die disposed at the second surface and having a third surface facing the second surface and an opposing fourth surface. The IC die has a first region comprising one or more metal layers and circuit components for one or more functions of the IC die and a second region offset from the first region in a direction parallel with the third and fourth surfaces. The semiconductor package further includes a voltage regulator disposed at the fourth surface in the second region and having an input configured to receive a supply voltage and an output configured to provide a regulated voltage, and also includes a conductive path coupling the output of the voltage regulator to a voltage input of circuitry of the IC die.
    Type: Grant
    Filed: December 20, 2021
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Gabriel H Loh, Raja Swaminathan, Rahul Agarwal, Brett P. Wilkerson
  • Patent number: 12164365
    Abstract: An apparatus and method for efficiently managing power consumption among multiple, replicated functional blocks of an integrated circuit. An integrated circuit includes multiple, replicated functional blocks that use separate power domains. Data of a given type is stored in an interleaved manner among at least two of the multiple functional blocks. In one implementation, a prior static allocation determines that only a subset of the functional blocks store the data of the given type. In another implementation, each of the functional blocks stores the data of the given type, and when an idle state has occurred, data of the given type is moved between the multiple functional blocks until one or more functional blocks no longer store data of the given type. When a transition to the idle state has occurred, the functional blocks that do not store the data of the given type are transitioned to a sleep state.
    Type: Grant
    Filed: December 27, 2022
    Date of Patent: December 10, 2024
    Assignees: Advanced Micro Devices, Inc, ATI Technologies ULC
    Inventors: Gia Tung Phan, Ashish Jain, Shang Yang
  • Patent number: 12165700
    Abstract: A technique reduces power consumption of a bit cell in a memory and provides write assistance to the bit cell. When the bit cell is active, a power-saving write-assist circuit coupled to the bit cell is selectively sized according to a type of memory access. When the bit cell is inactive, the virtual power supply node floats to a predetermined voltage between a first voltage on a first power supply node coupled to the bit cell and a second voltage on a second power supply node coupled to the bit cell. A method for controlling power consumption of a bit cell and assisting a write to the bit cell includes providing a reference voltage to a virtual power supply node coupled to the bit cell. The reference voltage is provided based on an operational state of the bit cell and a type of memory access to the bit cell.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Russell J. Schreiber, John J. Wuu, Keith A. Kasprak
  • Patent number: 12164353
    Abstract: A system and method for determining power-performance state transition thresholds in a computing system. A processor comprises several functional blocks and a power manager. Each of the functional blocks produces data corresponding to an activity level associated with the respective functional block. The power manager determines activity levels of the functional blocks and compares the activity level of a given functional block to a threshold to determine if a power-performance state (P-state) transition is indicated. The threshold is determined in part on a current P-state of the given functional block. When the current P-state of the given functional block is relatively high, the threshold activity level to transition to a higher P-state is higher than it would be if the current P-state were relatively low. The power manager is further configured to determine the thresholds based in part on one or more of a type of circuit being monitored and a type of workload being executed.
    Type: Grant
    Filed: September 29, 2022
    Date of Patent: December 10, 2024
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Ashish Jain, Shang Yang
  • Patent number: 12165016
    Abstract: Techniques are disclosed for communicating between a machine learning accelerator and one or more processing cores. The techniques include obtaining data at the machine learning accelerator via an input/output die; processing the data at the machine learning accelerator to generate machine learning processing results; and exporting the machine learning processing results via the input/output die, wherein the input/output die is coupled to one or more processor chiplets via one or more processor ports, and wherein the input/output die is coupled to the machine learning accelerator via an accelerator port.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Maxim V. Kazakov
  • Patent number: 12164924
    Abstract: A method includes, in response to receiving an instruction to perform a first operation on first data stored in a memory device, obtaining first compression metadata from the memory device based on an address for the first data, and reducing a number of operations in a set of operations based on the first operation and one or more matching addresses, the one or more matching addresses corresponding to second compression metadata matching the first compression metadata.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: December 10, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Matthew Tomei, Shomit Das
  • Publication number: 20240402907
    Abstract: A technique for operating a memory system is disclosed. The technique includes performing a first request, by a first memory client, to access data at a first memory address, wherein the first memory address refers to data in a first memory section that is coupled to the first memory client via a direct memory connection; servicing the first request via the direct memory connection; performing a second request, by the first client, to access data at a second memory address, wherein the second memory address refers to data in a second memory section that is coupled to the first client via a cross connection; and servicing the second request via the cross connection.
    Type: Application
    Filed: August 14, 2024
    Publication date: December 5, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Vidyashankar Viswanathan, Richard E. George, Michael Y. Chow
  • Publication number: 20240403065
    Abstract: The disclosed device includes multiple special purpose processors that are configured to perform, in parallel, a power on transition sequence for the device, which can involve restoring a data state of components of the device using data stored in local storages of the special purpose processors. Various other methods, systems, and computer-readable media are also disclosed.
    Type: Application
    Filed: May 31, 2024
    Publication date: December 5, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sriram Sambamurthy, Indrani Paul, Kevin M. Brandl, James R. Magro, Zhao Hui Yu, Oswin E. Housty
  • Patent number: 12158842
    Abstract: A processing system allocates memory to co-locate input and output operands for operations for processing in memory (PIM) execution in the same PIM-local memory while exploiting row-buffer locality and complying with conventional memory abstraction. The processing system identifies as “super rows” virtual rows that span all the banks of a memory device. Each super row has a different bank-interleaving pattern, referred to as a “color”. A group of contiguous super rows that has the same PIM-interleaving pattern is referred to as a “color group”. The processing system assigns memory addresses to each operand (e.g., vector) of an operation for PIM execution to a super row having a different color within the same color group to co-locate the operands for each PIM execution unit and uses address hashing to alternate between banks assigned to elements of a first operand and elements of a second operand of the operation.
    Type: Grant
    Filed: September 30, 2022
    Date of Patent: December 3, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Benjamin Youngjae Cho, Armand Bahram Behroozi, Michael L. Chu, Ashwin Aji
  • Patent number: 12160238
    Abstract: Systems, apparatuses, and methods for implementing a low-power single-phase logic gate latch for clock-gating are disclosed. A latch circuit includes shared clocked transistors without including clock inverters. The shared clocked transistors include a P-type clocked transistor and an N-type clocked transistor, with the clock input coupled to the gate of the P-type clocked transistor and to the gate of the N-type clocked transistor. The P-type clocked transistor is coupled between first and second transistor stacks of the latch. The N-type clocked transistor is coupled to a source gate of a first stack N-type transistor gated by a data input and to a source gate of a second stack N-type transistor gated by the inverted data input. The latch has a lower clock pin capacitance than a traditional logic gate latch while also avoiding having clock inverters which reduces dynamic power consumption.
    Type: Grant
    Filed: December 28, 2021
    Date of Patent: December 3, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nur Mohammad Baksh, Deepesh John
  • Patent number: 12158845
    Abstract: Systems, apparatuses, and methods for maintaining region-based cache directories split between node and memory are disclosed. The system with multiple processing nodes includes cache directories split between the nodes and memory to help manage cache coherency among the nodes' cache subsystems. In order to reduce the number of entries in the cache directories, the cache directories track coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Each processing node includes a node-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the node. The node-based cache directory includes a reference count field in each entry to track the aggregate number of cache lines that are cached per region. The memory-based cache directory includes entries for regions which have an entry stored in any node-based cache directory of the system.
    Type: Grant
    Filed: April 15, 2022
    Date of Patent: December 3, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Vydhyanathan Kalyanasundharam, Kevin M. Lepak, Amit P. Apte, Ganesh Balakrishnan
  • Patent number: 12159341
    Abstract: A technique for performing ray tracing operations is provided. The technique includes traversing through a bounding volume hierarchy to an instance node; performing an instance node transform using common circuitry; traversing to a leaf node of the bounding volume hierarchy; and performing an intersection test for the leaf node using the common circuitry.
    Type: Grant
    Filed: December 28, 2021
    Date of Patent: December 3, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Fataneh F. Ghodrat, Jeffrey Christopher Allan
  • Patent number: 12158956
    Abstract: An apparatus and method for providing access to reliable boot firmware. In various implementations, a computing system includes an integrated circuit with a security processor. Prior to performing any steps of a bootup operation using one of multiple copies of boot firmware, the security processor determines whether multiple signatures exist where the signatures are based on the multiple copies of boot firmware. Each of the multiple copies of boot firmware is a copy of a particular version of boot firmware. If the multiple signatures do not yet exist, then the security processor generates the signatures using the multiple copies of boot firmware. During a bootup operation, when the security processor determines that the multiple signatures already exist, the security processor uses these signatures to validate one or more of the multiple copies of boot firmware. The security processor continues with the bootup operation using the validated copy of boot firmware.
    Type: Grant
    Filed: November 14, 2022
    Date of Patent: December 3, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Nimit Madhubhai Patel