Abstract: Described herein is a technique for performing ray tracing operations. The technique includes encountering, at a non-leaf node, a pointer to a bottom-level acceleration structure having one or more delta instances; identifying an index associated with the pointer, wherein the index identifies an instance within the bottom-level acceleration structure; and obtaining data for the instance based on the pointer and the index.
Type:
Application
Filed:
September 28, 2021
Publication date:
March 30, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Konstantin I. Shkurko, Matthäus G. Chajdas, Michael Mantor
Abstract: A method and apparatus for isolating and restoring general-purpose input/output (GPIO) pads in a computer system includes identifying GPIO pads associated with the region responsive to an entry into a power-down state of a region of a circuit. The GPIO pads are isolated from one or more external circuits. Upon exit from the power-down state, each associated GPIO pad is restored to a current value.
Type:
Application
Filed:
September 24, 2021
Publication date:
March 30, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Alexander J. Branover, Indrani Paul, Benjamin Tsien, Christopher T. Weaver, John P. Petry, Mihir Shaileshbhai Doctor, Thomas J. Gibney
Abstract: Devices and methods for linear addressing are provided. A device is provided which comprises a plurality of components having assigned registers used to store data to execute a program and a power management controller, in communication with the components. The power management controller is configured to send one of a request to remove power to the components and a request to reduce power to the components when it is determined that the components are idle, execute a first process of one of removing power and reducing power to the components and entering a reduced power state when an acknowledgement of the request is received and execute a second process of restoring power to the components when one or more of the components are indicated to be active.
Type:
Application
Filed:
September 23, 2021
Publication date:
March 30, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Mihir Shaileshbhai Doctor, Alexander J. Branover, Benjamin Tsien, Indrani Paul, Christopher T. Weaver, Thomas J. Gibney, Stephen V. Kosonocky, John P. Petry
Abstract: A method and processing device for accessing data is provided. The processing device comprises a cache and a processor. The cache comprises a first data section having a first cache hit latency and a second data section having a second cache hit latency that is different from the first cache hit latency of the first data section. The processor is configured to request access to data in memory, the data corresponding to a memory address which includes an identifier that identifies the first data section of the cache. The processor is also configured to load the requested data, determined to be located in the first data section of the cache, according to the first cache hit latency of the first data section of the cache.
Abstract: Systems and methods are disclosed for reducing the power consumption of a system. Techniques are described that queue a message, sent by a source engine of the system, in a queue of a destination engine of the system that is in a sleep mode. Then, a priority level associated with the queued message is determined. If the priority level is at a maximum level, the destination engine is brought into an active mode. If the priority level is at an intermediate level, the destination engine is brought into an active mode when a time, associated with the intermediate level, has elapsed. When the destination engine is brought into an active mode it processes all messages accumulated in its queue in an order determined by their associated priority levels.
Abstract: Disclosed is a system and method for use in a cache for suppressing modification of cache line. The system and method includes a processor and a memory operating cooperatively with a cache controller. The memory includes a coherence directory stored within a cache created to track at least one cache line in the cache via the cache controller. The processor instructs a cache controller to store a first data in a cache line in the cache. The cache controller tags the cache line based on the first data. The processor instructs the cache controller to store a second data in the cache line in the cache causing eviction of the first data from the cache line. The processor compares based on the tagging the first data and the second data and suppresses modification of the cache line based on the comparing of the first data and the second data.
Abstract: A system and method for omission of probes when requesting data stored in memory where the omission includes creating a coherence directory entry, determining whether cache line data for the coherence directory entry is a trackable pattern, and setting an indication indicating that one or more reads for the cache line data can be serviced without sending probes. A system and method for providing extra data storage capacity in a coherence directory where the extra data storage capacity includes actively tracking cache lines, invalidating the cache line and informing the coherence directory, determining whether data is a trackable pattern, updating the coherence directory that the cache line is no longer in cache, updating the coherence directory to indicate cache line data is zero, and servicing reads to the cache line from the coherence directory and supplying the specified data.
Abstract: Techniques for invalidating cache lines are provided. The techniques include issuing, to a first level of a memory hierarchy, a weak exclusive read request for a speculatively executing store instruction; determining whether to invalidate one or more cache lines associated with the store instruction in one or more memories; and issuing the weak invalidation request to additional levels of the memory hierarchy.
Abstract: Systems and methods are disclosed that provide low latency augmented reality architecture for camera enabled devices. Systems and methods of communication between system components are presented that use a hybrid communication protocol. Techniques include communications between system components that involve one-way transactions. A hardware message controller is disclosed that controls out-buffers and in-buffers to facilitate the hybrid communication protocol.
Abstract: Systems and methods are disclosed that map quantum circuits to physical qubits of a quantum computer. Techniques are disclosed to generate a graph that characterizes the physical qubits of the quantum computer and to compute the resource requirements of each circuit of the quantum circuits. For each circuit, the graph is searched for a subgraph that matches the resource requirements of the circuit, based on a density matrix. Physical qubits, defined by the matching subgraph, are then allocated to the logical qubits of the circuit.
Type:
Application
Filed:
September 30, 2021
Publication date:
March 30, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Anthony T. Gutierrez, Salonik Resch, Yasuko Eckert, Gabriel H. Loh, Mark Henry Oskin, Vedula Venkata Srikant Bharadwaj
Abstract: A memory controller includes a command queue with multiple entry stacks, each with a plurality of entries holding memory access commands, one or more parameter indicators each holding a respective characteristic common to the plurality of entries, and a head indicator designating a current entry for arbitration. An arbiter has a single command input for each entry stack. A command queue loader circuit receives incoming memory access commands and loads entries of respective entry stacks with memory access commands having the respective characteristic of each of the one or more parameter indicators in common.
Abstract: A processing device is provided which includes a processor and a data storage structure. The data storage structure comprises a data storage array comprising a plurality of lines. Each line comprises at least one A latch configured to store a data bit and a clock gater. The data storage structure also comprises a write data B latch configured to store, over different clock cycles, a different data bit, each to be written to the at least one A latch of one of the plurality of lines. The data storage structure also comprises a plurality of write index B latches shared by the clock gaters of the lines. The write index B latches are configured to store, over the different clock cycles, combinations of index bits having values which index one of the lines to which a corresponding data bit is to be stored.
Abstract: Systems and methods are disclosed for maintaining insertion policies of a lower-level cache. Techniques are described for selecting, based on metadata of an evicted data block received from an upper-level cache, an insertion policy out of the insertion policies. Then, determining, based on the selected insertion policy, whether to insert the data block into the lower-level cache. If it is determined to insert, the data block is inserted into the lower-level cache according to the selected insertion policy. Techniques for dynamically updating the insertion policies of the lower-level cache are also disclosed.
Abstract: Methods and systems are disclosed for executing a collaborative task in a shader system. Techniques disclosed include receiving, by the system, input data and computing instructions associated with the collaborative task, as well as a configuration setting, causing the system to operate in a takeover mode. The system then launches, exclusively in one workgroup processor, a workgroup including wavefronts configured to execute the collaborative task.
Abstract: Systems and methods for cache replacement are disclosed. Techniques are described that determine a re-reference interval prediction (RRIP) value of respective data blocks in a cache, where an RRIP value represents a likelihood that a respective data block will be re-used within a time interval. Upon an access, by a processor, to a data segment in a memory, if the data segment is not stored in the cache, a data block in the cache to be replaced by the data segment is selected, utilizing a binary tree that tracks recency of data blocks in the cache.
Abstract: Described herein is a technique for performing operations for a bounding volume hierarchy. The techniques include: for a bounding box with quantized orientation, the bounding box being part of a bounding volume hierarchy, rotating a ray according to the quantized orientation to generate a rotated ray; performing an intersection test against the bounding box with the rotated ray; and according to the results of the intersection test, continuing traversal of the bounding volume hierarchy.
Type:
Application
Filed:
September 29, 2021
Publication date:
March 30, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
David Ronald Oldcorn, Matthäus G. Chajdas, Michael A. Kern
Abstract: Methods and systems are disclosed for executing operations on single-instruction-multiple-data (SIMD) units. Techniques disclosed perform a dot product operation on input data during one computer cycle, including convolving the input data, generating intermediate data, and applying one or more transitional operations to the intermediate data to generate output data. Aspects described, wherein the input data is an input to a layer of a convolutional neural network and the generated output data is the output of the layer.
Type:
Application
Filed:
September 29, 2021
Publication date:
March 30, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Brian Emberling, Michael Mantor, Michael Y. Chow, Bin He
Abstract: Techniques for performing cache operations are provided. The techniques include tracking re-references for cache lines of a cache, detecting that eviction is to occur, and selecting a cache line for eviction from the cache based on a re-reference indication.
Abstract: Methods are provided for creating objects in a way that permits an API client to explicitly participate in memory management for an object created using the API. Methods for managing data object memory include requesting memory requirements for an object using an API and expressly allocating a memory location for the object based on the memory requirements. Methods are also provided for cloning objects such that a state of the object remains unchanged from the original object to the cloned object or can be explicitly specified.
Type:
Application
Filed:
December 2, 2022
Publication date:
March 30, 2023
Applicants:
Advanced Micro Devices, Inc., ATI Technologies UL
Abstract: A voltage level-shifting circuit for an integrated circuit includes an input terminal receiving a voltage signal referenced to an input/output (PO) voltage level. A transistor overvoltage protection circuit includes a first p-type metal oxide semiconductor (PMOS) transistor includes a source coupled to the second voltage supply, a gate receiving an enable signal, and a drain connected to a central node. A first n-type metal oxide semiconductor (NMOS) transistor includes a drain connected to the central node, a gate connected to the input terminal, and a source connected to an output terminal. A second NMOS transistor includes a drain connected to the input terminal, a gate connected to the central node, and a source connected to the output terminal.