Patents Assigned to Advanced Micro Devics, Inc.
-
Publication number: 20230206509Abstract: Methods and systems are disclosed for encoding a Morton code. Techniques disclosed comprise receiving location vectors associated with primitives, where the primitives are graphical elements spatially located within a three-dimensional scene. Techniques further comprise determining a code pattern comprising a prefix pattern and a base pattern, and, then, coding each of the location vectors according to the code pattern.Type: ApplicationFiled: December 27, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventor: John Alexandre Tsakok
-
Publication number: 20230205584Abstract: A disclosed technique includes allocating a first set of resource slots for a first execution instance of a pipeline shader program; correlating the first set of resource slots with graphics pipeline passes; and on a second execution instance of the pipeline shader program, assigning resource slots, from the first set of resource slots, to the graphics pipeline passes, based on the correlating.Type: ApplicationFiled: December 28, 2021Publication date: June 29, 2023Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Zhuo Chen, Steven J. Tovey
-
Publication number: 20230206113Abstract: A technique for processing images is disclosed. The technique includes tracking accesses, by a machine learning system, to individual features of a set of features, to generate an access count for each of the individual features; generating a rank for at least one of the individual features of the set of features based on the access count; and assigning the at least one of the individual features to a level of a memory hierarchy based on the rank.Type: ApplicationFiled: December 28, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Sergey Blagodurov
-
Publication number: 20230205608Abstract: A disclosed technique includes executing, for a first wavefront, a barrier arrival notification instruction, for a first barrier, indicating arrival at a first barrier point; performing, for the first wavefront, work prior to the first barrier point; executing, for the first wavefront, a barrier check instruction; and executing, for the first wavefront, at a control flow path based on a result of the barrier check instruction.Type: ApplicationFiled: December 27, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Brian Emberling, Joseph L. Greathouse
-
Publication number: 20230205252Abstract: Methods and systems are disclosed for clock delay compensation in a multiple chiplet system. Techniques disclosed include distributing, by a clock generator, a clock signal across distribution trees of respective chiplets; measuring phases, by phase detectors, where each phase measurement is associated with a chiplet of the chiplets and is indicative of a propagation speed of the clock signal through the distribution tree of the chiplet. Then, for each chiplet, techniques are further disclosed that determine, by a microcontroller, based on the phase measurements associated with the chiplet, a delay offset, and that delay, based on the delay offset, the propagation of the clock signal through the distribution tree of the chiplet using a delay unit associated with the chiplet.Type: ApplicationFiled: December 29, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Anwar Kashem, Craig Daniel Eaton, Pouya Najafi Ashtiani, Deepesh John
-
Publication number: 20230205544Abstract: A processing device is provided which comprises memory configured to store data and a processor configured to execute a forward activation of the neural network using a low precision floating point (FP) format, scale up values of numbers represented by the low precision FP format and process the scaled up values of the numbers as non-zero values for the numbers. The processor is configured to scale up the values of one or more numbers, via scaling parameters, to a scaled up value equal to or greater than a floor of a dynamic range of the low precision FP format. The scaling parameters are, for example, static parameters or alternatively, parameters determined during execution of the neural network.Type: ApplicationFiled: December 29, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Hai Xiao
-
Publication number: 20230206542Abstract: A technique for performing ray tracing operations is provided. The technique includes processing small bounding box nodes in a box intersection test circuit to generate intersection test results for the small bounding box nodes; and processing large bounding box nodes in the box intersection test circuit to generate intersection test results for the large bounding box nodes.Type: ApplicationFiled: December 28, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Fataneh F. Ghodrat, Jeffrey Christopher Allan, Skyler Jonathon Saleh
-
Publication number: 20230206395Abstract: A technique for performing convolution operations is disclosed. The technique includes performing a first convolution operation based on a first convolutional layer input image to generate at least a portion of a first convolutional layer output image; while performing the first convolution operation, performing a second convolution operation based on a second convolutional layer input image to generate at least a portion of a second convolutional layer output image, wherein the second convolutional layer input image is based on the first convolutional layer output image; storing the portion of the first convolutional layer output image in a first memory dedicated to storing image data for convolution operations; and storing the portion of the second convolutional layer output image in a second memory dedicated to storing image data for convolution operations.Type: ApplicationFiled: December 29, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Michael Y. Chow, Vidyashankar Viswanathan, Richard E. George
-
Publication number: 20230205435Abstract: A phase training update circuit operates during a self-refresh cycle of a memory to perform a phase training update on individual bit lanes. The phase training update circuit adjusts a bit lane transmit phase offset forward a designated number of phase steps, transmits a training pattern, and determines a first number of errors in the transmission. It also adjusts the bit lane transmit phase offset backward the designated number of phase steps, transmits the training pattern, and determines a second number of errors in the transmission. Responsive to a difference between the first number of errors and the second number of errors, the phase training update circuits adjusts a center phase position for the bit lane transmit phase offset of the selected bit lane.Type: ApplicationFiled: December 23, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Scott P. Murphy, Huuhau M. Do
-
Publication number: 20230205420Abstract: A technique for operating a memory system is disclosed. The technique includes performing a first request, by a first memory client, to access data at a first memory address, wherein the first memory address refers to data in a first memory section that is coupled to the first memory client via a direct memory connection; servicing the first request via the direct memory connection; performing a second request, by the first client, to access data at a second memory address, wherein the second memory address refers to data in a second memory section that is coupled to the first client via a cross connection; and servicing the second request via the cross connection.Type: ApplicationFiled: December 29, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Vidyashankar Viswanathan, Richard E. George, Michael Y. Chow
-
Publication number: 20230206973Abstract: Methods and systems are disclosed for calibrating, by a memory interface system, an interface with dynamic random-access memory (DRAM) using a dynamically changing training clock. Techniques disclosed comprise receiving a system clock having a clock signal at a first pulse rate. Then, during the training of the interface, techniques disclosed comprise generating a training clock from the clock signal at the first pulse rate, the training clock having a clock signal at a second pulse rate, and sending, based on the generated training clock, command signals, including address data, to the DRAM.Type: ApplicationFiled: December 29, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Anwar Kashem, Craig Daniel Eaton, Pouya Najafi Ashtiani
-
Publication number: 20230205433Abstract: Methods and systems are disclosed for frequency transitioning in a memory interface system. Techniques disclosed include receiving a signal indicative of a change in operating frequency, into a new frequency, in a processing unit interfacing with memory via the memory interface system; switching the system from a normal mode of operation into a transition mode of operation; updating control and state register (CSR) banks of respective transceivers of the system through a mission bus used during the normal mode of operation; and operating the system in the new frequency.Type: ApplicationFiled: December 29, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Anwar Kashem, Craig Daniel Eaton, Pouya Najafi Ashtiani
-
Publication number: 20230205680Abstract: Methods and systems are disclosed for emulating, in a platform, the performance of a target platform. Techniques disclosed include receiving, by the platform, values of system features, associated with a target performance of the target platform; and setting, by the platform, one or more configuration knobs, based on the received values of system features, to match a performance of the platform to the target performance of the target platform.Type: ApplicationFiled: December 28, 2021Publication date: June 29, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Richard E. George, Vidyashankar Viswanathan, Michael Y. Chow
-
Patent number: 11687456Abstract: An electronic device that handles memory accesses includes a memory and a processor that supports a plurality of streams. The processor acquires a graph that includes paths of operations in a set of operations for processing instances of data through a model, each path of operations including a separate sequence of operations from the set of operations that is to be executed using a respective stream from among the plurality of streams. The processor then identifies concurrent paths in the graph, the concurrent paths being paths of operations between split points at which two or more paths of operations diverge and merge points at which the two or more paths of operations merge. The processor next executes operations in each of the concurrent paths using a respective stream, the executing including using memory coloring for handling memory accesses in the memory for the operations in each concurrent path.Type: GrantFiled: December 21, 2021Date of Patent: June 27, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Mei Ye
-
Patent number: 11687281Abstract: A memory controller includes a command queue and an arbiter for selecting entries from the command queue for transmission to a DRAM. The arbiter transacts streaks of consecutive read commands and streaks of consecutive write commands. The arbiter transacts a streak for at least a minimum burst length based on a number of commands of a designated type available to be selected by the arbiter. Following the minimum burst length, the arbiter decides to start a new streak of commands of a different type based on a first set of one or more conditions indicating intra-burst efficiency.Type: GrantFiled: March 31, 2021Date of Patent: June 27, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Guanhao Shen, Ravindra Nath Bhargava
-
Patent number: 11687460Abstract: Methods, devices, and systems for GPU cache injection. A GPU compute node includes a network interface controller (NIC) which includes NIC receiver circuitry which can receive data for processing on the GPU, NIC transmitter circuitry which can send the data to a main memory of the GPU compute node and which can send coherence information to a coherence directory of the GPU compute node based on the data. The GPU compute node also includes a GPU which includes GPU receiver circuitry which can receive the coherence information; GPU processing circuitry which can determine, based on the coherence information, whether the data satisfies a heuristic; and GPU loading circuitry which can load the data into a cache of the GPU from the main memory if on the data satisfies the heuristic.Type: GrantFiled: April 26, 2017Date of Patent: June 27, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Michael W. LeBeane, Walter B. Benton, Vinay Agarwala
-
Patent number: 11687251Abstract: Systems and methods for dynamic repartitioning of physical memory address mapping involve relocating data stored at one or more physical memory locations of one or more memory devices to another memory device or mass storage device, repartitioning one or more corresponding physical memory maps to include new mappings between physical memory addresses and physical memory locations of the one or more memory devices, then loading the relocated data back onto the one or more memory devices at physical memory locations determined by the new physical address mapping. Such dynamic repartitioning of the physical memory address mapping does not require a processing system to be rebooted and has various applications in connection with interleaving reconfiguration and error correcting code (ECC) reconfiguration of the processing system.Type: GrantFiled: September 28, 2021Date of Patent: June 27, 2023Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Joseph L. Greathouse, Alan D. Smith, Francisco L. Duran, Felix Kuehling, Anthony Asaro
-
Publication number: 20230195641Abstract: Guided cache replacement is described. In accordance with the described techniques, a request to access a cache is received, and a cache replacement policy which controls loading data into the cache is accessed. The cache replacement policy includes a tree structure having nodes corresponding to cachelines of the cache and a traversal algorithm controlling traversal of the tree structure to select one of the cachelines. Traversal of the tree structure is guided using the traversal algorithm to select a cacheline to allocate to the request. The guided traversal modifies at least one decision of the traversal algorithm to avoid selection of a non-replaceable cacheline.Type: ApplicationFiled: December 21, 2021Publication date: June 22, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Jeffrey Christopher Allan
-
Publication number: 20230197123Abstract: A method and apparatus for performing a simulated write in a computer system includes, responsive to a scheduled memory operation determined by a memory controller, sending a simulated write operation to a physical layer circuitry (PHY) to increase circuit power without enabling the output of the PHY until the memory operation begins. Responsive to the memory operation being complete, sending a simulated write operation to the PHY to decrease circuit power.Type: ApplicationFiled: December 20, 2021Publication date: June 22, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Anwar Kashem, Pouya Najafi Ashtiani, Craig Daniel Eaton, Kedarnath Balakrishnan
-
Publication number: 20230198528Abstract: A apparatus includes a reference signal generator, a droop detection circuit, a digital frequency-locked loop (DFLL), and a DFLL control circuit. The reference signal generator that receives a digital value and produces a pulse-density modulated signal based on the digital value. The droop detection circuit converts the pulse-density modulated signal to an analog signal, compares the analog signal to a monitored supply voltage, and responsive to detecting a droop of the monitored supply voltage below a designated value relative to the analog signal, produces a droop detection signal. The DFLL provides a clock signal for synchronizing circuitry within a domain of the monitored supply voltage. The DFLL control circuit, responsive to receiving the droop detection signal, causes the DFLL to slow the clock signal.Type: ApplicationFiled: December 21, 2021Publication date: June 22, 2023Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Kaushik Mazumdar, Joyce Cheuk Wai Wong, Naeem Ibrahim Ally, Stephen Victor Kosonocky