Patents Assigned to Advanced Micro Device

APPROXIMATION OF MATRICES FOR MATRIX MULTIPLY OPERATIONS

Publication number: 20220309126

Abstract: A processing device is provided which comprises memory configured to store data and a processor configured to receive a portion of data of a first matrix comprising a first plurality of elements and receive a portion of data of a second matrix comprising a second plurality of elements. The processor is also configured to determine values for a third matrix by dropping a number of products from products of pairs of elements of the first and second matrices based on approximating the products of the pairs of elements as a sum of the exponents of the pairs of elements and performing matrix multiplication on remaining products of the pairs of elements of the first and second matrices.

Type: Application

Filed: March 26, 2021

Publication date: September 29, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Pramod Vasant Argade, Swapnil P. Sakharshete, Maxim V. Kazakov, Alexander M. Potapov
SYNCHRONIZATION FREE CROSS PASS BINNING THROUGH SUBPASS INTERLEAVING

Publication number: 20220309729

Abstract: A method of tiled rendering is provided which comprises dividing a frame to be rendered, into a plurality of tiles, receiving commands to execute a plurality of subpasses of the tiles and interleaving execution of same subpasses of multiple tiles of the frame. Interleaving execution of same subpasses of multiple tiles comprises executing a previously ordered first subpass of a second tile between execution of the previously ordered first subpass of a first tile and execution of a subsequently ordered second subpass of the first tile. The interleaving is performed, for example, by executing the plurality of subpasses in an order different from the order in which the commands to execute the plurality of subpasses are stored and issued. Alternatively, interleaving is performed by executing one or more subpasses as skip operations such that the plurality of subpasses are executed in the same order.

Type: Application

Filed: December 29, 2021

Publication date: September 29, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Ruijin Wu, Mika Tuomi, Paavo Sampo Ilmari Pessi, Anirudh R. Acharya
Dynamic instances semantics

Patent number: 11455153

Abstract: A computing system includes a processor and a memory storing instructions for a compiler that, when executed by the processor, cause the processor to generate a control flow graph of program source code by receiving the program source code in the compiler, in the compiler, generating a structure point representation based on the received program source code by inserting into the program source code a set of structure points including an anchor structure point and a join structure point associated with the anchor structure point, and based on the structure point representation, generating the control flow graph including a plurality of blocks each representing a portion of the program source code. In the control flow graph, a block A between the anchor structure point and the join structure point post-dominates each of the one or more divergent branches between the anchor structure point and the join structure point.

Type: Grant

Filed: August 19, 2019

Date of Patent: September 27, 2022

Assignee: Advanced Micro Devices, Inc.

Inventor: Nicolai Haehnle
Variable precision computing system

Patent number: 11455766

Abstract: A processor selectively adjusts the precision of data for different functional units. Specified functional units of the processor, such as shader processing unit of a graphics processing unit (GPU) include a zeroing module to store, based on the states of corresponding precision flags, a data value of zero at specified portion of an input and/or output data operand. The functional unit then processes the data including the zeroed portion. Because a portion of the data has been zeroed, the functional unit consumes less power during data processing. Furthermore, the precision flags are set such that the reduced precision of the data does not significantly impact a user experience.

Type: Grant

Filed: September 18, 2018

Date of Patent: September 27, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Pramod V. Argade, Daniel Nikolai Peroni
Enhanced durability for systems on chip (SOCs)

Patent number: 11455251

Abstract: A system-on-chip with runtime global push to persistence includes a data processor having a cache, an external memory interface, and a microsequencer. The external memory interface is coupled to the cache and is adapted to be coupled to an external memory. The cache provides data to the external memory interface for storage in the external memory. The microsequencer is coupled to the data processor. In response to a trigger signal, the microsequencer causes the cache to flush the data by sending the data to the external memory interface for transmission to the external memory.

Type: Grant

Filed: November 11, 2020

Date of Patent: September 27, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Alexander J. Branover, Kevin M. Lepak, William A. Moyes
Power state transitions

Patent number: 11455025

Abstract: A computer processing device transitions among a plurality of power management states and at least one power management sub-state. From a first state, it is determined whether an entry condition for a third state is satisfied. If the entry condition for the third state is satisfied, the third state is entered. If the entry condition for the third state is not satisfied, it is determined whether an entry condition for the first sub-state is satisfied. If the entry condition for the first sub-state is determined to be satisfied, the first sub-state is entered, a first sub-state residency timer is started, and after expiry of the first sub-state residency timer, the first sub-state is exited, the first state is re-entered, and it is re-determined whether the entry condition for the third state is satisfied.

Type: Grant

Filed: September 14, 2020

Date of Patent: September 27, 2022

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Xiaojie He, Alexander J. Branover, Mihir Shaileshbhai Doctor, Evgeny Mintz, Fei Fei, Ming So, Felix Yat-Sum Ho, Biao Zhou
Multi-class multi-label classification using clustered singular decision trees for hardware adaptation

Patent number: 11455252

Abstract: Techniques for generating a model for predicting when different hybrid prefetcher configurations should be used are disclosed. Techniques for using the model to predict when different hybrid prefetcher configurations should be used are also disclosed. The techniques for generating the model include obtaining a set of input data, and generating trees based on the training data. Each tree is associated with a different hybrid prefetcher configuration and the trees output certainty scores for the associated hybrid prefetcher configuration based on hardware feature measurements. To decide on a hybrid prefetcher configuration to use, a prefetcher traverses multiple trees to obtain certainty scores for different hybrid prefetcher configurations and identifies a hybrid prefetcher configuration to used based on a comparison of the certainty scores.

Type: Grant

Filed: June 26, 2019

Date of Patent: September 27, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: John Kalamatianos, Paul S. Keltcher, Mayank Chhablani, Alok Garg, Furkan Eris
Setting values of portions of registers based on bit values

Patent number: 11451241

Abstract: A processor employs a set of bits to indicate values of portions of registers of a register file. In response to a specified instruction indicating an expected change of instruction types to be executed, the processor sets one or more of the bits and, for subsequent instructions, interprets corresponding portions of the registers as having a specified value (e.g., zero). By employing the set of bits to set the values of the register portions, rather than setting the individual portions of the registers to the specified value, the processor conserves processor resources (e.g., power) when the processor transitions between executing instructions of different types.

Type: Grant

Filed: December 14, 2017

Date of Patent: September 20, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Erik Swanson, Sneha V. Desai, Michael Estlick
OVERLAPPED CURVE MAPPING FOR HISTOGRAM-BASED LOCAL TONE AND LOCAL CONTRAST

Publication number: 20220292653

Abstract: Methods and apparatuses are disclosed herein for performing tone mapping and/or contrast enhancement. In some examples, a block mapping curve is low-pass filtered with block mapping curves of surrounding blocks to form a smoothed block mapping curve. In some examples, overlapped curve mapping of block mapping curves, including smoothed block mapping curves, is performed, including weighting, based on a pixel location, block mapping curves of a group of blocks to generate an interpolated block mapping curve and applying the interpolated block mapping curve to a pixel to perform ton mapping and/or contrast enhancement.

Type: Application

Filed: June 1, 2022

Publication date: September 15, 2022

Applicant: Advanced Micro Devices, Inc.

Inventor: Ying-Ru Chen
Separate clocking for components of a graphics processing unit

Patent number: 11442495

Abstract: Systems and methods related to controlling clock signals for clocking shader engines modules (SEs) and non-shader-engine modules (nSEs) of a graphics processing unit (GPU) are provided. One or more dividers receive a clock signal CLK and output a clock signal CLKA to the SEs and output a clock signal CLKB to the nSEs. The frequencies of CLKA and CLKB are independently selected based on sets of performance counter data monitored at the SEs and nSEs, respectively. The clock signal frequency for either the SEs or the nSEs is reduced when the corresponding sets of performance counter data indicates a comparatively lower processing workload for the SEs or for the nSEs.

Type: Grant

Filed: September 25, 2020

Date of Patent: September 13, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Ranjith Kumar Sajja, Sreekanth Godey, Anirudh R. Acharya
Secure computer vision processing

Patent number: 11443051

Abstract: A computer vision processor in an image cluster defines a fenced memory region (FMR) that controls access to image data stored in a first portion of a trusted memory region (TMR). The computer vision processor receives FMR requests from an application implemented in a processing cluster. The FMR requests are to access the image data in the first portion of the TMR. The computer vision processor selectively allows the requesting application to access the image data. In some cases, the computer vision processor acquires the image data and stores the image data in the first portion of the TMR, such as buffers in the TMR. A data fabric selectively permits the image processing application to access the data stored in the TMR based on whether the image cluster has opened or closed the FMR for the portion of the TMR.

Type: Grant

Filed: December 20, 2018

Date of Patent: September 13, 2022

Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC

Inventors: Benjamin Koon Pan Chan, William Lloyd Atkinson, Tung Chuen Kwong, Guhan Krishnan
DATA CACHE REGION PREFETCHER

Publication number: 20220283955

Abstract: A method, system, and processing system for pre-fetching data is disclosed. The method, system, and processing system includes data cache region prefetch circuitry for detecting a first access by a first instruction at a first instruction address to a first memory portion, detecting a first non-sequential access pattern to a set of addresses in the first memory portion, and in response to a miss by a second instruction at the first instruction address, and in response to the non-sequential access pattern occurring, pre-fetching data according to the first non-sequential access pattern.

Type: Application

Filed: May 24, 2022

Publication date: September 8, 2022

Applicant: Advanced Micro Devices, Inc.

Inventors: Donald W. McCauley, William E. Jones
Folded cell layout for 6T SRAM cell

Patent number: 11437316

Abstract: A layout for a 6T SRAM cell is disclosed. The cell layout takes a conventional 6T SRAM cell layout and restructures the layout into a more square cell layout with a single p-channel and a single n-channel across the width of the cell. Restructuring the cell layout reduces the height of wordlines and allows dual wordlines to be placed in the cell to reduce wordline resistance in the cell. Dual pairs of bitlines may also be placed in separate metal layers in the cell layout to reduce bitline resistance.

Type: Grant

Filed: September 24, 2020

Date of Patent: September 6, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Richard T. Schultz, John J. Wuu
Proactive management of inter-GPU network links

Patent number: 11436060

Abstract: Systems, apparatuses, and methods for proactively managing inter-processor network links are disclosed. A computing system includes at least a control unit and a plurality of processing units. Each processing unit of the plurality of processing units includes a compute module and a configurable link interface. The control unit dynamically adjusts a clock frequency and a link width of the configurable link interface of each processing unit based on a data transfer size and layer computation time of a plurality of layers of a neural network so as to reduce execution time of each layer. By adjusting the clock frequency and the link width of the link interface on a per-layer basis, the overlapping of communication and computation phases is closely matched, allowing layers to complete more quickly.

Type: Grant

Filed: August 27, 2019

Date of Patent: September 6, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Karthik Rao, Abhinav Vishnu
Neural network power management in a multi-GPU system

Patent number: 11435813

Abstract: Systems, apparatuses, and methods for managing power consumption for a neural network implemented on multiple graphics processing units (GPUs) are disclosed. A computing system includes a plurality of GPUs implementing a neural network. In one implementation, the plurality of GPUs draw power from a common power supply. To prevent the power consumption of the system from exceeding a power limit for long durations, the GPUs coordinate the scheduling of tasks of the neural network. At least one or more first GPUs schedule their computation tasks so as not to overlap with the computation tasks of one or more second GPUs. In this way, the system spends less time consuming power in excess of a power limit, allowing the neural network to be implemented in a more power efficient manner.

Type: Grant

Filed: August 29, 2018

Date of Patent: September 6, 2022

Assignee: Advanced Micro Devices, Inc.

Inventor: Greg Sadowski
Techniques for improving operand caching

Patent number: 11436016

Abstract: A technique for determining whether a register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache is provided. The technique includes executing an instruction that accesses an operand that comprises the register value, performing one or both of a lookahead technique and a prediction technique to determine whether the register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache, and based on the determining, updating the operand cache.

Type: Grant

Filed: December 4, 2019

Date of Patent: September 6, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Anthony T. Gutierrez, Bradford M. Beckmann, Marcus Nathaniel Chow
Offset-aligned three-dimensional integrated circuit

Patent number: 11437359

Abstract: A method for manufacturing a three-dimensional integrated circuit includes attaching a first side of a first die to a first carrier wafer. The method includes preparing a second side of the first die to generate a prepared second side of the first die. The method includes attaching the prepared second side of the first die to a second carrier wafer. The method includes removing the first carrier wafer from the first side of the first die to form a transitional three-dimensional integrated circuit. The method includes attaching a third carrier wafer to a first side of the transitional three-dimensional integrated circuit. The method includes attaching a first side of the second die to a second side of the transitional three-dimensional integrated circuit.

Type: Grant

Filed: February 24, 2020

Date of Patent: September 6, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Brett P. Wilkerson, Milind S. Bhagavat, Rahul Agarwal, Dmitri Yudanov
Neural network internal data fast access memory buffer

Patent number: 11436486

Abstract: Systems, apparatuses, and methods for optimizing neural network training with a first-in, last-out (FILO) buffer are disclosed. A processor executes a training run of a neural network implementation by performing multiple passes and adjusting weights of the neural network layers on each pass. Each training phase includes a forward pass and a backward pass. During the forward pass, each layer, in order from first layer to last layer, stores its weights in the FILO buffer. An error is calculated for the neural network at the end of the forward pass. Then, during the backward pass, each layer, in order from last layer to first layer, retrieves the corresponding weights from the FILO buffer. Gradients are calculated based on the error so as to update the weights of the layer for the next pass through the neural network.

Type: Grant

Filed: August 19, 2019

Date of Patent: September 6, 2022

Assignee: Advanced Micro Devices, Inc.

Inventor: Greg Sadowski
HYBRID RENDER WITH DEFERRED PRIMITIVE BATCH BINNING

Publication number: 20220277508

Abstract: A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin intercepts computed until all of the respective bin intercepts are processed.

Type: Application

Filed: May 16, 2022

Publication date: September 1, 2022

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael Mantor, Laurent Lefebvre, Mark Fowler, Timothy Kelley, Mikko Alho, Mika Tuomi, Kiia Kallio, Patrick Klas Rudolf Buss, Jari Antero Komppa, Kaj Tuomi
Speculative hint-triggered activation of pages in memory

Patent number: 11429281

Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. In various embodiments, a computing system includes a computing resource and a memory controller coupled to a memory device. The computing resource selectively generates a hint that includes a target address of a memory request generated by the processor. The hint is sent outside the primary communication fabric to the memory controller. The hint conditionally triggers a data access in the memory device. When no page in a bank targeted by the hint is open, the memory controller processes the hint by opening a target page of the hint without retrieving data. The memory controller drops the hint if there are other pending requests that target the same page or the target page is already open.

Type: Grant

Filed: April 6, 2020

Date of Patent: August 30, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Ravindra N. Bhargava, Philip S. Park, Vydhyanathan Kalyanasundharam, James Raymond Magro

prev … 62 63 64 65 66 67 68 69 70 … next