Patents Assigned to Advanced Micro Devics, Inc.

Kernel software driven color remapping of rendered primary surfaces

Patent number: 11915359

Abstract: Systems, apparatuses, and methods for implementing kernel software driven color remapping of rendered primary surfaces are disclosed. A system includes at least a general processor, a graphics processor, and a memory. The general processor executes a user-mode application, a user-mode driver, and a kernel-mode driver. A primary surface is rendered on the graphics processor on behalf of the user-mode application. The primary surface is stored in memory locations allocated for the primary surface by the user-mode driver and the kernel-mode driver is notified when the primary surface is ready to be displayed. Rather than displaying the primary surface, the kernel-mode driver causes the pixels of the primary surface to be remapped on the graphics processor using a selected lookup table (LUT) so as to generate a remapped surface which stored in memory locations allocated for the remapped surface by the user-mode driver. Then, the remapped surface is displayed.

Type: Grant

Filed: December 12, 2019

Date of Patent: February 27, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Jason Wen-Tse Wu, Parimalkumar Patel, Jia Hui Li, Chao Zhan
Single pass downsampler

Patent number: 11915337

Abstract: Systems, apparatuses, and methods for implementing a downsampler in a single compute shader pass are disclosed. A central processing unit (CPU) issues a single-pass compute shader kernel to perform downsampling of a texture on a graphics processing unit (GPU). The GPU includes a plurality of compute units for executing thread groups of the kernel. Each thread group fetches a patch of the texture, and each individual thread downsamples four quads of texels to compute mip levels 1 and 2 independently of the other threads. For mip level 3, texel data is written back over one of the local data share (LDS) entries from which the texel data was loaded. This eliminates the need for a barrier between loads and stores for computing mip level 3. The remaining mip levels are computed in a similar fashion by the thread groups of the single-pass kernel.

Type: Grant

Filed: February 23, 2021

Date of Patent: February 27, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Lou Isabelle Kramer, Matthäus G. Chajdas
Stack-based ray traversal with dynamic multiple-node iterations

Patent number: 11908065

Abstract: A technique for performing ray tracing operations is provided. The technique includes, in response to detecting that a threshold number of traversal stage work-items of a wavefront have terminated, increasing intersection test parallelization for non-terminated work-items.

Type: Grant

Filed: June 20, 2022

Date of Patent: February 20, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Daniel James Skinner, Michael John Livesley, David William John Pankratz
Processor with multiple op cache pipelines

Patent number: 11907126

Abstract: A processor employs a plurality of op cache pipelines to concurrently provide previously decoded operations to a dispatch stage of an instruction pipeline. In response to receiving a first branch prediction at a processor, the processor selects a first op cache pipeline of the plurality of op cache pipelines of the processor based on the first branch prediction, and provides a first set of operations associated with the first branch prediction to the dispatch queue via the selected first op cache pipeline.

Type: Grant

Filed: December 9, 2020

Date of Patent: February 20, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Robert B. Cohen, Tzu-Wei Lin, Anthony J. Bybell, Sudherssen Kalaiselvan, James Mossman
Delay-locked loop offset calibration and correction

Patent number: 11909404

Abstract: A clocking circuit is provided using a master delay-locked loop (DLL) and a slave DLL. A master DLL code indicates a delay adjustment made at a master DLL. A delay of a slave DLL is adjusted based on the master DLL code. A replica phase detector at the slave DLL is temporarily enabled during an interface idle period. A slave DLL code is determined, and a configuration value is determined based on the slave DLL code to the master DLL code. The replica phase detector is then disabled.

Type: Grant

Filed: December 12, 2022

Date of Patent: February 20, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Andy Huei Chu, Karthik Gopalakrishnan, Pradeep Jayaraman
Chipset Attached Random Access Memory

Publication number: 20240053891

Abstract: Random access memory (RAM) is attached to an input/output (I/O) controller of a chipset (e.g., on a motherboard). This chipset attached RAM is optionally used as part of a tiered storage solution with other tiers including, for example, nonvolatile memory (e.g., a solid state drive (SSD)) or a hard disk drive. The chipset attached RAM is separate from the system memory, allowing the chipset attached RAM to be used to speed up access to frequently used data stored in the tiered storage solution without reducing the amount of system memory available to an operating system running on the one or more processing units.

Type: Application

Filed: August 12, 2022

Publication date: February 15, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: William Robert Alverson, Amitabh Mehra, Jerry Anton Ahrens, Grant Evan Ley, Anil Harwani, Joshua Taylor Knight
ADAPTIVE QUANTIZATION FOR NEURAL NETWORKS

Publication number: 20240054332

Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.

Type: Application

Filed: October 27, 2023

Publication date: February 15, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Daniel I. Lowell, Sergey Voronov, Mayank Daga
Iterative indirect command buffers

Patent number: 11900499

Abstract: A technique for executing commands for an accelerated processing device is provided. The technique includes obtaining an iteration number and predication data from metadata for an iterative indirect command buffer; for each iteration indicated by the iteration number, performing commands of the iterative indirect command buffer as specified by the predication data; and ending processing of the iterative indirect command buffer in response to processing a number of iterations equal to the iteration number.

Type: Grant

Filed: September 22, 2020

Date of Patent: February 13, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Anirudh Rajendra Acharya, Ruijin Wu, Alexander Fuad Ashkar, Harry J. Wise
Marker-based processor instruction grouping

Patent number: 11900123

Abstract: A system includes a processing unit such as a GPU that itself includes a command processor configured to receive instructions for execution from a software application. A processor pipeline coupled to the processing unit includes a set of parallel processing units for executing the instructions in sets. A set manager is coupled to one or more of the processor pipeline and the command processor. The set manager includes at least one table for storing a set start time, a set end time, and a set execution time. The set manager determines an execution time for one or more sets of instructions of a first window of sets of instructions submitted to the processor pipeline. Based on the execution time of the one or more sets of instructions, a set limit is determined and applied to one or more sets of instructions of a second window subsequent to the first window.

Type: Grant

Filed: December 13, 2019

Date of Patent: February 13, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Alexander Fuad Ashkar, Manu Rastogi, Harry J. Wise
Instant auto-focus with distance estimation

Patent number: 11902658

Abstract: Systems, apparatuses, and methods for implementing an instant auto-focus mechanism with distance estimation are disclosed. A camera includes at least an image sensor, one or more movement and/or orientation sensors, a timer, a lens, and control circuit. The control circuit receives first and second images captured by the image sensor of a given scene. The control circuit calculates a distance between first and second camera locations when the first and second images, respectively, were captured based on the one or more movement and/or orientation sensors and the timer. Next, the control circuit calculates an estimate of a second distance between the camera and an object in the scene based on the distance between camera locations and angles between the camera and the object from the first and second locations. Then, the control circuit causes the lens to be adjusted to bring the object into focus for subsequent images.

Type: Grant

Filed: August 31, 2020

Date of Patent: February 13, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Po-Min Wang, Yu-Huai Chen
Dynamic cache bypass for power savings

Patent number: 11899520

Abstract: A technique for operating a cache is disclosed. The technique includes in response to a power down trigger that indicates that the cache effectiveness is considered to be low, powering down the cache.

Type: Grant

Filed: April 26, 2022

Date of Patent: February 13, 2024

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Ashish Jain, Benjamin Tsien, Chintan S. Patel, Vydhyanathan Kalyanasundharam, Shang Yang
METHODS FOR CONSTRUCTING PACKAGE SUBSTRATES WITH HIGH DENSITY

Publication number: 20240047228

Abstract: A disclosed method can include (i) positioning a first surface of a component of a semiconductor device on a first plated through-hole, (ii) covering, with a layer of dielectric material, at least a second surface of the component that is opposite the first surface of the component, (iii) removing a portion of the layer of dielectric material covering the second surface of the component to form at least one cavity, and (iv) depositing conductive material in the cavity to form a second plated through-hole on the second surface of the component. Various other apparatuses, systems, and methods are also disclosed.

Type: Application

Filed: August 2, 2022

Publication date: February 8, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Sri Ranga Sai Boyapati, Deepak Vasant Kulkarni, Raja Swaminathan, Brett P. Wilkerson, Arsalan Alam
FINE-GRAINED CONDITIONAL DISPATCHING

Publication number: 20240045718

Abstract: Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup dependency instruction including an indication to prioritize execution of the third workgroup has been executed.

Type: Application

Filed: October 17, 2023

Publication date: February 8, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Alexandru Dutu, Marcus Nathaniel Chow, Matthew D. Sinclair, Bradford M. Beckmann, David A. Wood
Dynamic hardware selection for experts in mixture-of-experts model

Patent number: 11893502

Abstract: A system assigns experts of a mixture-of-experts artificial intelligence model to processing devices in an automated manner. The system includes an orchestrator component that maintains priority data that stores, for each of a set of experts, and for each of a set of execution parameters, ranking information that ranks different processing devices for the particular execution parameter. In one example, for the execution parameter of execution speed, and for a first expert, the priority data indicates that a central processing unit (“CPU”) executes the first expert faster than a graphics processing unit (“GPU”). In this example, for the execution parameter of power consumption, and for the first expert, the priority data indicates that a GPU uses less power than a CPU. The priority data stores such information for one or more processing devices, one or more experts, and one or more execution characteristics.

Type: Grant

Filed: December 20, 2017

Date of Patent: February 6, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Nicholas Malaya, Nuwan Jayasena
DYNAMIC RANDOM-ACCESS MEMORY (DRAM) PHASE TRAINING UPDATE

Publication number: 20240036748

Abstract: A phase training update circuit operates to perform a phase training update on individual bit lanes. The phase training update circuit adjusts a bit lane transmit phase offset forward a designated number of phase steps, transmits a training pattern, and determines a first number of errors in the transmission. It also adjusts the bit lane transmit phase offset backward the designated number of phase steps, transmits the training pattern, and determines a second number of errors in the transmission. Responsive to a difference between the first number of errors and the second number of errors, the phase training update circuits adjusts a center phase position for the bit lane transmit phase offset of the selected bit lane.

Type: Application

Filed: October 11, 2023

Publication date: February 1, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Scott P. Murphy, Huuhau M. Do
DYNAMIC PERFORMANCE ADJUSTMENT

Publication number: 20240037031

Abstract: A technique for operating a device is disclosed. The technique includes recording log data for the device; analyzing the log data to determine one or more performance settings adjustments to apply to the device; and applying the one or more performance settings adjustments to the device.

Type: Application

Filed: September 28, 2023

Publication date: February 1, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Christopher J. Brennan, Akshay Lahiry
Distributing power shared between an accelerated processing unit and a discrete graphics processing unit

Patent number: 11886878

Abstract: An integrated coprocessor such as an accelerated processing unit (APU) generates commands for execution on a discrete coprocessor such as a discrete graphics processing unit (dGPU). Power distribution circuitry selectively provides power to the APU and the dGPU based on characteristics of workloads executing on the APU and the dGPU and based on a platform power limit that is shared by the APU and the dGPU. In some cases, the power distribution circuitry determines a first power provided to the APU and a second power provided to the dGPU. The power distribution circuitry increases the second power provided to the dGPU in response to a sum of the first and second powers being less than the platform power limit. In some cases, the power distribution circuitry modifies the power provided to the APU, the dGPU, or both in response to changes in temperatures measured by a set of sensors.

Type: Grant

Filed: December 12, 2019

Date of Patent: January 30, 2024

Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC

Inventors: Sukesh Shenoy, Adam N. C. Clark, Indrani Paul
Core selection based on usage policy and core constraints

Patent number: 11886224

Abstract: A processing unit of a processing system compiles a priority queue listing of a plurality of processor cores to run a workload based on a cost of running the workload on each of the processor cores. The cost is based on at least one of a system usage policy, characteristics of the workload, and one or more physical constraints of each processor core. The processing unit selects a processor core based on the cost to run the workload and communicates an identifier of the selected processor core to an operating system of the processing system.

Type: Grant

Filed: July 31, 2020

Date of Patent: January 30, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Leonardo De Paula Rosa Piga, Karthik Rao, Indrani Paul, Mahesh Subramony, Kenneth Mitchell, Dana Glenn Lewis, Sriram Sambamurthy, Wonje Choi
MULTI-ACCELERATOR COMPUTE DISPATCH

Publication number: 20240029336

Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.

Type: Application

Filed: October 3, 2023

Publication date: January 25, 2024

Applicant: Advanced Micro Devices, Inc.

Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
Using multiple functional blocks for training neural networks

Patent number: 11880769

Abstract: A system is described that performs training operations for a neural network, the system including an analog circuit element functional block with an array of analog circuit elements, and a controller. The controller monitors error values computed using an output from each of one or more initial iterations of a neural network training operation, the one or more initial iterations being performed using neural network data acquired from the memory. When one or more error values are less than a threshold, the controller uses the neural network data from the memory to configure the analog circuit element functional block to perform remaining iterations of the neural network training operation. The controller then causes the analog circuit element functional block to perform the remaining iterations.

Type: Grant

Filed: November 14, 2018

Date of Patent: January 23, 2024

Assignee: Advanced Micro Devices, Inc.

Inventor: Sudhanva Gurumurthi

prev … 28 29 30 31 32 33 34 35 36 … next