Patents Assigned to Advanced Micro Devics, Inc.
  • Patent number: 11915359
    Abstract: Systems, apparatuses, and methods for implementing kernel software driven color remapping of rendered primary surfaces are disclosed. A system includes at least a general processor, a graphics processor, and a memory. The general processor executes a user-mode application, a user-mode driver, and a kernel-mode driver. A primary surface is rendered on the graphics processor on behalf of the user-mode application. The primary surface is stored in memory locations allocated for the primary surface by the user-mode driver and the kernel-mode driver is notified when the primary surface is ready to be displayed. Rather than displaying the primary surface, the kernel-mode driver causes the pixels of the primary surface to be remapped on the graphics processor using a selected lookup table (LUT) so as to generate a remapped surface which stored in memory locations allocated for the remapped surface by the user-mode driver. Then, the remapped surface is displayed.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: February 27, 2024
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Jason Wen-Tse Wu, Parimalkumar Patel, Jia Hui Li, Chao Zhan
  • Patent number: 11915337
    Abstract: Systems, apparatuses, and methods for implementing a downsampler in a single compute shader pass are disclosed. A central processing unit (CPU) issues a single-pass compute shader kernel to perform downsampling of a texture on a graphics processing unit (GPU). The GPU includes a plurality of compute units for executing thread groups of the kernel. Each thread group fetches a patch of the texture, and each individual thread downsamples four quads of texels to compute mip levels 1 and 2 independently of the other threads. For mip level 3, texel data is written back over one of the local data share (LDS) entries from which the texel data was loaded. This eliminates the need for a barrier between loads and stores for computing mip level 3. The remaining mip levels are computed in a similar fashion by the thread groups of the single-pass kernel.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: February 27, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lou Isabelle Kramer, Matthäus G. Chajdas
  • Patent number: 11908065
    Abstract: A technique for performing ray tracing operations is provided. The technique includes, in response to detecting that a threshold number of traversal stage work-items of a wavefront have terminated, increasing intersection test parallelization for non-terminated work-items.
    Type: Grant
    Filed: June 20, 2022
    Date of Patent: February 20, 2024
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Daniel James Skinner, Michael John Livesley, David William John Pankratz
  • Patent number: 11907126
    Abstract: A processor employs a plurality of op cache pipelines to concurrently provide previously decoded operations to a dispatch stage of an instruction pipeline. In response to receiving a first branch prediction at a processor, the processor selects a first op cache pipeline of the plurality of op cache pipelines of the processor based on the first branch prediction, and provides a first set of operations associated with the first branch prediction to the dispatch queue via the selected first op cache pipeline.
    Type: Grant
    Filed: December 9, 2020
    Date of Patent: February 20, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert B. Cohen, Tzu-Wei Lin, Anthony J. Bybell, Sudherssen Kalaiselvan, James Mossman
  • Patent number: 11909404
    Abstract: A clocking circuit is provided using a master delay-locked loop (DLL) and a slave DLL. A master DLL code indicates a delay adjustment made at a master DLL. A delay of a slave DLL is adjusted based on the master DLL code. A replica phase detector at the slave DLL is temporarily enabled during an interface idle period. A slave DLL code is determined, and a configuration value is determined based on the slave DLL code to the master DLL code. The replica phase detector is then disabled.
    Type: Grant
    Filed: December 12, 2022
    Date of Patent: February 20, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Andy Huei Chu, Karthik Gopalakrishnan, Pradeep Jayaraman
  • Publication number: 20240053891
    Abstract: Random access memory (RAM) is attached to an input/output (I/O) controller of a chipset (e.g., on a motherboard). This chipset attached RAM is optionally used as part of a tiered storage solution with other tiers including, for example, nonvolatile memory (e.g., a solid state drive (SSD)) or a hard disk drive. The chipset attached RAM is separate from the system memory, allowing the chipset attached RAM to be used to speed up access to frequently used data stored in the tiered storage solution without reducing the amount of system memory available to an operating system running on the one or more processing units.
    Type: Application
    Filed: August 12, 2022
    Publication date: February 15, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: William Robert Alverson, Amitabh Mehra, Jerry Anton Ahrens, Grant Evan Ley, Anil Harwani, Joshua Taylor Knight
  • Publication number: 20240054332
    Abstract: Methods, devices, systems, and instructions for adaptive quantization in an artificial neural network (ANN) calculate a distribution of ANN information; select a quantization function from a set of quantization functions based on the distribution; apply the quantization function to the ANN information to generate quantized ANN information; load the quantized ANN information into the ANN; and generate an output based on the quantized ANN information. Some examples recalculate the distribution of ANN information and reselect the quantization function from the set of quantization functions based on the resampled distribution if the output does not sufficiently correlate with a known correct output. In some examples, the ANN information includes a set of training data. In some examples, the ANN information includes a plurality of link weights.
    Type: Application
    Filed: October 27, 2023
    Publication date: February 15, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Daniel I. Lowell, Sergey Voronov, Mayank Daga
  • Patent number: 11900499
    Abstract: A technique for executing commands for an accelerated processing device is provided. The technique includes obtaining an iteration number and predication data from metadata for an iterative indirect command buffer; for each iteration indicated by the iteration number, performing commands of the iterative indirect command buffer as specified by the predication data; and ending processing of the iterative indirect command buffer in response to processing a number of iterations equal to the iteration number.
    Type: Grant
    Filed: September 22, 2020
    Date of Patent: February 13, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Anirudh Rajendra Acharya, Ruijin Wu, Alexander Fuad Ashkar, Harry J. Wise
  • Patent number: 11900123
    Abstract: A system includes a processing unit such as a GPU that itself includes a command processor configured to receive instructions for execution from a software application. A processor pipeline coupled to the processing unit includes a set of parallel processing units for executing the instructions in sets. A set manager is coupled to one or more of the processor pipeline and the command processor. The set manager includes at least one table for storing a set start time, a set end time, and a set execution time. The set manager determines an execution time for one or more sets of instructions of a first window of sets of instructions submitted to the processor pipeline. Based on the execution time of the one or more sets of instructions, a set limit is determined and applied to one or more sets of instructions of a second window subsequent to the first window.
    Type: Grant
    Filed: December 13, 2019
    Date of Patent: February 13, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander Fuad Ashkar, Manu Rastogi, Harry J. Wise
  • Patent number: 11902658
    Abstract: Systems, apparatuses, and methods for implementing an instant auto-focus mechanism with distance estimation are disclosed. A camera includes at least an image sensor, one or more movement and/or orientation sensors, a timer, a lens, and control circuit. The control circuit receives first and second images captured by the image sensor of a given scene. The control circuit calculates a distance between first and second camera locations when the first and second images, respectively, were captured based on the one or more movement and/or orientation sensors and the timer. Next, the control circuit calculates an estimate of a second distance between the camera and an object in the scene based on the distance between camera locations and angles between the camera and the object from the first and second locations. Then, the control circuit causes the lens to be adjusted to bring the object into focus for subsequent images.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: February 13, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Po-Min Wang, Yu-Huai Chen
  • Patent number: 11899520
    Abstract: A technique for operating a cache is disclosed. The technique includes in response to a power down trigger that indicates that the cache effectiveness is considered to be low, powering down the cache.
    Type: Grant
    Filed: April 26, 2022
    Date of Patent: February 13, 2024
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Ashish Jain, Benjamin Tsien, Chintan S. Patel, Vydhyanathan Kalyanasundharam, Shang Yang
  • Publication number: 20240047228
    Abstract: A disclosed method can include (i) positioning a first surface of a component of a semiconductor device on a first plated through-hole, (ii) covering, with a layer of dielectric material, at least a second surface of the component that is opposite the first surface of the component, (iii) removing a portion of the layer of dielectric material covering the second surface of the component to form at least one cavity, and (iv) depositing conductive material in the cavity to form a second plated through-hole on the second surface of the component. Various other apparatuses, systems, and methods are also disclosed.
    Type: Application
    Filed: August 2, 2022
    Publication date: February 8, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sri Ranga Sai Boyapati, Deepak Vasant Kulkarni, Raja Swaminathan, Brett P. Wilkerson, Arsalan Alam
  • Publication number: 20240045718
    Abstract: Techniques for executing workgroups are provided. The techniques include executing, for a first workgroup of a first kernel dispatch, a workgroup dependency instruction that includes an indication to prioritize execution of a second workgroup of a second kernel dispatch, and in response to the workgroup dependency instruction, dispatching the second workgroup of the second kernel dispatch prior to dispatching a third workgroup of the second kernel dispatch, wherein no workgroup dependency instruction including an indication to prioritize execution of the third workgroup has been executed.
    Type: Application
    Filed: October 17, 2023
    Publication date: February 8, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Alexandru Dutu, Marcus Nathaniel Chow, Matthew D. Sinclair, Bradford M. Beckmann, David A. Wood
  • Patent number: 11893502
    Abstract: A system assigns experts of a mixture-of-experts artificial intelligence model to processing devices in an automated manner. The system includes an orchestrator component that maintains priority data that stores, for each of a set of experts, and for each of a set of execution parameters, ranking information that ranks different processing devices for the particular execution parameter. In one example, for the execution parameter of execution speed, and for a first expert, the priority data indicates that a central processing unit (“CPU”) executes the first expert faster than a graphics processing unit (“GPU”). In this example, for the execution parameter of power consumption, and for the first expert, the priority data indicates that a GPU uses less power than a CPU. The priority data stores such information for one or more processing devices, one or more experts, and one or more execution characteristics.
    Type: Grant
    Filed: December 20, 2017
    Date of Patent: February 6, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nicholas Malaya, Nuwan Jayasena
  • Publication number: 20240036748
    Abstract: A phase training update circuit operates to perform a phase training update on individual bit lanes. The phase training update circuit adjusts a bit lane transmit phase offset forward a designated number of phase steps, transmits a training pattern, and determines a first number of errors in the transmission. It also adjusts the bit lane transmit phase offset backward the designated number of phase steps, transmits the training pattern, and determines a second number of errors in the transmission. Responsive to a difference between the first number of errors and the second number of errors, the phase training update circuits adjusts a center phase position for the bit lane transmit phase offset of the selected bit lane.
    Type: Application
    Filed: October 11, 2023
    Publication date: February 1, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Scott P. Murphy, Huuhau M. Do
  • Publication number: 20240037031
    Abstract: A technique for operating a device is disclosed. The technique includes recording log data for the device; analyzing the log data to determine one or more performance settings adjustments to apply to the device; and applying the one or more performance settings adjustments to the device.
    Type: Application
    Filed: September 28, 2023
    Publication date: February 1, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Christopher J. Brennan, Akshay Lahiry
  • Patent number: 11886878
    Abstract: An integrated coprocessor such as an accelerated processing unit (APU) generates commands for execution on a discrete coprocessor such as a discrete graphics processing unit (dGPU). Power distribution circuitry selectively provides power to the APU and the dGPU based on characteristics of workloads executing on the APU and the dGPU and based on a platform power limit that is shared by the APU and the dGPU. In some cases, the power distribution circuitry determines a first power provided to the APU and a second power provided to the dGPU. The power distribution circuitry increases the second power provided to the dGPU in response to a sum of the first and second powers being less than the platform power limit. In some cases, the power distribution circuitry modifies the power provided to the APU, the dGPU, or both in response to changes in temperatures measured by a set of sensors.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: January 30, 2024
    Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC
    Inventors: Sukesh Shenoy, Adam N. C. Clark, Indrani Paul
  • Patent number: 11886224
    Abstract: A processing unit of a processing system compiles a priority queue listing of a plurality of processor cores to run a workload based on a cost of running the workload on each of the processor cores. The cost is based on at least one of a system usage policy, characteristics of the workload, and one or more physical constraints of each processor core. The processing unit selects a processor core based on the cost to run the workload and communicates an identifier of the selected processor core to an operating system of the processing system.
    Type: Grant
    Filed: July 31, 2020
    Date of Patent: January 30, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Leonardo De Paula Rosa Piga, Karthik Rao, Indrani Paul, Mahesh Subramony, Kenneth Mitchell, Dana Glenn Lewis, Sriram Sambamurthy, Wonje Choi
  • Publication number: 20240029336
    Abstract: Techniques for executing computing work by a plurality of chiplets are provided. The techniques include assigning workgroups of a kernel dispatch packet to the chiplets; by each chiplet, executing the workgroups assigned to that chiplet; for each chiplet, upon completion of all workgroups assigned to that chiplet for the kernel dispatch packet, notifying the other chiplets of such completion; and upon completion of all workgroups of the kernel dispatch packet, notifying a client of such completion and proceeding to a subsequent kernel dispatch packet.
    Type: Application
    Filed: October 3, 2023
    Publication date: January 25, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Milind N. Nemlekar, Maxim V. Kazakov, Prerit Dak
  • Patent number: 11880769
    Abstract: A system is described that performs training operations for a neural network, the system including an analog circuit element functional block with an array of analog circuit elements, and a controller. The controller monitors error values computed using an output from each of one or more initial iterations of a neural network training operation, the one or more initial iterations being performed using neural network data acquired from the memory. When one or more error values are less than a threshold, the controller uses the neural network data from the memory to configure the analog circuit element functional block to perform remaining iterations of the neural network training operation. The controller then causes the analog circuit element functional block to perform the remaining iterations.
    Type: Grant
    Filed: November 14, 2018
    Date of Patent: January 23, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Sudhanva Gurumurthi