Patents Assigned to Advanced Micro Devices
  • Patent number: 11262989
    Abstract: A computing system includes a compatibility graph builder to generate a compatibility graph based on a dependency graph representing program source code, where the compatibility graph indicates compatibility relationships between operations represented in the dependency graph, a clique generator coupled with the compatibility graph builder to generate a set of candidate vector packings based on the compatibility relationships indicated in the compatibility graph, a set cover generator coupled with the clique generator to select a subset of vector packings from the set of candidate vector packings, and a vector code generator coupled with the set cover generator to generate the vector code based on the selected subset of vector packings.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: March 1, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Abhilash Bhandari, Venugopal Raghavan, Mohammad Asghar Ahmad Shahid, Anupama Rajesh Rasale
  • Patent number: 11262924
    Abstract: Automatic memory overclocking, including: increasing a memory frequency setting for a memory module until a memory stability test fails; determining an overclocked memory frequency setting including a highest memory frequency setting passing the memory stability test; and generating a profile including the overclocked memory frequency setting.
    Type: Grant
    Filed: December 30, 2019
    Date of Patent: March 1, 2022
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: William R. Alverson, Amitabh Mehra, Anil Harwani, Jerry A. Ahrens, Grant E. Ley, Jayesh Joshi
  • Patent number: 11264998
    Abstract: A system and method for measuring power supply variations are described. A functional unit includes one or more power supply monitors capable of measuring power supply variations. The power supply monitors forego use of a clock signal from clock generating circuitry and forego use of a reference voltage from a reference power supply. The power supply monitors use an output of a source ring oscillator as a clock signal for the sequential elements of a counter. The counter measures a number of revolutions of a measuring ring oscillator within a period of the output of the source oscillator. The revolutions of the measuring ring oscillator are associated with a number of rising edges and falling edges of the output signal of the measuring ring oscillator. An encoder converts the output of the sequential elements to a binary value, and sends the binary value to an external age tracking unit.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: March 1, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Ravinder Reddy Rachala
  • Patent number: 11262949
    Abstract: An approach is provided for reducing command bus traffic between memory controllers and PIM-enabled memory modules using special PIM commands. The term “special PIM command” is used herein to describe embodiments and refers to a PIM command for which the corresponding module-specific command information is provided to memory modules via a non-command bus data path. A memory controller generates and issues a special PIM command to multiple PIM-enabled memory modules via a command bus and provides module-specific command information (e.g., address information) for the special PIM command to the PIM-enabled memory modules via the non-command bus data path that is shared by the PIM-enabled memory modules and the memory controller.
    Type: Grant
    Filed: May 28, 2020
    Date of Patent: March 1, 2022
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Johnathan Alsop, Shaizeen Aga, Nuwan Jayasena
  • Patent number: 11264115
    Abstract: An integrated circuit includes a memory core and a built-in self-test (BIST) controller. The memory core has an array of memory cells located at intersections of a plurality of word lines and a plurality of bit line pairs. The BIST controller is coupled to the memory core and has a mission mode and a built-in self-test mode. When in the mission mode, the BIST controller performs read and write accesses using precharge on demand. When in the built-in self-test mode, the BIST controller performs a floating bit line test by draining a voltage on true and complement bit lines of a selected bit line pair and subsequently precharging the true and complement bit lines of the selected bit line pair, before reading or writing data using the true and complement bit lines of the selected bit line pair.
    Type: Grant
    Filed: September 22, 2020
    Date of Patent: March 1, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Russell Schreiber, Keith A. Kasprak, Vance Threatt, James A. Wingfield, William A. Halliday, Srinivas R. Sathu, Arijit Banerjee
  • Patent number: 11257273
    Abstract: Techniques for processing pixel data are provided. The techniques include, in a first mode in which blending is enabled, reading in render target color data from a memory system; blending the render target color data with one or more fragments received from a pixel shader stage to generate blended color data; outputting the blended color data to the memory system utilizing a first amount of bandwidth; in a second mode in which blending is disabled and variable rate shading is enabled, amplifying shaded coarse fragments received from the pixel shader stage to generate fine fragments; and outputting the fine fragments to the memory system utilizing a second amount of bandwidth that is higher than the first amount of bandwidth.
    Type: Grant
    Filed: December 19, 2019
    Date of Patent: February 22, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Pazhani Pillai, Skyler Jonathon Saleh
  • Patent number: 11256505
    Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.
    Type: Grant
    Filed: February 5, 2021
    Date of Patent: February 22, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
  • Patent number: 11256522
    Abstract: Described herein are techniques for executing a heterogeneous code object executable. According to the techniques, a loader identifies a first memory appropriate for loading a first architecture-specific portion of the heterogeneous code object executable, wherein the first architecture specific portion includes instructions for a first architecture, identifies a second memory appropriate for loading a second architecture-specific portion of the heterogeneous code object executable, wherein the second architecture specific portion includes instructions for a second architecture that is different than the first architecture, loads the first architecture-specific portion into the first memory and the second architecture-specific portion into the second memory, and performs relocations on the first architecture-specific portion and on the second architecture-specific portion.
    Type: Grant
    Filed: November 22, 2019
    Date of Patent: February 22, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Steven Tony Tye, Brian Laird Sumner, Konstantin Zhuravlyov
  • Publication number: 20220050781
    Abstract: Techniques for prefetching are provided. The techniques include receiving a first prefetch command; in response to determining that a history buffer indicates that first information associated with the first prefetch command has not already been prefetched, prefetching the first information into a memory; receiving a second prefetch command; and in response to determining that the history buffer indicates that second information associated with the second prefetch command has already been prefetched, avoiding prefetching the second information into the memory.
    Type: Application
    Filed: March 31, 2021
    Publication date: February 17, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Anirudh R. Acharya, Alexander Fuad Ashkar
  • Patent number: 11249765
    Abstract: Techniques for improving performance of accelerated processing devices (“APDs”) when exceptions occur are provided. In APDs, the very large number of parallel processing execution units, and the complexity of the hardware used to execute a large number of work-items in parallel, means that APDs typically stall when an exception occurs (unlike in central processing units (“CPUs”), which are able to execute speculatively and out-of-order). However, the techniques provided herein allow at least some execution to occur past exceptions. Execution past an exception generating instruction occurs by executing instructions that would not lead to a corruption while skipping those that would lead to a corruption. After the exception has been satisfied, execution occurs in a replay mode in which the potentially exception-generating instruction is executed and in which instructions that did not execute in the exception-wait mode are executed. A mask and counter are used to control execution in replay mode.
    Type: Grant
    Filed: August 22, 2018
    Date of Patent: February 15, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Anthony T. Gutierrez
  • Patent number: 11252430
    Abstract: The present disclosure is directed a system and method for exploiting camera and depth information associated with rendered video frames, such as those rendered by a server operating as part of a cloud gaming service, to more efficiently encode the rendered video frames for transmission over a network. The method and system of the present disclosure can be used in a server operating in a cloud gaming service to improve, for example, the amount of latency, downstream bandwidth, and/or computational processing power associated with playing a video game over its service. The method and system of the present disclosure can be further used in other applications where camera and depth information of a rendered or captured video frame is available.
    Type: Grant
    Filed: November 1, 2019
    Date of Patent: February 15, 2022
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Khaled Mammou, Ihab Amer, Gabor Sines, Lei Zhang, Michael Schmit, Daniel Wong
  • Patent number: 11243904
    Abstract: A method and apparatus for scheduling instructions of a shader program for a graphics processing unit (GPU) with a fixed number of registers. The method and apparatus include computing, via a processing unit (PU), a liveness-based register usage across all basic blocks in the shader program, computing, via the PU, the range of numbers of waves of a plurality of registers for the shader program, assessing the impact of available post-register allocation optimizations, computing, via the PU, the scoring data based on number of waves of the plurality of registers, and computing, via the PU, the number of waves for execution for the plurality of registers.
    Type: Grant
    Filed: March 9, 2020
    Date of Patent: February 8, 2022
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Robert A. Gottlieb, Christopher L. Reeve, Michael John Bedy
  • Patent number: 11243799
    Abstract: An apparatus includes a plurality of virtual machines, a hypervisor coupled to the plurality of virtual machines, and a graphical processing unit (GPU) coupled to the hypervisor. The plurality of virtual machines are allocated a plurality of time slices. The hypervisor initiates a world switch to a first virtual machine of the plurality of virtual machines. The GPU makes a determination as to whether to adjust the time slice associated with the first virtual machine based on an assessment of time slice adjustment parameters related to an execution time of at least one of the plurality of virtual machines.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: February 8, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander Fuad Ashkar, Hans Fernlund
  • Patent number: 11243752
    Abstract: Described herein are techniques for generating a stitched shader program. The techniques include identifying a set of shader programs to include in the stitched shader program, wherein the set includes at least one multiversion shader program that includes a first version of instructions and a second version of instructions, wherein the first version of instructions uses a first number of resources that is different than a second number of resources used by the second version of instructions. The techniques also include combining the set of shader programs to form the stitched shader program. The techniques further include determining a number of resources for the stitched shader program. The techniques also include based on the determined number of resources, modifying the instructions corresponding to the multiversion shader program to, when executed, execute either the first version of instructions, or the second version of instructions.
    Type: Grant
    Filed: July 11, 2019
    Date of Patent: February 8, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Sumesh Udayakumaran
  • Patent number: 11243884
    Abstract: A method of prefetching target data includes, in response to detecting a lock-prefixed instruction for execution in a processor, determining a predicted target memory location for the lock-prefixed instruction based on control flow information associating the lock-prefixed instruction with the predicted target memory location. Target data is prefetched from the predicted target memory location to a cache coupled with the processor, and after completion of the prefetching, the lock-prefixed instruction is executed in the processor using the prefetched target data.
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: February 8, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Susumu Mashimo, John Kalamatianos
  • Publication number: 20220035765
    Abstract: An interconnect controller for a data processing platform includes a data link layer controller for selectively receiving data packets from and sending data packets to a higher protocol layer, and a physical layer controller coupled to the data link layer controller and adapted to be coupled to a communication link. The physical layer controller operates according to a predetermined protocol selectively at one of a plurality of enhanced speeds that are not specified by any published standard and are separated from each other by the same predetermined amount. In response to performing a link initialization, the interconnect controller performs at least one setup operation to select a speed, and subsequently operates the communication link using a selected speed.
    Type: Application
    Filed: October 18, 2021
    Publication date: February 3, 2022
    Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Gordon Caruk, Gerald R. Talbot
  • Publication number: 20220036629
    Abstract: A method of tiled rendering of an image for display is provided which comprises receiving an image comprising one or more three dimensional (3D) objects and executing a visibility pass for determining locations of primitives of the image. The method also comprises executing, concurrently with the executing of the visibility pass, front end geometry processing of one of the primitives determined, from the visibility pass, to be in a first one of a plurality of tiles of the image and executing, concurrently with the executing of the visibility pass, back end processing of the one primitive in the first tile.
    Type: Application
    Filed: July 28, 2020
    Publication date: February 3, 2022
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Mika Tuomi, Ruijin Wu, Anirudh R. Acharya, Kiia Kallio
  • Patent number: 11237827
    Abstract: A graphics processing unit (GPU) sequences provision of operands to a set of operand registers, thereby allowing the GPU to share at least one of the operand registers between processing. The GPU includes a plurality of arithmetic logic units (ALUs) with at least one of the ALUs configured to perform double precision operations. The GPU further includes a set of operand registers configured to store single precision operands. For a plurality of executing threads that request double precision operations, the GPU stores the corresponding operands at the operand registers. Over a plurality of execution cycles, the GPU sequences transfer of operands from the set of operand registers to a designated double precision operand register. During each execution cycle, the double-precision ALU executes a double precision operation using the operand stored at the double precision operand register.
    Type: Grant
    Filed: November 26, 2019
    Date of Patent: February 1, 2022
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Bin He, Jiasheng Chen, Jian Huang
  • Patent number: 11237220
    Abstract: In one form, a power supply monitor including a current controlled oscillator circuit, a time-to-digital converter, and an output divider. The current controlled oscillator circuit has an input for receiving a power supply voltage to be measured, and an output for providing a frequency signal having a frequency linearly proportional to the power supply voltage. The time-to-digital converter has an input coupled to the output of the current controlled oscillator circuit, and an output for providing a count signal representative of a number of cycles of a reference clock signal per cycle of the frequency signal. The output divider has an input coupled to the output of the time-to-digital converter, and an output for providing a divided count signal representative of a value of the power supply voltage, and provides the divided count signal by dividing a fixed number by the count signal.
    Type: Grant
    Filed: August 3, 2018
    Date of Patent: February 1, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ravinder Reddy Rachala, Stephen Victor Kosonocky, Miguel Rodriguez
  • Patent number: 11238640
    Abstract: A technique for performing ray tracing operations is provided. The technique includes reading descendant-shared type metadata for a non-leaf node of a bounding volume hierarchy; identifying one or more culling types for a ray-intersection test for a ray; and determining whether to treat the non-leaf node as not intersected based on whether the one or more culling types includes at least one type specified by the descendant-shared type metadata.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: February 1, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Sagar S. Bhandare, Fataneh F. Ghodrat, Paul Raymond Vella