Patents Assigned to Advanced Micros Devices, Inc.
-
Patent number: 11257273Abstract: Techniques for processing pixel data are provided. The techniques include, in a first mode in which blending is enabled, reading in render target color data from a memory system; blending the render target color data with one or more fragments received from a pixel shader stage to generate blended color data; outputting the blended color data to the memory system utilizing a first amount of bandwidth; in a second mode in which blending is disabled and variable rate shading is enabled, amplifying shaded coarse fragments received from the pixel shader stage to generate fine fragments; and outputting the fine fragments to the memory system utilizing a second amount of bandwidth that is higher than the first amount of bandwidth.Type: GrantFiled: December 19, 2019Date of Patent: February 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Pazhani Pillai, Skyler Jonathon Saleh
-
Patent number: 11256505Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.Type: GrantFiled: February 5, 2021Date of Patent: February 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
-
Patent number: 11256522Abstract: Described herein are techniques for executing a heterogeneous code object executable. According to the techniques, a loader identifies a first memory appropriate for loading a first architecture-specific portion of the heterogeneous code object executable, wherein the first architecture specific portion includes instructions for a first architecture, identifies a second memory appropriate for loading a second architecture-specific portion of the heterogeneous code object executable, wherein the second architecture specific portion includes instructions for a second architecture that is different than the first architecture, loads the first architecture-specific portion into the first memory and the second architecture-specific portion into the second memory, and performs relocations on the first architecture-specific portion and on the second architecture-specific portion.Type: GrantFiled: November 22, 2019Date of Patent: February 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Steven Tony Tye, Brian Laird Sumner, Konstantin Zhuravlyov
-
Publication number: 20220050781Abstract: Techniques for prefetching are provided. The techniques include receiving a first prefetch command; in response to determining that a history buffer indicates that first information associated with the first prefetch command has not already been prefetched, prefetching the first information into a memory; receiving a second prefetch command; and in response to determining that the history buffer indicates that second information associated with the second prefetch command has already been prefetched, avoiding prefetching the second information into the memory.Type: ApplicationFiled: March 31, 2021Publication date: February 17, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Anirudh R. Acharya, Alexander Fuad Ashkar
-
Patent number: 11249765Abstract: Techniques for improving performance of accelerated processing devices (“APDs”) when exceptions occur are provided. In APDs, the very large number of parallel processing execution units, and the complexity of the hardware used to execute a large number of work-items in parallel, means that APDs typically stall when an exception occurs (unlike in central processing units (“CPUs”), which are able to execute speculatively and out-of-order). However, the techniques provided herein allow at least some execution to occur past exceptions. Execution past an exception generating instruction occurs by executing instructions that would not lead to a corruption while skipping those that would lead to a corruption. After the exception has been satisfied, execution occurs in a replay mode in which the potentially exception-generating instruction is executed and in which instructions that did not execute in the exception-wait mode are executed. A mask and counter are used to control execution in replay mode.Type: GrantFiled: August 22, 2018Date of Patent: February 15, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Anthony T. Gutierrez
-
Patent number: 11252430Abstract: The present disclosure is directed a system and method for exploiting camera and depth information associated with rendered video frames, such as those rendered by a server operating as part of a cloud gaming service, to more efficiently encode the rendered video frames for transmission over a network. The method and system of the present disclosure can be used in a server operating in a cloud gaming service to improve, for example, the amount of latency, downstream bandwidth, and/or computational processing power associated with playing a video game over its service. The method and system of the present disclosure can be further used in other applications where camera and depth information of a rendered or captured video frame is available.Type: GrantFiled: November 1, 2019Date of Patent: February 15, 2022Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Khaled Mammou, Ihab Amer, Gabor Sines, Lei Zhang, Michael Schmit, Daniel Wong
-
Patent number: 11243799Abstract: An apparatus includes a plurality of virtual machines, a hypervisor coupled to the plurality of virtual machines, and a graphical processing unit (GPU) coupled to the hypervisor. The plurality of virtual machines are allocated a plurality of time slices. The hypervisor initiates a world switch to a first virtual machine of the plurality of virtual machines. The GPU makes a determination as to whether to adjust the time slice associated with the first virtual machine based on an assessment of time slice adjustment parameters related to an execution time of at least one of the plurality of virtual machines.Type: GrantFiled: August 30, 2019Date of Patent: February 8, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Alexander Fuad Ashkar, Hans Fernlund
-
Patent number: 11243904Abstract: A method and apparatus for scheduling instructions of a shader program for a graphics processing unit (GPU) with a fixed number of registers. The method and apparatus include computing, via a processing unit (PU), a liveness-based register usage across all basic blocks in the shader program, computing, via the PU, the range of numbers of waves of a plurality of registers for the shader program, assessing the impact of available post-register allocation optimizations, computing, via the PU, the scoring data based on number of waves of the plurality of registers, and computing, via the PU, the number of waves for execution for the plurality of registers.Type: GrantFiled: March 9, 2020Date of Patent: February 8, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Robert A. Gottlieb, Christopher L. Reeve, Michael John Bedy
-
Patent number: 11243752Abstract: Described herein are techniques for generating a stitched shader program. The techniques include identifying a set of shader programs to include in the stitched shader program, wherein the set includes at least one multiversion shader program that includes a first version of instructions and a second version of instructions, wherein the first version of instructions uses a first number of resources that is different than a second number of resources used by the second version of instructions. The techniques also include combining the set of shader programs to form the stitched shader program. The techniques further include determining a number of resources for the stitched shader program. The techniques also include based on the determined number of resources, modifying the instructions corresponding to the multiversion shader program to, when executed, execute either the first version of instructions, or the second version of instructions.Type: GrantFiled: July 11, 2019Date of Patent: February 8, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Sumesh Udayakumaran
-
Patent number: 11243884Abstract: A method of prefetching target data includes, in response to detecting a lock-prefixed instruction for execution in a processor, determining a predicted target memory location for the lock-prefixed instruction based on control flow information associating the lock-prefixed instruction with the predicted target memory location. Target data is prefetched from the predicted target memory location to a cache coupled with the processor, and after completion of the prefetching, the lock-prefixed instruction is executed in the processor using the prefetched target data.Type: GrantFiled: November 13, 2018Date of Patent: February 8, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Susumu Mashimo, John Kalamatianos
-
Publication number: 20220035765Abstract: An interconnect controller for a data processing platform includes a data link layer controller for selectively receiving data packets from and sending data packets to a higher protocol layer, and a physical layer controller coupled to the data link layer controller and adapted to be coupled to a communication link. The physical layer controller operates according to a predetermined protocol selectively at one of a plurality of enhanced speeds that are not specified by any published standard and are separated from each other by the same predetermined amount. In response to performing a link initialization, the interconnect controller performs at least one setup operation to select a speed, and subsequently operates the communication link using a selected speed.Type: ApplicationFiled: October 18, 2021Publication date: February 3, 2022Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Gordon Caruk, Gerald R. Talbot
-
Publication number: 20220036629Abstract: A method of tiled rendering of an image for display is provided which comprises receiving an image comprising one or more three dimensional (3D) objects and executing a visibility pass for determining locations of primitives of the image. The method also comprises executing, concurrently with the executing of the visibility pass, front end geometry processing of one of the primitives determined, from the visibility pass, to be in a first one of a plurality of tiles of the image and executing, concurrently with the executing of the visibility pass, back end processing of the one primitive in the first tile.Type: ApplicationFiled: July 28, 2020Publication date: February 3, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Mika Tuomi, Ruijin Wu, Anirudh R. Acharya, Kiia Kallio
-
Patent number: 11237827Abstract: A graphics processing unit (GPU) sequences provision of operands to a set of operand registers, thereby allowing the GPU to share at least one of the operand registers between processing. The GPU includes a plurality of arithmetic logic units (ALUs) with at least one of the ALUs configured to perform double precision operations. The GPU further includes a set of operand registers configured to store single precision operands. For a plurality of executing threads that request double precision operations, the GPU stores the corresponding operands at the operand registers. Over a plurality of execution cycles, the GPU sequences transfer of operands from the set of operand registers to a designated double precision operand register. During each execution cycle, the double-precision ALU executes a double precision operation using the operand stored at the double precision operand register.Type: GrantFiled: November 26, 2019Date of Patent: February 1, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Bin He, Jiasheng Chen, Jian Huang
-
Patent number: 11238640Abstract: A technique for performing ray tracing operations is provided. The technique includes reading descendant-shared type metadata for a non-leaf node of a bounding volume hierarchy; identifying one or more culling types for a ray-intersection test for a ray; and determining whether to treat the non-leaf node as not intersected based on whether the one or more culling types includes at least one type specified by the descendant-shared type metadata.Type: GrantFiled: August 31, 2020Date of Patent: February 1, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Skyler Jonathon Saleh, Sagar S. Bhandare, Fataneh F. Ghodrat, Paul Raymond Vella
-
Patent number: 11237220Abstract: In one form, a power supply monitor including a current controlled oscillator circuit, a time-to-digital converter, and an output divider. The current controlled oscillator circuit has an input for receiving a power supply voltage to be measured, and an output for providing a frequency signal having a frequency linearly proportional to the power supply voltage. The time-to-digital converter has an input coupled to the output of the current controlled oscillator circuit, and an output for providing a count signal representative of a number of cycles of a reference clock signal per cycle of the frequency signal. The output divider has an input coupled to the output of the time-to-digital converter, and an output for providing a divided count signal representative of a value of the power supply voltage, and provides the divided count signal by dividing a fixed number by the count signal.Type: GrantFiled: August 3, 2018Date of Patent: February 1, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Ravinder Reddy Rachala, Stephen Victor Kosonocky, Miguel Rodriguez
-
Patent number: 11237972Abstract: A method and apparatus physically partitions clean and dirty cache lines into separate memory partitions, such as one or more banks, so that during low power operation, a cache memory controller reduces power consumption of the cache memory containing the clean only data. The cache memory controller controls a refresh operation so that a data refresh does not occur for the clean data only banks or the refresh rate is reduced for the clean data only banks. Partitions that store dirty data can also store clean data; however, other partitions are designated for storing only clean data so that the partitions can have their refresh rate reduced or refresh stopped for periods of time. When multiple DRAM dies or packages are employed, the partition can occur on a die or package level as opposed to a bank level within a die.Type: GrantFiled: December 29, 2017Date of Patent: February 1, 2022Assignee: Advanced Micro Devices, Inc.Inventor: David A. Roberts
-
Patent number: 11237928Abstract: A method includes reserving memory capacity in a first memory device as patch memory region for backing faulted memory, receiving a memory error indication indicating an uncorrectable error in a faulted segment in a second memory device and, in response to the memory error indication, associating in a remapping table the faulted segment with a patch segment in the patch memory region. The faulted segment is smaller than a memory page size of the second memory device. The method also includes, in response to receiving a memory access request directed to the faulted memory segment, servicing the memory access request from the patch segment by querying the remapping table to determine a patch segment address corresponding to a requested memory address, where the patch segment address identifies the location of the patch segment, and based on the patch segment address, performing the requested memory access at the patch segment.Type: GrantFiled: December 2, 2019Date of Patent: February 1, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Sergey Blagodurov, Michael Ignatowski, Vilas Sridharan
-
Publication number: 20220028450Abstract: A method for performing stutter of dynamic random access memory (DRAM) where a system on a chip (SOC) initiates bursts of requests to the DRAM to fill buffers to allow the DRAM to self-refresh is disclosed. The method includes issuing, by a system management unit (SMU), a ForceZQCal command to the memory controller to initiate the stutter procedure in response to receiving a timeout request, such as an SMU ZQCal timeout request, periodically issuing a power platform threshold (PPT) request, by the SMU, to the memory controller, and sending a ForceZQCal command prior to a PPT request to ensure re-training occurs after ZQ Calibration. The ForceZQCal command issued prior to PPT request may reduce the latency of the stutter. The method may further include issuing a ForceZQCal command prior to each periodic re-training.Type: ApplicationFiled: July 24, 2020Publication date: January 27, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Jing Wang, Kedarnath Balakrishnan, Kevin M. Brandl, James R. Magro
-
Publication number: 20220027674Abstract: A generator for generating artificial data, and training for the same. Data corresponding to a first label is altered within a reference labeled data set. A discriminator is trained based on the reference labeled data set to create a selectively poisoned discriminator. A generator is trained based on the selectively poisoned discriminator to create a selectively poisoned generator. The selectively poisoned generator is tested for the first label and tested for the second label to determine whether the generator is sufficiently poisoned for the first label and sufficiently accurate for the second label. If it is not, the generator is retrained based on the data set including the further altered data. The generator includes a first ANN to input first information and output a set of artificial data that is classifiable using a first label and not classifiable using a second label of the set of labeled data.Type: ApplicationFiled: August 9, 2021Publication date: January 27, 2022Applicant: Advanced Micro Devices, Inc.Inventor: Nicholas Malaya
-
Patent number: 11232039Abstract: Systems, apparatuses, and methods for efficiently performing memory accesses in a computing system are disclosed. A computing system includes one or more clients, a communication fabric and a last-level cache implemented with low latency, high bandwidth memory. The cache controller for the last-level cache determines a range of addresses corresponding to a first region of system memory with a copy of data stored in a second region of the last-level cache. The cache controller sends a selected memory access request to system memory when the cache controller determines a request address of the memory access request is not within the range of addresses. The cache controller services the selected memory request by accessing data from the last-level cache when the cache controller determines the request address is within the range of addresses.Type: GrantFiled: December 10, 2018Date of Patent: January 25, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Gabriel H. Loh