Patents Assigned to Advanced Micros Devices, Inc.
-
Patent number: 10866790Abstract: An electronic device acquires, from program code, two or more program code loops having specified data dependencies. The electronic device places each of the program code loops into a corresponding blocking loop, each blocking loop including at least one blocking loop induction variable that is incremented by a corresponding block size and used to specify a number of iterations for at least one internal loop induction variable of the respective program code loop. The electronic device fuses the blocking loops into a fused loop by placing all of the blocking loops in the fused loop and replacing the blocking loop induction variables of the blocking loops with a fused loop induction variable that is incremented by the corresponding block size and used to specify the number of iterations for respective internal loop induction variables in the blocking loops.Type: GrantFiled: November 30, 2018Date of Patent: December 15, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Dibyendu Das, Pradeep H. Rao
-
Patent number: 10866768Abstract: A cluster compute server stores different types of data at different storage volumes in order to reduce data duplication at the storage volumes. The storage volumes are categorized into two classes: common storage volumes and dedicated storage volumes, wherein the common storage volumes store data to be accessed and used by multiple compute nodes (or multiple virtual servers) of the cluster compute server. The dedicated storage volumes, in contrast, store data to be accessed only by a corresponding compute node (or virtual server).Type: GrantFiled: December 12, 2014Date of Patent: December 15, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Mauricio Breternitz, Jr., Leonardo Piga
-
Patent number: 10861122Abstract: Methods, systems and non-transitory computer readable media are described. A system includes a shader pipe array, a redundant shader pipe array, a sequencer and a redundant shader switch. The shader pipe array includes multiple shader pipes, each of which perform rendering calculations on data provided thereto. The redundant shader pipe array also performs rendering calculations on data provided thereto. The sequencer identifies at least one defective shader pipe in the shader pipe array, and, in response, generates a signal. The redundant shader switch receives the generated signal, and, in response, transfers the data destined for each shader pipe identified as being defective independently to the redundant shader pipe array.Type: GrantFiled: May 17, 2016Date of Patent: December 8, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Jeffrey T. Brady, Angel E. Socarras
-
Patent number: 10861504Abstract: Systems, apparatuses, and methods for implementing dynamic control of a multi-region fabric are disclosed. A system includes at least one or more processing units, one or more memory devices, and a communication fabric coupled to the processing unit(s) and memory device(s). The system partitions the fabric into multiple regions based on different traffic types and/or periodicities of the clients connected to the regions. For example, the system partitions the fabric into a stutter region for predictable, periodic clients and a non-stutter region for unpredictable, non-periodic clients. The system power-gates the entirety of the fabric in response to detecting a low activity condition. After power-gating the entirety of the fabric, the system periodically wakes up one or more stutter regions while keeping the other non-stutter regions in power-gated mode. Each stutter region monitors stutter client(s) for activity and processes any requests before going back into power-gated mode.Type: GrantFiled: October 5, 2017Date of Patent: December 8, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Benjamin Tsien, Alexander J. Branover, Alan Dodson Smith, Chintan S. Patel
-
Patent number: 10861507Abstract: Systems, apparatuses, and methods for implementing a sampling circuit with increased headroom are disclosed. A sampling circuit includes at least a pair of input signal transistors connected via their drains to a cross-coupled pair of state nodes. The cross-coupled pair of state nodes are coupled to a tail transistor device via the sources of N-type transistors. When clock goes low, the circuit precharges the cross-coupled pair of state nodes while simultaneously attempting to amplify the difference between the pair of input signals. The amplification is performed by a pair of transistors in series between a source of each input signal transistor and ground. Each gate of the pair of transistors is connected to an inverted clock signal. When clock goes high, the circuit stops precharging and a voltage difference between the pair of input signals is regenerated to create a resulting differential voltage on the pair of state nodes.Type: GrantFiled: March 28, 2019Date of Patent: December 8, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Milam Paraschou, Jeffrey Cooper
-
Patent number: 10862809Abstract: The described embodiments include an electronic device that handles network packets. During operation, the electronic device receives a carrier packet, the carrier packet that includes a tunneled packet in a payload of the carrier packet, wherein the tunneled packet includes a packet priority of the tunneled packet and the carrier packet includes a packet priority of the carrier packet. The electronic device then updates the packet priority of the carrier packet based on the packet priority of the tunneled packet.Type: GrantFiled: May 19, 2017Date of Patent: December 8, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventor: David A. Roberts
-
Patent number: 10860418Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.Type: GrantFiled: April 8, 2019Date of Patent: December 8, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Michael Mantor, Sudhanva Gurumurthi
-
Patent number: 10860489Abstract: Techniques are disclosed for designing cache compression algorithms that control how data in caches are compressed. The techniques generate a custom “byte select algorithm” by applying repeated transforms applied to an initial compression algorithm until a set of suitability criteria is met. The suitability criteria include that the “cost” is below a threshold and that a metadata constraint is met. The “cost” is the number of blocks that can be compressed by an algorithm as compared with the “ideal” algorithm. The metadata constraint is the number of bits required for metadata.Type: GrantFiled: October 31, 2018Date of Patent: December 8, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
-
Publication number: 20200379814Abstract: Techniques for scheduling resources on a managed computer system are provided herein. A generative adversarial network generates predicted resource utilization. An orchestrator trains the generative adversarial network and provides the predicted resource utilization from the generative adversarial network to a resource scheduler for usage when the quality of the predicted resource utilization is above a threshold. The quality is measured as the ability of a generator component of the generative adversarial network to “fool” a discriminator component of the generative adversarial network into misclassifying the predicted resource utilization as being real (i.e., being of the type that is actually measured from the computer system).Type: ApplicationFiled: May 29, 2019Publication date: December 3, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Sergey Blagodurov, Abhinav Vishnu, Thaleia Dimitra Doudali, Jagadish B. Kotra
-
Publication number: 20200379544Abstract: Platform power management includes boosting performance in a platform power boost mode or restricting performance to keep a power or temperature under a desired threshold in a platform power cap mode. Platform power management exploits the mutually exclusive nature of activities and the associated headroom created in a temperature and/or power budget of a server platform to boost performance of a particular component while also keeping temperature and/or power below a threshold or budget.Type: ApplicationFiled: May 31, 2019Publication date: December 3, 2020Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Indrani Paul, Sriram Sambamurthy, Larry David Hewitt, Kevin M. Lepak, Samuel D. Naffziger, Adam Neil Calder Clark, Aaron Joseph Grenat, Steven Frederick Liepe, Sandhya Shyamasundar, Wonje Choi, Dana Glenn Lewis, Leonardo de Paula Rosa Piga
-
Publication number: 20200379820Abstract: A technique for synchronizing workgroups is provided. Multiple workgroups execute a wait instruction that specifies a condition variable and a condition. A workgroup scheduler stops execution of a workgroup that executes a wait instruction and an advanced controller begins monitoring the condition variable. In response to the advanced controller detecting that the condition is met, the workgroup scheduler determines whether there is a high contention scenario, which occurs when the wait instruction is part of a mutual exclusion synchronization primitive and is detected by determining that there is a low number of updates to the condition variable prior to detecting that the condition has been met. In a high contention scenario, the workgroup scheduler wakes up one workgroup and schedules another workgroup to be woken up at a time in the future. In a non-contention scenario, more than one workgroup can be woken up at the same time.Type: ApplicationFiled: May 29, 2019Publication date: December 3, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Alexandru Dutu, Sergey Blagodurov, Anthony T. Gutierrez, Matthew D. Sinclair, David A. Wood, Bradford M. Beckmann
-
Publication number: 20200380063Abstract: Techniques for performing in-memory matrix multiplication, taking into account temperature variations in the memory, are disclosed. In one example, the matrix multiplication memory uses ohmic multiplication and current summing to perform the dot products involved in matrix multiplication. One downside to this analog form of multiplication is that temperature affects the accuracy of the results. Thus techniques are provided herein to compensate for the effects of temperature increases on the accuracy of in-memory matrix multiplications. According to the techniques, portions of input matrices are classified as effective or ineffective. Effective portions are mapped to low temperature regions of the in-memory matrix multiplier and ineffective portions are mapped to high temperature regions of the in-memory matrix multiplier. The matrix multiplication is then performed.Type: ApplicationFiled: May 31, 2019Publication date: December 3, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Majed Valad Beigi, Amin Farmahini-Farahani, Sudhanva Gurumurthi
-
Publication number: 20200380761Abstract: Described herein are techniques for performing ray tracing operations. A command processor executes custom instructions for orchestrating a ray tracing pipeline. The custom instructions cause the command processor to perform a series of loop iterations, each at a particular recursion depth. In a first loop iteration, a ray generation shader is executed that triggers execution of a trace ray operation. In any other iteration, zero or more shaders are executed based on the contents of a shader queue. Any shader may trigger execution of a trace ray operation. The trace ray operation determines whether a ray specified by the shader intersects a triangle. The ray trace operation places shader entries into a shader queue, at the current recursion depth plus 1. The command processor updates the current recursion depth based on whether a trace ray operation is executed. The loop ends when the recursion depth is less than a threshold.Type: ApplicationFiled: May 28, 2019Publication date: December 3, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Rohan Mehalwal
-
Patent number: 10855222Abstract: A clock synthesizer has integrated voltage droop detection and clock stretching. An oscillator of the clock synthesizer receives a control current from a digital to analog converter and generates an oscillator output signal. A droop detector and clock stretching circuit responds to a voltage droop of a supply voltage supplying circuits coupled to the oscillator output signal, to cause a portion of the oscillator control current to be diverted from the oscillator to thereby cause the oscillator to reduce the first frequency. The diversion can be accomplished through shunt circuits or a current mirror circuit.Type: GrantFiled: June 29, 2018Date of Patent: December 1, 2020Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Dirk J. Robinson, Andy Huei Chu, Yan Sun, Saket Sham Doshi
-
Patent number: 10853904Abstract: A processor employs a hierarchical register file for a graphics processing unit (GPU). A top level of the hierarchical register file is stored at a local memory of the GPU (e.g., a memory on the same integrated circuit die as the GPU). Lower levels of the hierarchical register file are stored at a different, larger memory, such as a remote memory located on a different die than the GPU. A register file control module monitors the status of in-flight wavefronts at the GPU, and in particular whether each in-flight wavefront is active, predicted to be become active, or inactive. The register file control module places execution data for active and predicted-active wavefronts in the top level of the hierarchical register file and places execution data for inactive wavefronts at lower levels of the hierarchical register file.Type: GrantFiled: March 24, 2016Date of Patent: December 1, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Yasuko Eckert, Nuwan Jayasena
-
Patent number: 10853075Abstract: An electronic device handles accesses of a branch prediction functional block when executing instructions in program code. The electronic device includes a processor having the branch prediction functional block that provides branch prediction information for control transfer instructions (CTIs) in the program code and a minimum predictor use (MPU) functional block. The MPU functional block determines, based on a record associated with a given fetch group of instructions, that a specified number of subsequent fetch groups of instructions that were previously determined to include no CTIs or conditional CTIs that were not taken are to be fetched for execution in sequence following the given fetch group. The MPU functional block then, when each of the specified number of the subsequent fetch groups is fetched and prepared for execution, prevents corresponding accesses of the branch prediction functional block for acquiring branch prediction information for instructions in that subsequent fetch group.Type: GrantFiled: December 23, 2019Date of Patent: December 1, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Varun Agrawal, John Kalamatianos, Adithya Yalavarti, Jingjie Qian
-
Patent number: 10848177Abstract: Methods and apparatus are described. A method, implemented in a decoder, includes receiving two or more signals from an encoder over two or more respective wires. At least one of the two or more signals includes at least one code that was recoded by the encoder. The decoder receives a recoding table. The recoding table provides a mapping indicating the recoding for each code that was recoded by the encoder in the received two or more signals. The decoder decodes the two or more received signals using the received recoding table.Type: GrantFiled: February 17, 2017Date of Patent: November 24, 2020Assignee: Advanced Micro Devices, Inc.Inventor: Greg Sadowski
-
Patent number: 10848137Abstract: A C-element circuit for use in an oscillator or the like includes a first input terminal for receiving a first input signal, a second input terminal for receiving a second input signal, and an output latch for providing an output signal based on a relationship between the two input signals. A stack of input transistors is included with an outer pair of input transistors with gates connected to the first input terminal and an inner pair of input transistors with gates connected to a second input terminal. A balancing circuit operates to equalize a first delay of a change in the first input signal affecting the output signal with a second delay of a change in the second input signal affecting the output signal. Bypass control techniques are provided for using the C-element circuit with a single input.Type: GrantFiled: May 8, 2019Date of Patent: November 24, 2020Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Mikhail Rodionov, Stephen Victor Kosonocky, Joyce Cheuk Wai Wong
-
Patent number: 10848135Abstract: A receiver circuit holds an output voltage at a first output voltage level using a first device of a first type coupled between a first node and a first power supply node, and a second device of a second type coupled between the first node and the first power supply node. The first device is selectively enabled using an input signal. The second device is selectively enabled using a feedback signal. The second device is substantially larger than the first device. The receiver circuit switches the output voltage from the first output voltage level to a second output voltage level responsive to an input voltage level transitioning across a first threshold voltage level from a first input voltage level to a second input voltage level.Type: GrantFiled: August 12, 2019Date of Patent: November 24, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Ashish Sahu, Girish Anathahally Singrigowda, Aniket Bharat Waghide, Prasanth K. Vallur
-
Patent number: 10846095Abstract: A system and method for a virtual load queue is described. Load micro-operations are processed through an instruction pipeline without requiring an entry in a load queue (LDQ). An address generation scheduler queue (AGSQ) entry is allocated to the load micro-operation and a LDQ entry is not allocated to the load micro-operation. The LDQ entries are reserved for the N oldest load micro-operations, where N is the depth of the LDQ. Deallocation of the AGSQ entry is done if the load micro-operation is one of the N oldest load micro-operations, or upon successful completion of the load micro-operation. Deallocation of the AGSQ entry is not done if the load micro-operation gets a bad status and is not one of the N oldest micro-operations. Consequently, the AGSQ acts as a virtual queue for the LDQ and mitigates the limiting effect of the LDQ depth.Type: GrantFiled: November 28, 2017Date of Patent: November 24, 2020Assignee: Advanced Micro Devices, Inc.Inventor: John M. King