Patents Assigned to Advanced Micros Devices, Inc.
-
Patent number: 11113061Abstract: Described herein are techniques for saving registers in the event of a function call. The techniques include modifying a program including a block of code designated as a calling code that calls a function. The modifying includes modifying the calling code to set a register usage mask indicating which registers are in use at the time of the function call. The modifying also includes modifying the function to combine the information of the register usage mask with information indicating registers used by the function to generate registers to be saved and save the registers to be saved.Type: GrantFiled: September 26, 2019Date of Patent: September 7, 2021Assignee: Advanced Micro Devices, Inc.Inventor: Michael John Bedy
-
Patent number: 11113056Abstract: A technique for performing store-to-load forwarding is provided. The technique includes determining a virtual address for data to be loaded for the load instruction, identifying a matching store instruction from one or more store instruction memories by comparing a virtual-address-based comparison value for the load instruction to one or more virtual-address-based comparison values of one or more store instructions, determining a physical address for the load instruction, and validating the load instruction based on a comparison between the physical address of the load instruction and a physical address of the matching store instruction.Type: GrantFiled: November 27, 2019Date of Patent: September 7, 2021Assignee: Advanced Micro Devices, Inc.Inventors: John M. King, Matthew T. Sobel
-
Patent number: 11112926Abstract: Systems, apparatuses, and methods for implementing enhanced scaling techniques for display objects are disclosed. When graphical content is created by an application, display objects register with a scaling manager to be notified of display scaling events. These display scaling events can be caused by changing displays, changing resolution or other parameters on a display, changing a text size, resizing one or more graphical elements, or otherwise. When a display scaling event is detected, display objects are notified of the event by the scaling manager. If a given display object makes a decision to change the amount of space it occupies based on the event, the given display object notifies its parent object of the desired change. The parent can then decide whether to allow the change and/or to make adjustments to other display objects to accommodate the change sought by the given display object.Type: GrantFiled: October 30, 2020Date of Patent: September 7, 2021Assignee: Advanced Micro Devices, Inc.Inventor: Peter James Lohrmann
-
Speculative instruction wakeup to tolerate draining delay of memory ordering violation check buffers
Patent number: 11113065Abstract: A technique for speculatively executing load-dependent instructions includes detecting that a memory ordering consistency queue is full for a completed load instruction. The technique also includes storing data loaded by the completed load instruction into a storage location for storing data when the memory ordering consistency queue is full. The technique further includes speculatively executing instructions that are dependent on the completed load instruction. The technique also includes in response to a slot becoming available in the memory ordering consistency queue, replaying the load instruction. The technique further includes in response to receiving loaded data for the replayed load instruction, testing for a data mis-speculation by comparing the loaded data for the replayed load instruction with the data loaded by the completed load instruction that is stored in the storage location.Type: GrantFiled: October 31, 2019Date of Patent: September 7, 2021Assignee: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Susumu Mashimo, Krishnan V. Ramani, Scott Thomas Bingham -
Publication number: 20210272354Abstract: Improvements to graphics processing pipelines are disclosed. More specifically, the vertex shader stage, which performs vertex transformations, and the hull or geometry shader stages, are combined. If tessellation is disabled and geometry shading is enabled, then the graphics processing pipeline includes a combined vertex and graphics shader stage. If tessellation is enabled, then the graphics processing pipeline includes a combined vertex and hull shader stage. If tessellation and geometry shading are both disabled, then the graphics processing pipeline does not use a combined shader stage. The combined shader stages improve efficiency by reducing the number of executing instances of shader programs and associated resources reserved.Type: ApplicationFiled: April 19, 2021Publication date: September 2, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Mangesh P. Nijasure, Randy W. Ramsey, Todd Martin
-
Patent number: 11108382Abstract: An oscillator circuit includes a first oscillator, a second oscillator, and a calibration circuit to calibrate the first and second oscillators. The first oscillator is supplied with a first supply voltage, and the second oscillator is supplied with a second supply voltage. The calibration includes setting a frequency control of the second oscillator at a target frequency. Then, a voltage control of the second supply voltage is adjusted incrementally until a first control value is identified at which a second oscillator output frequency matches the target frequency. Then, a voltage control of the first supply voltage is set to the first control value. Then, the voltage control for the first supply voltage is adjusted incrementally until a second control value is identified at which a first oscillator output frequency is as close to the second oscillator output frequency as is achievable, but does not exceed it.Type: GrantFiled: September 24, 2020Date of Patent: August 31, 2021Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Joyce Cheuk Wai Wong, Naeem Ibrahim Ally, Jonathan Hauke, Stephen Victor Kosonocky
-
Patent number: 11106594Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.Type: GrantFiled: September 5, 2019Date of Patent: August 31, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Paul James Moyer, Douglas Benson Hunt
-
Patent number: 11106596Abstract: Methods, devices, and systems for determining an address in a physical memory which corresponds to a virtual address using a skewed-associative translation lookaside buffer (TLB) are described. A virtual address and a configuration indication are received using receiver circuitry. A physical address corresponding to the virtual address is output if a TLB hit occurs. A first subset of a plurality of ways of the TLB is configured to hold a first page size. The first subset includes a number of the ways based on the configuration indication. A physical address corresponding to the virtual address is retrieved from a page table if a TLB miss occurs, and at least a portion of the physical address is installed in a least recently used way of a subset of a plurality of ways the TLB, determined according to a replacement policy based on the configuration indication.Type: GrantFiled: December 23, 2016Date of Patent: August 31, 2021Assignee: Advanced Micro Devices, Inc.Inventors: John M. King, Michael T. Clark
-
Patent number: 11106600Abstract: A processing system adjusts a cache replacement priority of cache lines at a cache based on evictions of entries mapping virtual-to-physical address translations from a translation lookaside buffer (TLB). Upon eviction of a TLB entry, the processing system identifies cache lines corresponding to the physical addresses of the evicted TLB entry and evicts the cache lines or adjusts the cache replacement priority of the cache lines so that their eviction from the cache will be accelerated.Type: GrantFiled: January 24, 2019Date of Patent: August 31, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Gabriel H. Loh, Paul Moyer
-
Patent number: 11100004Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.Type: GrantFiled: June 23, 2015Date of Patent: August 24, 2021Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULCInventors: Gongxian Jeffrey Cheng, Mark Fowler, Philip J. Rogers, Benjamin T. Sander, Anthony Asaro, Mike Mantor, Raja Koduri
-
Patent number: 11100604Abstract: Systems, apparatuses, and methods for scheduling jobs for multiple frame-based applications are disclosed. A computing system executes a plurality of frame-based applications for generating pixels for display. The applications convey signals to a scheduler to notify the scheduler of various events within a given frame being rendered. The scheduler adjusts the priorities of applications based on the signals received from the applications. The scheduler attempts to adjust priorities of applications and schedule jobs from these applications so as to minimize the perceived latency of each application. When an application has enqueued the last job for the current frame, the scheduler raises the priority of the application to high. This results in the scheduler attempting to schedule all remaining jobs for the application back-to-back. Once all jobs of the application have been completed, the priority of the application is reduced, permitting jobs of other applications to be executed.Type: GrantFiled: January 31, 2019Date of Patent: August 24, 2021Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Jeffrey Gongxian Cheng, Ahmed M. Abdelkhalek, Yinan Jiang, Xingsheng Wan, Anthony Asaro, David Martinez Nieto
-
Patent number: 11099786Abstract: A memory controller interfaces with a non-volatile storage class memory (SCM) module over a heterogeneous memory channel, and includes a command queue for receiving memory access commands. A memory interface queue is coupled to the command queue for holding outgoing commands. A non-volatile command queue is coupled to the command queue for storing non-volatile read commands that are placed in the memory interface queue. An arbiter selects entries from the command queue, and places them in the memory interface queue for transmission over a heterogeneous memory channel. A control circuit is coupled to the heterogeneous memory channel for receiving a ready response from the non-volatile SCM module indicating that responsive data is available for a non-volatile read command, and in response to receiving the ready response, causing a send command to be placed in the memory interface queue for commanding the non-volatile SCM module to send the responsive data.Type: GrantFiled: December 30, 2019Date of Patent: August 24, 2021Assignee: Advanced Micro Devices, Inc.Inventors: James R. Magro, Kedarnath Balakrishnan
-
Patent number: 11099788Abstract: An approach is provided for implementing near-memory data reduction during store operations to off-chip or off-die memory. A Near-Memory Reduction (NMR) unit provides near-memory data reduction during write operations to a specified address range. The NMR unit is configured with a range of addresses to be reduced and when a store operation specifies an address within the range of addresses, the NRM unit performs data reduction by adding the data value specified by the store operation to an accumulated reduction result. According to an embodiment, the NRM unit maintains a count of the number of updates to the accumulated reduction result that are used to determine when data reduction has been completed.Type: GrantFiled: October 21, 2019Date of Patent: August 24, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Nuwan Jayasena, Shaizeen Aga
-
Apparatus and method for resynchronization prediction with variable upgrade and downgrade capability
Patent number: 11099846Abstract: A method and apparatus generates control information that indicates whether to change a counter value associated with a particular load instruction. In response to the control information, the method and apparatus causes a hysteresis effect for operating between a speculative mode and a non-speculative mode based on the counter value. The hysteresis effect is in favor of the non-speculative mode. The method and apparatus causes the hysteresis effect by incrementing the counter value associated with the particular load instruction by a first value or decrementing the counter value by a second value. The first value is greater than the second value.Type: GrantFiled: June 20, 2018Date of Patent: August 24, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Krishnan V. Ramani, Chetana N. Keltcher -
Publication number: 20210255871Abstract: A technique for processing qubits in a quantum computing device is provided. The technique includes determining that, in a first cycle, a first quantum processing region is to perform a first quantum operation that does not use a qubit that is stored in the first quantum processing region, identifying a second quantum processing region that is to perform a second quantum operation at a second cycle that is later than the first cycle, wherein the second quantum operation uses the qubit, determining that between the first cycle and the second cycle, no quantum operations are performed in the second quantum processing region, and moving the qubit from the first quantum processing region to the second quantum processing region.Type: ApplicationFiled: February 18, 2020Publication date: August 19, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Onur Kayiran, Jieming Yin, Yasuko Eckert
-
Patent number: 11095274Abstract: A pre-discharged edge-triggered flip-flop, in which internal nodes determinative of an output signal are discharged to VSS prior to an evaluation phase of a clock signal, is provided to enable improved clock-to-output response times when provided with a rising edge of a clock pulse. In operation, during a pre-discharge phase of the clock signal, multiple internal nodes of a differential master latch circuit of the flip-flop are discharged to VSS. In response to a rising edge of the clock signal signaling the beginning of an evaluation phase, one of the internal nodes (selected depending on the logical value of an input signal to the flip-flop) is charged to VDD while other of the internal nodes remain at VSS. A single clock signal inverter is disposed between the input clock signal and a multiplexer providing the output data signal.Type: GrantFiled: September 25, 2020Date of Patent: August 17, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Nur Mohammad Baksh, Michael Q. Co
-
Patent number: 11095910Abstract: A system and method for scalable video coding that includes base layer having lower resolution encoding, enhanced layer having higher resolution encoding and the data transferring between two layers. The system and method provides several methods to reduce bandwidth of inter-layer transfers while at the same time reducing memory requirements. Due to less memory access, the system clock frequency can be lowered so that system power consumption is lowered as well. The system avoids having prediction data from base layer to enhanced layer to be up-sampled for matching resolution in the enhanced layer as transferring up-sampled data can impose a big burden on memory bandwidth.Type: GrantFiled: December 6, 2019Date of Patent: August 17, 2021Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Lei Zhang, Ji Zhou, Zhen Chen, Min Yu
-
Patent number: 11093676Abstract: Methods for debugging a processor based on executing a randomly created and randomly executed executable on a fabricated processor. The executable may execute via startup firmware. By implementing randomization at multiple levels in the testing of the processor, coupled with highly specific test generation constraint rules, highly focused tests on a micro-architectural feature are implemented while at the same time applying a high degree of random permutation in the way it stresses that specific feature. This allows for the detection and diagnosis of errors and bugs in the processor that elude traditional testing methods. The processor Once the errors and bugs are detected and diagnosed, the processor can then be redesigned to no longer produce the anomalies. By eliminating the errors and bugs in the processor, a processor with improved computational efficiency and reliability can be fabricated.Type: GrantFiled: December 20, 2019Date of Patent: August 17, 2021Assignee: Advanced Micro Devices, Inc.Inventor: Eric W. Schieve
-
Patent number: 11093580Abstract: A processor sequences the application of submatrices at a matrix multiplier to reduce the number of input changes at an input register of the matrix multiplier. The matrix multiplier is configured to perform a matrix multiplication for a relatively small matrix. To multiply two larger matrices the GPU decomposes the larger matrices into smaller submatrices and stores the submatrices at input registers of the matrix multiplier in a sequence, thereby calculating each column of a result matrix. The GPU sequences the storage of the submatrices at the input registers to maintain input data at one of the input registers over multiple calculation cycles of the matrix multiplier thereby reducing power consumption at the GPU.Type: GrantFiled: October 31, 2018Date of Patent: August 17, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Maxim V. Kazakov, Jian Mao
-
Patent number: 11086809Abstract: Data transfer acceleration includes receiving, by a data transfer accelerator in a first node of a plurality of nodes, from a second node of the plurality of nodes, a request for data in a second state, wherein the second node stores an instance of the data in a first state; generating a message including one or more operations to transform the data from the first state to the second state; and sending the message to the second node in response to the request.Type: GrantFiled: November 25, 2019Date of Patent: August 10, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Anthony Gutierrez