Patents Assigned to Advanced Micro Devics, Inc.
-
Patent number: 11841803Abstract: A chiplet system includes a central processing unit (CPU) communicably coupled to a first GPU chiplet of a GPU chiplet array. The GPU chiplet array includes the first GPU chiplet communicably coupled to the CPU via a bus and a second GPU chiplet communicably coupled to the first GPU chiplet via a passive crosslink. The passive crosslink is a passive interposer die dedicated for inter-chiplet communications and partitions systems-on-a-chip (SoC) functionality into smaller functional chiplet groupings.Type: GrantFiled: June 28, 2019Date of Patent: December 12, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Skyler J. Saleh, Samuel Naffziger, Milind S. Bhagavat, Rahul Agarwal
-
Patent number: 11842200Abstract: An apparatus includes a plurality of load buses and a load store unit that includes a plurality of load ports to access the plurality of load buses. The load store unit performs a gather operation to concurrently gather a plurality of subsets of data from a memory via the plurality of load buses in a first mode. The apparatus also includes a register that is partitioned into a plurality of portions to hold the plurality of subsets of data provided by the load store unit. The load store unit ignores exceptions or faults while performing the gather operation in the first mode and transitions to a second mode in response to an exception or fault. Two lanes are dispatched to concurrently perform the gather operation per clock cycle in the first mode and a single lane is dispatched to perform the gather operation per clock cycle in the second mode.Type: GrantFiled: September 27, 2019Date of Patent: December 12, 2023Assignee: Advanced Micro Devices, Inc.Inventors: John M. King, Magiting Talisayon, Michael Estlick
-
Publication number: 20230393995Abstract: Described is a method and apparatus for application migration between a dockable device and a docking station in a seamless manner. The dockable device includes a processor and the docking station includes a high-performance processor. The method includes executing at least one application in the dockable device using a first processor, and initiating an application migration for the at least one application from the first processor to a second processor in a docking station responsive to determining that the dockable device is in a docked state, wherein the at least one application continues to execute during the application migration from the first processor to the second processor.Type: ApplicationFiled: August 11, 2023Publication date: December 7, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Jonathan Lawrence Campbell, Yuping Shen
-
Patent number: 11836549Abstract: Computer-implemented techniques for fast block-based parallel message passing interface (MPI) transpose are disclosed. The techniques achieve an in-place parallel matrix transpose of an input matrix in a distributed-memory multiprocessor environment with reduced consumption of computer processing time and storage media resources. An in-memory copy of the input matrix or a submatrix thereof to use as the send buffer for MPI send operations is not needed. Instead, by dividing the input matrix in-place into data blocks having up to at most a predetermined size and sending the corresponding data block(s) for a given submatrix using an MPI API before receiving any data block(s) for the given submatrix using an MPI API in the place of the sent data block(s), making the in-memory copy to use a send buffer can be avoided and yet the input matrix can be transposed in-place.Type: GrantFiled: October 15, 2020Date of Patent: December 5, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Samantray Biplab Raut
-
Patent number: 11836031Abstract: Systems, apparatuses, and methods for performing a software override of a power estimation mechanism are disclosed. A computing system includes a plurality of tuned parameters for generating an estimate of power consumption. The tuned parameters are generated based on post-silicon characterization of the system. After deployment, the system executes a plurality of different applications. When launching a particular application, the system loads a corresponding set of override parameters which are used to replace the plurality of tuned parameters. The system generates an estimate of power consumption using the set of override parameters rather than the previously determined tuned parameters. Then while executing the particular application, the system makes adjustments to power and frequency values for the various system components based on the estimate of power consumption.Type: GrantFiled: November 10, 2020Date of Patent: December 5, 2023Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Jonathan David Hauke, Adam Clark
-
Patent number: 11836091Abstract: A processor supports secure memory access in a virtualized computing environment by employing requestor identifiers at bus devices (such as a graphics processing unit) to identify the virtual machine associated with each memory access request. The virtualized computing environment uses the requestor identifiers to control access to different regions of system memory, ensuring that each VM accesses only those regions of memory that the VM is allowed to access. The virtualized computing environment thereby supports efficient memory access by the bus devices while ensuring that the different regions of memory are protected from unauthorized access.Type: GrantFiled: October 31, 2018Date of Patent: December 5, 2023Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Anthony Asaro, Jeffrey G. Cheng, Anirudh R. Acharya
-
Patent number: 11836085Abstract: Techniques for performing cache operations are provided. The techniques include, recording an entry indicating that a cache line is exclusive-upgradeable; removing the cache line from a cache; and converting a request to insert the cache line into the cache into a request to insert the cache line in the cache in an exclusive state.Type: GrantFiled: October 29, 2021Date of Patent: December 5, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Paul J. Moyer
-
Patent number: 11836610Abstract: An artificial neural network that includes first subnetworks to implement known functions and second subnetworks to implement unknown functions is trained. The first subnetworks are trained separately and in parallel on corresponding known training datasets to determine first parameter values that define the first subnetworks. The first subnetworks are executing on a plurality of processing elements in a processing system. Input values from a network training data set are provided to the artificial neural network including the trained first subnetworks. Error values are generated by comparing output values produced by the artificial neural network to labeled output values of the network training data set. The second subnetworks are trained by back propagating the error values to modify second parameter values that define the second subnetworks without modifying the first parameter values. The first and second parameter values are stored in a storage component.Type: GrantFiled: December 13, 2017Date of Patent: December 5, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Dmitri Yudanov, Nicholas Penha Malaya
-
Patent number: 11835988Abstract: A system and method for load fusion fuses small load operations into fewer, larger load operations. The system detects that a pair of adjacent operations are consecutive load operations, where the adjacent micro-operations refers to micro-operations flowing through adjacent dispatch slots and the consecutive load micro-operations refers to both of the adjacent micro-operations being load micro-operations. The consecutive load operations are then reviewed to determine if the data sizes are the same and if the load operation addresses are consecutive. The two load operations are then fused together to form one load micro-operation with twice the data size and one load data micro-operation with no load component.Type: GrantFiled: December 1, 2017Date of Patent: December 5, 2023Assignee: Advanced Micro Devices, Inc.Inventor: John M. King
-
Patent number: 11836088Abstract: Guided cache replacement is described. In accordance with the described techniques, a request to access a cache is received, and a cache replacement policy which controls loading data into the cache is accessed. The cache replacement policy includes a tree structure having nodes corresponding to cachelines of the cache and a traversal algorithm controlling traversal of the tree structure to select one of the cachelines. Traversal of the tree structure is guided using the traversal algorithm to select a cacheline to allocate to the request. The guided traversal modifies at least one decision of the traversal algorithm to avoid selection of a non-replaceable cacheline.Type: GrantFiled: December 21, 2021Date of Patent: December 5, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Jeffrey Christopher Allan
-
Publication number: 20230384855Abstract: Systems and methods are disclosed for reducing power consumed by capturing data from an I/O device. Techniques disclosed include receiving descriptors, by a controller of an I/O host of a system, including information associated with respective data chunks to be captured from an I/O device buffer of the I/O device. Techniques disclosed further include capturing, based on the descriptors, the data chunks. The capturing comprises pulling the data chunks from the I/O device buffer at a pulling rate, where the data chunks are transferred to a local buffer of the I/O host, and pushing segments of the pulled data chunks from the local buffer, where each segment is transferred to a data buffer of the system after a respective target time that precedes a time at which the data chunks in the segment are to be processed by an application executing on the system.Type: ApplicationFiled: May 25, 2022Publication date: November 30, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Raul Gutierrez
-
Publication number: 20230386593Abstract: The disclosed method may include detecting, by a control circuit coupled to a first read only memory (ROM) device and a second ROM device, a failure of a first output signal from the first ROM device to a common output. The first ROM device is connected to the common output and the second ROM device is disconnected from the common output. The method also includes switching, by the control circuit in response to detecting the failure, the common output from the first ROM device to the second ROM device. Various other methods, systems, and computer-readable media are also disclosed.Type: ApplicationFiled: June 30, 2022Publication date: November 30, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Cai YongFeng
-
Patent number: 11829196Abstract: An integrated circuit (IC) device includes a ring transport having a plurality of nodes and a wire interconnect coupling the plurality of nodes in a ring. The wire interconnect including a wire to transmit clock wake signals around the ring transport in advance of data signaling representing a data packet. Each node is to switch from a clock gated state to a clocked state responsive to receiving a clock wake signal. The ring transport further includes a sleep controller coupled to a select node of the plurality of nodes. The sleep controller is to configure the select node into a clock suppression state for a specified duration responsive to identifying an idle condition on the ring transport via monitoring of the wire. While in the clock suppression state the node suppresses further transmission of any clock wake signals received at the select node.Type: GrantFiled: October 22, 2019Date of Patent: November 28, 2023Assignee: Advanced Micro Devices, Inc.Inventor: William L. Walker
-
Patent number: 11829190Abstract: Data routing for efficient decompressor use is described. In accordance with the described techniques, a cache controller receives requests from multiple requestors for elements of data stored in a compressed format in a cache. The requests include at least a first request from a first requestor and a second request from a second requestor. A decompression routing system identifies a redundant element of data requested by both the first requestor and the second requestor and causes decompressors to decompress the requested elements of data. The decompression includes performing a single decompression of the redundant element. After the decompression, the decompression routing system routes the decompressed elements to the plurality of requestors, which includes routing the decompressed redundant element to both the first requestor and the second requestor.Type: GrantFiled: December 21, 2021Date of Patent: November 28, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Jeffrey Christopher Allan
-
Patent number: 11829222Abstract: A system and method for updating power supply voltages due to variations from aging are described. A functional unit includes a power supply monitor capable of measuring power supply variations in a region of the functional unit. An age counter measures an age of the functional unit. A control unit notifies the power supply monitor to measure an operating voltage reference. When the control unit receives a measured operating voltage reference, the control unit determines an updated age of the region different from the current age based on the measured operating voltage reference. The control unit updates the age counter with the corresponding age, which is younger than the previous age in some cases due to the region not experiencing predicted stress and aging. The control unit is capable of determining a voltage adjustment for the operating voltage reference based on an age indicated by the age counter.Type: GrantFiled: December 18, 2020Date of Patent: November 28, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Sriram Sambamurthy, Sriram Sundaram, Indrani Paul, Larry David Hewitt, Anil Harwani, Aaron Joseph Grenat, Dana Glenn Lewis, Leonardo Piga, Wonje Choi, Karthik Rao
-
Patent number: 11831565Abstract: Systems, apparatuses, and methods for performing efficient data transfer in a computing system are disclosed. A computing system includes multiple fabric interfaces in clients and a fabric. A packet transmitter in the fabric interface includes multiple queues, each for storing packets of a respective type, and a corresponding address history cache for each queue. Queue arbiters in the packet transmitter select candidate packets for issue and determine when address history caches on both sides of the link store the upper portion of the address. The packet transmitter sends a source identifier and a pointer for the request in the packet on the link, rather than the entire request address, which reduces the size of the packet. The queue arbiters support out-of-order issue from the queues. The queue arbiters detect conflicts with out-of-order issue and adjust the outbound packets and fields stored in the queue entries to avoid data corruption.Type: GrantFiled: October 3, 2018Date of Patent: November 28, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Greggory D. Donley, Bryan P. Broussard
-
Publication number: 20230376420Abstract: A method includes recording a first set of consecutive memory access deltas, where each of the consecutive memory access deltas represents a difference between two memory addresses accessed by an application, updating values in a prefetch training table based on the first set of memory access deltas, and predicting one or more memory addresses for prefetching responsive to a second set of consecutive memory access deltas and based on values in the prefetch training table.Type: ApplicationFiled: April 19, 2023Publication date: November 23, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Susumu Mashimo, John Kalamatianos
-
Publication number: 20230377086Abstract: A technique for rendering is provided. The technique includes for a set of primitives processed in a coarse binning pass, outputting early draw data to an early draw buffer; while processing the set of primitives in the coarse binning pass, processing the early draw data in a fine binning pass; and processing remaining primitives of the set of primitives in the fine binning pass.Type: ApplicationFiled: December 13, 2022Publication date: November 23, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Michael John Livesley, Ruijin Wu
-
Patent number: 11822479Abstract: Techniques for performing cache operations are provided. The techniques include recording an indication that providing exclusive access of a first cache line to a first processor is deemed problematic; detecting speculative execution of a store instruction by the first processor to the first cache line; and in response to the detecting, refusing to provide exclusive access of the first cache line to the first processor, based on the indication.Type: GrantFiled: October 29, 2021Date of Patent: November 21, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Paul J. Moyer
-
Patent number: 11822923Abstract: A load/store unit includes a first queue including a first entry for a store operation and a second queue including a second entry for a load operation that includes a return instruction that redirects a program flow to a location indicated by the return instruction. The load/store unit also includes a processor to determine that the store operation matches the load operation and selectively perform store-to-load forwarding (STLF) of a return address for the return instruction from the first entry to the second entry based on whether the store operation is associated with a call instruction. The load/store unit forwards the return address to the second entry in response to the store operation being associated with the call instruction. The load/store unit blocks forwarding until the store operation retires in response to the store operation not being associated with the call instruction.Type: GrantFiled: June 25, 2019Date of Patent: November 21, 2023Assignee: Advanced Micro Devices, Inc.Inventor: David Kaplan