Patents Assigned to Advanced Micro Devices
-
Patent number: 11237928Abstract: A method includes reserving memory capacity in a first memory device as patch memory region for backing faulted memory, receiving a memory error indication indicating an uncorrectable error in a faulted segment in a second memory device and, in response to the memory error indication, associating in a remapping table the faulted segment with a patch segment in the patch memory region. The faulted segment is smaller than a memory page size of the second memory device. The method also includes, in response to receiving a memory access request directed to the faulted memory segment, servicing the memory access request from the patch segment by querying the remapping table to determine a patch segment address corresponding to a requested memory address, where the patch segment address identifies the location of the patch segment, and based on the patch segment address, performing the requested memory access at the patch segment.Type: GrantFiled: December 2, 2019Date of Patent: February 1, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Sergey Blagodurov, Michael Ignatowski, Vilas Sridharan
-
Patent number: 11237972Abstract: A method and apparatus physically partitions clean and dirty cache lines into separate memory partitions, such as one or more banks, so that during low power operation, a cache memory controller reduces power consumption of the cache memory containing the clean only data. The cache memory controller controls a refresh operation so that a data refresh does not occur for the clean data only banks or the refresh rate is reduced for the clean data only banks. Partitions that store dirty data can also store clean data; however, other partitions are designated for storing only clean data so that the partitions can have their refresh rate reduced or refresh stopped for periods of time. When multiple DRAM dies or packages are employed, the partition can occur on a die or package level as opposed to a bank level within a die.Type: GrantFiled: December 29, 2017Date of Patent: February 1, 2022Assignee: Advanced Micro Devices, Inc.Inventor: David A. Roberts
-
Publication number: 20220028450Abstract: A method for performing stutter of dynamic random access memory (DRAM) where a system on a chip (SOC) initiates bursts of requests to the DRAM to fill buffers to allow the DRAM to self-refresh is disclosed. The method includes issuing, by a system management unit (SMU), a ForceZQCal command to the memory controller to initiate the stutter procedure in response to receiving a timeout request, such as an SMU ZQCal timeout request, periodically issuing a power platform threshold (PPT) request, by the SMU, to the memory controller, and sending a ForceZQCal command prior to a PPT request to ensure re-training occurs after ZQ Calibration. The ForceZQCal command issued prior to PPT request may reduce the latency of the stutter. The method may further include issuing a ForceZQCal command prior to each periodic re-training.Type: ApplicationFiled: July 24, 2020Publication date: January 27, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Jing Wang, Kedarnath Balakrishnan, Kevin M. Brandl, James R. Magro
-
Publication number: 20220027674Abstract: A generator for generating artificial data, and training for the same. Data corresponding to a first label is altered within a reference labeled data set. A discriminator is trained based on the reference labeled data set to create a selectively poisoned discriminator. A generator is trained based on the selectively poisoned discriminator to create a selectively poisoned generator. The selectively poisoned generator is tested for the first label and tested for the second label to determine whether the generator is sufficiently poisoned for the first label and sufficiently accurate for the second label. If it is not, the generator is retrained based on the data set including the further altered data. The generator includes a first ANN to input first information and output a set of artificial data that is classifiable using a first label and not classifiable using a second label of the set of labeled data.Type: ApplicationFiled: August 9, 2021Publication date: January 27, 2022Applicant: Advanced Micro Devices, Inc.Inventor: Nicholas Malaya
-
Patent number: 11232622Abstract: An apparatus includes a command buffer configured to temporarily store commands. The apparatus also includes processing units disposed at a substrate. The processing units are configured to access a plurality of copies of a command from the command buffer. The processing units include first processing units (such as fixed function hardware blocks) to perform geometry operations indicated by the command on a set of primitives. The geometry operations are performed concurrently by the first processing units. The processing units also include second processing units (such as shaders) to process mutually exclusive sets of pixels generated by rasterizing the set of primitives. The apparatus also includes a cache to temporarily store the pixels after shading by the shaders. The processing units stop or interrupt processing commands in response to detecting a synchronization point and resume processing the commands in response to all the processing units completing commands before synchronization point.Type: GrantFiled: November 27, 2019Date of Patent: January 25, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Skyler J. Saleh, Ruijin Wu
-
Patent number: 11232039Abstract: Systems, apparatuses, and methods for efficiently performing memory accesses in a computing system are disclosed. A computing system includes one or more clients, a communication fabric and a last-level cache implemented with low latency, high bandwidth memory. The cache controller for the last-level cache determines a range of addresses corresponding to a first region of system memory with a copy of data stored in a second region of the last-level cache. The cache controller sends a selected memory access request to system memory when the cache controller determines a request address of the memory access request is not within the range of addresses. The cache controller services the selected memory request by accessing data from the last-level cache when the cache controller determines the request address is within the range of addresses.Type: GrantFiled: December 10, 2018Date of Patent: January 25, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Gabriel H. Loh
-
Patent number: 11231931Abstract: A processor includes a first core and a second core to execute computer instructions. Each of the cores includes its own private memory cache and speculative load queue. The speculative load queue stores cachelines for the computer instructions and data when the core is operating in a speculative state with respect to a process or thread. The processor includes a state tracking buffer having a state field to store a speculative exclusive ownership state for each cacheline in the speculative load queue when present therein.Type: GrantFiled: December 20, 2018Date of Patent: January 25, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Sooraj Puthoor
-
Patent number: 11233510Abstract: Systems, apparatuses, and methods for efficiently performing operations system are disclosed. A computing system uses a memory for storing data, and one or more processing units. The memory includes multiple rows for storing the data with each intersection of a row and a column being a memory bit cell. The memory processes operations. For particular operations, the two or more operands are accessed simultaneously for generating a result without being read out and stored. Two indications are generated specifying at least a first row and a second row targeted by the operation. The memory generates a result by performing the operation for each of the one or more cells in the first row a stored value with a respective stored value in the one or more cells in the second row.Type: GrantFiled: April 27, 2018Date of Patent: January 25, 2022Assignee: Advanced Micro Devices, Inc.Inventors: John J. Wuu, Edward Chang
-
Patent number: 11231962Abstract: With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees behavior even in the case of developing high-performance code.Type: GrantFiled: October 30, 2017Date of Patent: January 25, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Benedict R. Gaster, Lee W. Howes
-
Patent number: 11232847Abstract: Methods, devices, and systems for testing a number of combinations of memory in a computer system. A modular memory device is installed in a memory channel in communication with a processor. The modular memory device includes a number of memory storage devices. The number of memory storage devices include a number of pins. A subset of the number of memory storage devices is selected. A subset of the plurality of pins which do not correspond to the subset of the number of memory storage devices and are not part of a memory map of the computer system is selected. Each pin of the subset of the plurality of pins configured with a termination impedance. The subset of the number of memory storage devices is tested.Type: GrantFiled: September 20, 2019Date of Patent: January 25, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Glennis Eliagh Covington, Benjamin Lyle Winston, Santha Kumar Parameswaran, Shannon T. Kesner
-
Patent number: 11226819Abstract: A processing unit includes a plurality of processing elements and one or more caches. A first thread executes a program that includes one or more prefetch instructions to prefetch information into a first cache. Prefetching is selectively enabled when executing the first thread on a first processing element dependent upon whether one or more second threads previously executed the program on the first processing element. The first thread is then dispatched to execute the program on the first processing element. In some cases, a dispatcher receives the first thread four dispatching to the first processing element. The dispatcher modifies the prefetch instruction to disable prefetching into the first cache in response to the one or more second threads having previously executed the program on the first processing element.Type: GrantFiled: November 20, 2017Date of Patent: January 18, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Brian Emberling, Michael Mantor
-
Patent number: 11227651Abstract: A read path for reading data from a memory includes a sense amplifier having data (SAT) and data complement (SAC) output nodes and a latch. The latch includes an input tri-state inverter including first and second PMOS transistors connected between VDD and an intermediate node, and first and second NMOS transistors connected between VSS and the intermediate node. A gate connection of the first PMOS and NMOS transistors is connected to the SAT node; a gate connection of the second PMOS transistor is connected to a sense amplifier enable complement input; and a gate connection of the second NMOS transistor is connected to a sense amplifier enable input. The latch also includes an output driver with an input connected to the intermediate node and an output connected to a data output node. The latch thus has two gate delays between the SAT node and the data output node.Type: GrantFiled: November 22, 2019Date of Patent: January 18, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Arijit Banerjee, Russell Schreiber, Kyle Whittle
-
Patent number: 11226900Abstract: An approach for tracking data stored in caches uses a Bloom filter to reduce the number of addresses that need to be tracked by a coherence directory. When a requested address is determined to not be currently tracked by either the coherence directory or the Bloom filter, tracking of the address is initiated in the Bloom filter, but not in the coherence directory. Initiating tracking of the address in the Bloom filter includes setting hash bits in the Bloom filter so that subsequent requests for the address will “hit” the Bloom filter. When a requested address is determined to be tracked by the coherence directory, the Bloom filter is not used to track the address.Type: GrantFiled: January 29, 2020Date of Patent: January 18, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Weon Taek Na, Yasuko Eckert, Mark H. Oskin, Gabriel H. Loh, William Louie Walker, Michael Warren Boyer
-
Patent number: 11227214Abstract: Systems, apparatuses, and methods for implementing memory bandwidth reduction techniques for low power convolutional neural network inference applications are disclosed. A system includes at least a processing unit and an external memory coupled to the processing unit. The system detects a request to perform a convolution operation on input data from a plurality of channels. Responsive to detecting the request, the system partitions the input data from the plurality of channels into 3D blocks so as to minimize the external memory bandwidth utilization for the convolution operation being performed. Next, the system loads a selected 3D block from external memory into internal memory and then generates convolution output data for the selected 3D block for one or more features. Then, for each feature, the system adds convolution output data together across channels prior to writing the convolution output data to the external memory.Type: GrantFiled: November 14, 2017Date of Patent: January 18, 2022Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Sateesh Lagudu, Lei Zhang, Allen Rush
-
Publication number: 20220012933Abstract: Techniques for performing shader operations are provided. The techniques include, performing pixel shading at a shading rate defined by pixel shader variable rate shading (“VRS”) data, and updating the pixel VRS data that indicates one or more shading rates for one or more tiles based on whether the tiles of the one or more tiles include triangle edges or do not include triangle edges, to generate updated VRS data.Type: ApplicationFiled: September 23, 2021Publication date: January 13, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Skyler Jonathon Saleh, Vineet Goel, Pazhani Pillai, Ruijin Wu, Christopher J. Brennan, Andrew S. Pomianowski
-
Patent number: 11221772Abstract: A system includes a memory system comprising a memory module and a processor adapted to access the memory module using a memory controller that includes a controller having an input for receiving a power state change request signal and an output for providing memory operations, and a memory operation array comprising a plurality of entries. Each entry includes a plurality of encoded fields. The memory operation array is programmable to store different sequences of commands for particular types of memory of a plurality of types of memory in the plurality of entries that initiate entry into and exit from supported low power modes for the particular types of memory. The controller is responsive to an activation of the power state change request signal to access the memory operation array to fetch at least one entry, and to issue at least one memory operation indicated by the at least one entry.Type: GrantFiled: January 7, 2019Date of Patent: January 11, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Kevin M. Brandl, Thomas H. Hamilton
-
Patent number: 11222685Abstract: A memory controller interfaces with a dynamic random access memory (DRAM) over a memory channel. A refresh control circuit monitors an activate counter which counts a rolling number of activate commands sent over the memory channel to a memory region of the DRAM. In response to the activate counter being above an intermediate management threshold value, the refresh control circuit only issue a refresh management (RFM) command if there is no REF command currently held at the refresh command circuit for the memory region.Type: GrantFiled: May 15, 2020Date of Patent: January 11, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Kevin M. Brandl, Kedarnath Balakrishnan, Jing Wang, Guanhao Shen
-
Patent number: 11223575Abstract: Systems, apparatuses, and methods for efficient data transfer in a computing system are disclosed. A source generates packets to send across a communication fabric (or fabric) to a destination. The source generates partition enable signals for the partitions of payload data. The source negates an enable signal for a particular partition when the source determines the packet type indicates the particular partition should have an associated asserted enable signal in the packet, but the source also determines the particular partition includes a particular data pattern. Routing components of the fabric disable clock signals to storage elements assigned to store the particular partition. The destination inserts the particular data pattern for the particular partition in the payload data.Type: GrantFiled: December 23, 2019Date of Patent: January 11, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Greggory D. Donley, Vydhyanathan Kalyanasundharam, Mark A. Silla, Ashwin Chincholi
-
Patent number: 11221902Abstract: Error handling for resilient software includes: receiving data indicating a region of resilient memory; detecting an error associated with a region of memory; and preventing raising an exception for the error in response to the region of memory falling within the region of resilient memory by preventing the region of memory as being identified as including the error.Type: GrantFiled: December 16, 2019Date of Patent: January 11, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Sudhanva Gurumurthi, Vilas Sridharan
-
Patent number: 11216378Abstract: The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.Type: GrantFiled: September 19, 2016Date of Patent: January 4, 2022Assignee: Advanced Micro Devices, Inc.Inventors: John M. King, Gregory W. Smaus