Patents Assigned to Advanced Micro Devices
  • Publication number: 20230069890
    Abstract: An accelerated processing device is provided which comprises a plurality of compute units each including a plurality of SIMD units, and each SIMD unit comprises a register file. The accelerated processing device also comprises LDS in communication with each of the SIMD units. The accelerated processing device also comprises a first portion of cache memory, in communication with each of the SIMD units and a second cache portion of memory shared by the compute units. The compute units are configured to execute a program in which a storage portion of at least one of the register file of a SIMD unit, the first portion of cache memory and the LDS is reserved as part of another of the register file, the first portion of cache memory and the LDS.
    Type: Application
    Filed: September 3, 2021
    Publication date: March 9, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Maxim V. Kazakov
  • Publication number: 20230070744
    Abstract: A method and an apparatus for decoding an image are disclosed. A region of the image is selected and the decoding is selected region and associated metadata is performed. Pixels for a generated for a decoded image based on the decoded selected region and metadata.
    Type: Application
    Filed: November 11, 2022
    Publication date: March 9, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Andrew S. Pomianowski, Konstantine Iourcha
  • Patent number: 11599359
    Abstract: A processor in a data processing system includes a master-shadow physical register file and a renaming unit. The master-shadow physical register file has a master storage coupled to shadow storage. The renaming unit is coupled to the master-shadow physical register file. Based on an occurrence of shadow transfer activation conditions verified by the renaming unit, data in the master storage is transferred from the master storage to the shadow storage for storage. Data is transferred from the shadow storage back to the master storage based on the occurrence of a shadow-to-master transfer event, which includes, for example, a flush of the master storage by the processor.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: March 7, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Arun A. Nair, Ashok T. Venkatachar, Emil Talpes, Srikanth Arekapudi, Rajesh Kumar Arunachalam
  • Patent number: 11586441
    Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: February 21, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Kalamatianos, Jagadish B. Kotra
  • Patent number: 11586539
    Abstract: A processing system selectively allocates space to store a group of one or more cache lines at a cache level of a cache hierarchy having a plurality of cache levels based on memory access patterns of a software application executing at the processing system. The processing system generates bit vectors indicating which cache levels are to allocate space to store groups of one or more cache lines based on the memory access patterns, which are derived from data granularity and movement information. Based on the bit vectors, the processing system provides hints to the cache hierarchy indicating the lowest cache level that can exploit the reuse potential for a particular data.
    Type: Grant
    Filed: December 13, 2019
    Date of Patent: February 21, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Weon Taek Na, Jagadish B. Kotra, Yasuko Eckert, Steven Raasch, Sergey Blagodurov
  • Patent number: 11586563
    Abstract: A processor distributes memory timing parameters and data among different memory modules based upon memory access patterns. The memory access patterns indicate different types, or classes, of data for an executing workload, with each class associated with different memory access characteristics, such as different row buffer hit rate levels, different frequencies of access, different criticalities, and the like. The processor assigns each memory module to a data class and sets the memory timing parameters for each memory module according to the module's assigned data class, thereby tailoring the memory timing parameters for efficient access of the corresponding data.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: February 21, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Max Ruttenberg, Vendula Venkata Srikant Bharadwaj, Yasuko Eckert, Anthony Gutierrez, Mark H. Oskin
  • Patent number: 11586555
    Abstract: Systems, apparatuses, and methods for implementing flexible dictionary sharing techniques for caches are disclosed. A set-associative cache includes a dictionary for each data array set. When a cache line is to be allocated in the cache, a cache controller determines to which set a base index of the cache line address maps. Then, a selector unit determines which dictionary of a group of dictionaries stored by those sets neighboring this set would achieve the most compression for the cache line. This dictionary is then selected to compress the cache line. An offset is added to the base index of the cache line to generate a full index in order to map the cache line to the set corresponding to this chosen dictionary. The compressed cache line is stored in this set with the chosen dictionary, and the offset is stored in the corresponding tag array entry.
    Type: Grant
    Filed: April 15, 2021
    Date of Patent: February 21, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, John Kalamatianos
  • Patent number: 11586472
    Abstract: A method, system, and apparatus determines that one or more tasks should be relocated from a first processor to a second processor by comparing performance metrics to associated thresholds or by using other indications. To relocate the one or more tasks from the first processor to the second processor, the first processor is stalled and state information from the first processor is copied to the second processor. The second processor uses the state information and then services incoming tasks instead of the first processor.
    Type: Grant
    Filed: December 10, 2019
    Date of Patent: February 21, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander J. Branover, Benjamin Tsien, Elliot H. Mednick
  • Publication number: 20230046477
    Abstract: A data transmission system includes a first circuit, a second circuit, and a reference voltage generation circuit. The first circuit includes a transmitter powered by a first power supply voltage and having an input for receiving a data output signal, and an output. The second circuit includes a receiver powered by a second power supply voltage and having a first input coupled to the output of the transmitter, a second input for receiving a reference voltage, and an output for providing a data input signal. The reference voltage generation circuit forms the reference voltage by mixing a first signal generated by the first circuit based on the first power supply voltage and a second signal generated by the second circuit based on the second power supply voltage.
    Type: Application
    Filed: December 8, 2021
    Publication date: February 16, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Ramon Mangaser, Karthik Gopalakrishnan, Andy Huei Chu, Pradeep Jayaraman
  • Patent number: 11580025
    Abstract: Systems and methods for coordinated memory-side cache prefetching and dynamic interleaving configuration modification involve modifying one or both of the prefetch distance or the prefetch degree used by prefetcher modules of one or more memory-side caches by modifying interleaving configuration data following detection of an interleaving reconfiguration trigger condition indicative, for example, of low prefetch accuracy, low prefetch coverage, high prefetch lateness, or a combination of these. In response an interleaving reconfiguration trigger condition, a processor modifies the interleaving configuration data for the processing system based on the prefetch performance characteristics associated with the interleaving reconfiguration trigger condition. In some embodiments, the interleaving configuration data is modified by changing which physical memory address indices are used to determine the bits that define the channel identification number to which that physical memory address is to be mapped.
    Type: Grant
    Filed: September 30, 2021
    Date of Patent: February 14, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Tarun Nakra, Akhil Arunkumar, Vydhyanathan Kalyanasundharam, Chintan S. Patel, Nithesh Kurella Lakshmi Narayanamurthy
  • Patent number: 11579922
    Abstract: Systems, apparatuses, and methods for dynamic graphics processing unit (GPU) register allocation are disclosed. A GPU includes at least a plurality of compute units (CUs), a control unit, and a plurality of registers for each CU. If a new wavefront requests more registers than are currently available on the CU, the control unit spills registers associated with stack frames at the bottom of a stack since they will not likely be used in the near future. The control unit has complete flexibility determining how many registers to spill based on dynamic demands and can prefetch the upcoming necessary fills without software involvement. Effectively, the control unit manages the physical register file as a cache. This allows younger workgroups to be dynamically descheduled so that older workgroups can allocate additional registers when needed to ensure improved fairness and better forward progress guarantees.
    Type: Grant
    Filed: December 29, 2020
    Date of Patent: February 14, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Bradford Michael Beckmann, Steven Tony Tye, Brian L. Sumner, Nicolai Hähnle
  • Patent number: 11579514
    Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: February 14, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Allen H. Rush, Hui Zhou
  • Patent number: 11579650
    Abstract: A method and apparatus for synchronizing a time stamp counter (TSC) associated with a processor core in a computer system includes initializing the TSC associated with the processor core by synchronizing the TSC associated with the processor core with at least one other TSC in a hierarchy of TSCs. One or more processor cores are powered down. Upon powering up of the one or more processor cores, the TSC associated with the processor core is synchronized with the at least one other TSC in the hierarchy of TSCs.
    Type: Grant
    Filed: December 19, 2019
    Date of Patent: February 14, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amitabh Mehra, David M. Dahle, Richard M. Born
  • Patent number: 11579884
    Abstract: Techniques for performing instruction fetch operations are provided. The techniques include determining instruction addresses for a primary branch prediction path; requesting that a level 0 translation lookaside buffer (“TLB”) caches address translations for the primary branch prediction path; determining either or both of alternate control flow path instruction addresses and lookahead control flow path instruction addresses; and requesting that either the level 0 TLB or an alternative level TLB caches address translations for either or both of the alternate control flow path instruction addresses and the lookahead control flow path instruction addresses.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: February 14, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashok Tirupathy Venkatachar, Steven R. Havlir, Robert B. Cohen
  • Patent number: 11579876
    Abstract: A method of save-restore operations includes monitoring, by a power controller of a parallel processor (such as a graphics processing unit), of a register bus for one or more register write signals. The power controller determines that a register write signal is addressed to a state register that is designated to be saved prior to changing a power state of the parallel processor from a first state to a second state having a lower level of energy usage. The power controller instructs a copy of data corresponding to the state register to be written to a local memory module of the parallel processor. Subsequently, the parallel processor receives a power state change signal and writes state register data saved at the local memory module to an off-chip memory prior to changing the power state of the parallel processor.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: February 14, 2023
    Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULC
    Inventors: Anirudh R. Acharya, Alexander Fuad Ashkar, Ashkan Hosseinzadeh Namin
  • Publication number: 20230039289
    Abstract: A data fabric routes requests between the plurality of requestors and the plurality of responders. The data fabric includes a crossbar router, a coherent slave controller coupled to the crossbar router, and a probe filter coupled to the coherent slave controller and tracking the state of cached lines of memory. Power state control circuitry operates, responsive to detecting any of a plurality of designated conditions, to cause the probe filter to enter a retention low power state in which a clock signal to the probe filter is gated while power is maintained to the probe filter. Entering the retention low power state is performed when all in-process probe filter lookups are complete.
    Type: Application
    Filed: October 25, 2022
    Publication date: February 9, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Benjamin Tsien, Amit P. Apte
  • Patent number: 11573853
    Abstract: Error checking data used in offloaded operations is disclosed. A remote execution device receives a request from a host to store a data block in a memory region. The data block includes data and host-generated error checking information for the data. The remote execution device updates the data block by overwriting the host-generated error checking information with locally generated error checking information for the data. The data block is then stored in the memory region.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: February 7, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Vilas Sridharan, Sudhanva Gurumurthi
  • Patent number: 11573765
    Abstract: A processing unit implements a convolutional neural network (CNN) by fusing at least a portion of a convolution phase of the CNN with at least a portion of a batch normalization phase. The processing unit convolves two input matrices representing inputs and weights of a portion of the CNN to generate an output matrix. The processing unit performs the convolution via a series of multiplication operations, with each multiplication operation generating a corresponding submatrix (or “tile”) of the output matrix at an output register of the processing unit. While an output submatrix is stored at the output register, the processing unit performs a reduction phase and an update phase of the batch normalization phase for the CNN. The processing unit thus fuses at least a portion of the batch normalization phase of the CNN with a portion of the convolution.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: February 7, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Milind N. Nemlekar, Prerit Dak
  • Patent number: 11573801
    Abstract: A processor includes a register file and control logic that detects multiple different sets of sequential zero bits of a register in the register file, wherein each of the multiple different sets has a bit length that corresponds to a partial instruction width and operates at a first partial instruction width or a second partial instruction width with the register file depending on number of sets of zero bits detected in the register. In certain examples, the control logic causes operating at first instruction width that avoids merging of a first bit length of data in the register and operating at the second instruction width that avoids merging of a second bit length of data in the register. In some examples, a register rename map table incudes multiple zero bits that identify the detected multiple different sets of bits of sequential zeros.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: February 7, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Eric Dixon, Erik Swanson, Theodore Carlson, Ruchir Dalal, Michael Estlick
  • Patent number: 11575916
    Abstract: An encoding method is provided which includes receiving a plurality of images, obtaining values of elements in a portion of the images, sorting the elements according to different values of the elements, sorting the elements according to a number of occurrences of the different values and encoding the elements using a subset of the different values having corresponding numbers of occurrences that are higher than corresponding numbers of occurrences of other values. Examples also include a processing device and method for use with palette mode encoding in which the elements are a portion of pixels in images and the values are color values of the portion of pixels in the images.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: February 7, 2023
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Shu-Hsien Wu, Crystal Yeong-Pian Sau, Yang Liu, Wei Gao, Feng Pan, Ihab M. A. Amer, Ying Luo, Edward A. Harold, Gabor Sines, Ehsan Mirhadi