Patents Assigned to Advanced Micro Device, Inc.
-
Patent number: 11586472Abstract: A method, system, and apparatus determines that one or more tasks should be relocated from a first processor to a second processor by comparing performance metrics to associated thresholds or by using other indications. To relocate the one or more tasks from the first processor to the second processor, the first processor is stalled and state information from the first processor is copied to the second processor. The second processor uses the state information and then services incoming tasks instead of the first processor.Type: GrantFiled: December 10, 2019Date of Patent: February 21, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Alexander J. Branover, Benjamin Tsien, Elliot H. Mednick
-
Patent number: 11586563Abstract: A processor distributes memory timing parameters and data among different memory modules based upon memory access patterns. The memory access patterns indicate different types, or classes, of data for an executing workload, with each class associated with different memory access characteristics, such as different row buffer hit rate levels, different frequencies of access, different criticalities, and the like. The processor assigns each memory module to a data class and sets the memory timing parameters for each memory module according to the module's assigned data class, thereby tailoring the memory timing parameters for efficient access of the corresponding data.Type: GrantFiled: December 22, 2020Date of Patent: February 21, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Max Ruttenberg, Vendula Venkata Srikant Bharadwaj, Yasuko Eckert, Anthony Gutierrez, Mark H. Oskin
-
Patent number: 11586555Abstract: Systems, apparatuses, and methods for implementing flexible dictionary sharing techniques for caches are disclosed. A set-associative cache includes a dictionary for each data array set. When a cache line is to be allocated in the cache, a cache controller determines to which set a base index of the cache line address maps. Then, a selector unit determines which dictionary of a group of dictionaries stored by those sets neighboring this set would achieve the most compression for the cache line. This dictionary is then selected to compress the cache line. An offset is added to the base index of the cache line to generate a full index in order to map the cache line to the set corresponding to this chosen dictionary. The compressed cache line is stored in this set with the chosen dictionary, and the offset is stored in the corresponding tag array entry.Type: GrantFiled: April 15, 2021Date of Patent: February 21, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Alexander D. Breslow, John Kalamatianos
-
Patent number: 11586539Abstract: A processing system selectively allocates space to store a group of one or more cache lines at a cache level of a cache hierarchy having a plurality of cache levels based on memory access patterns of a software application executing at the processing system. The processing system generates bit vectors indicating which cache levels are to allocate space to store groups of one or more cache lines based on the memory access patterns, which are derived from data granularity and movement information. Based on the bit vectors, the processing system provides hints to the cache hierarchy indicating the lowest cache level that can exploit the reuse potential for a particular data.Type: GrantFiled: December 13, 2019Date of Patent: February 21, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Weon Taek Na, Jagadish B. Kotra, Yasuko Eckert, Steven Raasch, Sergey Blagodurov
-
Publication number: 20230046477Abstract: A data transmission system includes a first circuit, a second circuit, and a reference voltage generation circuit. The first circuit includes a transmitter powered by a first power supply voltage and having an input for receiving a data output signal, and an output. The second circuit includes a receiver powered by a second power supply voltage and having a first input coupled to the output of the transmitter, a second input for receiving a reference voltage, and an output for providing a data input signal. The reference voltage generation circuit forms the reference voltage by mixing a first signal generated by the first circuit based on the first power supply voltage and a second signal generated by the second circuit based on the second power supply voltage.Type: ApplicationFiled: December 8, 2021Publication date: February 16, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Ramon Mangaser, Karthik Gopalakrishnan, Andy Huei Chu, Pradeep Jayaraman
-
Patent number: 11579922Abstract: Systems, apparatuses, and methods for dynamic graphics processing unit (GPU) register allocation are disclosed. A GPU includes at least a plurality of compute units (CUs), a control unit, and a plurality of registers for each CU. If a new wavefront requests more registers than are currently available on the CU, the control unit spills registers associated with stack frames at the bottom of a stack since they will not likely be used in the near future. The control unit has complete flexibility determining how many registers to spill based on dynamic demands and can prefetch the upcoming necessary fills without software involvement. Effectively, the control unit manages the physical register file as a cache. This allows younger workgroups to be dynamically descheduled so that older workgroups can allocate additional registers when needed to ensure improved fairness and better forward progress guarantees.Type: GrantFiled: December 29, 2020Date of Patent: February 14, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Bradford Michael Beckmann, Steven Tony Tye, Brian L. Sumner, Nicolai Hähnle
-
Patent number: 11580025Abstract: Systems and methods for coordinated memory-side cache prefetching and dynamic interleaving configuration modification involve modifying one or both of the prefetch distance or the prefetch degree used by prefetcher modules of one or more memory-side caches by modifying interleaving configuration data following detection of an interleaving reconfiguration trigger condition indicative, for example, of low prefetch accuracy, low prefetch coverage, high prefetch lateness, or a combination of these. In response an interleaving reconfiguration trigger condition, a processor modifies the interleaving configuration data for the processing system based on the prefetch performance characteristics associated with the interleaving reconfiguration trigger condition. In some embodiments, the interleaving configuration data is modified by changing which physical memory address indices are used to determine the bits that define the channel identification number to which that physical memory address is to be mapped.Type: GrantFiled: September 30, 2021Date of Patent: February 14, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Tarun Nakra, Akhil Arunkumar, Vydhyanathan Kalyanasundharam, Chintan S. Patel, Nithesh Kurella Lakshmi Narayanamurthy
-
Patent number: 11579650Abstract: A method and apparatus for synchronizing a time stamp counter (TSC) associated with a processor core in a computer system includes initializing the TSC associated with the processor core by synchronizing the TSC associated with the processor core with at least one other TSC in a hierarchy of TSCs. One or more processor cores are powered down. Upon powering up of the one or more processor cores, the TSC associated with the processor core is synchronized with the at least one other TSC in the hierarchy of TSCs.Type: GrantFiled: December 19, 2019Date of Patent: February 14, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Amitabh Mehra, David M. Dahle, Richard M. Born
-
Patent number: 11579514Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.Type: GrantFiled: December 23, 2020Date of Patent: February 14, 2023Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Allen H. Rush, Hui Zhou
-
Patent number: 11579876Abstract: A method of save-restore operations includes monitoring, by a power controller of a parallel processor (such as a graphics processing unit), of a register bus for one or more register write signals. The power controller determines that a register write signal is addressed to a state register that is designated to be saved prior to changing a power state of the parallel processor from a first state to a second state having a lower level of energy usage. The power controller instructs a copy of data corresponding to the state register to be written to a local memory module of the parallel processor. Subsequently, the parallel processor receives a power state change signal and writes state register data saved at the local memory module to an off-chip memory prior to changing the power state of the parallel processor.Type: GrantFiled: August 31, 2020Date of Patent: February 14, 2023Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Anirudh R. Acharya, Alexander Fuad Ashkar, Ashkan Hosseinzadeh Namin
-
Patent number: 11579884Abstract: Techniques for performing instruction fetch operations are provided. The techniques include determining instruction addresses for a primary branch prediction path; requesting that a level 0 translation lookaside buffer (“TLB”) caches address translations for the primary branch prediction path; determining either or both of alternate control flow path instruction addresses and lookahead control flow path instruction addresses; and requesting that either the level 0 TLB or an alternative level TLB caches address translations for either or both of the alternate control flow path instruction addresses and the lookahead control flow path instruction addresses.Type: GrantFiled: June 26, 2020Date of Patent: February 14, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Ashok Tirupathy Venkatachar, Steven R. Havlir, Robert B. Cohen
-
Publication number: 20230039289Abstract: A data fabric routes requests between the plurality of requestors and the plurality of responders. The data fabric includes a crossbar router, a coherent slave controller coupled to the crossbar router, and a probe filter coupled to the coherent slave controller and tracking the state of cached lines of memory. Power state control circuitry operates, responsive to detecting any of a plurality of designated conditions, to cause the probe filter to enter a retention low power state in which a clock signal to the probe filter is gated while power is maintained to the probe filter. Entering the retention low power state is performed when all in-process probe filter lookups are complete.Type: ApplicationFiled: October 25, 2022Publication date: February 9, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Benjamin Tsien, Amit P. Apte
-
Patent number: 11573765Abstract: A processing unit implements a convolutional neural network (CNN) by fusing at least a portion of a convolution phase of the CNN with at least a portion of a batch normalization phase. The processing unit convolves two input matrices representing inputs and weights of a portion of the CNN to generate an output matrix. The processing unit performs the convolution via a series of multiplication operations, with each multiplication operation generating a corresponding submatrix (or “tile”) of the output matrix at an output register of the processing unit. While an output submatrix is stored at the output register, the processing unit performs a reduction phase and an update phase of the batch normalization phase for the CNN. The processing unit thus fuses at least a portion of the batch normalization phase of the CNN with a portion of the convolution.Type: GrantFiled: December 13, 2018Date of Patent: February 7, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Milind N. Nemlekar, Prerit Dak
-
Patent number: 11573853Abstract: Error checking data used in offloaded operations is disclosed. A remote execution device receives a request from a host to store a data block in a memory region. The data block includes data and host-generated error checking information for the data. The remote execution device updates the data block by overwriting the host-generated error checking information with locally generated error checking information for the data. The data block is then stored in the memory region.Type: GrantFiled: March 31, 2021Date of Patent: February 7, 2023Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Vilas Sridharan, Sudhanva Gurumurthi
-
Patent number: 11573801Abstract: A processor includes a register file and control logic that detects multiple different sets of sequential zero bits of a register in the register file, wherein each of the multiple different sets has a bit length that corresponds to a partial instruction width and operates at a first partial instruction width or a second partial instruction width with the register file depending on number of sets of zero bits detected in the register. In certain examples, the control logic causes operating at first instruction width that avoids merging of a first bit length of data in the register and operating at the second instruction width that avoids merging of a second bit length of data in the register. In some examples, a register rename map table incudes multiple zero bits that identify the detected multiple different sets of bits of sequential zeros.Type: GrantFiled: September 29, 2021Date of Patent: February 7, 2023Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Eric Dixon, Erik Swanson, Theodore Carlson, Ruchir Dalal, Michael Estlick
-
Patent number: 11573724Abstract: A processing apparatus is provided that includes NVRAM and one or more processors configured to process a first set and a second set of instructions according to a hierarchical processing scope and process a scoped persistence barrier residing in the program after the first instruction set and before the second instruction set. The barrier includes an instruction to cause first data to persist in the NVRAM before second data persists in the NVRAM. The first data results from execution of each of the first set of instructions processed according to the one hierarchical processing scope. The second data results from execution of each of the second set of instructions processed according to the one hierarchical processing scope. The processing apparatus also includes a controller configured to cause the first data to persist in the NVRAM before the second data persists in the NVRAM based on the scoped persistence barrier.Type: GrantFiled: June 5, 2019Date of Patent: February 7, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Arkaprava Basu, Mitesh R. Meswani, Dibakar Gope, Sooraj Puthoor
-
Patent number: 11573593Abstract: A power regulator provides current to a processing unit. A clock distribution network provides a clock signal to the processing unit. A level-based droop detector monitors a voltage of the current provided to the processing unit and provides a droop detection signal to the clock distribution network in response to the voltage falling below a first threshold voltage. The clock distribution network decreases a frequency of a clock signal provided to the processing unit in response to receiving the droop detection signal. The level-based droop detector interrupts the droop detection signal that is provided to the clock distribution network in response to the voltage rising above a second threshold voltage. The clock distribution network increases the frequency of the clock signal provided to the processing unit in response to interruption of the droop detection signal.Type: GrantFiled: April 16, 2018Date of Patent: February 7, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Richard Martin Born, Stephen Victor Kosonocky, Miguel Rodriguez
-
Patent number: 11575916Abstract: An encoding method is provided which includes receiving a plurality of images, obtaining values of elements in a portion of the images, sorting the elements according to different values of the elements, sorting the elements according to a number of occurrences of the different values and encoding the elements using a subset of the different values having corresponding numbers of occurrences that are higher than corresponding numbers of occurrences of other values. Examples also include a processing device and method for use with palette mode encoding in which the elements are a portion of pixels in images and the values are color values of the portion of pixels in the images.Type: GrantFiled: October 30, 2020Date of Patent: February 7, 2023Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Shu-Hsien Wu, Crystal Yeong-Pian Sau, Yang Liu, Wei Gao, Feng Pan, Ihab M. A. Amer, Ying Luo, Edward A. Harold, Gabor Sines, Ehsan Mirhadi
-
Publication number: 20230031388Abstract: Systems, methods, and devices for integrated circuit power management. A mode of a power management state is entered, from the power management state, in response to an entry condition of the mode. A device that is otherwise powered off in the power management state is powered on in the mode of the power management state. In some implementations, the device includes a communications path between a second device and a third device. In some implementations, the device is in a power domain that is powered off in the power management state. In some implementations, the power domain is powered off in the mode. In some implementations, the device is powered on in the mode via a power rail that is specific to the mode. In some implementations, the entry condition of the mode includes an amount of data stored for display in a display buffer falling below a threshold amount.Type: ApplicationFiled: July 30, 2021Publication date: February 2, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Benjamin Tsien, Indrani Paul, Alexander J. Branover, Thomas J. Gibney, Mihir Shaileshbhai Doctor, John P. Petry, Stephen V. Kosonocky, Christopher T. Weaver
-
Publication number: 20230031295Abstract: A disclosed technique includes triggering entry into a clock bypass mode, in which a bypass clock generator provides clock signals to functional elements and a primary clock generator does not provide clock signals to functional elements; and triggering exit from the clock bypass mode, in which the bypass clock generator does not provide clock signals to the functional elements and the primary clock generator does provide clock signals to the functional elements.Type: ApplicationFiled: July 30, 2021Publication date: February 2, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Thomas J. Gibney, Alexander J. Branover, Mihir Shaileshbhai Doctor, Xiaojie He, Indrani Paul, Benjamin Tsien, John P. Petry, Pitchaiah Katari