Patents Assigned to Advanced Micro Devices
  • Patent number: 10579557
    Abstract: A configurable computing system which uses near-memory and in-memory hardened logic blocks is described herein. The hardened logic blocks are incorporated into memory modules. The memory modules include an interface or communication logic to communicate between the configurable computing substrate and the memory module. In an implementation, the memory modules can include an on-die memory or other forms of non-configurable logic to enable more efficient processing for a variety of operations. In another implementation, the memory modules can include a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations. In another implementation, the memory modules can include an on-die memory and a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations.
    Type: Grant
    Filed: January 16, 2018
    Date of Patent: March 3, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Michael Ignatowski
  • Patent number: 10579388
    Abstract: A method for use in a processor for arbitrating between multiple processes to select wavefronts for execution on a shader core is provided. The processor includes a compute pipeline configured to issue wavefronts to the shader core for execution, a hardware queue descriptor associated with the compute pipeline, and the shader core. The shader core is configured to execute work for the compute pipeline corresponding to a first memory queue descriptor executed using data for the first memory queue descriptor that is loaded into a first hardware queue descriptor. The processor is configured to detect a context switch condition, and, responsive to the context switch condition, perform a context switch operation including loading data for a second memory queue descriptor into the first hardware queue descriptor. The shader core is configured to execute work corresponding to the second memory queue descriptor that is loaded into the first hardware queue descriptor.
    Type: Grant
    Filed: July 19, 2018
    Date of Patent: March 3, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Mark Leather, Michael Mantor, Rex McCrary, Sebastien Nussbaum, Philip J. Rogers, Ralph Clay Taylor, Thomas Woller
  • Patent number: 10581587
    Abstract: Systems, apparatuses, and methods for implementing a deskewing method for a physical layer interface on a multi-chip module are disclosed. A circuit connected to a plurality of communication lanes trains each lane to synchronize a local clock of the lane with a corresponding global clock at a beginning of a timing window. Next, the circuit symbol rotates each lane by a single step responsive to determining that all of the plurality of lanes have an incorrect symbol alignment. Responsive to determining that some but not all of the plurality of lanes have a correct symbol alignment, the circuit symbol rotates lanes which have an incorrect symbol alignment by a single step. When the end of the timing window has been reached, the circuit symbol rotates lanes which have a correct symbol alignment and adjusts a phase of a corresponding global clock to compensate for missed symbol rotations.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: March 3, 2020
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Varun Gupta, Milam Paraschou, Gerald R. Talbot, Gurunath Dollin, Damon Tohidi, Eric Ian Carpenter, Chad S. Gallun, Jeffrey Cooper, Hanwoo Cho, Thomas H. Likens, III, Scott F. Dow, Michael J. Tresidder
  • Patent number: 10582250
    Abstract: Systems, apparatuses, and methods for integrating a video codec with an inference engine are disclosed. A system is configured to implement an inference engine and a video codec while sharing at least a portion of its processing elements between the inference engine and the video codec. By sharing processing elements when combining the inference engine and the video codec, the silicon area of the combination is reduced. In one embodiment, the portion of processing elements which are shared include a motion prediction/motion estimation/MACs engine with a plurality of multiplier-accumulator (MAC) units, an internal memory, and peripherals. The peripherals include a memory interface, a direct memory access (DMA) engine, and a microprocessor. The system is configured to perform a context switch to reprogram the processing elements to switch between operating modes. The context switch can occur at a frame boundary or at a sub-frame boundary.
    Type: Grant
    Filed: July 24, 2017
    Date of Patent: March 3, 2020
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Lei Zhang, Sateesh Lagudu, Allen Rush, Razvan Dan-Dobre
  • Publication number: 20200066677
    Abstract: A data processor is implemented as an integrated circuit. The data processor includes a processor die. The processor die is connected to an integrated voltage regulator die using die-to-die bonding. The integrated voltage regulator die provides a regulated voltage to the processor die, and the processor die operates in response to the regulated voltage.
    Type: Application
    Filed: August 23, 2018
    Publication date: February 27, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Milind Bhagavat, David Hugh McIntyre, Rahul Agarwal
  • Publication number: 20200065113
    Abstract: Techniques for improving performance of accelerated processing devices (“APDs”) when exceptions occur are provided. In APDs, the very large number of parallel processing execution units, and the complexity of the hardware used to execute a large number of work-items in parallel, means that APDs typically stall when an exception occurs (unlike in central processing units (“CPUs”), which are able to execute speculatively and out-of-order). However, the techniques provided herein allow at least some execution to occur past exceptions. Execution past an exception generating instruction occurs by executing instructions that would not lead to a corruption while skipping those that would lead to a corruption. After the exception has been satisfied, execution occurs in a replay mode in which the potentially exception-generating instruction is executed and in which instructions that did not execute in the exception-wait mode are executed. A mask and counter are used to control execution in replay mode.
    Type: Application
    Filed: August 22, 2018
    Publication date: February 27, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Anthony T. Gutierrez
  • Patent number: 10572389
    Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. External system memory is used as a last-level cache and includes one of a variety of types of dynamic random access memory (DRAM). A memory controller generates a tag request and a separate data request based on a same, single received memory request. The sending of the tag request is prioritized over sending the data request. A partial tag comparison is performed during processing of the tag request. If a tag miss is detected for the partial tag comparison, then the data request is cancelled, and the memory request is sent to main memory. If one or more tag hits are detected for the partial tag comparison, then processing of the data request is dependent upon the result of the full tag comparison.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: February 25, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ravindra N. Bhargava, Ganesh Balakrishnan
  • Patent number: 10572183
    Abstract: A data processing system includes a memory and a data processor. The data processor is connected to the memory and adapted to access the memory in response to scheduled memory access requests. The data processor has power management logic that, in response to detecting a memory power state change, determines whether to retrain or suppress retraining of at least one parameter related to accessing the memory based on an operating state of the memory. The power management logic further determines a retraining interval for retraining the at least one parameter related to accessing the memory, and initiates a retraining operation in response to the memory power state change based on the operating state of the memory being outside of a predetermined threshold.
    Type: Grant
    Filed: October 18, 2017
    Date of Patent: February 25, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sonu Arora, Guhan Krishnan, Kevin Brandl
  • Patent number: 10573630
    Abstract: A three-dimensional integrated circuit includes a first die having a first geometry. The first die includes a first region that operates with a first power density and a second region that operates with a second power density. The first power density is less than the second power density. The first die includes first electrical contacts disposed in the first region on a first side of the first die along a periphery of the first die. The three-dimensional integrated circuit includes a second die having a second geometry. The second die includes second electrical contacts disposed on a first side of the second die. A stacked portion of the second die is stacked within the periphery of the first die and an overhang portion of the second die extends beyond the periphery of the first die. The second electrical contacts are aligned with and coupled to the first electrical contacts.
    Type: Grant
    Filed: April 20, 2018
    Date of Patent: February 25, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Brett P. Wilkerson, Milind Bhagavat, Rahul Agarwal, Dmitri Yudanov
  • Publication number: 20200057717
    Abstract: In one form, a data processing system includes a host integrated circuit having a memory controller, a memory bus coupled to the memory controller, and a memory module. The memory module includes a bulk memory and a memory module scratchpad coupled to the bulk memory, wherein the memory module scratchpad has a lower access overhead than the bulk memory. The memory controller selectively provides predetermined commands over the memory bus to cause the memory module to copy data between the bulk memory and the memory module scratchpad without conducting data on the memory bus in response to a data movement decision.
    Type: Application
    Filed: August 17, 2018
    Publication date: February 20, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini Farahani, Michael Ignatowski
  • Patent number: 10568025
    Abstract: An approach is provided for data processing methods wherein a PHY layer transmit operation is constructed by aggregating MAC layer data units (MPDUs) by means of a linked list. A link list of TX descriptors can be formed separately from the payloads. In order to minimize processor speed rates, the next transmission buffer is pre-fetched whenever appropriate. Interlocks are used to prevent conflict on descriptor fetches and payload fetches.
    Type: Grant
    Filed: July 13, 2015
    Date of Patent: February 18, 2020
    Assignees: ADVANCED MICRO DEVICES, INC., AMD FAR EAST LTD.
    Inventors: Sebastian Ahmed, Douglas A. Mammoser
  • Patent number: 10558489
    Abstract: Systems, apparatuses, and methods for suspending and restoring operations on a processor are disclosed. In one embodiment, a processor includes at least a control unit, multiple execution units, and multiple work creation units. In response to detecting a request to suspend a software application executing on the processor, the control unit sends requests to the plurality of work creation units to stop creating new work. The control unit waits until receiving acknowledgements from the work creation units prior to initiating a suspend operation. Once all work creation units have acknowledged that they have stopped creating new work, the control unit initiates the suspend operation. Also, when a restore operation is initiated, the control unit prevents any work creation units from launching new work-items until all previously in-flight work-items have been restored to the same work creation units and execution units to which they were previously allocated.
    Type: Grant
    Filed: February 21, 2017
    Date of Patent: February 11, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander Fuad Ashkar, Michael J. Mantor, Randy Wayne Ramsey, Rex Eldon McCrary, Harry J. Wise
  • Patent number: 10560022
    Abstract: An apparatus includes an integrated circuit chip with a set of circuits having two or more subsets of circuits; an external voltage regulator separate from the integrated circuit chip; two or more integrated voltage regulators on the integrated circuit chip that each provide an input voltage to a respective subset of the circuits; and a controller. The controller determines, using an integrated voltage regulator power loss model, an electrical power loss for the integrated voltage regulators for a first combination of operating points for the subsets of the circuits. The controller then determines, based on the electrical power loss, a second combination of operating points for the subsets of the circuits that includes an adjustment to an operating point for at least one of the subsets of the circuits that compensates for an electrical power loss of the corresponding integrated voltage regulator.
    Type: Grant
    Filed: June 13, 2019
    Date of Patent: February 11, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Wei Huang, Miguel Rodriguez, Karthik Rao
  • Patent number: 10558499
    Abstract: Footprints, or resource allocations, of waves within resources that are shared by processor cores in a multithreaded processor are measured concurrently with the waves executing on the processor cores. The footprints are averaged over a time interval. A number of waves are spawned and dispatched for execution in the multithreaded processor based on the average footprint. In some cases, the waves are spawned at a rate that is determined based on the average value of the footprints of waves within the resources. The rate of spawning waves is modified in response to a change in the average value of the footprints of the waves within the resources.
    Type: Grant
    Filed: October 26, 2017
    Date of Patent: February 11, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Maxim V. Kazakov, Michael Mantor
  • Patent number: 10558466
    Abstract: Systems, apparatuses, and methods for adjusting group sizes to match a processor lane width are described. In early iterations of an algorithm, a processor partitions a dataset into groups of data points which are integer multiples of the processing lane width of the processor. For example, when performing a K-means clustering algorithm, the processor determines that a first plurality of data points belong to a first group during a given iteration. If the first plurality of data points is not an integer multiple of the number of processing lanes, then the processor reassigns a first number of data points from the first plurality of data points to one or more other groups. The processor then performs the next iteration with these first number of data points assigned to other groups even though the first number of data points actually meets the algorithmic criteria for belonging to the first group.
    Type: Grant
    Filed: June 23, 2016
    Date of Patent: February 11, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mauricio Breternitz, Mayank Daga
  • Patent number: 10558418
    Abstract: A technique for implementing synchronization monitors on an accelerated processing device (“APD”) is provided. Work on an APD includes workgroups that include one or more wavefronts. All wavefronts of a workgroup execute on a single compute unit. A monitor is a synchronization construct that allows workgroups to stall until a particular condition is met. Responsive to all wavefronts of a workgroup executing a wait instruction, the monitor coordinator records the workgroup in an “entry queue.” The workgroup begins saving its state to a general APD memory and, when such saving is complete, the monitor coordinator moves the workgroup to a “condition queue.” When the condition specified by the wait instruction is met, the monitor coordinator moves the workgroup to a “ready queue,” and, when sufficient resources are available on a compute unit, the APD schedules the ready workgroup for execution on a compute unit.
    Type: Grant
    Filed: July 27, 2017
    Date of Patent: February 11, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexandru Dutu, Bradford M. Beckmann
  • Patent number: 10558606
    Abstract: Systems, apparatuses, and methods for reliably transmitting data over voltage scaled links are disclosed. A computing system includes at least first and second devices connected via a link. In one implementation, if a data block can be compressed to less than or equal to half the original size of the data block, then the data block is compressed and sent on the link in a single clock cycle rather than two clock cycles. If the data block cannot be compressed to half the original size, but if the data block can be compressed enough to include error correction code (ECC) bits without exceeding the original size, then ECC bits are added to the compressed block which is sent on the link at a reduced voltage. The ECC bits help to correct for any errors that are generated as a result of operating the link at the reduced voltage.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: February 11, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shomit N. Das, Matthew Tomei, Shrikanth Ganapathy, John Kalamatianos
  • Patent number: 10558591
    Abstract: Systems, apparatuses, and methods for implementing priority adjustment forwarding are disclosed. A system includes at least one or more processing units, a memory, and a communication fabric coupled to the processing unit(s) and the memory. The communication fabric includes a plurality of arbitration points. When a client determines that its bandwidth requirements are not being met, the client generates and sends an in-band priority adjustment request to the nearest arbitration point. This arbitration point receives the in-band priority adjustment request and then identifies any pending requests which are buffered at the arbitration point which meet the criteria specified by the in-band priority adjustment request. The arbitration point adjusts the priority of any identified requests, and then the arbitration point forwards the in-band priority adjustment request on the fabric to the next upstream arbitration point which processes the in-band priority adjustment request in the same manner.
    Type: Grant
    Filed: October 9, 2017
    Date of Patent: February 11, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alan Dodson Smith, Eric Christopher Morton, Vydhyanathan Kalyanasundharam, Joe G. Cruz
  • Publication number: 20200041577
    Abstract: In one form, a power supply monitor including a current controlled oscillator circuit, a time-to-digital converter, and an output divider. The current controlled oscillator circuit has an input for receiving a power supply voltage to be measured, and an output for providing a frequency signal having a frequency linearly proportional to the power supply voltage. The time-to-digital converter has an input coupled to the output of the current controlled oscillator circuit, and an output for providing a count signal representative of a number of cycles of a reference clock signal per cycle of the frequency signal. The output divider has an input coupled to the output of the time-to-digital converter, and an output for providing a divided count signal representative of a value of the power supply voltage, and provides the divided count signal by dividing a fixed number by the count signal.
    Type: Application
    Filed: August 3, 2018
    Publication date: February 6, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Ravinder Reddy Rachala, Stephen Victor Kosonocky, Miguel Rodriguez
  • Patent number: 10553258
    Abstract: Managing temperature of a semiconductor device having a temperature inverted processor core and stacked memory by operation of an integrated thermoelectric cooler. The thermoelectric cooler is operated to pump heat from a stacked memory device that requires a cool operating temperature to a temperature inverted processor core that maintains a higher operating temperature until threshold operating temperatures are achieved.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: February 4, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Wei Huang