Patents Assigned to Advanced Micro Devices, Inc.
  • Patent number: 10643369
    Abstract: Techniques for improving memory utilization for communication between stages of a graphics processing pipeline are disclosed. The techniques include analyzing output instructions of a first shader program to determine whether any such output instructions output some data that is not used by a second shader program. The compiler performs data packing if gaps exist between used output data to reduce memory footprint. The compiler generates optimized output instructions in the first shader program and optimized input instructions in the second shader program to output the used data from the first shader program and input that data in the second shader program in a packed format based on information about usage of output data and data packing. If needed, the compiler inserts instructions to perform runtime checking to identify unused output data of the first shader program based on information not known at compile-time.
    Type: Grant
    Filed: May 30, 2018
    Date of Patent: May 5, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Guohua Jin, Richard A. Burns, Todd Martin, Gianpaolo Tommasi
  • Patent number: 10644826
    Abstract: An integrated circuit includes first and second through-silicon via (TSV) circuits and a steering logic circuit. The first TSV circuit has a first TSV and a first multiplexer for selecting between a first TSV data signal received from the first TSV and a first local data signal for transmission to a first TSV output terminal. The second TSV circuit includes a second TSV and a second multiplexer for selecting between a second TSV data signal received from the second TSV and the first local data signal for transmission to a second TSV output terminal. The steering logic circuit controls the first multiplexer to select the first local data signal and the second multiplexer to select the second TSV data signal in a first mode, and the first multiplexer to select the first TSV data signal and the second multiplexer to select the first local data signal in a second mode.
    Type: Grant
    Filed: February 23, 2018
    Date of Patent: May 5, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John Wuu, Samuel Naffziger, Michael K. Ciraula, Russell Schreiber
  • Patent number: 10642734
    Abstract: Systems, apparatuses, and methods for managing a non-power of two memory configuration are disclosed. A computing system includes at least one or more clients, a control unit, and a memory subsystem with a non-power of two number of active memory channels. The control unit reduces a ratio of the number of active memory channels over the total number of physical memory channels to a ratio of a first number to a second number. If a first subset of physical address bits of a received memory request are greater than or equal to the first number, the control unit calculates a third number which is equal to a second subset of physical address bits modulo the first number and the control unit uses a concatenation of the third number and a third subset of physical address bits to select a memory channel for issuing the received memory request.
    Type: Grant
    Filed: December 3, 2018
    Date of Patent: May 5, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Pazhani Pillai
  • Patent number: 10642336
    Abstract: A processor adjusts frequencies of one or more clock signals in response to a voltage droop at the processor. The processor generates at least one clock signal by generating a plurality of base clock signals, each of the base clock signals having a common frequency but a different phase. The processor also generates a plurality of enable signals, wherein each enable signal governs whether a corresponding one of the base clock signals is used to generate the clock signal. The enable signals therefore determine the frequency of the clock signal. In response to detecting a voltage droop, the processor adjusts the enable signals used to generate the clock signal, thereby reducing the frequency of the clock signal droop.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: May 5, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Steven Kommrusch, Amitabh Mehra, Richard Martin Born, Bobby D. Young
  • Patent number: 10644680
    Abstract: Systems, apparatuses, and methods for applying duty cycle correction to a level shifter via a feedback common mode resistor are disclosed. A circuit includes a capacitor, an inverter, and at least one feedback resistor. An input signal is received and coupled through the capacitor to the inverter. To correct for duty cycle distortion on the input signal, a duty cycle correction signal is applied to the at least one feedback resistor in the feedback path. The duty cycle correction signal can be applied as a voltage or as a current. In one implementation, the location of the injection point for applying the duty cycle correction signal within the at least one feedback resistor is programmable.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: May 5, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Milam Paraschou, Tracy J. Feist
  • Patent number: 10644004
    Abstract: A modified 1C1T cell detects when the charge in the memory cell drops below a predetermined voltage due to leakage and asserts a refresh signal indicating that refresh needs to be performed on those memory cells associated with the modified 1C1T memory cell. The associated memory cells may be a row, a bank, or other groupings of memory cells. Because temperature affects leakage current, the modified memory cell automatically adjusts for temperature.
    Type: Grant
    Filed: February 13, 2018
    Date of Patent: May 5, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Dmitri Yudanov, David A. Roberts
  • Publication number: 20200133992
    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Publication number: 20200133518
    Abstract: Memory management circuitry and processes operate to improve reliability of a group of memory stacks, providing that if a memory stack or a portion thereof fails during the product's lifetime, the system may still recover with no errors or data loss. A front-end controller receives a block of data requested to be written to memory, divides the block into sub-blocks, and creates a new redundant reliability sub-block. The sub-blocks are then written to different memory stacks. When reading data from the memory stacks, the front-end controller detects errors indicating a failure within one of the memory stacks, and recovers corrected data using the reliability sub-block. The front-end controller may monitor errors for signs of a stack failure and disable the failed stack.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Georgios Mappouras, Amin Farmahini Farahani, Michael Ignatowski
  • Publication number: 20200134445
    Abstract: The deep Q learning technique trains weights of an artificial neural network using a number of unique features, including separate target and prediction networks, random experience replay to avoid issues with temporally correlated training samples, and others. A hardware architecture is described that is tuned to perform deep Q learning. Inference cores use a prediction network to determine an action to apply to an environment. A replay memory stores the results of the action. Training cores use a loss function derived from outputs from both the target and prediction networks to update weights of the prediction neural networks. A high speed copy engine periodically copies weights from the prediction neural network to the target neural network.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shuai Che, Jieming Yin
  • Publication number: 20200133993
    Abstract: A processing device is provided which includes memory and a processor comprising a plurality of processor cores in communication with each other via first and second hierarchical communication links. Each processor core in a group of the processor cores is in communication with each other via the first hierarchical communication links. Each processor core is configured to store, in the memory, one of a plurality of sub-portions of data of a first matrix, store, in the memory, one of a plurality of sub-portions of data of a second matrix, determine an outer product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core of the group of processor cores, another sub-portion of data of the second matrix and determine another outer product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Publication number: 20200134248
    Abstract: Methods for debugging a processor based on executing a randomly created and randomly executed executable on a fabricated processor. The executable may execute via startup firmware. By implementing randomization at multiple levels in the testing of the processor, coupled with highly specific test generation constraint rules, highly focused tests on a micro-architectural feature are implemented while at the same time applying a high degree of random permutation in the way it stresses that specific feature. This allows for the detection and diagnosis of errors and bugs in the processor that elude traditional testing methods. The processor Once the errors and bugs are detected and diagnosed, the processor can then be redesigned to no longer produce the anomalies. By eliminating the errors and bugs in the processor, a processor with improved computational efficiency and reliability can be fabricated.
    Type: Application
    Filed: December 20, 2019
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Eric W. Schieve
  • Publication number: 20200133866
    Abstract: The disclosure herein provides techniques for designing cache compression algorithms that control how data in caches are compressed. The techniques generate a custom “byte select algorithm” by applying repeated transforms applied to an initial compression algorithm until a set of suitability criteria is met. The suitability criteria include that the “cost” is below a threshold and that a metadata constraint is met. The “cost” is the number of blocks that can be compressed by an algorithm as compared with the “ideal” algorithm. The metadata constraint is the number of bits required for metadata.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
  • Publication number: 20200133360
    Abstract: Control of power supplied to a machine intelligence (MI) processor is provided with an energy reservoir and power switching circuitry coupled to a power supply, the energy reservoir, and to power delivery circuitry of the MI processor. Control circuitry directs the power switching circuitry to charge the energy reservoir from the power supply or discharge the energy reservoir to the MI processor based on MI state information obtained from the MI processor. Processes for charging and discharging such an energy reservoir are provided. Processes for analyzing state information of the MI processor and configuring the control circuitry are also provided.
    Type: Application
    Filed: October 30, 2018
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Greg Sadowski
  • Patent number: 10637472
    Abstract: A reference voltage generation circuit for use with current mode logic includes a first transistor of a first conductivity type configured to operate as a diode-connected resistor with a source terminal coupled to a first voltage supply terminal for conducting a supply voltage and a gate terminal coupled to a drain terminal. Second and third transistors of a second conductivity type are coupled in series between the drain terminal of the first transistor and a second voltage supply terminal. Gate terminals of the second and third transistors coupled to the gate terminal of the first transistor. A reference voltage is obtained between the second and third transistors. The first and second NMOS transistors are sized such that they remain in sub-threshold mode operation during operation with an expected range of the supply voltage. Current mode logic circuits are also provided using the reference voltage generation circuit.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: April 28, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Aditya Mitra, Animesh Jain
  • Patent number: 10635588
    Abstract: A processing system includes a first set of one or more processing units including a first processing unit, a second set of one or more processing units including a second processing unit, and a memory having an address space shared by the first and second sets. The processing system further includes a distributed coherence directory subsystem having a first coherence directory to support a first subset of one or more address regions of the address space and a second coherence directory to support a second subset of one or more address regions of the address space. In some implementations, the first coherence directory is implemented in the system so as to have a lower access latency for the first set, whereas the second coherence directory is implemented in the system so as to have a lower access latency for the second set.
    Type: Grant
    Filed: June 5, 2018
    Date of Patent: April 28, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Yasuko Eckert, Maurice B. Steinman, Steven Raasch
  • Patent number: 10635591
    Abstract: Systems and methods selectively filter, buffer, and process cache coherency probes. A processor includes a probe buffering unit that includes a cache coherency probe buffer. The probe buffering unit receives cache coherency probes and memory access requests for a cache. The probe buffering unit identifies and discards any of the probes that are directed to a memory block that is not cached in the cache, and buffers at least a subset of the remaining probes in the probe buffer. The probe buffering unit submits to the cache, in descending order of priority, one or more of: any buffered probes that are directed to the memory block to which a current memory access request is also directed; any current memory access requests that are directed to a memory block to which there is not a buffered probe also directed; and any buffered probes when there is not a current memory access request.
    Type: Grant
    Filed: December 5, 2018
    Date of Patent: April 28, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Ashok T. Venkatachar, Anthony Jarvis
  • Patent number: 10636736
    Abstract: An integrated circuit assembly includes an integrated circuit package substrate and a conductive land pad disposed on a surface of the integrated circuit package substrate. The conductive land pad comprises a conductor portion, an isolated conductor portion, and an isolation portion disposed between the conductor portion and the isolated conductor portion. The isolated conductor portion may surround a first side of the conductor portion and a second side of the conductor portion. The isolated conductor portion may surround a portion of a perimeter of the conductor portion. The isolation portion may include a gap between the conductor portion and the isolated conductor portion. The gap may have a width smaller than a radius of an interconnect structure of a receiving structure.
    Type: Grant
    Filed: December 8, 2017
    Date of Patent: April 28, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sanjay Dandia, Gerald R. Talbot, Mahesh S. Hardikar
  • Publication number: 20200125490
    Abstract: A data processing system includes a host processor, a local memory coupled to the host processor, a plurality of remote memory media, and a scalable data fabric coupled to the host processor and to the plurality of remote memory media. The scalable data fabric includes a filter for storing information indicating a location of data that is stored by the data processing system. The host processor includes a hardware sequencer coupled to the filter for selectively moving data stored by the filter to the local memory.
    Type: Application
    Filed: October 23, 2018
    Publication date: April 23, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sergey Blagodurov, Timothy E. Landreth, Stanley Ames Lackey, JR., Patrick Conway
  • Patent number: 10627883
    Abstract: A processor includes a plurality of voltage droop detectors positioned at multiple points of a processor. The detectors monitor voltage levels and alert the processor if a droop event has been detected in real time. Multiple droops can be detected simultaneously, with each detected droop event generating an alert that is sent to a processor module, such as a clock control module, to act based on the detected droop. Each detector employs a ring oscillator that generates a periodic signal and a corresponding count based on that signal, where the frequency of the signal varies based on a voltage at the corresponding point being monitored.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: April 21, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Amitabh Mehra, Dana G. Lewis
  • Patent number: 10630271
    Abstract: A sampling circuit automatically resamples the data from another timing domain until the sampled data is represented correctly in the new domain by assuring that no metastable states exist. If a metastable state exists, a sampling signal recirculates through the sampling circuit until the metastable state no longer exists. A comparison of input data to sampled data is used to determine the existence of a metastable state.
    Type: Grant
    Filed: August 17, 2016
    Date of Patent: April 21, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Greg Sadowski