Patents Assigned to Advanced Micros Devices, Inc.
-
Patent number: 10643369Abstract: Techniques for improving memory utilization for communication between stages of a graphics processing pipeline are disclosed. The techniques include analyzing output instructions of a first shader program to determine whether any such output instructions output some data that is not used by a second shader program. The compiler performs data packing if gaps exist between used output data to reduce memory footprint. The compiler generates optimized output instructions in the first shader program and optimized input instructions in the second shader program to output the used data from the first shader program and input that data in the second shader program in a packed format based on information about usage of output data and data packing. If needed, the compiler inserts instructions to perform runtime checking to identify unused output data of the first shader program based on information not known at compile-time.Type: GrantFiled: May 30, 2018Date of Patent: May 5, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Guohua Jin, Richard A. Burns, Todd Martin, Gianpaolo Tommasi
-
Patent number: 10642336Abstract: A processor adjusts frequencies of one or more clock signals in response to a voltage droop at the processor. The processor generates at least one clock signal by generating a plurality of base clock signals, each of the base clock signals having a common frequency but a different phase. The processor also generates a plurality of enable signals, wherein each enable signal governs whether a corresponding one of the base clock signals is used to generate the clock signal. The enable signals therefore determine the frequency of the clock signal. In response to detecting a voltage droop, the processor adjusts the enable signals used to generate the clock signal, thereby reducing the frequency of the clock signal droop.Type: GrantFiled: July 12, 2016Date of Patent: May 5, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Steven Kommrusch, Amitabh Mehra, Richard Martin Born, Bobby D. Young
-
Patent number: 10644826Abstract: An integrated circuit includes first and second through-silicon via (TSV) circuits and a steering logic circuit. The first TSV circuit has a first TSV and a first multiplexer for selecting between a first TSV data signal received from the first TSV and a first local data signal for transmission to a first TSV output terminal. The second TSV circuit includes a second TSV and a second multiplexer for selecting between a second TSV data signal received from the second TSV and the first local data signal for transmission to a second TSV output terminal. The steering logic circuit controls the first multiplexer to select the first local data signal and the second multiplexer to select the second TSV data signal in a first mode, and the first multiplexer to select the first TSV data signal and the second multiplexer to select the first local data signal in a second mode.Type: GrantFiled: February 23, 2018Date of Patent: May 5, 2020Assignee: Advanced Micro Devices, Inc.Inventors: John Wuu, Samuel Naffziger, Michael K. Ciraula, Russell Schreiber
-
Publication number: 20200133360Abstract: Control of power supplied to a machine intelligence (MI) processor is provided with an energy reservoir and power switching circuitry coupled to a power supply, the energy reservoir, and to power delivery circuitry of the MI processor. Control circuitry directs the power switching circuitry to charge the energy reservoir from the power supply or discharge the energy reservoir to the MI processor based on MI state information obtained from the MI processor. Processes for charging and discharging such an energy reservoir are provided. Processes for analyzing state information of the MI processor and configuring the control circuitry are also provided.Type: ApplicationFiled: October 30, 2018Publication date: April 30, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Greg Sadowski
-
Publication number: 20200133992Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.Type: ApplicationFiled: October 31, 2018Publication date: April 30, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
-
Publication number: 20200133518Abstract: Memory management circuitry and processes operate to improve reliability of a group of memory stacks, providing that if a memory stack or a portion thereof fails during the product's lifetime, the system may still recover with no errors or data loss. A front-end controller receives a block of data requested to be written to memory, divides the block into sub-blocks, and creates a new redundant reliability sub-block. The sub-blocks are then written to different memory stacks. When reading data from the memory stacks, the front-end controller detects errors indicating a failure within one of the memory stacks, and recovers corrected data using the reliability sub-block. The front-end controller may monitor errors for signs of a stack failure and disable the failed stack.Type: ApplicationFiled: October 31, 2018Publication date: April 30, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Georgios Mappouras, Amin Farmahini Farahani, Michael Ignatowski
-
Publication number: 20200133866Abstract: The disclosure herein provides techniques for designing cache compression algorithms that control how data in caches are compressed. The techniques generate a custom “byte select algorithm” by applying repeated transforms applied to an initial compression algorithm until a set of suitability criteria is met. The suitability criteria include that the “cost” is below a threshold and that a metadata constraint is met. The “cost” is the number of blocks that can be compressed by an algorithm as compared with the “ideal” algorithm. The metadata constraint is the number of bits required for metadata.Type: ApplicationFiled: October 31, 2018Publication date: April 30, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Shomit N. Das, Matthew Tomei, David A. Wood
-
Publication number: 20200134248Abstract: Methods for debugging a processor based on executing a randomly created and randomly executed executable on a fabricated processor. The executable may execute via startup firmware. By implementing randomization at multiple levels in the testing of the processor, coupled with highly specific test generation constraint rules, highly focused tests on a micro-architectural feature are implemented while at the same time applying a high degree of random permutation in the way it stresses that specific feature. This allows for the detection and diagnosis of errors and bugs in the processor that elude traditional testing methods. The processor Once the errors and bugs are detected and diagnosed, the processor can then be redesigned to no longer produce the anomalies. By eliminating the errors and bugs in the processor, a processor with improved computational efficiency and reliability can be fabricated.Type: ApplicationFiled: December 20, 2019Publication date: April 30, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Eric W. Schieve
-
Publication number: 20200133993Abstract: A processing device is provided which includes memory and a processor comprising a plurality of processor cores in communication with each other via first and second hierarchical communication links. Each processor core in a group of the processor cores is in communication with each other via the first hierarchical communication links. Each processor core is configured to store, in the memory, one of a plurality of sub-portions of data of a first matrix, store, in the memory, one of a plurality of sub-portions of data of a second matrix, determine an outer product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core of the group of processor cores, another sub-portion of data of the second matrix and determine another outer product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.Type: ApplicationFiled: October 31, 2018Publication date: April 30, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
-
Publication number: 20200134445Abstract: The deep Q learning technique trains weights of an artificial neural network using a number of unique features, including separate target and prediction networks, random experience replay to avoid issues with temporally correlated training samples, and others. A hardware architecture is described that is tuned to perform deep Q learning. Inference cores use a prediction network to determine an action to apply to an environment. A replay memory stores the results of the action. Training cores use a loss function derived from outputs from both the target and prediction networks to update weights of the prediction neural networks. A high speed copy engine periodically copies weights from the prediction neural network to the target neural network.Type: ApplicationFiled: October 31, 2018Publication date: April 30, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Shuai Che, Jieming Yin
-
Patent number: 10635588Abstract: A processing system includes a first set of one or more processing units including a first processing unit, a second set of one or more processing units including a second processing unit, and a memory having an address space shared by the first and second sets. The processing system further includes a distributed coherence directory subsystem having a first coherence directory to support a first subset of one or more address regions of the address space and a second coherence directory to support a second subset of one or more address regions of the address space. In some implementations, the first coherence directory is implemented in the system so as to have a lower access latency for the first set, whereas the second coherence directory is implemented in the system so as to have a lower access latency for the second set.Type: GrantFiled: June 5, 2018Date of Patent: April 28, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Yasuko Eckert, Maurice B. Steinman, Steven Raasch
-
Patent number: 10637472Abstract: A reference voltage generation circuit for use with current mode logic includes a first transistor of a first conductivity type configured to operate as a diode-connected resistor with a source terminal coupled to a first voltage supply terminal for conducting a supply voltage and a gate terminal coupled to a drain terminal. Second and third transistors of a second conductivity type are coupled in series between the drain terminal of the first transistor and a second voltage supply terminal. Gate terminals of the second and third transistors coupled to the gate terminal of the first transistor. A reference voltage is obtained between the second and third transistors. The first and second NMOS transistors are sized such that they remain in sub-threshold mode operation during operation with an expected range of the supply voltage. Current mode logic circuits are also provided using the reference voltage generation circuit.Type: GrantFiled: May 21, 2019Date of Patent: April 28, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Aditya Mitra, Animesh Jain
-
Patent number: 10635591Abstract: Systems and methods selectively filter, buffer, and process cache coherency probes. A processor includes a probe buffering unit that includes a cache coherency probe buffer. The probe buffering unit receives cache coherency probes and memory access requests for a cache. The probe buffering unit identifies and discards any of the probes that are directed to a memory block that is not cached in the cache, and buffers at least a subset of the remaining probes in the probe buffer. The probe buffering unit submits to the cache, in descending order of priority, one or more of: any buffered probes that are directed to the memory block to which a current memory access request is also directed; any current memory access requests that are directed to a memory block to which there is not a buffered probe also directed; and any buffered probes when there is not a current memory access request.Type: GrantFiled: December 5, 2018Date of Patent: April 28, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Ashok T. Venkatachar, Anthony Jarvis
-
Patent number: 10636736Abstract: An integrated circuit assembly includes an integrated circuit package substrate and a conductive land pad disposed on a surface of the integrated circuit package substrate. The conductive land pad comprises a conductor portion, an isolated conductor portion, and an isolation portion disposed between the conductor portion and the isolated conductor portion. The isolated conductor portion may surround a first side of the conductor portion and a second side of the conductor portion. The isolated conductor portion may surround a portion of a perimeter of the conductor portion. The isolation portion may include a gap between the conductor portion and the isolated conductor portion. The gap may have a width smaller than a radius of an interconnect structure of a receiving structure.Type: GrantFiled: December 8, 2017Date of Patent: April 28, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Sanjay Dandia, Gerald R. Talbot, Mahesh S. Hardikar
-
Publication number: 20200125490Abstract: A data processing system includes a host processor, a local memory coupled to the host processor, a plurality of remote memory media, and a scalable data fabric coupled to the host processor and to the plurality of remote memory media. The scalable data fabric includes a filter for storing information indicating a location of data that is stored by the data processing system. The host processor includes a hardware sequencer coupled to the filter for selectively moving data stored by the filter to the local memory.Type: ApplicationFiled: October 23, 2018Publication date: April 23, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Sergey Blagodurov, Timothy E. Landreth, Stanley Ames Lackey, JR., Patrick Conway
-
Patent number: 10628063Abstract: A method and device generates a slab identifier and a hash function identifier in response to a memory allocation request with a request identifier and allocation size from a memory allocation requestor. The slab identifier indicates a memory region associated with a base data size and the hash function identifier indicates a hash function. The method and device provides a bit string including the slab identifier and the hash function identifier to the memory allocation requestor.Type: GrantFiled: August 24, 2018Date of Patent: April 21, 2020Assignee: Advanced Micro Devices, Inc.Inventor: Alexander Dodd Breslow
-
Patent number: 10630271Abstract: A sampling circuit automatically resamples the data from another timing domain until the sampled data is represented correctly in the new domain by assuring that no metastable states exist. If a metastable state exists, a sampling signal recirculates through the sampling circuit until the metastable state no longer exists. A comparison of input data to sampled data is used to determine the existence of a metastable state.Type: GrantFiled: August 17, 2016Date of Patent: April 21, 2020Assignee: Advanced Micro Devices, Inc.Inventor: Greg Sadowski
-
Patent number: 10627883Abstract: A processor includes a plurality of voltage droop detectors positioned at multiple points of a processor. The detectors monitor voltage levels and alert the processor if a droop event has been detected in real time. Multiple droops can be detected simultaneously, with each detected droop event generating an alert that is sent to a processor module, such as a clock control module, to act based on the detected droop. Each detector employs a ring oscillator that generates a periodic signal and a corresponding count based on that signal, where the frequency of the signal varies based on a voltage at the corresponding point being monitored.Type: GrantFiled: February 28, 2018Date of Patent: April 21, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Amitabh Mehra, Dana G. Lewis
-
Patent number: 10628124Abstract: Techniques and circuits are provided for stochastic rounding. In an embodiment, a circuit includes carry-save adder (CSA) logic having three or more CSA inputs, a CSA sum output, and a CSA carry output. One of the three or more CSA inputs is presented with a random number value, while other CSA inputs are presented with input values to be summed. The circuit further includes adder logic having adder inputs and a sum output. The CSA carry output of the CSA logic is coupled with one of the adder inputs of the adder logic, and the CSA sum output of the CSA logic is coupled with another input of the adder inputs of the adder logic. A particular number of most significant bits of the sum output of the adder logic represent a stochastically rounded sum of the input values.Type: GrantFiled: March 22, 2018Date of Patent: April 21, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Gabriel H. Loh
-
Publication number: 20200117617Abstract: Described is a method and apparatus for application migration between a dockable device and a docking station in a seamless manner. The dockable device includes a processor and the docking station includes a high-performance processor. The method includes determining a docking state of a dockable device while at least an application is running. Application migration from the dockable device to a docking station is initiated when the dockable device is moving to a docked state. Application migration from the docking station to the dockable device is initiated when the dockable device is moving to an undocked state. The application continues to run during the application migration from the dockable device to the docking station or during the application migration from the docking station to the dockable device.Type: ApplicationFiled: December 6, 2019Publication date: April 16, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Jonathan Lawrence Campbell, Yuping Shen