Patents Assigned to ADVANCED MICRO DEVICES (AMD)
-
Patent number: 9727435Abstract: A method for automatically scaling estimates of digital power consumed by a portion of an integrated circuit (IC) device by the operating frequency of the portion of the IC are described herein. The method may include obtaining an energy value which may correspond to an amount of energy used by the portion of the IC. A cumulative energy value may be generated by repeatedly, at a frequency proportional to the operating frequency of the portion of the IC, obtaining energy values and adding each obtained energy value to a sum of energy values for the portion of the IC. The cumulative energy value may be sampled at a time sample interval to generate an estimate of the portion of the IC's digital power consumption that is automatically scaled with the operating frequency of the portion of the IC.Type: GrantFiled: June 22, 2015Date of Patent: August 8, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Samuel D. Naffziger, Suresh B. Periyacheri
-
Patent number: 9710034Abstract: A method and apparatus using temperature margin to balance performance with power allocation. Nominal, middle and high power levels are determined for compute elements. A set of temperature thresholds are determined that drive the power allocation of the compute elements towards a balanced temperature profile. For a given workload, temperature differentials are determined for each of the compute elements relative the other compute elements, where the temperature differentials correspond to workload utilization of the compute element. If temperature overhead is available, and a compute element is below a temperature threshold, then particular compute elements are allocated power to match or drive toward the balanced temperature profile.Type: GrantFiled: June 8, 2015Date of Patent: July 18, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Samuel D. Naffziger, Michael Osborn, Sebastien Nussbaum
-
Patent number: 9697176Abstract: A method of multiplication of a sparse matrix and a vector to obtain a new vector and a system for implementing the method are claimed. Embodiments of the method are intended to optimize the performance of sparse matrix-vector multiplication in highly parallel processors, such as GPUs. The sparse matrix is stored in compressed sparse row (CSR) format.Type: GrantFiled: November 14, 2014Date of Patent: July 4, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Mayank Daga, Joseph L. Greathouse
-
Patent number: 9685953Abstract: In one form, a logic circuit includes an asynchronous logic circuit, a synchronous logic circuit, and an interface circuit coupled between the asynchronous logic circuit and the synchronous logic circuit. The asynchronous logic circuit has a plurality of asynchronous outputs for providing a corresponding plurality of asynchronous signals. The synchronous logic circuit has a plurality of synchronous inputs corresponding to the plurality of asynchronous outputs, a stretch input for receiving a stretch signal, and a clock output for providing a clock signal. The synchronous logic circuit provides the clock signal as a periodic signal but prolongs a predetermined state of the clock signal while the stretch signal is active. The asynchronous interface detects whether metastability could occur when latching any of the plurality of the asynchronous outputs of the asynchronous logic circuit using said clock signal, and activates the stretch signal while the metastability could occur.Type: GrantFiled: September 9, 2016Date of Patent: June 20, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Greg Sadowski
-
Patent number: 9679345Abstract: A frame pacing method, computer program product, and computing system are provided for graphics processing. A method and system for frame pacing adds a delay which evenly spaces out the display of the subsequent frames, and a measurement mechanism which measures and adjusts the delay as application workload changes in an evenly spaced manner.Type: GrantFiled: August 6, 2015Date of Patent: June 13, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Jonathan Lawrence Campbell, Mitchell H. Singer, Yuping Shen, Yue Zhuo
-
Patent number: 9672161Abstract: The described embodiments include a cache controller that configures a cache management mechanism. In the described embodiments, the cache controller is configured to monitor at least one structure associated with a cache to determine at least one cache block that may be accessed during a future access in the cache. Based on the determination of the at least one cache block that may be accessed during a future access in the cache, the cache controller configures the cache management mechanism.Type: GrantFiled: December 9, 2012Date of Patent: June 6, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Gabriel H. Loh, Yasuko Eckert
-
Patent number: 9658960Abstract: A method and apparatus for controlling affinity of subcaches is disclosed. When a core compute unit evicts a line of victim data, a prioritized search for space allocation on available subcaches is executed, in order of proximity between the subcache and the compute unit. The victim data may be injected into an adjacent subcache if space is available. Otherwise, a line may be evicted from the adjacent subcache to make room for the victim data or the victim data may be sent to the next closest subcache. To retrieve data, a core compute unit sends a Tag Lookup Request message directly to the nearest subcache as well as to a cache controller, which controls routing of messages to all of the subcaches. A Tag Lookup Response message is sent back to the cache controller to indicate if the requested data is located in the nearest sub-cache.Type: GrantFiled: December 22, 2010Date of Patent: May 23, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Greggory D. Donley
-
Publication number: 20170123670Abstract: A memory-to-memory copy operation control system includes a processor configured to receive an instruction to perform a memory-to-memory copy operation and a memory module network in communication with the processor. The memory module network has a plurality of memory modules that include a proximal memory module in direct communication with the processor and one or more additional memory modules in communication with the processor via the proximal memory module. The system also includes a memory controller in communication with the processor and the network of memory modules. The processor is configured to issue a first command causing data to be copied from a first memory module to a second memory module without sending the data to the processor or the memory controller.Type: ApplicationFiled: October 28, 2015Publication date: May 4, 2017Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Nuwan Jayasena, David A. Roberts
-
Patent number: 9639280Abstract: The disclosed embodiments provide a system for processing a memory command on a computer system. During operation, a command scheduler executing on a memory controller of the computer system obtains a predicted latency of the memory command based on a memory address to be accessed by the memory command. Next, the command scheduler orders the memory command with other memory commands in a command queue for subsequent processing by a memory resource on the computer system based on the predicted latency of the memory command.Type: GrantFiled: June 18, 2015Date of Patent: May 2, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventor: David A. Roberts
-
Patent number: 9632848Abstract: A system and method for allocating commands in processing is disclosed. The system and method includes an application running on a computer system that provides commands to be executed on one of a plurality of processors capable of executing the commands, the commands provided through an application programming interface, a device driver that buffers the streamed commands and converts the streamed commands into a format used by a GPU, and an operating system that builds a command buffer by grouping a plurality of converted commands based on an allocation for an available processor, wherein the available processor is determined in the interface between the device driver and the operating system. The available processor is one of the plurality of processors capable of executing the commands that receives the command buffer from the operating system, queues the command buffer and performs an asynchronous submission of the command buffer to the GPU, and the GPU executes the command buffer.Type: GrantFiled: December 29, 2015Date of Patent: April 25, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: David Oldcorn, Timour T. Paltashev
-
Publication number: 20170102971Abstract: The methods and apparatus can assign processing core workloads to processing cores from a heterogeneous instruction set architectures (ISA) pool of available processing cores based on processing core metric results. For example, the method and apparatus can obtain processing core metric results for one or more processing cores, such as processing cores within general purpose processors, from a heterogeneous ISA pool of available processing cores. The method and apparatus can also obtain one or more processing core workloads, such as software applications or software processes, from a pool of available processing core workloads to be assigned. The method and apparatus can then assign one or more processing core workloads that have higher priority than others from the pool of available processing core workloads to a processing core from the heterogeneous ISA pool of available processing cores based on its processing core metric result.Type: ApplicationFiled: October 12, 2015Publication date: April 13, 2017Applicant: ADVANCED MICRO DEVICES, INC.Inventor: Sergey Blagodurov
-
Patent number: 9606177Abstract: In one form, a scan flip-flop includes a clock gating cell and a dedicated clock flip-flop. The clock gating cell provides an input clock input signal as a scan clock signal when a scan shift enable signal is active, and provides the input clock signal as a data clock signal when the scan shift enable signal is inactive. The dedicated clock flip-flop stores a data input signal and provides the data input signal, so stored, as a data output signal in response to transitions of the data clock signal, and stores a scan data input signal and provides the scan data input signal, so stored, as the data output signal in response to transitions of the scan clock signal. The clock gating cell may further provide the input clock signal as the data clock signal when both a scan shift enable signal is inactive and a data enable signal is active.Type: GrantFiled: May 19, 2015Date of Patent: March 28, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Daniel W. Bailey, Abhishek Sharma, Michael Q. Co
-
Publication number: 20170085472Abstract: A communication device includes a data source that generates data for transmission over a bus, and a data encoder that receives and encodes outgoing data. An encoder system receives outgoing data from a data source and stores the outgoing data in a first queue. An encoder encodes outgoing data with a header type that is based upon a header type indication from a controller and stores the encoded data that may be a packet or a data word with at least one layered header in a second queue for transmission. The device is configured to receive at a payload extractor, a packet protocol change command from the controller and to remove the encoded data and to re-encode the data to create a re-encoded data packet and placing the re-encoded data packet in the second queue for transmission.Type: ApplicationFiled: September 21, 2015Publication date: March 23, 2017Applicant: ADVANCED MICRO DEVICES, INC.Inventors: David A. Roberts, Michael Ignatowski, Nuwan Jayasena, Gabriel H. Loh
-
Patent number: 9594521Abstract: In one form, scheduling data migration comprises determining whether the data is likely to be used by an input/output (I/O) device, the data being at a location remote to the I/O device; and scheduling the data for migration from the remote location to a location local to the I/O device in response to determining that the data is likely to be used by the I/O device.Type: GrantFiled: February 23, 2015Date of Patent: March 14, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Sergey Blagodurov, Andrew G. Kegel
-
Patent number: 9582402Abstract: The described embodiments include a networking subsystem in a second computing device that is configured to receive a task message from a first computing device. Based on the task message, the networking subsystem updates an entry in a task queue with task information from the task message. A processing subsystem in the second computing device subsequently retrieves the task information from the task queue and performs the corresponding task. In these embodiments, the networking subsystem processes the task message (e.g., stores the task information in the task queue) without causing the processing subsystem to perform operations for processing the task message.Type: GrantFiled: January 26, 2014Date of Patent: February 28, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Steven K. Reinhardt, Michael L. Chu, Vinod Tipparaju, Walter B. Benton
-
Patent number: 9576637Abstract: A data processing system includes a memory channel and a data processor coupled to the memory channel. The data processor is adapted to access at least one rank and has refresh logic. In response to an activation of the refresh logic, the data processor generates refresh cycles to a bank of the memory channel. The data processor selects one of a first state corresponding to a first auto-refresh command that causes the data processor to auto-refresh the bank, and a second state corresponding to a second auto-refresh command that causes the data processor to auto-refresh a selected subset of the bank. The data processor initiates a switch between the first state and the second state in response to the refresh logic detecting a first condition related to the bank, and between the second state and the first state in response to the refresh logic circuit detecting a second condition.Type: GrantFiled: May 25, 2016Date of Patent: February 21, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Kedarnath Balakrishnan
-
Patent number: 9563402Abstract: A method and apparatus for additive range reduction are disclosed. A constant may be pre-stored in a look-up table (LUT), and at least one section of the constant may be retrieved from the LUT for generating a product of an input argument and the constant such that a precision of the product may be controlled in any granularity. For a trigonometric function, 2/? is stored in the LUT, and at least one section of 2/? may be retrieved from the LUT. The argument is multiplied with the retrieved sections of 2/?. The retrieved sections are determined to correctly generate the two least significant bits (LSBs) of an integer portion and a scalable number of most significant bits of the multiplication result. An output of the trigonometric function is generated for the argument with a fractional portion of the multiplication result based on two LSBs of the integer portion of the multiplication result.Type: GrantFiled: September 1, 2011Date of Patent: February 7, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Christopher L. Spencer, Yun-Xiao Zou, Brian L. Sumner
-
Publication number: 20170031853Abstract: A communication device includes a data source that generates data for transmission over a bus, and that further includes a data encoder coupled to receive and encode outgoing data. The encoder further includes a coupling toggle rate (CTR) calculator configured to calculate a CTR for the outgoing data, a threshold calculator configured to determine an expected value of the CTR as a threshold value, a comparator configured to compare the calculated CTR to the threshold value wherein the comparison is used to determine whether to perform an encoding step by an encoding block configured to selectively encode said data. A method according to one embodiment includes determining and comparing a CTR and an expected CTR to determine whether to encode the outgoing data. Any one of a plurality different coding techniques may be used including bus inversion.Type: ApplicationFiled: July 30, 2015Publication date: February 2, 2017Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Greg Sadowski, John Kalamatianos
-
Patent number: 9552294Abstract: The described embodiments include a main memory and a cache memory (or “cache”) with a cache controller that includes a mode-setting mechanism. In some embodiments, the mode-setting mechanism is configured to dynamically determine an access pattern for the main memory. Based on the determined access pattern, the mode-setting mechanism configures at least one region of the main memory in a write-back mode and configures other regions of the main memory in a write-through mode. In these embodiments, when performing a write operation in the cache memory, the cache controller determines whether a region in the main memory where the cache block is from is configured in the write-back mode or the write-through mode and then performs a corresponding write operation in the cache memory.Type: GrantFiled: January 7, 2013Date of Patent: January 24, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Jaewoong Sim, Mithuna S. Thottethodi, Gabriel H. Loh
-
Patent number: 9552157Abstract: A system has a plurality of functional modules including a first functional module and one or more other functional modules. The first functional module includes an embedded memory element and is configurable in a plurality of modes including a first mode and a second mode. When the first functional module is in the first mode, access to the embedded memory element is limited to the first functional module. At least one of the one or more other functional modules is provided with access to the embedded memory element based at least in part on the first functional module being in the second mode.Type: GrantFiled: April 23, 2014Date of Patent: January 24, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Yunpeng Zhu, Xianshuai Shi, Yan Liu