Patents Assigned to Micro Devices, Inc.

PRESERVING QUALITY OF SERVICE CONSTRAINTS IN HETEROGENEOUS PROCESSING SYSTEMS

Publication number: 20180069767

Abstract: Techniques described herein improve processor performance in situations where a large number of system service requests are being received from other devices. More specifically, upon detecting that certain operating conditions that indicate a processor slowdown are present, the processor performs one or more system service adjustment techniques. These techniques include throttling (reducing the rate of handling) of such requests, coalescing (grouping multiple requests into a single group) the requests, disabling microarchitctural structures (such as caches or branch prediction units) or updates to those structures, and prefetching data for or pre-performing these requests. Each of these adjustment techniques helps to reduce the number of and/or workload associated with servicing requests for system services.

Type: Application

Filed: September 6, 2016

Publication date: March 8, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Arkaprava Basu, Joseph L. Greathouse, Guru Prasadh V. Venkataramani, Jan Vesely
Cache access statistics accumulation for cache line replacement selection

Patent number: 9910788

Abstract: A processor device includes a cache and a memory storing a set of counters. Each counter of the set is associated with a corresponding block of a plurality of blocks of the cache. The processor device further includes a cache access monitor to, for each time quantum for a series of one or more time quanta, increment counter values of the set of counters based on accesses to the corresponding blocks of the cache. The processor device further includes a transfer engine to, after completion of each time quantum, transfer the counter values of the set of counters for the time quantum to a corresponding location in a system memory.

Type: Grant

Filed: September 22, 2015

Date of Patent: March 6, 2018

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Philip J. Rogers, Benjamin T. Sander, Anthony Asaro
Computer-based square root and division operations

Patent number: 9910638

Abstract: Square root operations in a computer processor are disclosed. A first iteration for calculating partial results of a square root operation is performed in a larger number of cycles than remaining iterations. The first iteration requires calculation of a first digit that is larger than the subsequent digits. The first iteration thus requires multiplication of values that are larger than corresponding values for the subsequent other digits. By splitting the first digit into two parts, the required multiplications can be performed in less time than if the first digit were not split. Performing these multiplications in less time reduces the total delay for clock cycles associated with the first digit calculations, which increases the possible clock frequency allowed. A multiply-and-accumulate unit that performs either packed-single operations or double-precision operations may be used, along with a combined division/square root unit for simultaneous execution of division and square root operations.

Type: Grant

Filed: August 25, 2016

Date of Patent: March 6, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Hanbing Liu, John Kelley, Michael Estlick, Erik Swanson, Jay Fleischman
Scriptable dynamic load balancing in computer systems

Patent number: 9910714

Abstract: The described embodiments include a system for executing a load using a first processor and a seond processor in a computer system. During operation, a load balancer executing on the first processor obtains one or more attributes of a load to be executed on the computer system. Next, the load balancer applies a set of configurable rules to the one or more attributes to select a processor from the first and second processors for executing the load. Finally, the system executes the load on the selected processor.

Type: Grant

Filed: June 29, 2015

Date of Patent: March 6, 2018

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Kent F. Knox, Jian Liu
Page migration in a hybrid memory device

Patent number: 9910605

Abstract: A die-stacked hybrid memory device implements a first set of one or more memory dies implementing first memory cell circuitry of a first memory architecture type and a second set of one or more memory dies implementing second memory cell circuitry of a second memory architecture type different than the first memory architecture type. The die-stacked hybrid memory device further includes a set of one or more logic dies electrically coupled to the first and second sets of one or more memory dies, the set of one or more logic dies comprising a memory interface and a page migration manager, the memory interface coupleable to a device external to the die-stacked hybrid memory device, and the page migration manager to transfer memory pages between the first set of one or more memory dies and the second set of one or more memory dies.

Type: Grant

Filed: November 16, 2016

Date of Patent: March 6, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, Gabriel H. Loh, James M. O'Connor, Niladrish Chatterjee
HETEROGENEOUS PARALLEL PRIMITIVES PROGRAMMING MODEL

Publication number: 20180060124

Abstract: With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees behavior even in the case of developing high-performance code.

Type: Application

Filed: October 30, 2017

Publication date: March 1, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Benedict R. Gaster, Lee W. Howes
METHOD AND DEVICE FOR DETERMINING BRANCH HISTORY FOR BRANCH PREDICTION

Publication number: 20180060074

Abstract: Disclosed are a method and a processing device directed to determining global branch history for branch prediction. The method includes shifting first bits of a branch signature into a current global branch history and performing a bitwise exclusive-or (XOR) function on second bits of the branch signature and shifted bits of the current global branch history. In this way, the current global branch history is updated. The processing device implements the method using a shift logic configured to store and shift bits representing a current global branch history, a register configured to store the current global branch history, decision circuitry configured to determine whether or not a branch is taken, and XOR gates.

Type: Application

Filed: August 30, 2016

Publication date: March 1, 2018

Applicant: Advanced Micro Devices, Inc.

Inventor: Steven R. Havlir
COMPUTER-BASED SQUARE ROOT AND DIVISION OPERATIONS

Publication number: 20180060039

Abstract: Square root operations in a computer processor are disclosed. A first iteration for calculating partial results of a square root operation is performed in a larger number of cycles than remaining iterations. The first iteration requires calculation of a first digit that is larger than the subsequent digits. The first iteration thus requires multiplication of values that are larger than corresponding values for the subsequent other digits. By splitting the first digit into two parts, the required multiplications can be performed in less time than if the first digit were not split. Performing these multiplications in less time reduces the total delay for clock cycles associated with the first digit calculations, which increases the possible clock frequency allowed. A multiply-and-accumulate unit that performs either packed-single operations or double-precision operations may be used, along with a combined division/square root unit for simultaneous execution of division and square root operations.

Type: Application

Filed: August 25, 2016

Publication date: March 1, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Hanbing Liu, John Kelley, Michael Estlick, Erik Swanson, Jay Fleischman
BRANCH TARGET BUFFER COMPRESSION

Publication number: 20180060073

Abstract: Techniques for improving branch target buffer (“BTB”) operation. A compressed BTB is included within a branch prediction unit along with an uncompressed BTB. To support prediction of up to two branch instructions per cycle, the uncompressed BTB includes entries that each store data for up to two branch predictions. The compressed BTB includes entries that store data for only a single branch instruction for situations where storing that single branch instruction in the uncompressed BTB would waste space in that buffer. Space would be wasted in the uncompressed BTB due to the fact that, in order to support two branch lookups per cycle, prediction data for two branches must have certain features in common (such as cache line address) in order to be stored together in a single entry.

Type: Application

Filed: August 30, 2016

Publication date: March 1, 2018

Applicant: Advanced Micro Devices, Inc.

Inventor: Steven R. Havlir
Scan or JTAG controllable capture clock generation

Patent number: 9903913

Abstract: A capture clock generation control mechanism is provided. The capture clock generation control mechanism controls the number of at-speed clocks generated and supplied to one or more scan chains during scan testing of a microcircuit based on control data stored in a JTAG or scan test register. The scan test register may be formed out of scan cells and comprise part of a scan chain. Automatic Test Pattern Generation (ATPG) tools may generate the data that is loaded into the scan test register to automatically configure the clock generation control mechanism. The clock control mechanism may include the ability to adjust the position of the at-speed clocks within a capture cycle, thereby facilitating transition fault detection.

Type: Grant

Filed: January 20, 2014

Date of Patent: February 27, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Atchyuth Gorti, Anirudh Kadiyala, Bill K. C. Kwan, Venkat Kuchipudi
Early cache prefetching in preparation for exit from idle mode

Patent number: 9904623

Abstract: A system includes a functional unit, at least one cache coupled to the functional unit, and a power management unit coupled to the functional unit and the at least one cache, the power management unit configured to trigger the functional unit to initiate prefetching of data to repopulate the at least one cache prior to a predicted exit of the functional unit from an idle mode to an active mode. The system further may include a prediction unit to predict the exit from the idle mode for the functional unit as occurring a predetermined duration from an entry into the idle mode. The prediction unit may determine the predetermined duration based on a history of idle mode durations indicative of durations of previous instances in which the functional unit was in the idle mode.

Type: Grant

Filed: May 1, 2015

Date of Patent: February 27, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Madhu Saravana Sibi Govindan, William Lloyd Bircher, Aniruddha Dasgupta, Dongyuan Zhan
DATA CACHE REGION PREFETCHER

Publication number: 20180052779

Abstract: A data cache region prefetcher creates a region when a data cache miss occurs. Each region includes a predetermined range of data lines proximate to each data cache miss and is tagged with an associated instruction pointer register (RIP). The data cache region prefetcher compares subsequent memory requests against the predetermined range of data lines for each of the existing regions. For each match, the data cache region prefetcher sets an access bit and attempts to identify a pseudo-random access pattern based on the set access bits. The data cache region prefetcher increments or decrements appropriate counters to track how often the pseudo-random access pattern occurs. If the pseudo-random access pattern occurs frequently, then the next time a memory request is processed with the same RIP and pattern, the data cache region prefetcher prefetches the data lines in accordance with the pseudo-random access pattern for that RIP.

Type: Application

Filed: October 13, 2016

Publication date: February 22, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Donald W. McCauley, William E. Jones
TRACKING STORES AND LOADS BY BYPASSING LOAD STORE UNITS

Publication number: 20180052613

Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. The system and method include storing data in one or more memory dependent architectural register numbers (MdArns), allocating the one or more MdArns to a MEMFILE, writing the allocated one or more MdArns to a map file, wherein the map file contains a MdArn map to enable subsequent access to an entry in the MEMFILE, upon receipt of a load request, checking a base, an index, a displacement and a match/hit via the map file to identify an entry in the MEMFILE and an associated store, and on a hit, providing the entry responsive to the load request from the one or more MdArns.

Type: Application

Filed: December 15, 2016

Publication date: February 22, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Betty Ann McDaniel, Michael D. Achenbach, David N. Suggs, Frank C. Galloway, Kai Troester, Krishnan V. Ramani
INCREASE CACHE ASSOCIATIVITY USING HOT SET DETECTION

Publication number: 20180052778

Abstract: A processing apparatus and a method of accessing data using cache hot set detection is provided that includes receiving a plurality of requests to access data in a cache. The cache includes a plurality of cache sets each including N number of cache lines. Each request includes an address. The apparatus and a method also includes storing, in a HSVC array, cache line victims of one or more of the plurality of cache sets determined to be hot sets. Each cache line victim includes a corresponding address that is determined, using a HSD array, to belong to the one or more determined cache hot sets based on a hot set frequency of a plurality of addresses mapped to the set in the cache.

Type: Application

Filed: August 22, 2016

Publication date: February 22, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: John Kalamatianos, Adithya Yalavarti, Johnsy Kanjirapallil John
METHOD AND APPARATUS FOR COMPRESSING ADDRESSES

Publication number: 20180052631

Abstract: A method and apparatus of compressing addresses for transmission includes receiving a transaction at a first device from a source that includes a memory address request for a memory location on a second device. It is determined if a first part of the memory address is stored in a cache located on the first device. If the first part of the memory address is not stored in the cache, the first part of the memory address is stored in the cache and the entire memory address and information relating to the storage of the first part is transmitted to the second device. If the first part of the memory address is stored in the cache, only a second part of the memory address and an identifier that indicates a way in which the first part of the address is stored in the cache is transmitted to the second device.

Type: Application

Filed: November 8, 2016

Publication date: February 22, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Vydhyanathan Kalyanasundharam, Greggory D. Donley
Fine granularity refresh

Patent number: 9899074

Abstract: A data processing system includes a memory channel and a data processor coupled to the memory channel. The data processor is adapted to access at least one rank and has refresh logic. In response to an activation of the refresh logic, the data processor generates refresh cycles to a bank of the memory channel. The data processor selects one of a first state corresponding to a first auto-refresh command that causes the data processor to auto-refresh the bank, and a second state corresponding to a second auto-refresh command that causes the data processor to auto-refresh a selected subset of the bank. The data processor initiates a switch between the first state and the second state in response to the refresh logic detecting a first condition related to the bank, and between the second state and the first state in response to the refresh logic circuit detecting a second condition.

Type: Grant

Filed: January 17, 2017

Date of Patent: February 20, 2018

Assignee: Advanced Micro Devices, Inc.

Inventor: Kedarnath Balakrishnan
Reducing the load on the bitlines of a ROM bitcell array

Patent number: 9898568

Abstract: Systems, apparatuses, and methods for reducing the load on the bitlines of a ROM bitcell array are described. The connections between nets of a ROM bitcell array may be assigned based on their programmed values using a traditional approach. Then, a plurality of optimizations may be performed on the assignment of nets to reduce the load on the bitlines of the array. A first optimization may swap the connections between ground and bitline for the nets of a given column responsive to detecting that the number of connections to the corresponding bitline is greater than the number of connections to ground for the given column. A second optimization may remove the connection of a net to a bitline if three consecutive nets of a given column are connected to the bitline.

Type: Grant

Filed: June 23, 2015

Date of Patent: February 20, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Naveen Chandra Srivastava, Janardhan Achanta, Pankaj Kumar, Shreekanth Karandoor Sampigethaya
Low power adaptive synchronizer

Patent number: 9899992

Abstract: A circuit adapts to the occurrence of metastable states. The circuit inhibits passing of the metastable state to circuits that follow, by clock gating the output stage. In order to determine whether or not to gate the clock of the output stage, two detect circuits may be used. One circuit detects metastability and another circuit detects metastability resolved to a wrong logic level. The results from one or both detector circuits are used to gate the next clock cycle if needed, waiting for the metastable situation to be resolved.

Type: Grant

Filed: August 17, 2016

Date of Patent: February 20, 2018

Assignee: Advanced Micro Devices, Inc.

Inventor: Greg Sadowski
Dynamic wavefront creation for processing units using a hybrid compactor

Patent number: 9898287

Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.

Type: Grant

Filed: April 9, 2015

Date of Patent: February 20, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Sooraj Puthoor, Bradford M. Beckmann, Dmitri Yudanov
UPDATING LEAST-RECENTLY-USED DATA FOR GREATER PERSISTENCE OF HIGHER GENERALITY CACHE ENTRIES

Publication number: 20180046583

Abstract: Techniques for improving translation lookaside buffer (TLB) operation are disclosed. A particular entry of the TLB is to be updated with data associated with a large page size. The TLB updates replacement policy data for that TLB entry for that large page size to indicate that the TLB entry is not the least-recently-used. To prevent smaller pages from evicting the TLB entry for the large page size, the TLB also updates replacement policy data for that TLB entry for the smaller page size to indicate that the TLB entry is not the least-recently-used.

Type: Application

Filed: August 12, 2016

Publication date: February 15, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Anthony J. Bybell, John M. King

prev … 128 129 130 131 132 133 134 135 136 … next