Patents Assigned to Advanced Micro Devices

Encoding valid data states in source synchronous bus interfaces using clock signal transitions

Patent number: 9639488

Abstract: Embodiments are described for a method of reducing power consumption in source synchronous bus systems by reducing signal transitions in the system. Instead of sending clock and data valid signals, only the start and end of valid data packets are marked by clock signal transitions, or only a number of clock pulses that corresponds to number of data words is sent, or only a number transitions on clock signals are sent. The clock signal transitions may comprise either clock pulses or exclusively rising edge or falling edge transitions of the clock signal.

Type: Grant

Filed: June 20, 2014

Date of Patent: May 2, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Gregory Sadowski, Sudha Thiruvengadam, Arun Iyer
Ordering memory commands in a computer system

Patent number: 9639280

Abstract: The disclosed embodiments provide a system for processing a memory command on a computer system. During operation, a command scheduler executing on a memory controller of the computer system obtains a predicted latency of the memory command based on a memory address to be accessed by the memory command. Next, the command scheduler orders the memory command with other memory commands in a command queue for subsequent processing by a memory resource on the computer system based on the predicted latency of the memory command.

Type: Grant

Filed: June 18, 2015

Date of Patent: May 2, 2017

Assignee: ADVANCED MICRO DEVICES, INC.

Inventor: David A. Roberts
Integrated controller for training memory physical layer interface

Patent number: 9639495

Abstract: A controller integrated in a memory physical layer interface (PHY) can be used to control training used to configure the memory PHY for communication with an associated external memory such as a dynamic random access memory (DRAM), thereby removing the need to provide training sequences over a data pipeline between a BIOS and the memory PHY. For example, a controller integrated in the memory PHY can control read training and write training of the memory PHY for communication with the external memory based on a training algorithm. The training algorithm may be a seedless training algorithm that converges on a solution for a timing delay and a voltage offset between the memory PHY and the external memory without receiving, from a basic input/output system (BIOS), seed information that characterizes a signal path traversed by training sequences or commands generated by the training algorithm.

Type: Grant

Filed: June 27, 2014

Date of Patent: May 2, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Glenn A. Dearth, Gerry Talbot, Anwar Kashem, Edoardo Prete, Brian Amick
Thermal-aware compiler for parallel instruction execution in processors

Patent number: 9639359

Abstract: Embodiments are described for a method for compiling instruction code for execution in a processor having a number of functional units by determining a thermal constraint of the processor, and defining instruction words comprising both real instructions and one or more no operation (NOP) instructions to be executed by the functional units within a single clock cycle, wherein a number of NOP instructions executed over a number of consecutive clock cycles is configured to prevent exceeding the thermal constraint during execution of the instruction code.

Type: Grant

Filed: May 21, 2013

Date of Patent: May 2, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Yuan Xie, Junli Gu
Power management of interactive workloads driven by direct and indirect user feedback

Patent number: 9639140

Abstract: A method of managing power state transitions for an interactive workload includes storing one or more parameters, each representing an electrical operating characteristic that controls power consumption of the processing unit, receiving a first user input requesting execution of a task by the processing unit, in response to receiving a second user input, modifying at least one of the one or more parameters, and executing the task in the processing unit while operating the processing unit according to the at least one modified parameter.

Type: Grant

Filed: September 17, 2015

Date of Patent: May 2, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Leonardo de Paula Rosa Piga, Mauricio Breternitz
Asynchronous submission of commands

Patent number: 9632848

Abstract: A system and method for allocating commands in processing is disclosed. The system and method includes an application running on a computer system that provides commands to be executed on one of a plurality of processors capable of executing the commands, the commands provided through an application programming interface, a device driver that buffers the streamed commands and converts the streamed commands into a format used by a GPU, and an operating system that builds a command buffer by grouping a plurality of converted commands based on an allocation for an available processor, wherein the available processor is determined in the interface between the device driver and the operating system. The available processor is one of the plurality of processors capable of executing the commands that receives the command buffer from the operating system, queues the command buffer and performs an asynchronous submission of the command buffer to the GPU, and the GPU executes the command buffer.

Type: Grant

Filed: December 29, 2015

Date of Patent: April 25, 2017

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: David Oldcorn, Timour T. Paltashev
Method and apparatus for floating point register caching

Patent number: 9626190

Abstract: The present invention provides a method and apparatus for floating-point register caching. One embodiment of the method includes mapping a first set of architected registers defined by a first instruction set to a memory outside of a plurality of physical registers. The plurality of physical registers are configured to map to the first set, a second set of architected registers defined by a second construction set, and a set of rename registers. This embodiment of the method also includes adding the physical registers corresponding to the first set of architected registers to the set of rename registers.

Type: Grant

Filed: October 7, 2010

Date of Patent: April 18, 2017

Assignee: Advanced Micro Devices, Inc.

Inventor: Jeff Rupley
Apparatus and method for hash table access

Patent number: 9626428

Abstract: A system and method for accessing a hash table are provided. A hash table includes buckets where each bucket includes multiple chains. When a single instruction multiple data (SIMD) processor receives a group of threads configured to execute a key look-up instruction that accesses an element in the hash table, the threads executing on the SIMD processor identify a bucket that stores a key in the key look-up instruction. Once identified, the threads in the group traverse the multiple chains in the bucket, such that the elements at a chain level in the multiple chains are traversed in parallel. The traversal continues until a key look-up succeeds or fails.

Type: Grant

Filed: September 11, 2013

Date of Patent: April 18, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Mithuna Thottethodi, Steven Reinhardt
Integrated differential clock gater

Patent number: 9625938

Abstract: A technique implements differential digital logic circuits with a differential clock distribution network using standard cell differential clock gater circuits to reduce area, delay, power consumption in integrated circuits. An apparatus includes a first terminal configured to receive a clock signal, a second terminal configured to receive a complementary clock signal, and a third terminal configured to receive a clock control signal. The apparatus includes a latch circuit configured to generate a latched version of the clock control signal based on a version of the clock control signal, a version of the clock signal, and a version of the complementary clock signal. The apparatus includes a combinatorial circuit configured to generate a gated clock signal and a gated complementary clock signal based on the version of the clock control signal, the version of the clock signal, and the version of the complementary clock signal.

Type: Grant

Filed: March 25, 2015

Date of Patent: April 18, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Hariprasad Thodukattil Thazhatheppattu, Prasant Vallur, Animesh Sharma
Implicit texture map parameterization for GPU rendering

Patent number: 9626789

Abstract: Embodiments are described for a method for processing textures for a mesh comprising quadrilateral polygons for real-time rendering of an object or model in a graphics processing unit (GPU), comprising associating an independent texture map with each face of the mesh to produce a plurality of face textures, packing the plurality of face textures into a single texture atlas, wherein the atlas is divided into a plurality of blocks based on a resolution of the face textures, adding a border to the texture map for each face comprising additional texels including at least border texels from an adjacent face texture map, and performing linear interpolation of a trilinear filtering operation on the face textures to resolve resolution discrepancies caused when crossing an edge of a polygon.

Type: Grant

Filed: May 7, 2013

Date of Patent: April 18, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Karl E. Hillesland, Justin Hensley
Data error correction device and methods thereof

Patent number: 9626243

Abstract: A method and device for error detection includes performing error detection for each data word received in a burst access to a memory. When no error is detected, the data words are written to a cache and indicated as valid data. In response to detecting an error in a data word, the error is corrected and the corrected data written to the cache without indicating the data as valid. In addition, the location of the detected error, indicating the data symbol associated with the error, is recorded in an error vector. The error vectors associated with each data word in the burst access are compared to determine whether a detected error was properly corrected.

Type: Grant

Filed: December 11, 2009

Date of Patent: April 18, 2017

Assignee: Advanced Micro Devices, Inc.

Inventor: James O. Nicholson
Method and Apparatus for Workload Placement on Heterogeneous Systems

Publication number: 20170102971

Abstract: The methods and apparatus can assign processing core workloads to processing cores from a heterogeneous instruction set architectures (ISA) pool of available processing cores based on processing core metric results. For example, the method and apparatus can obtain processing core metric results for one or more processing cores, such as processing cores within general purpose processors, from a heterogeneous ISA pool of available processing cores. The method and apparatus can also obtain one or more processing core workloads, such as software applications or software processes, from a pool of available processing core workloads to be assigned. The method and apparatus can then assign one or more processing core workloads that have higher priority than others from the pool of available processing core workloads to a processing core from the heterogeneous ISA pool of available processing cores based on its processing core metric result.

Type: Application

Filed: October 12, 2015

Publication date: April 13, 2017

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor: Sergey Blagodurov
MINIMIZING LATENCY FROM PERIPHERAL DEVICES TO COMPUTE ENGINES

Publication number: 20170102886

Abstract: Methods, systems, and computer program products are provided for minimizing latency in a implementation where a peripheral device is used as a capture device and a compute device such as a GPU processes the captured data in a computing environment. In embodiments, a peripheral device and GPU are tightly integrated and communicate at a hardware/firmware level. Peripheral device firmware can determine and store compute instructions specifically for the GPU, in a command queue. The compute instructions in the command queue are understood and consumed by firmware of the GPU. The compute instructions include but are not limited to generating low latency visual feedback for presentation to a display screen, and detecting the presence of gestures to be converted to OS messages that can be utilized by any application.

Type: Application

Filed: December 21, 2016

Publication date: April 13, 2017

Applicant: Advanced Micro Devices, Inc.

Inventor: Daniel W. WONG
Propagation simulation buffer for clock domain crossing

Patent number: 9621143

Abstract: Techniques are disclosed relating to detecting and minimizing timing problems created by clock domain crossing (CDC) in integrated circuits. In various embodiments, one or more timing parameters are associated with a path that crosses between clock domains in an integrated circuit, where the one or more timing parameters specify a propagation delay for the path. In one embodiment, the timing parameters may be distributed to different design stages using a configuration file. In some embodiments, the one or more parameters may be used in conjunction with an RTL model to simulate propagation of a data signal along the path. In some embodiments, the one or more parameters may be used in conjunction with a netlist to create a physical design for the integrated circuit, where the physical design includes a representation of the path that has the specified propagation delay.

Type: Grant

Filed: November 8, 2013

Date of Patent: April 11, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Osborn, Michael J. Tresidder, Aaron J. Grenat, Joseph Kidd, Priyank Parakh, Steven J. Kommrusch
SIMD processing unit with local data share and access to a global data share of a GPU

Patent number: 9619428

Abstract: A graphics processing unit is disclosed, the graphics processing unit having a processor having one or more SIMD processing units, and a local data share corresponding to one of the one or more SIMD processing units, the local data share comprising one or more low latency accessible memory regions for each group of threads assigned to one or more execution wavefronts, and a global data share comprising one or more low latency memory regions for each group of threads.

Type: Grant

Filed: June 1, 2009

Date of Patent: April 11, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Michael J. Mantor, Brian Emberling
Hardware and runtime coordinated load balancing for parallel applications

Patent number: 9619290

Abstract: A method of balancing execution rates for a plurality of parallel program loops being executed concurrently by a processor may include estimating a completion time for each program loop of the plurality of program loops, determining a difference between the estimated completion time of a first program loop of the plurality of program loops and the estimated completion time of a second program loop of the plurality of program loops, and decreasing the difference by adjusting an execution rate of the first program loop.

Type: Grant

Filed: March 6, 2015

Date of Patent: April 11, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Peter Bailey, Indrani Paul, Manish Arora
Memory management in graphics and compute application programming interfaces

Patent number: 9612884

Abstract: Methods are provided for creating objects in a way that permits an API client to explicitly participate in memory management for an object created using the API. Methods for managing data object memory include requesting memory requirements for an object using an API and expressly allocating a memory location for the object based on the memory requirements. Methods are also provided for cloning objects such that a state of the object remains unchanged from the original object to the cloned object or can be explicitly specified.

Type: Grant

Filed: December 4, 2014

Date of Patent: April 4, 2017

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Guennadi Riguer, Brian K. Bennett
Generalized control registers

Patent number: 9606936

Abstract: Methods, systems, and computer readable media generalize control registers in the context of memory address translations for I/O devices. A method includes maintaining a table including a plurality of concurrently available control register base pointers each associated with a corresponding input/output (I/O) device, associating each control register base pointer with a first translation from a guest virtual address (GVA) to a guest physical address (GPA) and a second translation from the GPA to a system physical address (SPA), and operating the first and second translations concurrently for the plurality of I/O devices.

Type: Grant

Filed: December 2, 2011

Date of Patent: March 28, 2017

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Andy Kegel, Mark Hummel, Tony Asaro, Philip Ng
Dependence-based replay suppression

Patent number: 9606806

Abstract: A method includes selecting for execution in a processor a load instruction having at least one dependent instruction. Responsive to selecting the load instruction, the at least one dependent instruction is selectively awakened based on a status of a store instruction associated with the load instruction to indicate that the at least one dependent instruction is eligible for execution. A processor includes an instruction pipeline having an execution unit to execute instructions, a scheduler, and a controller. The scheduler selects for execution in the execution unit a load instruction having at least one dependent instruction. The controller, responsive to the scheduler selecting the load instruction, selectively awakens the at least one dependent instruction based on a status of a store instruction associated with the load instruction to indicate that the at least one dependent instruction is eligible for execution by the execution unit.

Type: Grant

Filed: June 25, 2013

Date of Patent: March 28, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Gregory W. Smaus, Michael Achenbach, Christopher J. Burke, Francesco Spadini
Scan flip-flop circuit with dedicated clocks

Patent number: 9606177

Abstract: In one form, a scan flip-flop includes a clock gating cell and a dedicated clock flip-flop. The clock gating cell provides an input clock input signal as a scan clock signal when a scan shift enable signal is active, and provides the input clock signal as a data clock signal when the scan shift enable signal is inactive. The dedicated clock flip-flop stores a data input signal and provides the data input signal, so stored, as a data output signal in response to transitions of the data clock signal, and stores a scan data input signal and provides the scan data input signal, so stored, as the data output signal in response to transitions of the scan clock signal. The clock gating cell may further provide the input clock signal as the data clock signal when both a scan shift enable signal is inactive and a data enable signal is active.

Type: Grant

Filed: May 19, 2015

Date of Patent: March 28, 2017

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Daniel W. Bailey, Abhishek Sharma, Michael Q. Co

prev … 137 138 139 140 141 142 143 144 145 … next