Patents by Inventor Nuwan S. Jayasena

Nuwan S. Jayasena has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PROCESSING ENGINE FOR COMPLEX ATOMIC OPERATIONS

Publication number: 20140181421

Abstract: A system includes an atomic processing engine (APE) coupled to an interconnect. The interconnect is to couple to one or more processor cores. The APE receives a plurality of commands from the one or more processor cores through the interconnect. In response to a first command, the APE performs a first plurality of operations associated with the first command. The first plurality of operations references multiple memory locations, at least one of which is shared between two or more threads executed by the one or more processor cores.

Type: Application

Filed: December 21, 2012

Publication date: June 26, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: James M. O'CONNOR, Michael J. Schulte, Nuwan S. Jayasena, Gabriel H. Loh
Write Endurance Management Techniques in the Logic Layer of a Stacked Memory

Publication number: 20140181457

Abstract: A system, method, and memory device embodying some aspects of the present invention for remapping external memory addresses and internal memory locations in stacked memory are provided. The stacked memory includes one or more memory layers configured to store data. The stacked memory also includes a logic layer connected to the memory layer. The logic layer has an Input/Output (I/O) port configured to receive read and write commands from external devices, a memory map configured to maintain an association between external memory addresses and internal memory locations, and a controller coupled to the I/O port, memory map, and memory layers, configured to store data received from external devices to internal memory locations.

Type: Application

Filed: December 21, 2012

Publication date: June 26, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Lisa R. HSU, Gabriel H. LOH, Michael IGNATOWSKI, Michael J. SCHULTE, Nuwan S. JAYASENA, James M. O'CONNOR
MECHANISMS TO BOUND THE PRESENCE OF CACHE BLOCKS WITH SPECIFIC PROPERTIES IN CACHES

Publication number: 20140181414

Abstract: A system and method for efficiently limiting storage space for data with particular properties in a cache memory. A computing system includes a cache array and a corresponding cache controller. The cache array includes multiple banks, wherein a first bank is powered down. In response a write request to a second bank for data indicated to be stored in the powered down first bank, the cache controller determines a respective bypass condition for the data. If the bypass condition exceeds a threshold, then the cache controller invalidates any copy of the data stored in the second bank. If the bypass condition does not exceed the threshold, then the cache controller stores the data with a clean state in the second bank. The cache controller writes the data in a lower-level memory for both cases.

Type: Application

Filed: October 16, 2013

Publication date: June 26, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Yasuko Eckert, Gabriel H. Loh, Mauricio Breternitz, James M. O'Connor, Srilatha Manne, Nuwan S. Jayasena, Mithuna S. Thottethodi
DIE-STACKED MEMORY DEVICE WITH RECONFIGURABLE LOGIC

Publication number: 20140176187

Abstract: A die-stacked memory device incorporates a reconfigurable logic device to provide implementation flexibility in performing various data manipulation operations and other memory operations that use data stored in the die-stacked memory device or that result in data that is to be stored in the die-stacked memory device. One or more configuration files representing corresponding logic configurations for the reconfigurable logic device can be stored in a configuration store at the die-stacked memory device, and a configuration controller can program a reconfigurable logic fabric of the reconfigurable logic device using a selected one of the configuration files. Due to the integration of the logic dies and the memory dies, the reconfigurable logic device can perform various data manipulation operations with higher bandwidth and lower latency and power consumption compared to devices external to the die-stacked memory device.

Type: Application

Filed: December 23, 2012

Publication date: June 26, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Nuwan S. Jayasena, Michael J. Schulte, Gabriel H. Loh, Michael Ignatowski
Compound Memory Operations in a Logic Layer of a Stacked Memory

Publication number: 20140181427

Abstract: Some die-stacked memories will contain a logic layer in addition to one or more layers of DRAM (or other memory technology). This logic layer may be a discrete logic die or logic on a silicon interposer associated with a stack of memory dies. Additional circuitry/functionality is placed on the logic layer to implement functionality to perform various data movement and address calculation operations. This functionality would allow compound memory operations—a single request communicated to the memory that characterizes the accesses and movement of many data items. This eliminates the performance and power overheads associated with communicating address and control information on a fine-grain, per-data-item basis from a host processor (or other device) to the memory. This approach also provides better visibility of macro-level memory access patterns to the memory system and may enable additional optimizations in scheduling memory accesses.

Type: Application

Filed: December 21, 2012

Publication date: June 26, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Nuwan S. JAYASENA, James M. O'Connor, Gabriel H. Loh, Michael J. Schulte, Bradford M. Beckmann, Michael Ignatowski
DIE-STACKED MEMORY DEVICE PROVIDING DATA TRANSLATION

Publication number: 20140181458

Abstract: A die-stacked memory device incorporates a data translation controller at one or more logic dies of the device to provide data translation services for data to be stored at, or retrieved from, the die-stacked memory device. The data translation operations implemented by the data translation controller can include compression/decompression operations, encryption/decryption operations, format translations, wear-leveling translations, data ordering operations, and the like. Due to the tight integration of the logic dies and the memory dies, the data translation controller can perform data translation operations with higher bandwidth and lower latency and power consumption compared to operations performed by devices external to the die-stacked memory device.

Type: Application

Filed: December 23, 2012

Publication date: June 26, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Gabriel H. Loh, Bradford M. Beckmann, James M. O'Connor, Michael Ignatowski, Michael J. Schulte, Lisa R. Hsu, Nuwan S. Jayasena
Invalidation of Dead Transient Data in Caches

Publication number: 20140173216

Abstract: Embodiments include methods, systems, and articles of manufacture directed to identifying transient data upon storing the transient data in a cache memory, and invalidating the identified transient data in the cache memory.

Type: Application

Filed: December 18, 2012

Publication date: June 19, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Nuwan S. JAYASENA, Mark D. HILL
Redundant Threading for Improved Reliability

Publication number: 20140156975

Abstract: In some embodiments, a method for improving reliability in a processor is provided. The method can include replicating input data for first and second lanes of a processor, the first and second lanes being located in a same cluster of the processor and the first and second lanes each generating a respective value associated with an instruction to be executed in the respective lane, and responsive to a determination that the generated values do not match, providing an indication that the generated values do not match.

Type: Application

Filed: November 30, 2012

Publication date: June 5, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Vilas SRIDHARAN, James M. O'Connor, Steven K. Reinhardt, Nuwan S. Jayasena, Michael J. Schulte, Dean A. Liberty
Prefetch Kernels on Data-Parallel Processors

Publication number: 20140149677

Abstract: Embodiments include methods, systems and computer readable media configured to execute a first kernel (e.g. compute or graphics kernel) with reduced intermediate state storage resource requirements. These include executing a first and second (e.g. prefetch) kernel on a data-parallel processor, such that the second kernel begins executing before the first kernel. The second kernel performs memory operations that are based upon at least a subset of memory operations in the first kernel.

Type: Application

Filed: November 26, 2012

Publication date: May 29, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Nuwan S. JAYASENA, James Michael O'CONNOR, Michael MANTOR
Using a Linear Prediction to Configure an Idle State of an Entity in a Computing Device

Publication number: 20140149772

Abstract: The described embodiments include a computing device with one or more entities (processor cores, processors, etc.). In some embodiments, during operation, a thermal power management unit in the computing device uses a linear prediction to compute a predicted duration of a next idle period for an entity based on the duration of one or more previous idle periods for the entity. Based on the predicted duration of the next idle period, the thermal power management unit configures the entity to operate in a corresponding idle state.

Type: Application

Filed: November 8, 2013

Publication date: May 29, 2014

Applicant: Advanced Micro Devices, Inc.

Inventors: Manish Arora, Nuwan S. Jayasena, Yasuko Eckert, Madhu Saravana Sibi Govindan, William L. Bircher, Michael J. Schulte, Srilatha Manne
Scheduling compute kernel workgroups to heterogeneous processors based on historical processor execution times and utilizations

Patent number: 8707314

Abstract: A system and method embodiments for optimally allocating compute kernels to different types of processors, such as CPUs and GPUs, in a heterogeneous computer system are disclosed. These include comparing a kernel profile of a compute kernel to respective processor profiles of a plurality of processors in a heterogeneous computer system, selecting at least one processor from the plurality of processors based upon the comparing, and scheduling the compute kernel for execution in the selected at least one processor.

Type: Grant

Filed: December 16, 2011

Date of Patent: April 22, 2014

Assignee: Advanced Micro Devices, Inc.

Inventors: Jayanth Gummaraju, Nuwan S. Jayasena
REDUCING COLD TLB MISSES IN A HETEROGENEOUS COMPUTING SYSTEM

Publication number: 20140101405

Abstract: Methods and apparatuses are provided for avoiding cold translation lookaside buffer (TLB) misses in a computer system. A typical system is configured as a heterogeneous computing system having at least one central processing unit (CPU) and one or more graphic processing units (GPUs) that share a common memory address space. Each processing unit (CPU and GPU) has an independent TLB. When offloading a task from a particular CPU to a particular GPU, translation information is sent along with the task assignment. The translation information allows the GPU to load the address translation data into the TLB associated with the one or more GPUs prior to executing the task. Preloading the TLB of the GPUs reduces or avoids cold TLB misses that could otherwise occur without the benefits offered by the present disclosure.

Type: Application

Filed: October 5, 2012

Publication date: April 10, 2014

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Misel-Myrto Papadopoulou, Lisa R. Hsu, Andrew G. Kegel, Nuwan S. Jayasena, Bradford M. Beckmann, Steven K. Reinhardt
STACKED MEMORY DEVICE WITH HELPER PROCESSOR

Publication number: 20140040532

Abstract: A processing system comprises one or more processor devices and other system components coupled to a stacked memory device having a set of stacked memory layers and a set of one or more logic layers. The set of logic layers implements a helper processor that executes instructions to perform tasks in response to a task request from the processor devices or otherwise on behalf of the other processor devices. The set of logic layers also includes a memory interface coupled to memory cell circuitry implemented in the set of stacked memory layers and coupleable to the processor devices. The memory interface operates to perform memory accesses for the processor devices and for the helper processor. By virtue of the helper processor's tight integration with the stacked memory layers, the helper processor may perform certain memory-intensive operations more efficiently than could be performed by the external processor devices.

Type: Application

Filed: August 6, 2012

Publication date: February 6, 2014

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Yasuko Watanabe, Gabriel H. Loh, James M. O'Connor, Michael Ignatowski, Nuwan S. Jayasena
METHOD FOR URGENCY-BASED PREEMPTION OF A PROCESS

Publication number: 20140022263

Abstract: The desire to use an Accelerated Processing Device (APD) for general computation has increased due to the APD's exemplary performance characteristics. However, current systems incur high overhead when dispatching work to the APD because a process cannot be efficiently identified or preempted. The occupying of the APD by a rogue process for arbitrary amounts of time can prevent the effective utilization of the available system capacity and can reduce the processing progress of the system. Embodiments described herein can overcome this deficiency by enabling the system software to pre-empt a process executing on the APD for any reason. The APD provides an interface for initiating such a pre-emption. This interface exposes an urgency of the request which determines whether the process being preempted is allowed a grace period to complete its issued work before being forced off the hardware.

Type: Application

Filed: July 23, 2012

Publication date: January 23, 2014

Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Kevin McGrath, Sebastien Nussbaum, Nuwan S. Jayasena, Rex Eldon McCrary, Mark Leather, Philip J. Rogers
MEMORY ARCHITECTURE FOR READ-MODIFY-WRITE OPERATIONS

Publication number: 20130159812

Abstract: According to one embodiment, a memory architecture implemented method is provided, where the memory architecture includes a logic chip and one or more memory chips on a single die, and where the method comprises: reading values of data from the one or more memory chips to the logic chip, where the one or more memory chips and the logic chip are on a single die; modifying, via the logic chip on the single die, the values of data; and writing, from the logic chip to the one or more memory chips, the modified values of data.

Type: Application

Filed: December 16, 2011

Publication date: June 20, 2013

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Gabriel H. LOH, James M. O'Connor, Michael Ignatowski, Nuwan S. Jayasena, Bradford M. Beckmann
Allocating Compute Kernels to Processors in a Heterogeneous System

Publication number: 20130160016

Abstract: A system and method embodiments for optimally allocating compute kernels to different types of processors, such as CPUs and GPUs, in a heterogeneous computer system are disclosed. These include comparing a kernel profile of a compute kernel to respective processor profiles of a plurality of processors in a heterogeneous computer system, selecting at least one processor from the plurality of processors based upon the comparing, and scheduling the compute kernel for execution in the selected at least one processor.

Type: Application

Filed: December 16, 2011

Publication date: June 20, 2013

Applicant: Advanced Micro Devices, Inc.

Inventors: Jayanth GUMMARAJU, Nuwan S. JAYASENA
Software Mechanisms for Managing Task Scheduling on an Accelerated Processing Device (APD)

Publication number: 20130160017

Abstract: Embodiments describe herein provide a method of for managing task scheduling on a accelerated processing device. The method includes executing a first task within the accelerated processing device (APD), monitoring for an interruption of the execution of the first task, and switching to a second task when an interruption is detected.

Type: Application

Filed: December 14, 2011

Publication date: June 20, 2013

Inventors: Robert Scott HARTOG, Ralph Clay Taylor, Michael Mantor, Thomas Roy Woller, Kevin McGrath, Sebastien Nussbaum, Nuwan S. Jayasena, Rex McCrary, Philip J. Rogers, Mark Leather
Methods and Systems for Synchronous Operation of a Processing Device

Publication number: 20120198458

Abstract: Embodiments of the present invention provide a method of synchronous operation of a first processing device and a second processing device. The method includes executing a process on the first processing device, responsive to a determination that execution of the process on the first device has reached a serial-parallel boundary, passing an execution thread of the process from the first processing device to the second processing device, and executing the process on the second processing device.

Type: Application

Filed: November 30, 2011

Publication date: August 2, 2012

Applicant: Advanced Micro Devices, Inc.

Inventors: Robert Scott Hartog, Ralph Clay Taylor, Michael Mantor, Sebastien Nussbaum, Rex McCrary, Mark Leather, Nuwan S. Jayasena, Kevin McGrath, Philip j. Rogers, Thomas Woller

prev 1 2 3 4 5