Patents by Inventor Nuwan Jayasena

Nuwan Jayasena has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10162757
    Abstract: A distributed shared-memory system includes several nodes that each have one or more processor cores, caches, local main memory, and a directory. Each node further includes predictors that use historical memory access information to predict future coherence permission requirements and speculatively initiate coherence operations. In one embodiment, predictors are included at processor cores for monitoring a memory access stream (e.g., historical sequence of memory addresses referenced by a processor core) and predicting addresses of future accesses. In another embodiment, predictors are included at the directory of each node for monitoring memory access traffic and coherence-related activities for individual cache lines to predict future demands for particular cache lines. In other embodiments, predictors are included at both the processor cores and directory of each node.
    Type: Grant
    Filed: December 6, 2016
    Date of Patent: December 25, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Yasuko Eckert
  • Patent number: 10133672
    Abstract: Described is a system and method for efficient pointer chasing in systems having a single memory node or a network of memory nodes. In particular, a pointer chasing command is sent along with a memory request by an issuing node to a memory node. The pointer chasing command indicates the number of interdependent memory accesses and information needed for the identified interdependent memory accesses. An address computing unit associated with the memory node determines the relevant memory address for an interdependent memory access absent further interaction with the issuing node or without having to return to the issuing node.
    Type: Grant
    Filed: September 15, 2016
    Date of Patent: November 20, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Paula Aguilera Diez, Amin Farmahini-Farahani, Nuwan Jayasena
  • Publication number: 20180276150
    Abstract: A data processing system includes a memory that includes a first memory bank and a second memory bank. The data processing system also includes a conflict detector connected to the memory and adapted to receive memory access information. The conflict detector tracks memory access statistics of the first memory bank, and determines if the first memory bank contains frequent row conflicts. The conflict detector also remaps a frequent row conflict in the first memory bank to the second memory bank. An indirection table is connected to the conflict detector and adapted to receive a memory access request, and redirects an address into a dynamically selected physical memory address in response to a remapping of the frequent row conflict to the second memory bank.
    Type: Application
    Filed: March 24, 2017
    Publication date: September 27, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Yasuko Eckert, Reena Panda, Nuwan Jayasena
  • Publication number: 20180239702
    Abstract: A cache coherence technique for operating a multi-processor system including shared memory includes allocating a cache line of a cache memory of a processor to a memory address in the shared memory in response to execution of an instruction of a program executing on the processor. The technique includes encoding a shared information state of the cache line to indicate whether the memory address is a shared memory address shared by the processor and a second processor, or a private memory address private to the processor, in response to whether the instruction is included in a critical section of the program, the critical section being a portion of the program that confines access to shared, writeable data.
    Type: Application
    Filed: February 23, 2017
    Publication date: August 23, 2018
    Inventors: Amin Farmahini Farahani, Nuwan Jayasena
  • Patent number: 10049044
    Abstract: Proactive flush logic in a computing system is configured to perform a proactive flush operation to flush data from a first memory in a first computing device to a second memory in response to execution of a non-blocking flush instruction. Reactive flush logic in the computing system is configured to, in response to a memory request issued prior to completion of the proactive flush operation, interrupt the proactive flush operation and perform a reactive flush operation to flush requested data from the first memory to the second memory.
    Type: Grant
    Filed: June 14, 2016
    Date of Patent: August 14, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael Boyer, Gabriel Loh, Nuwan Jayasena
  • Patent number: 10042762
    Abstract: A data processing system includes a plurality of processors, local memories associated with a corresponding processor, and at least one inter-processor link. In response to a first processor performing a load or store operation on an address of a corresponding local memory that is not currently in the local cache, a local cache allocates a first cache line and encodes a local state with the first cache line. In response to a load operation from an address of a remote memory that is not currently in the local cache, the local cache allocates a second cache line and encodes a remote state with the second cache line. The first processor performs subsequent loads and stores on the first cache line in the local cache in response to the local state, and subsequent loads from the second cache line in the local cache in response to the remote state.
    Type: Grant
    Filed: September 14, 2016
    Date of Patent: August 7, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Michael Boyer
  • Publication number: 20180157589
    Abstract: A distributed shared-memory system includes several nodes that each have one or more processor cores, caches, local main memory, and a directory. Each node further includes predictors that use historical memory access information to predict future coherence permission requirements and speculatively initiate coherence operations. In one embodiment, predictors are included at processor cores for monitoring a memory access stream (e.g., historical sequence of memory addresses referenced by a processor core) and predicting addresses of future accesses. In another embodiment, predictors are included at the directory of each node for monitoring memory access traffic and coherence-related activities for individual cache lines to predict future demands for particular cache lines. In other embodiments, predictors are included at both the processor cores and directory of each node.
    Type: Application
    Filed: December 6, 2016
    Publication date: June 7, 2018
    Inventors: Nuwan Jayasena, Yasuko Eckert
  • Publication number: 20180113815
    Abstract: A processing system selects data for eviction at a cache based at least in part on a penalty associated with accessing the data at the memory location from which the data was transferred to the cache. The penalty reflects the amount of time and resources expended in copying the data from memory to the cache. By assigning priorities to the data stored at a cache based on the penalty incurred in accessing the data at the memory location from which it was transferred to the cache and selecting data for eviction from the cache based in part on the assigned priority, the processing system can preferentially select for eviction from the cache data that was transferred from a local memory to the cache rather than data that was transferred from a remote memory to the cache.
    Type: Application
    Filed: October 21, 2016
    Publication date: April 26, 2018
    Inventors: Yasuko Eckert, Bo Wu, Nuwan Jayasena, Dong Ping Zhang
  • Patent number: 9947386
    Abstract: A method of managing thermal levels in a memory system may include determining an expected thermal level associated with each of a plurality of locations in a memory structure, and for each operation of a plurality of operations addressed to the memory structure, assigning the operation to a target location of the plurality of physical locations in the memory structure based on a thermal penalty associated with the operation and the expected thermal level associated with the target location.
    Type: Grant
    Filed: September 21, 2014
    Date of Patent: April 17, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Manish Arora, Indrani Paul, Yasuko Eckert, Nuwan Jayasena, Dong Ping Zhang
  • Publication number: 20180081590
    Abstract: A processing system employs a memory module as a temporary write buffer to store write requests when a write buffer at a memory controller reaches a threshold capacity, and de-allocates the temporary write buffer when the write buffer capacity falls below the threshold. Upon receiving a write request, the memory controller stores the write request in a write buffer until the write request can be written to main memory. The memory controller can temporarily extend the memory controller's write buffer to the memory module, thereby accommodating temporary periods of high memory activity without requiring a large permanent write buffer at the memory controller.
    Type: Application
    Filed: September 22, 2016
    Publication date: March 22, 2018
    Inventors: Amin Farmahini Farahani, David A. Roberts, Nuwan Jayasena
  • Publication number: 20180074965
    Abstract: Described is a system and method for efficient pointer chasing in systems having a single memory node or a network of memory nodes. In particular, a pointer chasing command is sent along with a memory request by an issuing node to a memory node. The pointer chasing command indicates the number of interdependent memory accesses and information needed for the identified interdependent memory accesses. An address computing unit associated with the memory node determines the relevant memory address for an interdependent memory access absent further interaction with the issuing node or without having to return to the issuing node.
    Type: Application
    Filed: September 15, 2016
    Publication date: March 15, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Paula Aguilera Diez, Amin Farmahini-Farahani, Nuwan Jayasena
  • Publication number: 20180074958
    Abstract: A data processing system includes a plurality of processors, local memories associated with a corresponding processor, and at least one inter-processor link In response to a first processor performing a load or store operation on an address of a corresponding local memory that is not currently in the local cache, a local cache allocates a first cache line and encodes a local state with the first cache line. In response to a load operation from an address of a remote memory that is not currently in the local cache, the local cache allocates a second cache line and encodes a remote state with the second cache line. The first processor performs subsequent loads and stores on the first cache line in the local cache in response to the local state, and subsequent loads from the second cache line in the local cache in response to the remote state.
    Type: Application
    Filed: September 14, 2016
    Publication date: March 15, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Michael Boyer
  • Publication number: 20180018104
    Abstract: Methods and apparatus of dynamically determining a variable reset latency time based on a data pattern of the data to be written into memory is disclosed. A memory controller determines a variable reset latency time for a plurality of memory cells depending on the bit values to be written into the plurality of memory cells in response to a write request having corresponding bit values. A write latency for the plurality of memory cells is dependent on the bit values being written into the plurality of memory cells. The memory controller writes the bit values of the write request to the plurality of memory cells within the determined variable reset latency time.
    Type: Application
    Filed: July 15, 2016
    Publication date: January 18, 2018
    Inventors: Amin Farmahini Farahani, Benjamin Y. Cho, Nuwan Jayasena
  • Publication number: 20170371386
    Abstract: A cooling system is provided for a 3D integrated circuit (IC) to deliver fluid in x, y, and z dimensions to interior regions of the IC as a means to regulate heat. An IC includes a microfluidic network of channels, at least one sensor and at least one microelectromechanical system (MEMS)-based device that is disposed within the network of channels and that is configured to regulate a flow of fluid within the network of channels. Each sensor monitors a state of the IC. Each MEMS-based device receives control signals based on a state of the IC and regulates a flow of fluid within the network of channels based on control signals that area received on a real-time basis based on changes detected in a state of the IC.
    Type: Application
    Filed: June 24, 2016
    Publication date: December 28, 2017
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, Dong Ping Zhang, Nuwan Jayasena
  • Publication number: 20170371805
    Abstract: A method and apparatus for reducing TLB shootdown operation overheads in accelerator-based computing systems is described. The disclosed method and apparatus may also be used in the areas of near-memory and in-memory computing, where near-memory or in-memory compute units may need to share a host CPU's virtual address space. Metadata is associated with page table entries (PTEs) and mechanisms use the metadata to limit the number of processing elements that participate in a TLB shootdown operation.
    Type: Application
    Filed: June 23, 2016
    Publication date: December 28, 2017
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Andrew G. Kegel
  • Publication number: 20170357583
    Abstract: Proactive flush logic in a computing system is configured to perform a proactive flush operation to flush data from a first memory in a first computing device to a second memory in response to execution of a non-blocking flush instruction. Reactive flush logic in the computing system is configured to, in response to a memory request issued prior to completion of the proactive flush operation, interrupt the proactive flush operation and perform a reactive flush operation to flush requested data from the first memory to the second memory.
    Type: Application
    Filed: June 14, 2016
    Publication date: December 14, 2017
    Inventors: Michael Boyer, Gabriel Loh, Nuwan Jayasena
  • Publication number: 20170344479
    Abstract: A cache coherence bridge protocol provides an interface between a cache coherence protocol of a host processor and a cache coherence protocol of a processor-in-memory, thereby decoupling coherence mechanisms of the host processor and the processor-in-memory. The cache coherence bridge protocol requires limited change to existing host processor cache coherence protocols. The cache coherence bridge protocol may be used to facilitate interoperability between host processors and processor-in-memory devices designed by different vendors and both the host processors and processor-in-memory devices may implement coherence techniques among computing units within each processor. The cache coherence bridge protocol may support different granularity of cache coherence permissions than those used by cache coherence protocols of a host processor and/or a processor-in-memory.
    Type: Application
    Filed: May 31, 2016
    Publication date: November 30, 2017
    Inventors: Michael W. Boyer, Nuwan Jayasena
  • Patent number: 9792961
    Abstract: Various apparatus and methods using phase change materials are disclosed. In one aspect, a method of operating a computing device that has a first semiconductor chip with a first phase change material and a second semiconductor chip with a second phase change material is provided. The method includes determining if the first semiconductor chip phase change material has available thermal capacity. If the first semiconductor chip phase change material has available thermal capacity then the first semiconductor chip is instructed to operate in sprint mode. The first semiconductor chip is instructed to perform a first computing task while in sprint mode.
    Type: Grant
    Filed: July 21, 2014
    Date of Patent: October 17, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Manish Arora, Nuwan Jayasena, Gabriel H. Loh, Michael J. Schulte, Srilatha Manne
  • Publication number: 20170293560
    Abstract: A method and apparatus for performing memory prefetching includes determining whether to initiate prefetching. Upon a determination to initiate prefetching, a first memory row is determined as a suitable prefetch candidate, and it is determined whether a particular set of one or more cachelines of the first memory row is to be prefetched.
    Type: Application
    Filed: September 19, 2016
    Publication date: October 12, 2017
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Yasuko Eckert, Nuwan Jayasena, Reena Panda, Onur Kayiran, Michael W. Boyer
  • Publication number: 20170278213
    Abstract: A processor employs a hierarchical register file for a graphics processing unit (GPU). A top level of the hierarchical register file is stored at a local memory of the GPU (e.g., a memory on the same integrated circuit die as the GPU). Lower levels of the hierarchical register file are stored at a different, larger memory, such as a remote memory located on a different die than the GPU. A register file control module monitors the status of in-flight wavefronts at the GPU, and in particular whether each in-flight wavefront is active, predicted to be become active, or inactive. The register file control module places execution data for active and predicted-active wavefronts in the top level of the hierarchical register file and places execution data for inactive wavefronts at lower levels of the hierarchical register file.
    Type: Application
    Filed: March 24, 2016
    Publication date: September 28, 2017
    Inventors: Yasuko Eckert, Nuwan Jayasena