Patents by Inventor Nuwan Jayasena

Nuwan Jayasena has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200057717
    Abstract: In one form, a data processing system includes a host integrated circuit having a memory controller, a memory bus coupled to the memory controller, and a memory module. The memory module includes a bulk memory and a memory module scratchpad coupled to the bulk memory, wherein the memory module scratchpad has a lower access overhead than the bulk memory. The memory controller selectively provides predetermined commands over the memory bus to cause the memory module to copy data between the bulk memory and the memory module scratchpad without conducting data on the memory bus in response to a data movement decision.
    Type: Application
    Filed: August 17, 2018
    Publication date: February 20, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini Farahani, Michael Ignatowski
  • Patent number: 10503641
    Abstract: A cache coherence bridge protocol provides an interface between a cache coherence protocol of a host processor and a cache coherence protocol of a processor-in-memory, thereby decoupling coherence mechanisms of the host processor and the processor-in-memory. The cache coherence bridge protocol requires limited change to existing host processor cache coherence protocols. The cache coherence bridge protocol may be used to facilitate interoperability between host processors and processor-in-memory devices designed by different vendors and both the host processors and processor-in-memory devices may implement coherence techniques among computing units within each processor. The cache coherence bridge protocol may support different granularity of cache coherence permissions than those used by cache coherence protocols of a host processor and/or a processor-in-memory.
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: December 10, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael W. Boyer, Nuwan Jayasena
  • Publication number: 20190317832
    Abstract: A thread holding a lock notifies a sleeping thread that is waiting on the lock that the lock holding thread is “about” to release the lock. In response to the notification, the waiting thread is woken up. While the waiting thread is woken up, the lock holding thread completes other operations prior to actually releasing the lock and then releases the lock. The notification to the waiting thread hides latency associated with waking up the waiting thread by allowing operations that wake up the waiting thread to occur while the lock holding thread is performing the other operations prior to releasing the thread.
    Type: Application
    Filed: April 12, 2018
    Publication date: October 17, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini-Farahani, David A. Roberts
  • Publication number: 20190317831
    Abstract: A memory fence or other similar operation is executed with reduced latency. An early fence operation is executed and acts as a hint to the processor executing the thread that executes the fence. This hint causes the processor to begin performing sub-operations for the fence earlier than if no such hint were executed. Examples of sub-operations for the fence include operations to make data written to by writes prior to the fence operation available to other threads. A resolving fence, which occurs after the early fence, performs the remaining sub-operations for the fence. By triggering some or all of the sub-operations for a memory fence that will occur in the future, the early fence operation reduces the amount of latency associated with that memory fence operation.
    Type: Application
    Filed: April 12, 2018
    Publication date: October 17, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Amin Farmahini-Farahani, David A. Roberts, Nuwan Jayasena
  • Patent number: 10409343
    Abstract: A cooling system is provided for a 3D integrated circuit (IC) to deliver fluid in x, y, and z dimensions to interior regions of the IC as a means to regulate heat. An IC includes a microfluidic network of channels, at least one sensor and at least one microelectromechanical system (MEMS)-based device that is disposed within the network of channels and that is configured to regulate a flow of fluid within the network of channels. Each sensor monitors a state of the IC. Each MEMS-based device receives control signals based on a state of the IC and regulates a flow of fluid within the network of channels based on control signals that area received on a real-time basis based on changes detected in a state of the IC.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: September 10, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, Dong Ping Zhang, Nuwan Jayasena
  • Patent number: 10382410
    Abstract: A processing system includes a processing module having a first interface coupleable to an interconnect. The first interface includes a first cryptologic engine to encrypt a representation of store data of a store operation and a memory address using a first key and a first feedback-based cryptologic process to generate first encrypted data and an encrypted memory address. A memory module includes a second interface coupled to the interconnect. The second interface includes a second cryptologic engine to decrypt the first encrypted data and the encrypted memory address using a second key and a second feedback-based cryptologic process to generate a copy of the representation of the store data and a copy of the memory address. The second interface further is to store the copy of the representation of the store data to a memory location of the memory core based on the copy of the memory address.
    Type: Grant
    Filed: January 12, 2016
    Date of Patent: August 13, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Dong Ping Zhang
  • Publication number: 20190220426
    Abstract: A configurable computing system which uses near-memory and in-memory hardened logic blocks is described herein. The hardened logic blocks are incorporated into memory modules. The memory modules include an interface or communication logic to communicate between the configurable computing substrate and the memory module. In an implementation, the memory modules can include an on-die memory or other forms of non-configurable logic to enable more efficient processing for a variety of operations. In another implementation, the memory modules can include a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations. In another implementation, the memory modules can include an on-die memory and a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations.
    Type: Application
    Filed: January 16, 2018
    Publication date: July 18, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Michael Ignatowski
  • Publication number: 20190188577
    Abstract: A system assigns experts of a mixture-of-experts artificial intelligence model to processing devices in an automated manner. The system includes an orchestrator component that maintains priority data that stores, for each of a set of experts, and for each of a set of execution parameters, ranking information that ranks different processing devices for the particular execution parameter. In one example, for the execution parameter of execution speed, and for a first expert, the priority data indicates that a central processing unit (“CPU”) executes the first expert faster than a graphics processing unit (“GPU”). In this example, for the execution parameter of power consumption, and for the first expert, the priority data indicates that a GPU uses less power than a CPU. The priority data stores such information for one or more processing devices, one or more experts, and one or more execution characteristics.
    Type: Application
    Filed: December 20, 2017
    Publication date: June 20, 2019
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Nicholas Malaya, Nuwan Jayasena
  • Patent number: 10310981
    Abstract: A method and apparatus for performing memory prefetching includes determining whether to initiate prefetching. Upon a determination to initiate prefetching, a first memory row is determined as a suitable prefetch candidate, and it is determined whether a particular set of one or more cachelines of the first memory row is to be prefetched.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: June 4, 2019
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Yasuko Eckert, Nuwan Jayasena, Reena Panda, Onur Kayiran, Michael W. Boyer
  • Patent number: 10310997
    Abstract: A processing system employs a memory module as a temporary write buffer to store write requests when a write buffer at a memory controller reaches a threshold capacity, and de-allocates the temporary write buffer when the write buffer capacity falls below the threshold. Upon receiving a write request, the memory controller stores the write request in a write buffer until the write request can be written to main memory. The memory controller can temporarily extend the memory controller's write buffer to the memory module, thereby accommodating temporary periods of high memory activity without requiring a large permanent write buffer at the memory controller.
    Type: Grant
    Filed: September 22, 2016
    Date of Patent: June 4, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amin Farmahini Farahani, David A. Roberts, Nuwan Jayasena
  • Publication number: 20190163644
    Abstract: A first processor is configured to detect migration of a page from a second memory associated with a second processor to a first memory associated with the first processor or to detect duplication of the page in the first memory and the second memory. The first processor implements a translation lookaside buffer (TLB) and the first processor is configured to insert an entry in the TLB in response to the duplication or the migration of the page. The entry maps a virtual address of the page to a physical address in the first memory and the entry is inserted into the TLB without modifying a corresponding entry in a page table that maps the virtual address of the page to a physical address in the second memory. In some cases, a duplicate translation table (DTT) stores a copy of the entry that is accessed in response to a TLB miss.
    Type: Application
    Filed: November 29, 2017
    Publication date: May 30, 2019
    Inventors: Nuwan JAYASENA, Yasuko ECKERT
  • Patent number: 10282308
    Abstract: A method and apparatus for reducing TLB shootdown operation overheads in accelerator-based computing systems is described. The disclosed method and apparatus may also be used in the areas of near-memory and in-memory computing, where near-memory or in-memory compute units may need to share a host CPU's virtual address space. Metadata is associated with page table entries (PTEs) and mechanisms use the metadata to limit the number of processing elements that participate in a TLB shootdown operation.
    Type: Grant
    Filed: June 23, 2016
    Date of Patent: May 7, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Andrew G. Kegel
  • Patent number: 10268416
    Abstract: A memory-to-memory copy operation control system includes a processor configured to receive an instruction to perform a memory-to-memory copy operation and a memory module network in communication with the processor. The memory module network has a plurality of memory modules that include a proximal memory module in direct communication with the processor and one or more additional memory modules in communication with the processor via the proximal memory module. The system also includes a memory controller in communication with the processor and the network of memory modules. The processor is configured to issue a first command causing data to be copied from a first memory module to a second memory module without sending the data to the processor or the memory controller.
    Type: Grant
    Filed: October 28, 2015
    Date of Patent: April 23, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, David A. Roberts
  • Patent number: 10242420
    Abstract: Methods and apparatus are described. A method includes an accelerated processing device running a process. When a maximum time interval during which the process is permitted to run expires before the process completes, the accelerated processing device receives an operating-system-initiated instruction to stop running the process. The accelerated processing device stops the process from running in response to the received operating-system-initiated instruction.
    Type: Grant
    Filed: November 28, 2016
    Date of Patent: March 26, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Robert Scott Hartog, Ralph Clayton Taylor, Michael Mantor, Kevin John McGrath, Sebastien Nussbaum, Nuwan Jayasena, Rex McCrary, Mark Leather, Philip J. Rogers, Thomas Woller
  • Patent number: 10198369
    Abstract: A data processing system includes a memory that includes a first memory bank and a second memory bank. The data processing system also includes a conflict detector connected to the memory and adapted to receive memory access information. The conflict detector tracks memory access statistics of the first memory bank, and determines if the first memory bank contains frequent row conflicts. The conflict detector also remaps a frequent row conflict in the first memory bank to the second memory bank. An indirection table is connected to the conflict detector and adapted to receive a memory access request, and redirects an address into a dynamically selected physical memory address in response to a remapping of the frequent row conflict to the second memory bank.
    Type: Grant
    Filed: March 24, 2017
    Date of Patent: February 5, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Yasuko Eckert, Reena Panda, Nuwan Jayasena
  • Publication number: 20190018800
    Abstract: A host processor receives an address translation request from an accelerator, which may be trusted or un-trusted. The address translation request includes a virtual address in a virtual address space that is shared by the host processor and the accelerator. The host processor encrypts a physical address in a host memory indicated by the virtual address in response to the accelerator being permitted to access the physical address. The host processor then provides the encrypted physical address to the accelerator. The accelerator provides memory access requests including the encrypted physical address to the host processor, which decrypts the physical address and selectively accesses a location in the host memory indicated by the decrypted physical address depending upon whether the accelerator is permitted to access the location indicated by the decrypted physical address.
    Type: Application
    Filed: July 14, 2017
    Publication date: January 17, 2019
    Inventors: Nuwan JAYASENA, Brandon K. POTTER, Andrew G. KEGEL
  • Patent number: 10162757
    Abstract: A distributed shared-memory system includes several nodes that each have one or more processor cores, caches, local main memory, and a directory. Each node further includes predictors that use historical memory access information to predict future coherence permission requirements and speculatively initiate coherence operations. In one embodiment, predictors are included at processor cores for monitoring a memory access stream (e.g., historical sequence of memory addresses referenced by a processor core) and predicting addresses of future accesses. In another embodiment, predictors are included at the directory of each node for monitoring memory access traffic and coherence-related activities for individual cache lines to predict future demands for particular cache lines. In other embodiments, predictors are included at both the processor cores and directory of each node.
    Type: Grant
    Filed: December 6, 2016
    Date of Patent: December 25, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Yasuko Eckert
  • Patent number: 10133672
    Abstract: Described is a system and method for efficient pointer chasing in systems having a single memory node or a network of memory nodes. In particular, a pointer chasing command is sent along with a memory request by an issuing node to a memory node. The pointer chasing command indicates the number of interdependent memory accesses and information needed for the identified interdependent memory accesses. An address computing unit associated with the memory node determines the relevant memory address for an interdependent memory access absent further interaction with the issuing node or without having to return to the issuing node.
    Type: Grant
    Filed: September 15, 2016
    Date of Patent: November 20, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Paula Aguilera Diez, Amin Farmahini-Farahani, Nuwan Jayasena
  • Publication number: 20180276150
    Abstract: A data processing system includes a memory that includes a first memory bank and a second memory bank. The data processing system also includes a conflict detector connected to the memory and adapted to receive memory access information. The conflict detector tracks memory access statistics of the first memory bank, and determines if the first memory bank contains frequent row conflicts. The conflict detector also remaps a frequent row conflict in the first memory bank to the second memory bank. An indirection table is connected to the conflict detector and adapted to receive a memory access request, and redirects an address into a dynamically selected physical memory address in response to a remapping of the frequent row conflict to the second memory bank.
    Type: Application
    Filed: March 24, 2017
    Publication date: September 27, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Yasuko Eckert, Reena Panda, Nuwan Jayasena
  • Publication number: 20180239702
    Abstract: A cache coherence technique for operating a multi-processor system including shared memory includes allocating a cache line of a cache memory of a processor to a memory address in the shared memory in response to execution of an instruction of a program executing on the processor. The technique includes encoding a shared information state of the cache line to indicate whether the memory address is a shared memory address shared by the processor and a second processor, or a private memory address private to the processor, in response to whether the instruction is included in a critical section of the program, the critical section being a portion of the program that confines access to shared, writeable data.
    Type: Application
    Filed: February 23, 2017
    Publication date: August 23, 2018
    Inventors: Amin Farmahini Farahani, Nuwan Jayasena