Patents by Inventor Nuwan Jayasena

Nuwan Jayasena has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11055150
    Abstract: A thread holding a lock notifies a sleeping thread that is waiting on the lock that the lock holding thread is “about” to release the lock. In response to the notification, the waiting thread is woken up. While the waiting thread is woken up, the lock holding thread completes other operations prior to actually releasing the lock and then releases the lock. The notification to the waiting thread hides latency associated with waking up the waiting thread by allowing operations that wake up the waiting thread to occur while the lock holding thread is performing the other operations prior to releasing the thread.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: July 6, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini-Farahani, David A. Roberts
  • Patent number: 11049794
    Abstract: Various circuit board embodiments are disclosed. In one aspect, an apparatus is provided that includes a circuit board and a first phase change material pocket positioned on or in the circuit board and contacting a surface of the circuit board.
    Type: Grant
    Filed: March 1, 2014
    Date of Patent: June 29, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Manish Arora, Nuwan Jayasena
  • Publication number: 20210182262
    Abstract: A method and apparatus perform a first hash operation on a first key wherein the first hash operation is biased to map the first key and associated value to a set of frequently-accessed buckets in a hash table. An entry for the first key and associated value is stored in the set of frequently-accessed buckets. A second hash operation is performed on a second key wherein the second hash operation is biased to map the second key and associated value to a set of less frequently-accessed buckets in the hash table. An entry for the second key and associated value is stored in the set of less frequently-accessed buckets. The method and apparatus perform a hash table look up of the requested key in the set of frequently-accessed buckets, if the requested key is not found, then a hash table lookup is performed in the set of less frequently-accessed buckets.
    Type: Application
    Filed: December 17, 2019
    Publication date: June 17, 2021
    Inventor: Nuwan Jayasena
  • Patent number: 11030117
    Abstract: A host processor receives an address translation request from an accelerator, which may be trusted or un-trusted. The address translation request includes a virtual address in a virtual address space that is shared by the host processor and the accelerator. The host processor encrypts a physical address in a host memory indicated by the virtual address in response to the accelerator being permitted to access the physical address. The host processor then provides the encrypted physical address to the accelerator. The accelerator provides memory access requests including the encrypted physical address to the host processor, which decrypts the physical address and selectively accesses a location in the host memory indicated by the decrypted physical address depending upon whether the accelerator is permitted to access the location indicated by the decrypted physical address.
    Type: Grant
    Filed: July 14, 2017
    Date of Patent: June 8, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Nuwan Jayasena, Brandon K. Potter, Andrew G. Kegel
  • Patent number: 10990453
    Abstract: A memory fence or other similar operation is executed with reduced latency. An early fence operation is executed and acts as a hint to the processor executing the thread that executes the fence. This hint causes the processor to begin performing sub-operations for the fence earlier than if no such hint were executed. Examples of sub-operations for the fence include operations to make data written to by writes prior to the fence operation available to other threads. A resolving fence, which occurs after the early fence, performs the remaining sub-operations for the fence. By triggering some or all of the sub-operations for a memory fence that will occur in the future, the early fence operation reduces the amount of latency associated with that memory fence operation.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: April 27, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amin Farmahini-Farahani, David A. Roberts, Nuwan Jayasena
  • Publication number: 20210117100
    Abstract: A hybrid mechanism for operating on a data item in connection with an associative structure combines first-fit and K-choice. The hybrid mechanism leverages advantages of both approaches by choosing whether to insert, retrieve, delete, or modify a data item using either first-fit or K-choice. Based on the data item, a function of the data item, and/or other factors such as the load statistics of the associative structure, one of either first-fit or K-choice is used to improve operation on the associative structure across a variety of different load states of the associative structure.
    Type: Application
    Filed: October 21, 2019
    Publication date: April 22, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, Nuwan Jayasena
  • Publication number: 20210117133
    Abstract: An approach is provided for implementing near-memory data reduction during store operations to off-chip or off-die memory. A Near-Memory Reduction (NMR) unit provides near-memory data reduction during write operations to a specified address range. The NMR unit is configured with a range of addresses to be reduced and when a store operation specifies an address within the range of addresses, the NRM unit performs data reduction by adding the data value specified by the store operation to an accumulated reduction result. According to an embodiment, the NRM unit maintains a count of the number of updates to the accumulated reduction result that are used to determine when data reduction has been completed.
    Type: Application
    Filed: October 21, 2019
    Publication date: April 22, 2021
    Inventors: Nuwan Jayasena, Shaizeen Aga
  • Patent number: 10956536
    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: March 23, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Publication number: 20210050864
    Abstract: A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.
    Type: Application
    Filed: August 16, 2019
    Publication date: February 18, 2021
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, Nuwan Jayasena, John Kalamatianos
  • Patent number: 10902087
    Abstract: A processing device is provided which includes memory and a processor comprising a plurality of processor cores in communication with each other via first and second hierarchical communication links. Each processor core in a group of the processor cores is in communication with each other via the first hierarchical communication links. Each processor core is configured to store, in the memory, one of a plurality of sub-portions of data of a first matrix, store, in the memory, one of a plurality of sub-portions of data of a second matrix, determine an outer product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core of the group of processor cores, another sub-portion of data of the second matrix and determine another outer product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: January 26, 2021
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Patent number: 10853904
    Abstract: A processor employs a hierarchical register file for a graphics processing unit (GPU). A top level of the hierarchical register file is stored at a local memory of the GPU (e.g., a memory on the same integrated circuit die as the GPU). Lower levels of the hierarchical register file are stored at a different, larger memory, such as a remote memory located on a different die than the GPU. A register file control module monitors the status of in-flight wavefronts at the GPU, and in particular whether each in-flight wavefront is active, predicted to be become active, or inactive. The register file control module places execution data for active and predicted-active wavefronts in the top level of the hierarchical register file and places execution data for inactive wavefronts at lower levels of the hierarchical register file.
    Type: Grant
    Filed: March 24, 2016
    Date of Patent: December 1, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Yasuko Eckert, Nuwan Jayasena
  • Patent number: 10817422
    Abstract: In one form, a data processing system includes a host integrated circuit having a memory controller, a memory bus coupled to the memory controller, and a memory module. The memory module includes a bulk memory and a memory module scratchpad coupled to the bulk memory, wherein the memory module scratchpad has a lower access overhead than the bulk memory. The memory controller selectively provides predetermined commands over the memory bus to cause the memory module to copy data between the bulk memory and the memory module scratchpad without conducting data on the memory bus in response to a data movement decision.
    Type: Grant
    Filed: August 17, 2018
    Date of Patent: October 27, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini Farahani, Michael Ignatowski
  • Patent number: 10802977
    Abstract: A processing system tracks counts of accesses to memory pages using a set of counters located at the memory module that stores the pages, wherein the counts are adjusted at least in part based on refreshes of the memory pages. This approach allows a processing system to efficiently maintain the counts with relatively small counters and with relatively low overhead. Furthermore, the rate at which the counters are adjusted, relative to the page refreshes, is adjustable, so that the access counts are useful for a wide variety of application types.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: October 13, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Georgios Mappouras, Amin Farmahini Farahani, Nuwan Jayasena
  • Patent number: 10782918
    Abstract: Methods, systems, and devices for near-memory data-dependent gathering and packing of data stored in a memory. A processing device extracts a function, a memory source address, and a memory destination address from a near-memory data-dependent gathering and packing primitive. A signal to perform gathering and packing operations based on the primitive is sent to near-memory processing circuitry of a memory device. The near-memory processing circuitry receives the signal, gathers data from the memory device based on the function and the memory source address, and packs the gathered data into the memory device based on the memory destination address.
    Type: Grant
    Filed: September 6, 2018
    Date of Patent: September 22, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Shaizeen Aga, Nuwan Jayasena
  • Patent number: 10749545
    Abstract: A data storage system performs partial compression and decompression of a set of memory items. The memory items each include a data block and a tag with a prefix making up at least part of the tag. The memory items are ordered based on the prefixes. A code word is created containing compressed information representing values of the prefixes for the set of memory items. The code word and block data for each of the memory items are stored in a memory. The code word is decompressed to recover the prefixes.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: August 18, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, Nuwan Jayasena, John Kalamatianos
  • Publication number: 20200192809
    Abstract: A processing system tracks counts of accesses to memory pages using a set of counters located at the memory module that stores the pages, wherein the counts are adjusted at least in part based on refreshes of the memory pages. This approach allows a processing system to efficiently maintain the counts with relatively small counters and with relatively low overhead. Furthermore, the rate at which the counters are adjusted, relative to the page refreshes, is adjustable, so that the access counts are useful for a wide variety of application types.
    Type: Application
    Filed: December 12, 2018
    Publication date: June 18, 2020
    Inventors: Georgios MAPPOURAS, Amin FARMAHINI FARAHANI, Nuwan JAYASENA
  • Publication number: 20200133993
    Abstract: A processing device is provided which includes memory and a processor comprising a plurality of processor cores in communication with each other via first and second hierarchical communication links. Each processor core in a group of the processor cores is in communication with each other via the first hierarchical communication links. Each processor core is configured to store, in the memory, one of a plurality of sub-portions of data of a first matrix, store, in the memory, one of a plurality of sub-portions of data of a second matrix, determine an outer product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core of the group of processor cores, another sub-portion of data of the second matrix and determine another outer product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Publication number: 20200133992
    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
    Type: Application
    Filed: October 31, 2018
    Publication date: April 30, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
  • Publication number: 20200081651
    Abstract: Methods, systems, and devices for near-memory data-dependent gathering and packing of data stored in a memory. A processing device extracts a function, a memory source address, and a memory destination address from a near-memory data-dependent gathering and packing primitive. A signal to perform gathering and packing operations based on the primitive is sent to near-memory processing circuitry of a memory device. The near-memory processing circuitry receives the signal, gathers data from the memory device based on the function and the memory source address, and packs the gathered data into the memory device based on the memory destination address.
    Type: Application
    Filed: September 6, 2018
    Publication date: March 12, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shaizeen Aga, Nuwan Jayasena
  • Patent number: 10579557
    Abstract: A configurable computing system which uses near-memory and in-memory hardened logic blocks is described herein. The hardened logic blocks are incorporated into memory modules. The memory modules include an interface or communication logic to communicate between the configurable computing substrate and the memory module. In an implementation, the memory modules can include an on-die memory or other forms of non-configurable logic to enable more efficient processing for a variety of operations. In another implementation, the memory modules can include a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations. In another implementation, the memory modules can include an on-die memory and a portion of configurable computing substrate logic fabric to enable more efficient processing for a variety of operations.
    Type: Grant
    Filed: January 16, 2018
    Date of Patent: March 3, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Michael Ignatowski