Patents by Inventor Amin Farmahini-Farahani

Amin Farmahini-Farahani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

FAST THREAD WAKE-UP THROUGH EARLY LOCK RELEASE

Publication number: 20190317832

Abstract: A thread holding a lock notifies a sleeping thread that is waiting on the lock that the lock holding thread is “about” to release the lock. In response to the notification, the waiting thread is woken up. While the waiting thread is woken up, the lock holding thread completes other operations prior to actually releasing the lock and then releases the lock. The notification to the waiting thread hides latency associated with waking up the waiting thread by allowing operations that wake up the waiting thread to occur while the lock holding thread is performing the other operations prior to releasing the thread.

Type: Application

Filed: April 12, 2018

Publication date: October 17, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Nuwan Jayasena, Amin Farmahini-Farahani, David A. Roberts
TECHNIQUES FOR IMPROVED LATENCY OF THREAD SYNCHRONIZATION MECHANISMS

Publication number: 20190317831

Abstract: A memory fence or other similar operation is executed with reduced latency. An early fence operation is executed and acts as a hint to the processor executing the thread that executes the fence. This hint causes the processor to begin performing sub-operations for the fence earlier than if no such hint were executed. Examples of sub-operations for the fence include operations to make data written to by writes prior to the fence operation available to other threads. A resolving fence, which occurs after the early fence, performs the remaining sub-operations for the fence. By triggering some or all of the sub-operations for a memory fence that will occur in the future, the early fence operation reduces the amount of latency associated with that memory fence operation.

Type: Application

Filed: April 12, 2018

Publication date: October 17, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Amin Farmahini-Farahani, David A. Roberts, Nuwan Jayasena
High-performance on-module caching architectures for non-volatile dual in-line memory module (NVDIMM)

Patent number: 10431305

Abstract: A high-performance on-module caching architecture for hybrid memory modules is provided. A hybrid memory module includes a cache controller, a first volatile memory coupled to the cache controller, a first multiplexing data buffer coupled to the first volatile memory and the cache controller, and a first non-volatile memory coupled to the first multiplexing data buffer and the cache controller, wherein the first multiplexing data buffer multiplexes data between the first volatile memory and the first non-volatile memory and wherein the cache controller enables a tag checking operation to occur in parallel with a data movement operation. The hybrid memory module includes a volatile memory tag unit coupled to the cache controller, wherein the volatile memory tag unit includes a line connection that allows the cache controller to store a plurality of tags in the volatile memory tag unit and retrieve the plurality of tags from the volatile memory tag unit.

Type: Grant

Filed: December 14, 2017

Date of Patent: October 1, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Amin Farmahini Farahani, David A. Roberts
HIGH-PERFORMANCE ON-MODULE CACHING ARCHITECTURES FOR NON-VOLATILE DUAL IN-LINE MEMORY MODULE (NVDIMM)

Publication number: 20190189210

Abstract: A high-performance on-module caching architecture for hybrid memory modules is provided. A hybrid memory module includes a cache controller, a first volatile memory coupled to the cache controller, a first multiplexing data buffer coupled to the first volatile memory and the cache controller, and a first non-volatile memory coupled to the first multiplexing data buffer and the cache controller, wherein the first multiplexing data buffer multiplexes data between the first volatile memory and the first non-volatile memory and wherein the cache controller enables a tag checking operation to occur in parallel with a data movement operation. The hybrid memory module includes a volatile memory tag unit coupled to the cache controller, wherein the volatile memory tag unit includes a line connection that allows the cache controller to store a plurality of tags in the volatile memory tag unit and retrieve the plurality of tags from the volatile memory tag unit.

Type: Application

Filed: December 14, 2017

Publication date: June 20, 2019

Inventors: Amin FARMAHINI FARAHANI, David A. ROBERTS
System and method for dynamically allocating memory to hold pending write requests

Patent number: 10310997

Abstract: A processing system employs a memory module as a temporary write buffer to store write requests when a write buffer at a memory controller reaches a threshold capacity, and de-allocates the temporary write buffer when the write buffer capacity falls below the threshold. Upon receiving a write request, the memory controller stores the write request in a write buffer until the write request can be written to main memory. The memory controller can temporarily extend the memory controller's write buffer to the memory module, thereby accommodating temporary periods of high memory activity without requiring a large permanent write buffer at the memory controller.

Type: Grant

Filed: September 22, 2016

Date of Patent: June 4, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Amin Farmahini Farahani, David A. Roberts, Nuwan Jayasena
Logical memory address regions

Patent number: 10255191

Abstract: Systems, apparatuses, and methods for implementing logical memory address regions in a computing system. The physical memory address space of a computing system may be partitioned into a plurality of logical memory address regions. Each logical memory address region may be dynamically configured at run-time to meet changing application needs of the system. Each logical memory address region may also be configured separately from the other logical memory address regions. Each logical memory address region may have associated parameters that identify region start address, region size, cell-level mode, physical-to-device mapping scheme, address masks, access permissions, wear-leveling data, encryption settings, and compression settings. These parameters may be stored in a table which may be used when processing memory access requests.

Type: Grant

Filed: April 19, 2016

Date of Patent: April 9, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Amin Farmahini-Farahani, David A. Roberts
Instruction set and micro-architecture supporting asynchronous memory access

Patent number: 10209991

Abstract: A system and method for reducing latencies of main memory data accesses are described. A non-blocking load (NBLD) instruction identifies an address of requested data and a subroutine. The subroutine includes instructions dependent on the requested data. A processing unit verifies that address translations are available for both the address and the subroutine. The processing unit continues processing instructions with no stalls caused by younger-in-program-order instructions waiting for the requested data. The non-blocking load unit performs a cache coherent data read request on behalf of the NBLD instruction and requests that the processing unit perform an asynchronous jump to the subroutine upon return of the requested data from lower-level memory.

Type: Grant

Filed: November 16, 2016

Date of Patent: February 19, 2019

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Meenakshi Sundaram Bhaskaran, Elliot H. Mednick, David A. Roberts, Anthony Asaro, Amin Farmahini-Farahani
System and method for efficient pointer chasing

Patent number: 10133672

Abstract: Described is a system and method for efficient pointer chasing in systems having a single memory node or a network of memory nodes. In particular, a pointer chasing command is sent along with a memory request by an issuing node to a memory node. The pointer chasing command indicates the number of interdependent memory accesses and information needed for the identified interdependent memory accesses. An address computing unit associated with the memory node determines the relevant memory address for an interdependent memory access absent further interaction with the issuing node or without having to return to the issuing node.

Type: Grant

Filed: September 15, 2016

Date of Patent: November 20, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Paula Aguilera Diez, Amin Farmahini-Farahani, Nuwan Jayasena
COMPILER-ASSISTED INTER-SIMD-GROUP REGISTER SHARING

Publication number: 20180275991

Abstract: Systems, apparatuses, and methods for efficiently sharing registers among threads are disclosed. A system includes at least a processor, control logic, and a register file with a plurality of registers. The processor assigns a base set of registers to each thread of a plurality of threads executing on the processor. When a given thread needs more than the base set of registers to execute a given phase of program code, the given thread executes an acquire instruction to acquire exclusive access to an extended set of registers from a shared resource pool. When the given thread no longer needs additional registers, the given thread executes a release instruction to release the extended set of registers back into the shared register pool for other threads to use. In one implementation, the compiler inserts acquire and release instructions into the program code based on a register liveness analysis performed during compilation.

Type: Application

Filed: March 26, 2018

Publication date: September 27, 2018

Inventors: Farzad Khorasani, Amin Farmahini-Farahani, Nuwan S. Jayasena
LOCALITY-AWARE AND SHARING-AWARE CACHE COHERENCE FOR COLLECTIONS OF PROCESSORS

Publication number: 20180239702

Abstract: A cache coherence technique for operating a multi-processor system including shared memory includes allocating a cache line of a cache memory of a processor to a memory address in the shared memory in response to execution of an instruction of a program executing on the processor. The technique includes encoding a shared information state of the cache line to indicate whether the memory address is a shared memory address shared by the processor and a second processor, or a private memory address private to the processor, in response to whether the instruction is included in a critical section of the program, the critical section being a portion of the program that confines access to shared, writeable data.

Type: Application

Filed: February 23, 2017

Publication date: August 23, 2018

Inventors: Amin Farmahini Farahani, Nuwan Jayasena
DYNAMIC CACHE BYPASSING

Publication number: 20180165214

Abstract: A processing system fills a memory access request for data from a processor core by bypassing a cache when a write congestion condition is detected, and when transferring the data to the cache would cause eviction of a dirty cache line. The cache is bypassed by transferring the requested data to the processor core or to a different cache. Accordingly, the processing system can temporarily bypass the cache storing the dirty cache line when filling a memory access request, thereby avoiding the eviction and write back to main memory of a dirty cache line when a write congestion condition exists.

Type: Application

Filed: December 13, 2016

Publication date: June 14, 2018

Inventors: Amin Farmahini Farahani, David A. Roberts
SYSTEM AND METHOD FOR DYNAMICALLY ALLOCATING MEMORY AT A MEMORY CONTROLLER

Publication number: 20180081590

Abstract: A processing system employs a memory module as a temporary write buffer to store write requests when a write buffer at a memory controller reaches a threshold capacity, and de-allocates the temporary write buffer when the write buffer capacity falls below the threshold. Upon receiving a write request, the memory controller stores the write request in a write buffer until the write request can be written to main memory. The memory controller can temporarily extend the memory controller's write buffer to the memory module, thereby accommodating temporary periods of high memory activity without requiring a large permanent write buffer at the memory controller.

Type: Application

Filed: September 22, 2016

Publication date: March 22, 2018

Inventors: Amin Farmahini Farahani, David A. Roberts, Nuwan Jayasena
DYNAMIC ADAPTATION OF MEMORY PAGE MANAGEMENT POLICY

Publication number: 20180074715

Abstract: Systems, apparatuses, and methods for determining preferred memory page management policies by software are disclosed. Software executing on one or more processing units generates a memory request. Software determines the preferred page management policy for the memory request based at least in part on the data access size and data access pattern of the memory request. Software conveys an indication of a preferred page management policy to a memory controller. Then, the memory controller accesses memory for the memory request using the preferred page management policy specified by software.

Type: Application

Filed: September 13, 2016

Publication date: March 15, 2018

Inventors: Amin Farmahini-Farahani, Alexander D. Breslow, Nuwan S. Jayasena
SYSTEM AND METHOD FOR EFFICIENT POINTER CHASING

Publication number: 20180074965

Abstract: Described is a system and method for efficient pointer chasing in systems having a single memory node or a network of memory nodes. In particular, a pointer chasing command is sent along with a memory request by an issuing node to a memory node. The pointer chasing command indicates the number of interdependent memory accesses and information needed for the identified interdependent memory accesses. An address computing unit associated with the memory node determines the relevant memory address for an interdependent memory access absent further interaction with the issuing node or without having to return to the issuing node.

Type: Application

Filed: September 15, 2016

Publication date: March 15, 2018

Applicant: Advanced Micro Devices, Inc.

Inventors: Paula Aguilera Diez, Amin Farmahini-Farahani, Nuwan Jayasena
DYNAMIC WRITE LATENCY FOR MEMORY CONTROLLER USING DATA PATTERN EXTRACTION

Publication number: 20180018104

Abstract: Methods and apparatus of dynamically determining a variable reset latency time based on a data pattern of the data to be written into memory is disclosed. A memory controller determines a variable reset latency time for a plurality of memory cells depending on the bit values to be written into the plurality of memory cells in response to a write request having corresponding bit values. A write latency for the plurality of memory cells is dependent on the bit values being written into the plurality of memory cells. The memory controller writes the bit values of the write request to the plurality of memory cells within the determined variable reset latency time.

Type: Application

Filed: July 15, 2016

Publication date: January 18, 2018

Inventors: Amin Farmahini Farahani, Benjamin Y. Cho, Nuwan Jayasena
INSTRUCTION SET AND MICRO-ARCHITECTURE SUPPORTING ASYNCHRONOUS MEMORY ACCESS

Publication number: 20170212760

Abstract: A system and method for reducing latencies of main memory data accesses are described. A non-blocking load (NBLD) instruction identifies an address of requested data and a subroutine. The subroutine includes instructions dependent on the requested data. A processing unit verifies that address translations are available for both the address and the subroutine. The processing unit continues processing instructions with no stalls caused by younger-in-program-order instructions waiting for the requested data. The non-blocking load unit performs a cache coherent data read request on behalf of the NBLD instruction and requests that the processing unit perform an asynchronous jump to the subroutine upon return of the requested data from lower-level memory.

Type: Application

Filed: November 16, 2016

Publication date: July 27, 2017

Inventors: Meenakshi Sundaram Bhaskaran, Elliot H. Mednick, David A. Roberts, Anthony Asaro, Amin Farmahini-Farahani
DISTRIBUTED GATHER/SCATTER OPERATIONS ACROSS A NETWORK OF MEMORY NODES

Publication number: 20170048320

Abstract: Devices, methods, and systems for distributed gather and scatter operations in a network of memory nodes. A responding memory node includes a memory; a communications interface having circuitry configured to communicate with at least one other memory node; and a controller. The controller includes circuitry configured to receive a request message from a requesting node via the communications interface. The request message indicates a gather or scatter operation, and instructs the responding node to retrieve data elements from a source memory data structure and store the data elements to a destination memory data structure. The controller further includes circuitry configured to transmit a response message to the requesting node via the communications interface. The response message indicates that the data elements have been stored into the destination memory data structure.

Type: Application

Filed: July 27, 2016

Publication date: February 16, 2017

Applicant: Advanced Micro Devices, Inc.

Inventors: Amin Farmahini-Farahani, David A. Roberts
LOGICAL MEMORY ADDRESS REGIONS

Publication number: 20170046272

Abstract: Systems, apparatuses, and methods for implementing logical memory address regions in a computing system. The physical memory address space of a computing system may be partitioned into a plurality of logical memory address regions. Each logical memory address region may be dynamically configured at run-time to meet changing application needs of the system. Each logical memory address region may also be configured separately from the other logical memory address regions. Each logical memory address region may have associated parameters that identify region start address, region size, cell-level mode, physical-to-device mapping scheme, address masks, access permissions, wear-leveling data, encryption settings, and compression settings. These parameters may be stored in a table which may be used when processing memory access requests.

Type: Application

Filed: April 19, 2016

Publication date: February 16, 2017

Inventors: Amin Farmahini-Farahani, David A. Roberts

prev 1 2