Patents by Inventor Amin Farmahini

Amin Farmahini has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11775799
    Abstract: Systems, apparatuses, and methods for managing buffers in a neural network implementation with heterogeneous memory are disclosed. A system includes a neural network coupled to a first memory and a second memory. The first memory is a relatively low-capacity, high-bandwidth memory while the second memory is a relatively high-capacity, low-bandwidth memory. During a forward propagation pass of the neural network, a run-time manager monitors the usage of the buffers for the various layers of the neural network. During a backward propagation pass of the neural network, the run-time manager determines how to move the buffers between the first and second memories based on the monitored buffer usage during the forward propagation pass. As a result, the run-time manager is able to reduce memory access latency for the layers of the neural network during the backward propagation pass.
    Type: Grant
    Filed: November 19, 2018
    Date of Patent: October 3, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Georgios Mappouras, Amin Farmahini-Farahani, Sudhanva Gurumurthi, Abhinav Vishnu, Gabriel H. Loh
  • Patent number: 11748028
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Grant
    Filed: November 23, 2022
    Date of Patent: September 5, 2023
    Assignee: Google LLC
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Patent number: 11537397
    Abstract: Systems, apparatuses, and methods for efficiently sharing registers among threads are disclosed. A system includes at least a processor, control logic, and a register file with a plurality of registers. The processor assigns a base set of registers to each thread of a plurality of threads executing on the processor. When a given thread needs more than the base set of registers to execute a given phase of program code, the given thread executes an acquire instruction to acquire exclusive access to an extended set of registers from a shared resource pool. When the given thread no longer needs additional registers, the given thread executes a release instruction to release the extended set of registers back into the shared register pool for other threads to use. In one implementation, the compiler inserts acquire and release instructions into the program code based on a register liveness analysis performed during compilation.
    Type: Grant
    Filed: March 26, 2018
    Date of Patent: December 27, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Farzad Khorasani, Amin Farmahini-Farahani, Nuwan S. Jayasena
  • Patent number: 11513724
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Grant
    Filed: June 15, 2021
    Date of Patent: November 29, 2022
    Assignee: Google LLC
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Patent number: 11507641
    Abstract: Techniques for performing in-memory matrix multiplication, taking into account temperature variations in the memory, are disclosed. In one example, the matrix multiplication memory uses ohmic multiplication and current summing to perform the dot products involved in matrix multiplication. One downside to this analog form of multiplication is that temperature affects the accuracy of the results. Thus techniques are provided herein to compensate for the effects of temperature increases on the accuracy of in-memory matrix multiplications. According to the techniques, portions of input matrices are classified as effective or ineffective. Effective portions are mapped to low temperature regions of the in-memory matrix multiplier and ineffective portions are mapped to high temperature regions of the in-memory matrix multiplier. The matrix multiplication is then performed.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: November 22, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Majed Valad Beigi, Amin Farmahini-Farahani, Sudhanva Gurumurthi
  • Patent number: 11494087
    Abstract: Memory management circuitry and processes operate to improve reliability of a group of memory stacks, providing that if a memory stack or a portion thereof fails during the product's lifetime, the system may still recover with no errors or data loss. A front-end controller receives a block of data requested to be written to memory, divides the block into sub-blocks, and creates a new redundant reliability sub-block. The sub-blocks are then written to different memory stacks. When reading data from the memory stacks, the front-end controller detects errors indicating a failure within one of the memory stacks, and recovers corrected data using the reliability sub-block. The front-end controller may monitor errors for signs of a stack failure and disable the failed stack.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: November 8, 2022
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Georgios Mappouras, Amin Farmahini Farahani, Michael Ignatowski
  • Publication number: 20210311658
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Application
    Filed: June 15, 2021
    Publication date: October 7, 2021
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Patent number: 11137936
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: October 5, 2021
    Assignee: Google LLC
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Patent number: 11119923
    Abstract: A cache coherence technique for operating a multi-processor system including shared memory includes allocating a cache line of a cache memory of a processor to a memory address in the shared memory in response to execution of an instruction of a program executing on the processor. The technique includes encoding a shared information state of the cache line to indicate whether the memory address is a shared memory address shared by the processor and a second processor, or a private memory address private to the processor, in response to whether the instruction is included in a critical section of the program, the critical section being a portion of the program that confines access to shared, writeable data.
    Type: Grant
    Filed: February 23, 2017
    Date of Patent: September 14, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amin Farmahini Farahani, Nuwan Jayasena
  • Publication number: 20210223985
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Application
    Filed: July 15, 2020
    Publication date: July 22, 2021
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Patent number: 11055150
    Abstract: A thread holding a lock notifies a sleeping thread that is waiting on the lock that the lock holding thread is “about” to release the lock. In response to the notification, the waiting thread is woken up. While the waiting thread is woken up, the lock holding thread completes other operations prior to actually releasing the lock and then releases the lock. The notification to the waiting thread hides latency associated with waking up the waiting thread by allowing operations that wake up the waiting thread to occur while the lock holding thread is performing the other operations prior to releasing the thread.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: July 6, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini-Farahani, David A. Roberts
  • Patent number: 10990453
    Abstract: A memory fence or other similar operation is executed with reduced latency. An early fence operation is executed and acts as a hint to the processor executing the thread that executes the fence. This hint causes the processor to begin performing sub-operations for the fence earlier than if no such hint were executed. Examples of sub-operations for the fence include operations to make data written to by writes prior to the fence operation available to other threads. A resolving fence, which occurs after the early fence, performs the remaining sub-operations for the fence. By triggering some or all of the sub-operations for a memory fence that will occur in the future, the early fence operation reduces the amount of latency associated with that memory fence operation.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: April 27, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amin Farmahini-Farahani, David A. Roberts, Nuwan Jayasena
  • Publication number: 20200380063
    Abstract: Techniques for performing in-memory matrix multiplication, taking into account temperature variations in the memory, are disclosed. In one example, the matrix multiplication memory uses ohmic multiplication and current summing to perform the dot products involved in matrix multiplication. One downside to this analog form of multiplication is that temperature affects the accuracy of the results. Thus techniques are provided herein to compensate for the effects of temperature increases on the accuracy of in-memory matrix multiplications. According to the techniques, portions of input matrices are classified as effective or ineffective. Effective portions are mapped to low temperature regions of the in-memory matrix multiplier and ineffective portions are mapped to high temperature regions of the in-memory matrix multiplier. The matrix multiplication is then performed.
    Type: Application
    Filed: May 31, 2019
    Publication date: December 3, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Majed Valad Beigi, Amin Farmahini-Farahani, Sudhanva Gurumurthi
  • Patent number: 10817422
    Abstract: In one form, a data processing system includes a host integrated circuit having a memory controller, a memory bus coupled to the memory controller, and a memory module. The memory module includes a bulk memory and a memory module scratchpad coupled to the bulk memory, wherein the memory module scratchpad has a lower access overhead than the bulk memory. The memory controller selectively provides predetermined commands over the memory bus to cause the memory module to copy data between the bulk memory and the memory module scratchpad without conducting data on the memory bus in response to a data movement decision.
    Type: Grant
    Filed: August 17, 2018
    Date of Patent: October 27, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Nuwan Jayasena, Amin Farmahini Farahani, Michael Ignatowski
  • Patent number: 10802977
    Abstract: A processing system tracks counts of accesses to memory pages using a set of counters located at the memory module that stores the pages, wherein the counts are adjusted at least in part based on refreshes of the memory pages. This approach allows a processing system to efficiently maintain the counts with relatively small counters and with relatively low overhead. Furthermore, the rate at which the counters are adjusted, relative to the page refreshes, is adjustable, so that the access counts are useful for a wide variety of application types.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: October 13, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Georgios Mappouras, Amin Farmahini Farahani, Nuwan Jayasena
  • Patent number: 10805392
    Abstract: Devices, methods, and systems for distributed gather and scatter operations in a network of memory nodes. A responding memory node includes a memory; a communications interface having circuitry configured to communicate with at least one other memory node; and a controller. The controller includes circuitry configured to receive a request message from a requesting node via the communications interface. The request message indicates a gather or scatter operation, and instructs the responding node to retrieve data elements from a source memory data structure and store the data elements to a destination memory data structure. The controller further includes circuitry configured to transmit a response message to the requesting node via the communications interface. The response message indicates that the data elements have been stored into the destination memory data structure.
    Type: Grant
    Filed: July 27, 2016
    Date of Patent: October 13, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Amin Farmahini-Farahani, David A. Roberts
  • Patent number: 10705972
    Abstract: Systems, apparatuses, and methods for determining preferred memory page management policies by software are disclosed. Software executing on one or more processing units generates a memory request. Software determines the preferred page management policy for the memory request based at least in part on the data access size and data access pattern of the memory request. Software conveys an indication of a preferred page management policy to a memory controller. Then, the memory controller accesses memory for the memory request using the preferred page management policy specified by software.
    Type: Grant
    Filed: September 13, 2016
    Date of Patent: July 7, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Amin Farmahini-Farahani, Alexander D. Breslow, Nuwan S. Jayasena
  • Publication number: 20200193268
    Abstract: A computer processing system having a first memory with a first set of memory pages resident therein and a second memory coupled to the first memory. A resource tracker provides information to instances of a long short-term memory (LSTM) recurrent neural network (RNN). A predictor identifies memory pages from the first set of memory pages for prediction by the one or more LSTM RNN instances. The system groups the memory pages of the identified plurality of memory pages into a number of patterns based on a number of memory accesses per time. An LSTM RNN instance predicts a number of page accesses for each pattern. A second set of memory pages is selected for moving from the first memory to the second memory.
    Type: Application
    Filed: December 14, 2018
    Publication date: June 18, 2020
    Inventors: Sergey BLAGODUROV, Thaleia Dimitra DOUDALI, Amin FARMAHINI FARAHANI
  • Publication number: 20200192809
    Abstract: A processing system tracks counts of accesses to memory pages using a set of counters located at the memory module that stores the pages, wherein the counts are adjusted at least in part based on refreshes of the memory pages. This approach allows a processing system to efficiently maintain the counts with relatively small counters and with relatively low overhead. Furthermore, the rate at which the counters are adjusted, relative to the page refreshes, is adjustable, so that the access counts are useful for a wide variety of application types.
    Type: Application
    Filed: December 12, 2018
    Publication date: June 18, 2020
    Inventors: Georgios MAPPOURAS, Amin FARMAHINI FARAHANI, Nuwan JAYASENA
  • Patent number: 10672474
    Abstract: A high-performance on-module caching architecture for hybrid memory modules is provided. A hybrid memory module includes a cache controller, a first volatile memory coupled to the cache controller, a first multiplexing data buffer coupled to the first volatile memory and the cache controller, and a first non-volatile memory coupled to the first multiplexing data buffer and the cache controller, wherein the first multiplexing data buffer multiplexes data between the first volatile memory and the first non-volatile memory and wherein the cache controller enables a tag checking operation to occur in parallel with a data movement operation. The hybrid memory module includes a volatile memory tag unit coupled to the cache controller, wherein the volatile memory tag unit includes a line connection that allows the cache controller to store a plurality of tags in the volatile memory tag unit and retrieve the plurality of tags from the volatile memory tag unit.
    Type: Grant
    Filed: August 6, 2019
    Date of Patent: June 2, 2020
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Amin Farmahini Farahani, David A. Roberts