Patents by Inventor Mohamed Assem Abd ElMohsen Ibrahim

Mohamed Assem Abd ElMohsen Ibrahim has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11977782
    Abstract: An approach allows concurrent execution of near-memory processing commands, referred to herein as “PIM commands,” and host memory commands. A memory controller determines and issues a plurality of register-only PIM commands that do not reference memory with host memory commands to allow concurrent execution of the register-only PIM commands and the host memory commands. The approach allows concurrent execution of register-only PIM commands and host memory commands without interference, even when the register-only PIM commands and the host memory commands are interleaved, and even for the same memory module, which improves resource utilization and performance. Further improvement of resource utilization and performance is achieved by extending a register-only phase by reordering register-only PIM commands before non-register-only PIM commands, subject to dependency constraints, and using shadow row buffers to provide local working copies of data from memory to near-memory compute elements.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: May 7, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mohamed Assem Abd ElMohsen Ibrahim, Meysam Taassori, Mahzabeen Islam, Shaizeen Aga
  • Publication number: 20240103730
    Abstract: In accordance with described techniques for reduction of parallel memory operation messages, a computing system or computing device includes a memory system that receives memory operation messages. A shared response component in the memory system receives responses to the memory operation messages, and identifies a set of the responses that are coalesceable. The shared response component then coalesces the set of the responses into a combined message for communication completion through a communication path in the memory system.
    Type: Application
    Filed: September 28, 2022
    Publication date: March 28, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Johnathan Robert Alsop, Shaizeen Dilawarhusen Aga, Mohamed Assem Abd ElMohsen Ibrahim
  • Publication number: 20240106782
    Abstract: In accordance with described techniques for filtered responses to memory operation messages, a computing system or computing device includes a memory system that receives messages. A filter component in the memory system receives the responses to the memory operation messages, and filters one or more of the responses based on a filterable condition. A tracking logic component tracks the one or more responses as filtered responses for communication completion.
    Type: Application
    Filed: September 28, 2022
    Publication date: March 28, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Johnathan Robert Alsop, Shaizeen Dilawarhusen Aga, MOHAMED ASSEM ABD ELMOHSEN IBRAHIM
  • Publication number: 20240103763
    Abstract: In accordance with the described techniques for bank-level parallelism for processing in memory, a plurality of commands are received for execution by a processing in memory component embedded in a memory. The memory includes a first bank and a second bank. The plurality of commands include a first stream of commands which cause the processing in memory component to perform operations that access the first bank and a second stream of commands which cause the processing in memory component to perform operations that access the second bank. A next row of the first bank that is to be accessed by the processing in memory component is identified. Further, a precharge command is scheduled to close a first row of the first bank and an activate command is scheduled to open the next row of the first bank in parallel with execution of the second stream of commands.
    Type: Application
    Filed: September 27, 2022
    Publication date: March 28, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Mahzabeen Islam, Shaizeen Dilawarhusen Aga, Johnathan Robert Alsop, MOHAMED ASSEM ABD ELMOHSEN IBRAHIM, Nuwan S Jayasena
  • Publication number: 20240069915
    Abstract: A virtual padding unit provides a virtual padded data structure (e.g., virtually padded matrix) that provides output values for a padded data structure without storing all of the padding elements in memory. When the virtual padding unit receives a virtual memory address of a location in the virtual padded data structure, the virtual padding unit checks whether the location is a non-padded location in the virtual padded data structure or a padded location in the virtual padded data structure. If the location is a padded location in the virtual padded data structure, the virtual padding unit outputs a padding value rather than a value stored in the virtual padded data structure. If the location is a non-padded location in the virtual padded data structure, a value stored at the location is output.
    Type: Application
    Filed: August 30, 2022
    Publication date: February 29, 2024
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Meysam Taassori, Shaizeen Dilawarhusen Aga, Mohamed Assem Abd ElMohsen Ibrahim, Johnathan Robert Alsop
  • Publication number: 20240004585
    Abstract: An approach allows concurrent execution of near-memory processing commands, referred to herein as “PIM commands,” and host memory commands. A memory controller determines and issues a plurality of register-only PIM commands that do not reference memory with host memory commands to allow concurrent execution of the register-only PIM commands and the host memory commands. The approach allows concurrent execution of register-only PIM commands and host memory commands without interference, even when the register-only PIM commands and the host memory commands are interleaved, and even for the same memory module, which improves resource utilization and performance. Further improvement of resource utilization and performance is achieved by extending a register-only phase by reordering register-only PIM commands before non-register-only PIM commands, subject to dependency constraints, and using shadow row buffers to provide local working copies of data from memory to near-memory compute elements.
    Type: Application
    Filed: June 30, 2022
    Publication date: January 4, 2024
    Inventors: Mohamed Assem Abd ElMohsen Ibrahim, Meysam Taassori, Mahzabeen Islam, Shaizeen Aga
  • Publication number: 20230401154
    Abstract: A system and method for efficiently accessing sparse data for a workload are described. In various implementations, a computing system includes an integrated circuit and a memory for storing tasks of a workload that includes sparse accesses of data items stored in one or more tables. The integrated circuit receives a user query, and generates a result based on multiple data items targeted by the user query. To reduce the latency of processing the workload even with sparse lookup operations performed on the one or more tables, a prefetch engine of the integrated circuit stores a subset of data items in prefetch data storage. The prefetch engine also determines which data items to store in the prefetch data storage based on one or more of a frequency of reuse, a distance or latency of access of a corresponding table of the one more tables, or other.
    Type: Application
    Filed: June 8, 2022
    Publication date: December 14, 2023
    Inventors: Mohamed Assem Abd ElMohsen Ibrahim, Onur Kayiran, Shaizeen Dilawarhusen Aga, Yasuko Eckert
  • Publication number: 20230359558
    Abstract: An approach is provided for skipping, i.e., not processing and/or deleting, near-memory processing commands when one or more skip criteria are satisfied. Examples of skip criteria include, without limitation, specific operations, specific operands, and combinations of specific operations and specific operands. The approach is implemented at one or more memory command processing elements in the memory pipeline of a processor, such as memory controllers, caches, queues, and buffers, etc. Implementations include exceptions to skipping in certain situations and software support for configuring skip criteria, including particular operations and operands for which skip checking is performed. The approach provides the benefits of reducing command bus traffic and power consumption while maintaining functional correctness.
    Type: Application
    Filed: May 9, 2022
    Publication date: November 9, 2023
    Inventors: Shaizeen Aga, Mohamed Assem Abd ElMohsen Ibrahim
  • Publication number: 20230098421
    Abstract: Methods and apparatuses include a processing unit which helps control the speed and computational resources required for arithmetic operations of two numbers in a first format. The control unit of the processing unit approximates the arithmetic operations using a plurality of decomposed numbers in a second format that facilitates faster calculations than the first format, such that performing arithmetic operations using the decomposed numbers is capable of approximating the results of the arithmetic operations of the two numbers in the first format.
    Type: Application
    Filed: September 30, 2021
    Publication date: March 30, 2023
    Inventors: Onur Kayiran, Mohamed Assem Abd ElMohsen Ibrahim, Shaizeen Aga
  • Publication number: 20230065546
    Abstract: An electronic device includes a plurality of nodes, each node having a processor that performs operations for processing instances of input data through a model, a local memory that stores a separate portion of model data for the model, and a controller. The controller identifies model data that meets one or more predetermined conditions in the separate portion of the model data in the local memory in some or all of the nodes that is accessible by the processors when processing the instances of input data through the model. The controller then copies the model data that meets the one or more predetermined conditions from the separate portion of the model data in the local memory in the some or all of the nodes to local memories in other nodes. In this way, the controller distributes model data that meets the one or more predetermined conditions among the nodes, making the model data that meets the one or more predetermined conditions available to the nodes without performing remote memory accesses.
    Type: Application
    Filed: September 29, 2021
    Publication date: March 2, 2023
    Inventors: Mohamed Assem Abd ElMohsen Ibrahim, Onur Kayiran, Shaizeen Aga