Patents by Inventor Aamer Jaleel

Aamer Jaleel has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250190373
    Abstract: Metadata generally refers to data that describes, or gives information about, other data. Metadata can be used for a wide variety of purposes, including for ensuring the safety of memory accesses. For example, to prevent memory safety errors, metadata, which indicates the base address and size of the data, can be used to validate memory access requests as prerequisite to allowing the memory access. While there are many useful applications of metadata, including for memory safety as mentioned above, the underlying metadata storage and retrieval processes that have been developed to date suffer from various problems. The present disclosure provides an object-level metadata locator, which can allow for an internal object layout to be maintained and which can scale to an arbitrary number of objects while requiring lower memory overhead than that required in the prior art.
    Type: Application
    Filed: July 16, 2024
    Publication date: June 12, 2025
    Inventors: Mohamed Tarek Bnziad Mohamed Hassan, Aamer Jaleel
  • Publication number: 20250190632
    Abstract: Applications written in memory unsafe languages, such as C, C++, and CUDA, are vulnerable to a variety of memory safety errors because they do not validate the bounds and lifetime of memory accesses. For example, spatial memory safety errors occur when a pointer is used to access an object beyond its intended bounds while temporal memory safety errors occur when a pointer is used to access an object beyond its lifetime. Memory safety errors can lead to control-flow hijacking, silent data corruption, difficult-to-diagnose crashes, and security exploitation. Unfortunately, existing software-based solutions either provide low error detection coverage or come with significant runtime overheads, and existing hardware-accelerated GPU-based solutions have poor scalability or intrusive hardware changes. The present disclosure provides memory safety using a combination of hardware and software.
    Type: Application
    Filed: June 17, 2024
    Publication date: June 12, 2025
    Inventors: Mohamed Tarek Bnziad Mohamed Hassan, Aamer Jaleel, Sana Damani, Mark Stephenson, Stephen William Keckler
  • Patent number: 12321230
    Abstract: Implicit Memory Tagging (IMT) mechanisms utilizing alias-free memory tags that enable hardware-assisted memory tagging without incurring storage overhead above those incurred by conventional tagging mechanisms, while providing enhanced data integrity and memory security. The IMT mechanisms enhance the utility of error correcting codes (ECCs) to test memory tags in addition to the traditional utility of ECCs for detecting and correcting data errors and enable a finer granularity of memory tagging than many conventional approaches.
    Type: Grant
    Filed: October 11, 2023
    Date of Patent: June 3, 2025
    Assignee: NVIDIA Corp.
    Inventors: Michael B Sullivan, Mohamed Tarek Bnziad Mohamed Hassan, Aamer Jaleel
  • Publication number: 20250021642
    Abstract: While a compiler compiles source code to create an executable binary, code is added into the compiled source code that, when executed, identifies and stores in a metadata table base and bounds information associated with memory allocations. Additionally, additional code is added into the compiled source code that enables hardware to determine a safety of memory access requests during an implementation of the compiled source code by performing an out-of-bounds (OOB) check in hardware using the base and bounds information stored in the metadata table. This enables the identification and avoidance of unsafe memory operations during the implementation of the executable by a GPU.
    Type: Application
    Filed: September 30, 2024
    Publication date: January 16, 2025
    Inventors: Aamer Jaleel, Mohamed Tarek Bnziad Mohamed Hassan, Mark Stephenson
  • Publication number: 20240403417
    Abstract: Rowhammer attacks, which are malicious processes that rapidly issue access requests to memory, can impose serious security threats including being used to tamper data, take control of entire systems, and even breach confidentiality. Current solutions to defend against these attacks are limited, as they typically employ a deterministic tracker to track the portions of memory accessed and to mitigate potential attacks accordingly. However, the deterministic nature of these trackers results in their own vulnerability. The present disclosure provides probabilistic tracker management for mitigation of rowhammer attacks and/or other memory attacks in which a row (or other defined portion of memory) is maliciously targeted to disturb contents of neighboring rows, which can prevent these types of attacks that otherwise take advantage of the determinism in prior used tracker designs.
    Type: Application
    Filed: December 19, 2023
    Publication date: December 5, 2024
    Inventors: Aamer Jaleel, Gururaj Saileshwar
  • Patent number: 12135781
    Abstract: While a compiler compiles source code to create an executable binary, code is added into the compiled source code that, when executed, identifies and stores in a metadata table base and bounds information associated with memory allocations. Additionally, additional code is added into the compiled source code that enables hardware to determine a safety of memory access requests during an implementation of the compiled source code by performing an out-of-bounds (OOB) check in hardware using the base and bounds information stored in the metadata table. This enables the identification and avoidance of unsafe memory operations during the implementation of the executable by a GPU.
    Type: Grant
    Filed: December 29, 2021
    Date of Patent: November 5, 2024
    Assignee: NVIDIA CORPORATION
    Inventors: Aamer Jaleel, Mohamed Tarek Bnziad Mohamed Hassan, Mark Stephenson
  • Publication number: 20240184670
    Abstract: Implicit Memory Tagging (IMT) mechanisms utilizing alias-free memory tags that enable hardware-assisted memory tagging without incurring storage overhead above those incurred by conventional tagging mechanisms, while providing enhanced data integrity and memory security. The IMT mechanisms enhance the utility of error correcting codes (ECCs) to test memory tags in addition to the traditional utility of ECCs for detecting and correcting data errors and enable a finer granularity of memory tagging than many conventional approaches.
    Type: Application
    Filed: October 11, 2023
    Publication date: June 6, 2024
    Applicant: NVIDIA Corp.
    Inventors: Michael B. Sullivan, Mohamed Tarek Bnziad Mohamed Hassan, Aamer Jaleel
  • Patent number: 11836361
    Abstract: While a compiler compiles source code to create an executable binary, code is added into the compiled source code that, when executed, identifies and stores in a metadata table base and bounds information associated with memory allocations. Additionally, additional code is added into the compiled source code that performs memory safety checks during execution. This updated compiled source code automatically determines a safety of memory access requests during execution by performing an out-of-bounds (OOB) check using the base and bounds information retrieved and stored in the metadata table. This enables the identification and avoidance of unsafe memory operations during the implementation of the executable by a GPU.
    Type: Grant
    Filed: December 29, 2021
    Date of Patent: December 5, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Mohamed Tarek Bnziad Mohamed Hassan, Aamer Jaleel, Mark Stephenson, Michael Sullivan
  • Publication number: 20230061154
    Abstract: While a compiler compiles source code to create an executable binary, code is added into the compiled source code that, when executed, identifies and stores in a metadata table base and bounds information associated with memory allocations. Additionally, additional code is added into the compiled source code that enables hardware to determine a safety of memory access requests during an implementation of the compiled source code by performing an out-of-bounds (OOB) check in hardware using the base and bounds information stored in the metadata table. This enables the identification and avoidance of unsafe memory operations during the implementation of the executable by a GPU.
    Type: Application
    Filed: December 29, 2021
    Publication date: March 2, 2023
    Inventors: Aamer Jaleel, Mohamed Tarek Bnziad Mohamed Hassan, Mark Stephenson
  • Publication number: 20230063568
    Abstract: While a compiler compiles source code to create an executable binary, code is added into the compiled source code that, when executed, identifies and stores in a metadata table base and bounds information associated with memory allocations. Additionally, additional code is added into the compiled source code that performs memory safety checks during execution. This updated compiled source code automatically determines a safety of memory access requests during execution by performing an out-of-bounds (OOB) check using the base and bounds information retrieved and stored in the metadata table. This enables the identification and avoidance of unsafe memory operations during the implementation of the executable by a GPU.
    Type: Application
    Filed: December 29, 2021
    Publication date: March 2, 2023
    Inventors: Mohamed Tarek Bnziad Mohamed Hassan, Aamer Jaleel, Mark Stephenson, Michael Sullivan
  • Patent number: 11513957
    Abstract: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: November 29, 2022
    Assignee: Intel Corporation
    Inventors: Ren Wang, Andrew J. Herdrich, Yen-cheng Liu, Herbert H. Hum, Jong Soo Park, Christopher J. Hughes, Namakkal N. Venkatesan, Adrian C. Moga, Aamer Jaleel, Zeshan A. Chishti, Mesut A. Ergin, Jr-shian Tsai, Alexander W. Min, Tsung-yuan C. Tai, Christian Maciocco, Rajesh Sankaran
  • Publication number: 20210004328
    Abstract: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.
    Type: Application
    Filed: September 21, 2020
    Publication date: January 7, 2021
    Inventors: Ren Wang, Andrew J. Herdrich, Yen-cheng Liu, Herbert H. Hum, Jong Soo Park, Christopher J. Hughes, Namakkal N. Venkatesan, Adrian C. Moga, Aamer Jaleel, Zeshan A. Chishti, Mesut A. Ergin, Jr-shian Tsai, Alexander W. Min, Tsung-yuan C. Tai, Christian Maciocco, Rajesh Sankaran
  • Patent number: 10853276
    Abstract: A technology for implementing a method for distributed memory operations. A method of the disclosure includes obtaining distributed channel information for an algorithm to be executed by a plurality of spatially distributed processing elements. For each distributed channel in the distributed channel information, the method further associates one or more of the plurality of spatially distributed processing elements with the distributed channel based on the algorithm.
    Type: Grant
    Filed: June 17, 2019
    Date of Patent: December 1, 2020
    Assignee: Intel Corporation
    Inventors: Bushra Ahsan, Michael C. Adler, Neal C. Crago, Joel S. Emer, Aamer Jaleel, Angshuman Parashar, Michael I. Pellauer
  • Patent number: 10817425
    Abstract: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.
    Type: Grant
    Filed: December 26, 2014
    Date of Patent: October 27, 2020
    Assignee: Intel Corporation
    Inventors: Ren Wang, Andrew J. Herdrich, Yen-cheng Liu, Herbert H. Hum, Jong Soo Park, Christopher J. Hughes, Namakkal N. Venkatesan, Adrian C. Moga, Aamer Jaleel, Zeshan A. Chishti, Mesut A. Ergin, Jr-shian Tsai, Alexander W. Min, Tsung-yuan C. Tai, Christian Maciocco, Rajesh Sankaran
  • Publication number: 20190303312
    Abstract: A technology for implementing a method for distributed memory operations. A method of the disclosure includes obtaining distributed channel information for an algorithm to be executed by a plurality of spatially distributed processing elements. For each distributed channel in the distributed channel information, the method further associates one or more of the plurality of spatially distributed processing elements with the distributed channel based on the algorithm.
    Type: Application
    Filed: June 17, 2019
    Publication date: October 3, 2019
    Inventors: Bushra Ahsan, Michael C. Adler, Neal C. Crago, Joel S. Emer, Aamer Jaleel, Angshuman Parashar, Michael I. Pellauer
  • Patent number: 10387319
    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. The processor also includes a streamer element to prefetch the incoming operand set from two or more levels of a memory system.
    Type: Grant
    Filed: July 1, 2017
    Date of Patent: August 20, 2019
    Assignee: Intel Corporation
    Inventors: Michael C. Adler, Chiachen Chou, Neal C. Crago, Kermin Fleming, Kent D. Glossop, Aamer Jaleel, Pratik M. Marolia, Simon C. Steely, Jr., Samantika S. Sury
  • Patent number: 10331583
    Abstract: A processing device for executing distributed memory operations using spatial processing units (SPU) connected by distributed channels is disclosed. A distributed channel may or may not be associated with memory operations, such as load operations or store operations. Distributed channel information is obtained for an algorithm to be executed by a group of spatially distributed processing elements. The group of spatially distributed processing elements can be connected to a shared memory controller. For each distributed channel in the distributed channel information, one or more of the group of spatially distributed processing elements may be associated with the distributed channel based on the algorithm. By associating the spatially distributed processing elements to a distributed channel, the functionality of the processing element can vary depending on the algorithm mapped onto the SPU.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: June 25, 2019
    Assignee: Intel Corporation
    Inventors: Bushra Ahsan, Michael C. Adler, Neal C. Crago, Joel S. Emer, Aamer Jaleel, Angshuman Parashar, Michael I. Pellauer
  • Patent number: 10284470
    Abstract: Technologies for managing network flow lookups of a network device include a network controller and a target device, each communicatively coupled to the network device. The network device includes a cache for a processor of the network device and a main memory. The network device additionally includes a multi-level hash table having a first-level hash table stored in the cache of the network device and a second-level hash table stored in the main memory of the network device. The network device is configured to determine whether to store a network flow hash corresponding to a network flow indicating the target device in the first-level or second-level hash table based on a priority of the network flow provided to the network device by the network controller.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: May 7, 2019
    Assignee: Intel Corporation
    Inventors: Ren Wang, Namakkal N. Venkatesan, Aamer Jaleel, Tsung-Yuan C. Tai, Sameh Gobriel, Christian Maciocco
  • Publication number: 20190004955
    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. The processor also includes a streamer element to prefetch the incoming operand set from two or more levels of a memory system.
    Type: Application
    Filed: July 1, 2017
    Publication date: January 3, 2019
    Inventors: Michael C. Adler, Chiachen Chou, Neal C. Crago, Kermin Fleming, Kent D. Glossop, Aamer Jaleel, Pratik M. Marolia, Simon C. Steely, JR., Samantika S. Sury
  • Patent number: 10102134
    Abstract: A processor includes a cache, a prefetcher module to select information according to a prefetcher algorithm, and a prefetcher algorithm selection module. The prefetcher algorithm selection module includes logic to select a candidate prefetcher algorithm determine and store memory addresses of predicted memory accesses of the candidate prefetcher algorithm when performed by the prefetcher module, determine cache lines accessed during memory operations, and evaluate whether the determined cache lines match the stored memory addresses. The prefetcher algorithm selection module further includes logic to adjust an accuracy ratio of the candidate prefetcher algorithm, compare the accuracy ratio with a threshold accuracy ratio, and determine whether to apply the first candidate prefetcher algorithm to the prefetcher module.
    Type: Grant
    Filed: June 23, 2016
    Date of Patent: October 16, 2018
    Assignee: Intel Corporation
    Inventors: Zeshan A. Chishti, Christopher B. Wilkerson, Seth Pugsley, Peng-Fei Chuang, Robert L. Scott, Aamer Jaleel, Shih-Lien L. Lu, Kingsum Chow