Patents by Inventor Mostafa Hagog

Mostafa Hagog has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12282526
    Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
    Type: Grant
    Filed: March 28, 2024
    Date of Patent: April 22, 2025
    Assignee: NVIDIA Corporation
    Inventors: Piotr Majcher, Mostafa Hagog, Philippe Vandermersch
  • Publication number: 20250110737
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Application
    Filed: September 6, 2024
    Publication date: April 3, 2025
    Applicant: Intel Corporation
    Inventors: Eran SHIFER, Mostafa HAGOG, Eliyahu TURIEL
  • Patent number: 12086603
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Grant
    Filed: October 27, 2022
    Date of Patent: September 10, 2024
    Assignee: Intel Corporation
    Inventors: Eran Shifer, Mostafa Hagog, Eliyahu Turiel
  • Publication number: 20240256633
    Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
    Type: Application
    Filed: March 28, 2024
    Publication date: August 1, 2024
    Inventors: Piotr Majcher, Mostafa Hagog, Philippe Vandermersch
  • Publication number: 20240086491
    Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
    Type: Application
    Filed: November 20, 2023
    Publication date: March 14, 2024
    Inventors: Piotr Majcher, Mostafa Hagog, Philippe Vandermersch
  • Publication number: 20230052630
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Application
    Filed: October 27, 2022
    Publication date: February 16, 2023
    Inventors: Eran SHIFER, Mostafa HAGOG, Eliyahu TURIEL
  • Patent number: 11494194
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: November 8, 2022
    Assignee: Intel Corporation
    Inventors: Eran Shifer, Mostafa Hagog, Eliyahu Turiel
  • Publication number: 20220300578
    Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
    Type: Application
    Filed: June 7, 2022
    Publication date: September 22, 2022
    Inventors: Piotr Majcher, Mostafa Hagog, Philippe Vandermersch
  • Publication number: 20220179703
    Abstract: Apparatuses, systems, and techniques to improve neural network computations. In at least one embodiment, a deep neural network library receives computation descriptors from one or more users and generates an optimized execution plan comprising one or more optimized operations to facilitate neural network computing.
    Type: Application
    Filed: December 7, 2020
    Publication date: June 9, 2022
    Inventors: Kevin Vincent, Yang Xu, Scott A. Yokim, Mostafa Hagog, Lingfeng Zhang, Seth Erickson Walters, Anerudhan Gopal
  • Publication number: 20220004391
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Application
    Filed: March 29, 2021
    Publication date: January 6, 2022
    Inventors: Eran SHIFER, Mostafa HAGOG, Eliyahu TURIEL
  • Publication number: 20210406342
    Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
    Type: Application
    Filed: September 9, 2021
    Publication date: December 30, 2021
    Inventors: Piotr Majcher, Mostafa Hagog, Philippe Vandermersch
  • Publication number: 20210256092
    Abstract: Apparatuses, systems, and techniques to determine a matrix multiplication algorithm for a matrix multiplication operation. In at least one embodiment, a matrix multiplication operation is analyzed to determine an appropriate matrix multiplication algorithm to perform the matrix multiplication algorithm.
    Type: Application
    Filed: February 19, 2020
    Publication date: August 19, 2021
    Inventors: Piotr Majcher, Mostafa Hagog, Philippe Vandermersch
  • Publication number: 20210103433
    Abstract: Apparatuses, systems, and techniques are presented to compile code. In at least one embodiment, one or more compilers are to compile one or more compiled portions of code with one or more intermediate representations of one or more portions of code.
    Type: Application
    Filed: October 2, 2019
    Publication date: April 8, 2021
    Inventors: Andrew Kerr, Mike Murphy, Mostafa Hagog, Julien Demouth, John Tran
  • Patent number: 10963263
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Grant
    Filed: August 8, 2018
    Date of Patent: March 30, 2021
    Assignee: Intel Corporation
    Inventors: Eran Shifer, Mostafa Hagog, Eliyahu Turiel
  • Patent number: 10901748
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Grant
    Filed: October 1, 2018
    Date of Patent: January 26, 2021
    Assignee: Intel Corporation
    Inventors: Eran Shifer, Mostafa Hagog, Eliyahu Turiel
  • Publication number: 20200334076
    Abstract: An application binary interface (ABI) can be exposed in a processor to enable blocks of threads, which may correspond to separately compiled operators, to communicate without storing data to global memory external to the processor. The ABI can define how results of one computation, corresponding to a first thread block, will be organized in registers and shared memory of a processor at the end of one operator (i.e., kernel). The start of the next operator (i.e., kernel), corresponding to a second thread block, can consume the results from the registers and shared memory. Data can be stored to processor local storage for individual threads as they exit the block. Once published, libraries can be separately compiled, optimized, and tested as long as they adhere to the published ABI.
    Type: Application
    Filed: April 19, 2019
    Publication date: October 22, 2020
    Inventors: Brian Fahs, Michael Lightstone, Mostafa Hagog
  • Patent number: 10303471
    Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.
    Type: Grant
    Filed: February 28, 2017
    Date of Patent: May 28, 2019
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Mostafa Hagog, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber
  • Publication number: 20190114176
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Application
    Filed: October 1, 2018
    Publication date: April 18, 2019
    Inventors: Eran SHIFER, Mostafa HAGOG, Eliyahu TURIEL
  • Publication number: 20190012178
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Application
    Filed: August 8, 2018
    Publication date: January 10, 2019
    Inventors: Eran SHIFER, Mostafa HAGOG, Eliyahu TURIEL
  • Patent number: 10061593
    Abstract: An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
    Type: Grant
    Filed: February 7, 2017
    Date of Patent: August 28, 2018
    Assignee: Intel Corporation
    Inventors: Eran Shifer, Mostafa Hagog, Eliyahu Turiel