Patents by Inventor Mohammad Ghasemzadeh

Mohammad Ghasemzadeh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12230361
    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.
    Type: Grant
    Filed: July 3, 2023
    Date of Patent: February 18, 2025
    Assignee: Apple Inc.
    Inventors: Paolo Di Febbo, Mohamed H. Abu-Rahma, Jelam K. Parekh, Yildiz Sinangil, Mohammad Ghasemzadeh, Anthony Ghannoum, Chaminda N. Vidanagamachchi
  • Publication number: 20250021808
    Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.
    Type: Application
    Filed: October 1, 2024
    Publication date: January 16, 2025
    Applicant: Apple Inc.
    Inventors: Waleed ABDULLA, Paolo Di Febbo, Mohammad Ghasemzadeh, Yohan Rajan
  • Patent number: 12141679
    Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.
    Type: Grant
    Filed: October 7, 2020
    Date of Patent: November 12, 2024
    Assignee: APPLE INC.
    Inventors: Waleed Abdulla, Paolo Di Febbo, Mohammad Ghasemzadeh, Yohan Rajan
  • Publication number: 20240005972
    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.
    Type: Application
    Filed: July 3, 2023
    Publication date: January 4, 2024
    Inventors: Paolo Di Febbo, Mohamed H. Abu-Rahma, Jelam K. Parekh, Yildiz Sinangil, Mohammad Ghasemzadeh, Anthony Ghannoum, Chaminda N. Vidanagamachchi
  • Patent number: 11694733
    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.
    Type: Grant
    Filed: August 19, 2021
    Date of Patent: July 4, 2023
    Assignee: Apple Inc.
    Inventors: Paolo Di Febbo, Mohamed H. Abu-Rahma, Jelam K. Parekh, Yildiz Sinangil, Mohammad Ghasemzadeh, Anthony Ghannoum, Chaminda N. Vidanagamachchi
  • Patent number: 11651260
    Abstract: A method for hardware-based machine learning acceleration is provided. The method may include partitioning, into a first batch of data and a second batch of data, an input data received at a hardware accelerator implementing a machine learning model. The input data may be a continuous stream of data samples. The input data may be partitioned based at least on a resource constraint of the hardware accelerator. An update of a probability density function associated with the machine learning model may be performed in real time. The probability density function may be updated by at least processing, by the hardware accelerator, the first batch of data before the second batch of data. An output may be generated based at least on the updated probability density function. The output may include a probability of encountering a data value. Related systems and articles of manufacture, including computer program products, are also provided.
    Type: Grant
    Filed: January 31, 2018
    Date of Patent: May 16, 2023
    Assignee: The Regents of the University of California
    Inventors: Bita Darvish Rouhani, Mohammad Ghasemzadeh, Farinaz Koushanfar
  • Publication number: 20230059200
    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.
    Type: Application
    Filed: August 19, 2021
    Publication date: February 23, 2023
    Inventors: Paolo Di Febbo, Mohamed H. Abu-Rahma, Jelam K. Parekh, Yildiz Sinangil, Mohammad Ghasemzadeh, Anthony Ghannoum, Chaminda N. Vidanagamachchi
  • Patent number: 11386326
    Abstract: A method may include a transforming a trained machine learning model including by replacing at least one layer of the trained machine learning model with a dictionary matrix and a coefficient matrix. The dictionary matrix and the coefficient matrix may be formed by decomposing a weight matrix associated with the at least one layer of the trained machine learning model. A product of the dictionary matrix and the coefficient matrix may form a reduced-dimension representation of the weight matrix associated with the at least one layer of the trained machine learning model. The transformed machine learning model may be deployed to a client. Related systems and computer program products are also provided.
    Type: Grant
    Filed: June 11, 2019
    Date of Patent: July 12, 2022
    Assignee: The Regents of the University of California
    Inventors: Fang Lin, Mohammad Ghasemzadeh, Bita Darvish Rouhani, Farinaz Koushanfar
  • Publication number: 20220108155
    Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.
    Type: Application
    Filed: October 7, 2020
    Publication date: April 7, 2022
    Inventors: Waleed Abdulla, Paolo Di Febbo, Mohammad Ghasemzadeh, Yohan Rajan
  • Publication number: 20210166106
    Abstract: A method may include training, based a training dataset, a machine learning model. The machine learning model may include a neuron configured to generate an output by applying, to one or more inputs to the neuron, an activation function. The output of the activation function may be subject to a multi-level binarization function configured to generate an estimate of the output. The estimate of the output may include a first bit providing a first binary representation of the output and a second bit providing a second binary representation of a first residual error associated with the first binary representation of the output. In response to determining that the training of the machine learning model is complete, the trained machine learning model may be deployed to perform a cognitive task. Related systems and articles of manufacture, including computer program products, are also provided.
    Type: Application
    Filed: December 12, 2018
    Publication date: June 3, 2021
    Inventors: Mohammad Ghasemzadeh, Farinaz Koushanfar, Mohammad Samragh Razlighi
  • Publication number: 20200027016
    Abstract: A method for hardware-based machine learning acceleration is provided. The method may include partitioning, into a first batch of data and a second batch of data, an input data received at a hardware accelerator implementing a machine learning model. The input data may be a continuous stream of data samples. The input data may be partitioned based at least on a resource constraint of the hardware accelerator. An update of a probability density function associated with the machine learning model may be performed in real time. The probability density function may be updated by at least processing, by the hardware accelerator, the first batch of data before the second batch of data. An output may be generated based at least on the updated probability density function. The output may include a probability of encountering a data value. Related systems and articles of manufacture, including computer program products, are also provided.
    Type: Application
    Filed: January 31, 2018
    Publication date: January 23, 2020
    Inventors: Bita Darvish Rouhani, Mohammad Ghasemzadeh, Farinaz Koushanfar
  • Publication number: 20190378015
    Abstract: A method may include a transforming a trained machine learning model including by replacing at least one layer of the trained machine learning model with a dictionary matrix and a coefficient matrix. The dictionary matrix and the coefficient matrix may be formed by decomposing a weight matrix associated with the at least one layer of the trained machine learning model. A product of the dictionary matrix and the coefficient matrix may form a reduced-dimension representation of the weight matrix associated with the at least one layer of the trained machine learning model. The transformed machine learning model may be deployed to a client. Related systems and computer program products are also provided.
    Type: Application
    Filed: June 11, 2019
    Publication date: December 12, 2019
    Inventors: Fang Lin, Mohammad Ghasemzadeh, Bita Darvish Rouhani, Farinaz Koushanfar