Patents by Inventor Mohammad Ghasemzadeh

Mohammad Ghasemzadeh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Acceleration of in-memory-compute arrays

Patent number: 12230361

Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

Type: Grant

Filed: July 3, 2023

Date of Patent: February 18, 2025

Assignee: Apple Inc.

Inventors: Paolo Di Febbo, Mohamed H. Abu-Rahma, Jelam K. Parekh, Yildiz Sinangil, Mohammad Ghasemzadeh, Anthony Ghannoum, Chaminda N. Vidanagamachchi
MAPPABLE FILTER FOR NEURAL PROCESSOR CIRCUIT

Publication number: 20250021808

Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.

Type: Application

Filed: October 1, 2024

Publication date: January 16, 2025

Applicant: Apple Inc.

Inventors: Waleed ABDULLA, Paolo Di Febbo, Mohammad Ghasemzadeh, Yohan Rajan
Mappable filter for neural processor circuit

Patent number: 12141679

Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.

Type: Grant

Filed: October 7, 2020

Date of Patent: November 12, 2024

Assignee: APPLE INC.

Inventors: Waleed Abdulla, Paolo Di Febbo, Mohammad Ghasemzadeh, Yohan Rajan
Acceleration of In-Memory-Compute Arrays

Publication number: 20240005972

Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

Type: Application

Filed: July 3, 2023

Publication date: January 4, 2024

Inventors: Paolo Di Febbo, Mohamed H. Abu-Rahma, Jelam K. Parekh, Yildiz Sinangil, Mohammad Ghasemzadeh, Anthony Ghannoum, Chaminda N. Vidanagamachchi
Acceleration of in-memory-compute arrays

Patent number: 11694733

Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

Type: Grant

Filed: August 19, 2021

Date of Patent: July 4, 2023

Assignee: Apple Inc.

Inventors: Paolo Di Febbo, Mohamed H. Abu-Rahma, Jelam K. Parekh, Yildiz Sinangil, Mohammad Ghasemzadeh, Anthony Ghannoum, Chaminda N. Vidanagamachchi
Hardware-based machine learning acceleration

Patent number: 11651260

Abstract: A method for hardware-based machine learning acceleration is provided. The method may include partitioning, into a first batch of data and a second batch of data, an input data received at a hardware accelerator implementing a machine learning model. The input data may be a continuous stream of data samples. The input data may be partitioned based at least on a resource constraint of the hardware accelerator. An update of a probability density function associated with the machine learning model may be performed in real time. The probability density function may be updated by at least processing, by the hardware accelerator, the first batch of data before the second batch of data. An output may be generated based at least on the updated probability density function. The output may include a probability of encountering a data value. Related systems and articles of manufacture, including computer program products, are also provided.

Type: Grant

Filed: January 31, 2018

Date of Patent: May 16, 2023

Assignee: The Regents of the University of California

Inventors: Bita Darvish Rouhani, Mohammad Ghasemzadeh, Farinaz Koushanfar
Acceleration of In-Memory-Compute Arrays

Publication number: 20230059200

Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

Type: Application

Filed: August 19, 2021

Publication date: February 23, 2023

Inventors: Paolo Di Febbo, Mohamed H. Abu-Rahma, Jelam K. Parekh, Yildiz Sinangil, Mohammad Ghasemzadeh, Anthony Ghannoum, Chaminda N. Vidanagamachchi
Training a machine learning model with limited training data

Patent number: 11386326

Abstract: A method may include a transforming a trained machine learning model including by replacing at least one layer of the trained machine learning model with a dictionary matrix and a coefficient matrix. The dictionary matrix and the coefficient matrix may be formed by decomposing a weight matrix associated with the at least one layer of the trained machine learning model. A product of the dictionary matrix and the coefficient matrix may form a reduced-dimension representation of the weight matrix associated with the at least one layer of the trained machine learning model. The transformed machine learning model may be deployed to a client. Related systems and computer program products are also provided.

Type: Grant

Filed: June 11, 2019

Date of Patent: July 12, 2022

Assignee: The Regents of the University of California

Inventors: Fang Lin, Mohammad Ghasemzadeh, Bita Darvish Rouhani, Farinaz Koushanfar
MAPPABLE FILTER FOR NEURAL PROCESSOR CIRCUIT

Publication number: 20220108155

Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.

Type: Application

Filed: October 7, 2020

Publication date: April 7, 2022

Inventors: Waleed Abdulla, Paolo Di Febbo, Mohammad Ghasemzadeh, Yohan Rajan
RESIDUAL BINARY NEURAL NETWORK

Publication number: 20210166106

Abstract: A method may include training, based a training dataset, a machine learning model. The machine learning model may include a neuron configured to generate an output by applying, to one or more inputs to the neuron, an activation function. The output of the activation function may be subject to a multi-level binarization function configured to generate an estimate of the output. The estimate of the output may include a first bit providing a first binary representation of the output and a second bit providing a second binary representation of a first residual error associated with the first binary representation of the output. In response to determining that the training of the machine learning model is complete, the trained machine learning model may be deployed to perform a cognitive task. Related systems and articles of manufacture, including computer program products, are also provided.

Type: Application

Filed: December 12, 2018

Publication date: June 3, 2021

Inventors: Mohammad Ghasemzadeh, Farinaz Koushanfar, Mohammad Samragh Razlighi
HARDWARE-BASED MACHINE LEARNING ACCELERATION

Publication number: 20200027016

Abstract: A method for hardware-based machine learning acceleration is provided. The method may include partitioning, into a first batch of data and a second batch of data, an input data received at a hardware accelerator implementing a machine learning model. The input data may be a continuous stream of data samples. The input data may be partitioned based at least on a resource constraint of the hardware accelerator. An update of a probability density function associated with the machine learning model may be performed in real time. The probability density function may be updated by at least processing, by the hardware accelerator, the first batch of data before the second batch of data. An output may be generated based at least on the updated probability density function. The output may include a probability of encountering a data value. Related systems and articles of manufacture, including computer program products, are also provided.

Type: Application

Filed: January 31, 2018

Publication date: January 23, 2020

Inventors: Bita Darvish Rouhani, Mohammad Ghasemzadeh, Farinaz Koushanfar
TRAINING A MACHINE LEARNING MODEL WITH LIMITED TRAINING DATA

Publication number: 20190378015

Abstract: A method may include a transforming a trained machine learning model including by replacing at least one layer of the trained machine learning model with a dictionary matrix and a coefficient matrix. The dictionary matrix and the coefficient matrix may be formed by decomposing a weight matrix associated with the at least one layer of the trained machine learning model. A product of the dictionary matrix and the coefficient matrix may form a reduced-dimension representation of the weight matrix associated with the at least one layer of the trained machine learning model. The transformed machine learning model may be deployed to a client. Related systems and computer program products are also provided.

Type: Application

Filed: June 11, 2019

Publication date: December 12, 2019

Inventors: Fang Lin, Mohammad Ghasemzadeh, Bita Darvish Rouhani, Farinaz Koushanfar