Patents by Inventor Reza Ghasemi

Reza Ghasemi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Architecture of crossbar of inference engine

Patent number: 11256517

Abstract: A programmable hardware system for machine learning (ML) includes a core and an inference engine. The core receives commands from a host. The commands are in a first instruction set architecture (ISA) format. The core divides the commands into a first set for performance-critical operations, in the first ISA format, and a second set of performance non-critical operations, in the first ISA format. The core executes the second set to perform the performance non-critical operations of the ML operations and streams the first set to inference engine. The inference engine generates a stream of the first set of commands in a second ISA format based on the first set of commands in the first ISA format. The first set of commands in the second ISA format programs components within the inference engine to execute the ML operations to infer data.

Type: Grant

Filed: December 19, 2018

Date of Patent: February 22, 2022

Assignee: Marvell Asia Pte Ltd

Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen
Single instruction set architecture (ISA) format for multiple ISAS in machine learning inference engine

Patent number: 11086633

Abstract: A programmable hardware system for machine learning (ML) includes a core and an inference engine. The core receives commands from a host. The commands are in a first instruction set architecture (ISA) format. The core divides the commands into a first set for performance-critical operations, in the first ISA format, and a second set of performance non-critical operations, in the first ISA format. The core executes the second set to perform the performance non-critical operations of the ML operations and streams the first set to inference engine. The inference engine generates a stream of the first set of commands in a second ISA format based on the first set of commands in the first ISA format. The first set of commands in the second ISA format programs components within the inference engine to execute the ML operations to infer data.

Type: Grant

Filed: December 19, 2018

Date of Patent: August 10, 2021

Assignee: Marvell Asia Pte, Ltd.

Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen
Architecture for irregular operations in machine learning inference engine

Patent number: 11029963

Abstract: A processing unit of an inference engine for machine learning (ML) includes a first data load steamer, a second data load streamer, an operator component, and a store streamer. The first data load streamer streams a first data stream from an on-chip memory (OCM) to the operator component. The second data load streamer streams a second data stream from the OCM to the operator component. The operator component performs a matrix operation on the first data stream and the second data stream. The store streamer receives a data output stream from the operator component and to store the data output stream in a buffer.

Type: Grant

Filed: December 19, 2018

Date of Patent: June 8, 2021

Assignee: Marvell Asia Pte, Ltd.

Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen, Rishan Tan
Systems and methods for programmable hardware architecture for machine learning

Patent number: 10970080

Abstract: A programmable hardware architecture for machine learning (ML) is proposed, which includes at least a host, a memory, a core, a data streaming engine, a instruction-streaming engine, and an interference engine. The core interprets a plurality of ML commands for a ML operation and/or data received from the host and coordinate activities of the engines based on the data in the received ML commands. The instruction-streaming engine translates the ML commands received from the core and provide a set of programming instructions to the data streaming engine and the inference engines based on the translated parameters. The data steaming engine sends one or more data streams to the inference engine in response to the received programming instructions. The inference engine then processes the data streams received from the data stream engine according to the programming instructions received from the instruction-streaming engine.

Type: Grant

Filed: November 9, 2018

Date of Patent: April 6, 2021

Assignee: Marvell Asia Pte, Ltd.

Inventors: Avinash Sodani, Chia-Hsin Chen, Ulf R. Hanebutte, Hamid Reza Ghasemi, Senad Durakovic
ARRAY-BASED INFERENCE ENGINE FOR MACHINE LEARNING

Publication number: 20210055934

Abstract: An array-based inference engine includes a plurality of processing tiles arranged in a two-dimensional array of a plurality of rows and a plurality of columns. Each processing tile comprises at least one or more of an on-chip memory (OCM) configured to load and maintain data from the input data stream for local access by components in the processing tile and further configured to maintain and output result of the ML operation performed by the processing tile as an output data stream. The array includes a first processing unit (POD) configured to perform a dense and/or regular computation task of the ML operation on the data in the OCM. The array also includes a second processing unit/element (PE) configured to perform a sparse and/or irregular computation task of the ML operation on the data in the OCM and/or from the POD.

Type: Application

Filed: October 2, 2020

Publication date: February 25, 2021

Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen
Architecture for dense operations in machine learning inference engine

Patent number: 10896045

Abstract: A processing unit of an inference engine for machine learning (ML) includes a first, a second, and a third register, and a matrix multiplication block. The first register receives a first stream of data associated with a first matrix data that is read only once. The second register receives a second stream of data associated with a second matrix data that is read only once. The matrix multiplication block performs a multiplication operation based on data from the first register and the second register resulting in an output matrix. A row associated with the first matrix is maintained while rows associated with the second matrix is fed to the matrix multiplication block to perform a multiplication operation. The process is repeated for each row of the first matrix. The third register receives the output matrix from the matrix multiplication block and stores the output matrix.

Type: Grant

Filed: December 19, 2018

Date of Patent: January 19, 2021

Assignee: Marvell Asia Pte, Ltd.

Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen
Array-based inference engine for machine learning

Patent number: 10824433

Abstract: An array-based inference engine includes a plurality of processing tiles arranged in a two-dimensional array of a plurality of rows and a plurality of columns. Each processing tile comprises at least one or more of an on-chip memory (OCM) configured to load and maintain data from the input data stream for local access by components in the processing tile and further configured to maintain and output result of the ML operation performed by the processing tile as an output data stream. The array includes a first processing unit (POD) configured to perform a dense and/or regular computation task of the ML operation on the data in the OCM. The array also includes a second processing unit/element (PE) configured to perform a sparse and/or irregular computation task of the ML operation on the data in the OCM and/or from the POD.

Type: Grant

Filed: December 19, 2018

Date of Patent: November 3, 2020

Assignee: Marvell Asia Pte, Ltd.

Inventors: Avinash Sodani, Ulf Hanebutte, Senad Durakovic, Hamid Reza Ghasemi, Chia-Hsin Chen
STREAMING ENGINE FOR MACHINE LEARNING ARCHITECTURE

Publication number: 20190244117

Abstract: A programmable hardware system for machine learning (ML) includes a core and a streaming engine. The core receives a plurality of commands and a plurality of data from a host to be analyzed and inferred via machine learning. The core transmits a first subset of commands of the plurality of commands that is performance-critical operations and associated data thereof of the plurality of data for efficient processing thereof. The first subset of commands and the associated data are passed through via a function call. The streaming engine is coupled to the core and receives the first subset of commands and the associated data from the core. The streaming engine streams a second subset of commands of the first subset of commands and its associated data to an inference engine by executing a single instruction.

Type: Application

Filed: December 19, 2018

Publication date: August 8, 2019

Inventors: Avinash SODANI, Ulf HANEBUTTE, Senad DURAKOVIC, Hamid Reza GHASEMI, Chia-Hsin CHEN
ARCHITECTURE OF CROSSBAR OF INFERENCE ENGINE

Publication number: 20190244118

Abstract: A programmable hardware system for machine learning (ML) includes a core and an inference engine. The core receives commands from a host. The commands are in a first instruction set architecture (ISA) format. The core divides the commands into a first set for performance-critical operations, in the first ISA format, and a second set of performance non-critical operations, in the first ISA format. The core executes the second set to perform the performance non-critical operations of the ML operations and streams the first set to inference engine. The inference engine generates a stream of the first set of commands in a second ISA format based on the first set of commands in the first ISA format. The first set of commands in the second ISA format programs components within the inference engine to execute the ML operations to infer data.

Type: Application

Filed: December 19, 2018

Publication date: August 8, 2019

Inventors: Avinash SODANI, Ulf HANEBUTTE, Senad DURAKOVIC, Hamid Reza GHASEMI, Chia-Hsin CHEN
SYSTEMS AND METHODS FOR PROGRAMMABLE HARDWARE ARCHITECTURE FOR MACHINE LEARNING

Publication number: 20190244141

Abstract: A programmable hardware architecture for machine learning (ML) is proposed, which includes at least a host, a memory, a core, a data streaming engine, a instruction-streaming engine, and an interference engine. The core interprets a plurality of ML commands for a ML operation and/or data received from the host and coordinate activities of the engines based on the data in the received ML commands. The instruction-streaming engine translates the ML commands received from the core and provide a set of programming instructions to the data streaming engine and the inference engines based on the translated parameters. The data steaming engine sends one or more data streams to the inference engine in response to the received programming instructions. The inference engine then processes the data streams received from the data stream engine according to the programming instructions received from the instruction-streaming engine.

Type: Application

Filed: November 9, 2018

Publication date: August 8, 2019

Inventors: Avinash SODANI, Chia-Hsin CHEN, Ulf R. HANEBUTTE, Hamid Reza GHASEMI, Senad DURAKOVIC
ARCHITECTURE FOR DENSE OPERATIONS IN MACHINE LEARNING INFERENCE ENGINE

Publication number: 20190244130

Abstract: A processing unit of an inference engine for machine learning (ML) includes a first, a second, and a third register, and a matrix multiplication block. The first register receives a first stream of data associated with a first matrix data that is read only once. The second register receives a second stream of data associated with a second matrix data that is read only once. The matrix multiplication block performs a multiplication operation based on data from the first register and the second register resulting in an output matrix. A row associated with the first matrix is maintained while rows associated with the second matrix is fed to the matrix multiplication block to perform a multiplication operation. The process is repeated for each row of the first matrix. The third register receives the output matrix from the matrix multiplication block and stores the output matrix.

Type: Application

Filed: December 19, 2018

Publication date: August 8, 2019

Inventors: Avinash SODANI, Ulf HANEBUTTE, Senad DURAKOVIC, Hamid Reza GHASEMI, Chia-Hsin CHEN
SINGLE INSTRUCTION SET ARCHITECTURE (ISA) FORMAT FOR MULTIPLE ISAS IN MACHINE LEARNING INFERENCE ENGINE

Publication number: 20190243653

Abstract: A programmable hardware system for machine learning (ML) includes a core and an inference engine. The core receives commands from a host. The commands are in a first instruction set architecture (ISA) format. The core divides the commands into a first set for performance-critical operations, in the first ISA format, and a second set of performance non-critical operations, in the first ISA format. The core executes the second set to perform the performance non-critical operations of the ML operations and streams the first set to inference engine. The inference engine generates a stream of the first set of commands in a second ISA format based on the first set of commands in the first ISA format. The first set of commands in the second ISA format programs components within the inference engine to execute the ML operations to infer data.

Type: Application

Filed: December 19, 2018

Publication date: August 8, 2019

Inventors: Avinash SODANI, Ulf HANEBUTTE, Senad DURAKOVIC, Hamid Reza GHASEMI, Chia-Hsin CHEN
ARRAY-BASED INFERENCE ENGINE FOR MACHINE LEARNING

Publication number: 20190243800

Abstract: An array-based inference engine includes a plurality of processing tiles arranged in a two-dimensional array of a plurality of rows and a plurality of columns. Each processing tile comprises at least one or more of an on-chip memory (OCM) configured to load and maintain data from the input data stream for local access by components in the processing tile and further configured to maintain and output result of the ML operation performed by the processing tile as an output data stream. The array includes a first processing unit (POD) configured to perform a dense and/or regular computation task of the ML operation on the data in the OCM. The array also includes a second processing unit/element (PE) configured to perform a sparse and/or irregular computation task of the ML operation on the data in the OCM and/or from the POD.

Type: Application

Filed: December 19, 2018

Publication date: August 8, 2019

Inventors: Avinash SODANI, Ulf HANEBUTTE, Senad DURAKOVIC, Hamid Reza GHASEMI, Chia-Hsin CHEN
ARCHITECTURE FOR IRREGULAR OPERATIONS IN MACHINE LEARNING INFFERENCE ENGINE

Publication number: 20190243871

Abstract: A processing unit of an inference engine for machine learning (ML) includes a first data load steamer, a second data load streamer, an operator component, and a store streamer. The first data load streamer streams a first data stream from an on-chip memory (OCM) to the operator component. The second data load streamer streams a second data stream from the OCM to the operator component. The operator component performs a matrix operation on the first data stream and the second data stream. The store streamer receives a data output stream from the operator component and to store the data output stream in a buffer.

Type: Application

Filed: December 19, 2018

Publication date: August 8, 2019

Inventors: Avinash SODANI, Ulf HANEBUTTE, Senad DURAKOVIC, Hamid Reza GHASEMI, Chia-Hsin CHEN, Rishan TAN
Cognitive-based dynamic tuning

Patent number: 10373072

Abstract: A method, system, and computer program product for performing cognitive-based dynamic tuning of a software-based system include monitoring live operation of the system, and determining whether tuning is needed based on the monitoring. Analyzing information and suggesting a change in one or more parameters is based on the determining, the information including an output of a learning algorithm that learns an effect of changes in one or more of the one or more parameters on performance of the system.

Type: Grant

Filed: January 8, 2016

Date of Patent: August 6, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Diane Britton, Reza Ghasemi, Chon N. Lei, Robert Maher, Vanessa V. Michelini
COGNITIVE-BASED DYNAMIC TUNING

Publication number: 20170200091

Abstract: A method, system, and computer program product for performing cognitive-based dynamic tuning of a software-based system include monitoring live operation of the system, and determining whether tuning is needed based on the monitoring. Analyzing information and suggesting a change in one or more parameters is based on the determining, the information including an output of a learning algorithm that learns an effect of changes in one or more of the one or more parameters on performance of the system.

Type: Application

Filed: January 8, 2016

Publication date: July 13, 2017

Inventors: Diane Britton, Reza Ghasemi, Chon N. Lei, Robert Maher, Vanessa V. Michelini
Quality evaluation tool for dynamic voice portals

Patent number: 8050918

Abstract: A method and system for evaluating the quality of voice input recognition by a voice portal is provided. An analysis interface extracts a set of current grammars from the voice portal. A test pattern generator generates a test input for each current grammar. The test input includes a test pattern and a set of active grammars corresponding to each current grammar. The system further includes a text-to-speech engine for entering each test pattern into the voice server. A results collector analyzes each test pattern entered into the voice server with the speech recognition engine against the set of active grammars corresponding to the current grammar for said test pattern. A results analyzer derives a set of statistics of a quality of recognition of each current grammar.

Type: Grant

Filed: December 11, 2003

Date of Patent: November 1, 2011

Assignee: Nuance Communications, Inc.

Inventors: Reza Ghasemi, Walter Haenel
Effortless association between services in a communication system and methods thereof

Patent number: 7831656

Abstract: A communication system (100) includes a portal (110), a subscriber (108), a service processor (112), and a communication network (102-104, 107) for providing communication between the portal (110), the subscriber (108) and the service processor (112).

Type: Grant

Filed: December 29, 2004

Date of Patent: November 9, 2010

Assignee: International Business Machines Corporation

Inventors: John J. Cazzolla, Reza Ghasemi, Walter Haenel, Joseph A. Hansen
Voice enabled network communications

Patent number: 7739350

Abstract: A method of communicating with a remote user. The method can include receiving a plurality of server requests from the remote user via a communications network. The plurality of server requests can be processed in a single user session without re-authenticating the user, and can include at least one server request that includes voice data and at least one server request that includes non-audio data. A portlet can be provided to process the voice data and the non-audio data server requests. Responsive to the server requests, data can be provided to the remote user via the communications network.

Type: Grant

Filed: December 10, 2003

Date of Patent: June 15, 2010

Assignee: International Business Machines Corporation

Inventors: John J. Cazzolla, Reza Ghasemi, Walter Haenel, Joseph A. Hansen
Effortless registration with content providers and methods thereof

Patent number: 7730128

Abstract: A communication system (100) has a portal (110), a subscriber (108), a plurality of content providers (112), and a communication network for providing communication between the portal, the subscriber and the plurality of content providers. The components of the communication system are programmed to transmit to the subscriber from the portal an available selection of the plurality of content providers, select at the subscriber a select one of the plurality of content providers, and transmit content provider registration corresponding to the selected content provider from the portal to the selected content provider.

Type: Grant

Filed: August 28, 2008

Date of Patent: June 1, 2010

Assignee: International Business Machines Corporation

Inventors: Thomas E. Creamer, Reza Ghasemi, Walter Haenel

1 2 next