Patents by Inventor Aliasger Zaidy

Aliasger Zaidy has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Deep neural networks compiler for a trace-based accelerator

Patent number: 11861337

Abstract: A method of compiling neural network code to executable instructions for execution by a computational acceleration system having a memory circuit and one or more acceleration circuits having a maps data buffer and a kernel data buffer is disclosed, such as for execution by an inference engine circuit architecture which includes a matrix-matrix (MM) accelerator circuit having multiple operating modes to provide a complete matrix multiplication. A representative compiling method includes generating a list of neural network layer model objects; fusing available functions and layers in the list; selecting a cooperative mode, an independent mode, or a combined cooperative and independent mode for execution; selecting a data movement mode and an ordering of computations which reduces usage of the memory circuit; generating an ordered sequence of load objects, compute objects, and store objects; and converting the ordered sequence of load objects, compute objects, and store objects into the executable instructions.

Type: Grant

Filed: August 26, 2020

Date of Patent: January 2, 2024

Assignee: Micron Technology, Inc.

Inventors: Andre Xian Ming Chang, Aliasger Zaidy, Eugenio Culurciello, Marko Vitez
Hardware accelerator for convolutional neural networks and method of operation thereof

Patent number: 11775313

Abstract: An accelerator for processing of a convolutional neural network (CNN) includes a compute core having a plurality of compute units. Each compute unit includes a first memory cache configured to store at least one vector in a map trace, a second memory cache configured to store at least one vector in a kernel trace, and a plurality of vector multiply-accumulate units (vMACs) connected to the first and second memory caches. Each vMAC includes a plurality of multiply-accumulate units (MACs). Each MAC includes a multiplier unit configured to multiply a first word that of the at least one vector in the map trace by a second word of the at least one vector in the kernel trace to produce an intermediate product, and an adder unit that adds the intermediate product to a third word to generate a sum of the intermediate product and the third word.

Type: Grant

Filed: May 25, 2018

Date of Patent: October 3, 2023

Assignee: Purdue Research Foundation

Inventors: Eugenio Culurciello, Vinayak Gokhale, Aliasger Zaidy, Andre Chang
Inference engine circuit architecture

Patent number: 11675624

Abstract: An inference engine circuit architecture is disclosed which includes a matrix-matrix (MM) processor circuit and a MM accelerator circuit having multiple operating modes to provide a complete matrix multiplication. A representative MM accelerator circuit includes a first buffer circuit storing maps data; a first data network; multiple second buffer circuits each storing different kernel data; multiple second, serial data networks, with each coupled to a corresponding second buffer circuit; and a plurality of vector-vector (VV) acceleration circuits arranged in a plurality of arrays. Each VV acceleration circuit includes multiply and accumulate circuits; a shift register; a control multiplexer to provide a selected output, in response to a mode control word, of a bias parameter or a first accumulation sum; and a second adder circuit which adds the multiplicative product to the bias parameter or to the first accumulation sum to generate a second or next accumulation sum.

Type: Grant

Filed: March 29, 2020

Date of Patent: June 13, 2023

Assignee: Micron Technology, Inc.

Inventors: Aliasger Zaidy, Andre Xian Ming Chang, Eugenio Culurciello
Deep Neural Networks Compiler for a Trace-Based Accelerator

Publication number: 20220066760

Abstract: A method of compiling neural network code to executable instructions for execution by a computational acceleration system having a memory circuit and one or more acceleration circuits having a maps data buffer and a kernel data buffer is disclosed, such as for execution by an inference engine circuit architecture which includes a matrix-matrix (MM) accelerator circuit having multiple operating modes to provide a complete matrix multiplication. A representative compiling method includes generating a list of neural network layer model objects; fusing available functions and layers in the list; selecting a cooperative mode, an independent mode, or a combined cooperative and independent mode for execution; selecting a data movement mode and an ordering of computations which reduces usage of the memory circuit; generating an ordered sequence of load objects, compute objects, and store objects; and converting the ordered sequence of load objects, compute objects, and store objects into the executable instructions.

Type: Application

Filed: August 26, 2020

Publication date: March 3, 2022

Inventors: Andre Xian Ming Chang, Aliasger Zaidy, Eugenio Culurciello, Marko Vitez
Inference Engine Circuit Architecture

Publication number: 20210303358

Abstract: An inference engine circuit architecture is disclosed which includes a matrix-matrix (MM) processor circuit and a MM accelerator circuit having multiple operating modes to provide a complete matrix multiplication. A representative MM accelerator circuit includes a first buffer circuit storing maps data; a first data network; multiple second buffer circuits each storing different kernel data; multiple second, serial data networks, with each coupled to a corresponding second buffer circuit; and a plurality of vector-vector (VV) acceleration circuits arranged in a plurality of arrays. Each VV acceleration circuit includes multiply and accumulate circuits; a shift register; a control multiplexer to provide a selected output, in response to a mode control word, of a bias parameter or a first accumulation sum; and a second adder circuit which adds the multiplicative product to the bias parameter or to the first accumulation sum to generate a second or next accumulation sum.

Type: Application

Filed: March 29, 2020

Publication date: September 30, 2021

Inventors: Aliasger Zaidy, Andre Xian Ming Chang, Eugenio Culurciello
Hardware Accelerator for Convolutional Neural Networks and Method of Operation Thereof

Publication number: 20180341495

Abstract: An accelerator for processing of a convolutional neural network (CNN) includes a compute core having a plurality of compute units. Each compute unit includes a first memory cache configured to store at least one vector in a map trace, a second memory cache configured to store at least one vector in a kernel trace, and a plurality of vector multiply-accumulate units (vMACs) connected to the first and second memory caches. Each vMAC includes a plurality of multiply-accumulate units (MACs). Each MAC includes a multiplier unit configured to multiply a first word that of the at least one vector in the map trace by a second word of the at least one vector in the kernel trace to produce an intermediate product, and an adder unit that adds the intermediate product to a third word to generate a sum of the intermediate product and the third word.

Type: Application

Filed: May 25, 2018

Publication date: November 29, 2018

Inventors: Eugenio Culurciello, Vinayak Gokhale, Aliasger Zaidy, Andre Chang

Deep neural networks compiler for a trace-based accelerator

Hardware accelerator for convolutional neural networks and method of operation thereof

Inference engine circuit architecture

Deep Neural Networks Compiler for a Trace-Based Accelerator

Inference Engine Circuit Architecture

Hardware Accelerator for Convolutional Neural Networks and Method of Operation Thereof