Patents by Inventor Aleksandar ZLATESKI

Aleksandar ZLATESKI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11216732
    Abstract: A system and method may generate code to be used when executing neural networks (NNs), for example convolutional neural networks (CNNs) which may include one or more convolutional layers. For at least one convolutional layer, for each non-zero element in a kernel tensor or matrix associated with the convolutional layer, instructions may be generated or issued. For example, for each non-zero element, a vector broadcast instruction may be generated, and a fused multiply-add (FMA) instruction may be generated, having as parameters a register representing a portion of the output for the convolutional layer, a register storing input data for the convolutional layer, and a register or reference to memory storing the non-zero element. The software or code produced may be executed during convolutional operations, for example as part of a larger application such as a NN inference application.
    Type: Grant
    Filed: February 18, 2021
    Date of Patent: January 4, 2022
    Assignee: NEURALMAGIC INC.
    Inventors: Aleksandar Zlateski, Justin Kopinsky
  • Publication number: 20210182676
    Abstract: A system and method may generate code to be used when executing neural networks (NNs), for example convolutional neural networks (CNNs) which may include one or more convolutional layers. For at least one convolutional layer, for each non-zero element in a kernel tensor or matrix associated with the convolutional layer, instructions may be generated or issued. For example, for each non-zero element, a vector broadcast instruction may be generated, and a fused multiply-add (FMA) instruction may be generated, having as parameters a register representing a portion of the output for the convolutional layer, a register storing input data for the convolutional layer, and a register or reference to memory storing the non-zero element. The software or code produced may be executed during convolutional operations, for example as part of a larger application such as a NN inference application.
    Type: Application
    Filed: February 18, 2021
    Publication date: June 17, 2021
    Applicant: Neuralmagic Inc.
    Inventors: Aleksandar ZLATESKI, Justin KOPINSKY
  • Patent number: 10963787
    Abstract: A system and method may generate code to be used when executing neural networks (NNs), for example convolutional neural networks (CNNs) which may include one or more convolutional layers. For at least one convolutional layer, for each non-zero element in a kernel tensor or matrix associated with the convolutional layer, instructions may be generated or issued. For example, for each non-zero element, a vector broadcast instruction may be generated, and a fused multiply-add (FMA) instruction may be generated, having as parameters a register representing a portion of the output for the convolutional layer, a register storing input data for the convolutional layer, and a register or reference to memory storing the non-zero element. The software or code produced may be executed during convolutional operations, for example as part of a larger application such as a NN inference application.
    Type: Grant
    Filed: January 24, 2020
    Date of Patent: March 30, 2021
    Assignee: NEURALMAGIC INC.
    Inventors: Aleksandar Zlateski, Justin Kopinsky
  • Publication number: 20200160182
    Abstract: A system and method of inferring a neural network (NN) on one or more target computing devices. The NN may include a plurality of layers, where at least one layer includes one or more kernels. Embodiments may include: receiving a data structure representing the NN; analyzing the data structure to produce one or more tasks, where each task may include computations pertaining to a kernel of the NN; selecting a sparse version of at least one kernel and replacing the at least one kernel with the sparse version; and compiling the one or more tasks to produce one or more respective tensor columns, The one or more tensor columns are adapted to fit in respective one or more cache memories of the one or more target computing devices, and include task instruction code that represents at least one computation of the kernel of the NN.
    Type: Application
    Filed: January 24, 2020
    Publication date: May 21, 2020
    Applicant: Neuralmagic Inc.
    Inventors: Alexander MATVEEV, Nir Shavit, Aleksandar Zlateski
  • Publication number: 20200160181
    Abstract: A system and method may generate code to be used when executing neural networks (NNs), for example convolutional neural networks (CNNs) which may include one or more convolutional layers. For at least one convolutional layer, for each non-zero element in a kernel tensor or matrix associated with the convolutional layer, instructions may be generated or issued. For example, for each non-zero element, a vector broadcast instruction may be generated, and a fused multiply-add (FMA) instruction may be generated, having as parameters a register representing a portion of the output for the convolutional layer, a register storing input data for the convolutional layer, and a register or reference to memory storing the non-zero element. The software or code produced may be executed during convolutional operations, for example as part of a larger application such as a NN inference application.
    Type: Application
    Filed: January 24, 2020
    Publication date: May 21, 2020
    Applicant: Neuralmagic Inc.
    Inventors: Aleksandar ZLATESKI, Justin KOPINSKY