Patents by Inventor Nitin Garegrat

Nitin Garegrat has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11748251
    Abstract: Embodiments of the present disclosure include systems and methods for storing tensors in memory based on depth. In some embodiments, for each of a plurality of sets of elements in a three-dimensional (3D) matrix, a position is determined along a height axis and width axis of the 3D matrix. At the determined position, a set of elements are identified along a depth axis of the 3D matrix. The set of elements are stored in a contiguous block of memory.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: September 5, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Nitin Garegrat, Shankar Narayan, Derek Gladding
  • Publication number: 20220222174
    Abstract: Embodiments of the present disclosure include systems and methods for storing tensors in memory based on depth. In some embodiments, for each of a plurality of sets of elements in a three-dimensional (3D) matrix, a position is determined along a height axis and width axis of the 3D matrix. At the determined position, a set of elements are identified along a depth axis of the 3D matrix. The set of elements are stored in a contiguous block of memory.
    Type: Application
    Filed: January 8, 2021
    Publication date: July 14, 2022
    Inventors: Nitin Garegrat, Shankar Narayan, Derek Gladding
  • Publication number: 20220222318
    Abstract: Embodiments of the present disclosure include systems and methods for performing tensor operations using a programmable control engine. A command queue is configured to receive a command from a software application. A configuration storage is configured to store a plurality of configurations. A matrix multiplication unit is configured to perform matrix multiplication operations. Memory is configured to store matrices. A control engine is configured to retrieve the command from the command queue; retrieve a configuration from the configuration storage based on the command; generate, based on the command and the configuration, instructions for the matrix multiplication unit to perform a set of matrix multiplication operations on first and second matrices stored in the memory; send the instructions to the matrix multiplication unit to configure the matrix multiplication unit to output results of the set of matrix multiplication operations; and store the results in a third matrix in the memory.
    Type: Application
    Filed: January 8, 2021
    Publication date: July 14, 2022
    Inventors: Nitin Garegrat, Derek Gladding, Shankar Narayan, Sujatha Santhanaraman, Jayadev Velagandula
  • Patent number: 10929503
    Abstract: An apparatus and method for a masked multiply instruction to support neural network pruning operations. For example, one embodiment of a processor comprises: a decoder to decode a matrix multiplication with masking (GEMM) instruction identifying a destination matrix register to store a result, and source registers storing an A-matrix, a B-matrix, and a matrix mask; execution circuitry to execute the GEMM instruction, the execution circuitry to multiply a plurality of B-matrix elements with a plurality of A-matrix elements, each of the B-matrix elements associated with a mask value in the matrix mask, wherein if the mask value is set to a first value, then the execution circuitry is to multiply the B-matrix element with one or more of the A-matrix elements to generate a first partial result, and if the mask value is set to a second value, then the execution circuitry is to multiply an alternate B-matrix element with a one or more of the A-matrix elements to generate a second partial result.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: February 23, 2021
    Assignee: Intel Corporation
    Inventors: Omid Azizi, Chen Koren, Nitin Garegrat
  • Patent number: 10761757
    Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.
    Type: Grant
    Filed: June 30, 2018
    Date of Patent: September 1, 2020
    Assignee: Intel Corporation
    Inventors: Krishnakumar Nair, Andrew Yang, Michael Rotzin, Nitin Garegrat, Tom Schebye, Tony Werner
  • Publication number: 20190121837
    Abstract: An apparatus and method for a masked multiply instruction to support neural network pruning operations. For example, one embodiment of a processor comprises: a decoder to decode a matrix multiplication with masking (GEMM) instruction identifying a destination matrix register to store a result, and source registers storing an A-matrix, a B-matrix, and a matrix mask; execution circuitry to execute the GEMM instruction, the execution circuitry to multiply a plurality of B-matrix elements with a plurality of A-matrix elements, each of the B-matrix elements associated with a mask value in the matrix mask, wherein if the mask value is set to a first value, then the execution circuitry is to multiply the B-matrix element with one or more of the A-matrix elements to generate a first partial result, and if the mask value is set to a second value, then the execution circuitry is to multiply an alternate B-matrix element with a one or more of the A-matrix elements to generate a second partial result.
    Type: Application
    Filed: December 21, 2018
    Publication date: April 25, 2019
    Inventors: OMID AZIZI, CHEN KOREN, NITIN GAREGRAT
  • Publication number: 20190042094
    Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.
    Type: Application
    Filed: June 30, 2018
    Publication date: February 7, 2019
    Inventors: Krishnakumar Nair, Andrew Yang, Michael Rotzn, Nitin Garegrat, Tom Schebye, Tony Werner