Patents by Inventor Amol Ashok Ambardekar

Amol Ashok Ambardekar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11909422
    Abstract: A deep neural network (“DNN”) module compresses and decompresses neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit receives an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit receives a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion.
    Type: Grant
    Filed: November 11, 2022
    Date of Patent: February 20, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Joseph Leon Corkery, Benjamin Eliot Lundell, Larry Marvin Wall, Chad Balling McBride, Amol Ashok Ambardekar, George Petre, Kent D. Cedola, Boris Bobrov
  • Patent number: 11750212
    Abstract: The performance of a neural network (NN) and/or deep neural network (DNN) can limited by the number of operations being performed as well as memory data management of a NN/DNN. Using vector quantization of neuron weight values, the processing of data by neurons can be optimize the number of operations as well as memory utilization to enhance the overall performance of a NN/DNN. Operatively, one or more contiguous segments of weight values can be converted into one or more vectors of arbitrary length and each of the one or more vectors can be assigned an index. The generated indexes can be stored in an exemplary vector quantization lookup table and retrieved by exemplary fast weight lookup hardware at run time on the fly as part of an exemplary data processing function of the NN as part of an inline de-quantization operation to obtain needed one or more neuron weight values.
    Type: Grant
    Filed: April 15, 2021
    Date of Patent: September 5, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Amol Ashok Ambardekar, Aleksandar Tomic, Chad Balling McBride, George Petre, Kent D. Cedola, Larry Marvin Wall, Boris Bobrov
  • Patent number: 11722147
    Abstract: Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.
    Type: Grant
    Filed: January 25, 2022
    Date of Patent: August 8, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Kent D. Cedola, Larry Marvin Wall, Boris Bobrov, George Petre, Chad Balling McBride, Amol Ashok Ambardekar
  • Publication number: 20230071352
    Abstract: A deep neural network (“DNN”) module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit can receive an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit can receive a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion. This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption.
    Type: Application
    Filed: November 11, 2022
    Publication date: March 9, 2023
    Inventors: Joseph Leon CORKERY, Benjamin Eliot LUNDELL, Larry Marvin WALL, Chad Balling McBRIDE, Amol Ashok AMBARDEKAR, George PETRE, Kent D. CEDOLA, Boris BOBROV
  • Patent number: 11528033
    Abstract: A deep neural network (“DNN”) module compresses and decompresses neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit receives an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit receives a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion.
    Type: Grant
    Filed: April 13, 2018
    Date of Patent: December 13, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Joseph Leon Corkery, Benjamin Eliot Lundell, Larry Marvin Wall, Chad Balling McBride, Amol Ashok Ambardekar, George Petre, Kent D. Cedola, Boris Bobrov
  • Patent number: 11526581
    Abstract: A method of performing matrix computations includes receiving a compression-encoded matrix including a plurality of rows. Each row of the compression-encoded matrix has a plurality of defined element values and, for each such defined element value, a schedule tag indicating a schedule for using the defined element value in a scheduled matrix computation. The method further includes loading the plurality of rows of the compression-encoded matrix into a corresponding plurality of work memory banks, and providing decoded input data to a matrix computation module configured for performing the scheduled matrix computation. For each work memory bank, a next defined element value and a corresponding schedule tag are read. If the schedule tag meets a scheduling condition, the next defined element value is provided to the matrix computation module. Otherwise, a default element value is provided to the matrix computation module.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: December 13, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shuayb M. Zarar, Amol Ashok Ambardekar, Jun Zhang
  • Patent number: 11514648
    Abstract: An image data annotation system automatically annotates a physical object within individual images frames of an image sequence with relevant object annotations based on a three-dimensional (3D) model of the physical object. Annotating the individual image frames with object annotations includes updating individual image frames within image input data to generate annotated image data that is suitable for reliably training a DNN object detection architecture. Exemplary object annotations that the image data annotation system can automatically apply to individual image frames include, inter alia, object pose, image pose, object masks, 3D bounding boxes composited over the physical object, 2D bounding boxes composited over the physical object, and/or depth map information.
    Type: Grant
    Filed: December 23, 2020
    Date of Patent: November 29, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Harpreet Singh Sawhney, Ning Xu, Amol Ashok Ambardekar, Moses Obadeji Olafenwa
  • Patent number: 11493985
    Abstract: A computer system comprising a scheduled computation module, a work memory storage device, and a controller. The scheduled computation module is configured to receive and process data values according to a predetermined access pattern. The work memory storage device includes one or more work memory banks. The controller is configured to, based on scheduling information associated with the predetermined access pattern, (1) provide data values held by the one or more work memory banks to the scheduled computation module, and (2) selectively control a power state of the one or more work memory banks.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: November 8, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Amol Ashok Ambardekar, Shuayb M. Zarar, Jun Zhang
  • Patent number: 11487342
    Abstract: Techniques to provide for improved (i.e., reduced) power consumption in an exemplary neural network (NN) and/or Deep Neural Network (DNN) environment using data management. Improved power consumption in the NN/DNN may be achieved by reducing a number of bit flips needed to process operands associated with one or more storages. Reducing the number bit flips associated with the NN/DNN may be achieved by multiplying an operand associated with a first storage with a plurality of individual operands associated with a plurality of kernels of the NN/DNN. The operand associated with the first storage may be neuron input data and the plurality of individual operands associated with the second storage may be weight values for multiplication with the neuron input data. The plurality of kernels may be arranged or sorted and subsequently processed in a manner that improves power consumption in the NN/DNN.
    Type: Grant
    Filed: April 16, 2021
    Date of Patent: November 1, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Amol Ashok Ambardekar, Chad Balling McBride, George Petre, Kent D. Cedola, Larry Marvin Wall
  • Patent number: 11476869
    Abstract: A deep neural network (DNN) module is disclosed that can dynamically partition neuron workload to reduce power consumption. The DNN module includes neurons and a group partitioner and scheduler unit. The group partitioner and scheduler unit divides a workload for the neurons into partitions in order to maximize the number of neurons that can simultaneously process the workload. The group partitioner and scheduler unit then assigns a group of neurons to each of the partitions. The groups of neurons in the DNN module process the workload in their assigned partition to generate a partial output value. The neurons in each group can then sum their partial output values to generate a final output value for the workload. The neurons can be powered down once the groups of neurons have completed processing their assigned workload to reduce power consumption.
    Type: Grant
    Filed: April 13, 2018
    Date of Patent: October 18, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Amol Ashok Ambardekar, Boris Bobrov, Chad Balling McBride, George Petre, Kent D. Cedola, Larry Marvin Wall
  • Patent number: 11405051
    Abstract: An exemplary artificial intelligence/machine learning hardware computing environment having an exemplary DNN module cooperating with one or more memory components can perform data sharing and distribution as well reuse of a buffer data to reduce the number of memory component read/writes thereby enhancing overall hardware performance and reducing power consumption. Illustratively, data from a cooperating memory component is read according to a selected operation of the exemplary hardware and written to corresponding other memory component for use by one or more processing elements (e.g., neurons). The data is read in such a manner to optimize the engagement of the one or more processing elements for each processing cycle as well as to reuse data previously stored in the one or more cooperating memory components. Operatively, the written data is copied to a shadow memory buffer prior to being consumed by the processing elements.
    Type: Grant
    Filed: April 13, 2018
    Date of Patent: August 2, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Chad Balling McBride, Amol Ashok Ambardekar, Kent D. Cedola, Boris Bobrov, George Petre, Larry Marvin Wall
  • Publication number: 20220198753
    Abstract: An image data annotation system automatically annotates a physical object within individual images frames of an image sequence with relevant object annotations based on a three-dimensional (3D) model of the physical object. Annotating the individual image frames with object annotations includes updating individual image frames within image input data to generate annotated image data that is suitable for reliably training a DNN object detection architecture. Exemplary object annotations that the image data annotation system can automatically apply to individual image frames include, inter alia, object pose, image pose, object masks, 3D bounding boxes composited over the physical object, 2D bounding boxes composited over the physical object, and/or depth map information.
    Type: Application
    Filed: December 23, 2020
    Publication date: June 23, 2022
    Inventors: Harpreet Singh SAWHNEY, Ning XU, Amol Ashok AMBARDEKAR, Moses Obadeji OLAFENWA
  • Patent number: 11341399
    Abstract: A deep neural network (“DNN”) module can determine whether processing of certain values in an input buffer or a weight buffer by neurons can be skipped. For example, the DNN module might determine whether neurons can skip the processing of values in entire columns of a neuron buffer. Processing of these values might be skipped if an entire column of an input buffer or a weight buffer are zeros, for example. The DNN module can also determine whether processing of single values in rows of the input buffer or the weight buffer can be skipped (e.g. if the values are zero). Neurons that complete their processing early as a result of skipping operations can assist other neurons with their processing. A combination operation can be performed following the completion of processing that transfers the results of the processing operations performed by a neuron to their correct owner.
    Type: Grant
    Filed: April 13, 2018
    Date of Patent: May 24, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Amol Ashok Ambardekar, Chad Balling McBride, George Petre, Larry Marvin Wall, Kent D. Cedola, Boris Bobrov
  • Publication number: 20220147833
    Abstract: Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.
    Type: Application
    Filed: January 25, 2022
    Publication date: May 12, 2022
    Inventors: Kent D. CEDOLA, Larry Marvin WALL, Boris BOBROV, George PETRE, Chad Balling McBRIDE, Amol Ashok AMBARDEKAR
  • Patent number: 11256976
    Abstract: Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.
    Type: Grant
    Filed: September 28, 2017
    Date of Patent: February 22, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Kent D. Cedola, Larry Marvin Wall, Boris Bobrov, George Petre, Chad Balling McBride, Amol Ashok Ambardekar
  • Patent number: 11205118
    Abstract: A deep neural network (DNN) module utilizes parallel kernel and parallel input processing to decrease bandwidth utilization, reduce power consumption, improve neuron multiplier stability, and provide other technical benefits. Parallel kernel processing enables the DNN module to load input data only once for processing by multiple kernels. Parallel input processing enables the DNN module to load kernel data only once for processing with multiple input data. The DNN module can implement other power-saving techniques like clock-gating (i.e. removing the clock from) and power-gating (i.e. removing the power from) banks of accumulators based upon usage of the accumulators. For example, individual banks of accumulators can be power-gated when all accumulators in a bank are not in use, and do not store data for a future calculation. Banks of accumulators can also be clock-gated when all accumulators in a bank are not in use, but store data for a future calculation.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: December 21, 2021
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Amol Ashok Ambardekar, Chad Balling McBRIDE, George Petre, Larry Marvin Wall, Kent D. Cedola, Boris Bobrov
  • Patent number: 11182667
    Abstract: The performance of a neural network (NN) and/or deep neural network (DNN) can be limited by the number of operations being performed as well as management of data among the various memory components of the NN/DNN. By inserting a selected padding in the input data to align the input data in memory, data read/writes can be optimized for processing by the NN/DNN thereby enhancing the overall performance of a NN/DNN. Operatively, an operations controller/iterator can generate one or more instructions that inserts the selected padding into the data. The data padding can be calculated using various characteristics of the input data as well as the NN/DNN as well as characteristics of the cooperating memory components. Padding on the output data can be utilized to support the data alignment at the memory components and the cooperating processing units of the NN/DNN.
    Type: Grant
    Filed: November 15, 2017
    Date of Patent: November 23, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: George Petre, Chad Balling McBride, Amol Ashok Ambardekar, Kent D. Cedola, Larry Marvin Wall, Boris Bobrov
  • Patent number: 11176448
    Abstract: An exemplary computing environment having a DNN module can maintain one or more bandwidth throttling mechanisms. Illustratively, a first throttling mechanism can specify the number of cycles to wait between transactions on a cooperating fabric component (e.g., data bus). Illustratively, a second throttling mechanism can be a transaction count limiter that operatively sets a threshold of a number of transactions to be processed during a given transaction sequence and limits the number of transactions such as multiple transactions in flight to not exceed the set threshold. In an illustrative operation, in executing these two exemplary calculated throttling parameters, the average bandwidth usage and the peak bandwidth usage can be limited. Operatively, with this fabric bandwidth control, the processing units of the DNN are optimized to process data across each transaction cycle resulting in enhanced processing and lower power consumption.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: November 16, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Chad Balling McBride, Timothy Hume Heil, Amol Ashok Ambardekar, George Petre, Kent D. Cedola, Larry Marvin Wall, Boris Bobrov
  • Patent number: 11132845
    Abstract: A method for object recognition includes, at a computing device, receiving an image of a real-world object. An identity of the real-world object is recognized using an object recognition model trained on a plurality of computer-generated training images. A digital augmentation model corresponding to the real-world object is retrieved, the digital augmentation model including a set of augmentation-specific instructions. A pose of the digital augmentation model is aligned with a pose of the real-world object. An augmentation is provided, the augmentation associated with the real-world object and specified by the augmentation-specific instructions.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: September 28, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Harpreet Singh Sawhney, Andrey Konin, Bilha-Catherine W. Githinji, Amol Ashok Ambardekar, William Douglas Guyman, Muhammad Zeeshan Zia, Ning Xu, Sheng Kai Tang, Pedro Urbina Escos
  • Patent number: 11100390
    Abstract: A deep neural network (DNN) processor is configured to execute layer descriptors in layer descriptor lists. The descriptors define instructions for performing a forward pass of a DNN by the DNN processor. The layer descriptors can also be utilized to manage the flow of descriptors through the DNN module. For example, layer descriptors can define dependencies upon other descriptors. Descriptors defining a dependency will not execute until the descriptors upon which they are dependent have completed. Layer descriptors can also define a “fence,” or barrier, function that can be used to prevent the processing of upstream layer descriptors until the processing of all downstream layer descriptors is complete. The fence bit guarantees that there are no other layer descriptors in the DNN processing pipeline before the layer descriptor that has the fence to be asserted is processed.
    Type: Grant
    Filed: April 11, 2018
    Date of Patent: August 24, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Chad Balling McBride, Amol Ashok Ambardekar, Kent D. Cedola, George Petre, Larry Marvin Wall, Boris Bobrov