Patents by Inventor Amol A. AMBARDEKAR

Amol A. AMBARDEKAR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230196086
    Abstract: Neural processing elements are configured with a hardware AND gate configured to perform a logical AND operation between a sign extend signal and a most significant bit (“MSB”) of an operand. The state of the sign extend signal can be based upon a type of a layer of a deep neural network (“DNN”) that generate the operand. If the sign extend signal is logical FALSE, no sign extension is performed. If the sign extend signal is logical TRUE, a concatenator concatenates the output of the hardware AND gate and the operand, thereby extending the operand from an N-bit unsigned binary value to an N+1 bit signed binary value. The neural processing element can also include another hardware AND gate and another concatenator for processing another operand similarly. The outputs of the concatenators for both operands are provided to a hardware binary multiplier.
    Type: Application
    Filed: February 24, 2023
    Publication date: June 22, 2023
    Inventors: Amol A AMBARDEKAR, Boris BOBROV, Kent D. CEDOLA, Chad Balling MCBRIDE, George PETRE, Larry Marvin WALL
  • Patent number: 11604972
    Abstract: Neural processing elements are configured with a hardware AND gate configured to perform a logical AND operation between a sign extend signal and a most significant bit (“MSB”) of an operand. The state of the sign extend signal can be based upon a type of a layer of a deep neural network (“DNN”) that generate the operand. If the sign extend signal is logical FALSE, no sign extension is performed. If the sign extend signal is logical TRUE, a concatenator concatenates the output of the hardware AND gate and the operand, thereby extending the operand from an N-bit unsigned binary value to an N+1 bit signed binary value. The neural processing element can also include another hardware AND gate and another concatenator for processing another operand similarly. The outputs of the concatenators for both operands are provided to a hardware binary multiplier.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: March 14, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Amol A Ambardekar, Boris Bobrov, Kent D. Cedola, Chad Balling McBride, George Petre, Larry Marvin Wall
  • Patent number: 11507349
    Abstract: An architecture is disclosed for an neural processing element having single instruction, multiple data (“SIMD”) compute lanes. The neural processing element includes compute lanes having multipliers configured to multiply a binary operand with another binary operand to generate a binary output. The neural processing element also includes a single adder tree for summing the binary outputs of the hardware binary multipliers. The neural processing element also includes a storage element for storing a binary output of the single hardware binary adder tree.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: November 22, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Chad Balling McBride, Amol A. Ambardekar, Boris Bobrov, Kent D. Cedola, George Petre, Larry Marvin Wall
  • Patent number: 11494237
    Abstract: A computing system includes processor cores for executing applications that utilize functionality provided by a deep neural network (“DNN”) processor. One of the cores operates as a resource and power management (“RPM”) processor core. When the RPM processor receives a request to execute a DNN workload, it divides the DNN workload into workload fragments. The RPM processor then determines whether a workload fragment is to be statically allocated or dynamically allocated to a DNN processor. Once the RPM processor has selected a DNN processor, the RPM enqueues the workload fragment on a queue maintained by the selected DNN processor. The DNN processor dequeues workload fragments from its queue for execution. Once execution of a workload fragment has completed, the DNN processor generates an interrupt indicating that execution of the workload fragment has completed. The RPM processor can then notify the processor core that originally requested execution of the workload fragment.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: November 8, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Chad Balling McBride, Amol A. Ambardekar, Boris Bobrov, Kent D. Cedola, George Petre, Larry Marvin Wall
  • Publication number: 20200409663
    Abstract: An architecture is disclosed for an neural processing element having single instruction, multiple data (“SIMD”) compute lanes. The neural processing element includes compute lanes having multipliers configured to multiply a binary operand with another binary operand to generate a binary output. The neural processing element also includes a single adder tree for summing the binary outputs of the hardware binary multipliers. The neural processing element also includes a storage element for storing a binary output of the single hardware binary adder tree.
    Type: Application
    Filed: June 26, 2019
    Publication date: December 31, 2020
    Inventors: Chad Balling MCBRIDE, Amol A. AMBARDEKAR, Boris BOBROV, Kent D. CEDOLA, George PETRE, Larry Marvin WALL
  • Publication number: 20200410329
    Abstract: Neural processing elements are configured with a hardware AND gate configured to perform a logical AND operation between a sign extend signal and a most significant bit (“MSB”) of an operand. The state of the sign extend signal can be based upon a type of a layer of a deep neural network (“DNN”) that generate the operand. If the sign extend signal is logical FALSE, no sign extension is performed. If the sign extend signal is logical TRUE, a concatenator concatenates the output of the hardware AND gate and the operand, thereby extending the operand from an N-bit unsigned binary value to an N+1 bit signed binary value. The neural processing element can also include another hardware AND gate and another concatenator for processing another operand similarly. The outputs of the concatenators for both operands are provided to a hardware binary multiplier.
    Type: Application
    Filed: June 28, 2019
    Publication date: December 31, 2020
    Inventors: Amol A. AMBARDEKAR, Boris BOBROV, Kent D. CEDOLA, Chad Balling MCBRIDE, George PETRE, Larry Marvin WALL
  • Publication number: 20200409757
    Abstract: A computing system includes processor cores for executing applications that utilize functionality provided by a deep neural network (“DNN”) processor. One of the cores operates as a resource and power management (“RPM”) processor core. When the RPM processor receives a request to execute a DNN workload, it divides the DNN workload into workload fragments. The RPM processor then determines whether a workload fragment is to be statically allocated or dynamically allocated to a DNN processor. Once the RPM processor has selected a DNN processor, the RPM enqueues the workload fragment on a queue maintained by the selected DNN processor. The DNN processor dequeues workload fragments from its queue for execution. Once execution of a workload fragment has completed, the DNN processor generates an interrupt indicating that execution of the workload fragment has completed. The RPM processor can then notify the processor core that originally requested execution of the workload fragment.
    Type: Application
    Filed: June 26, 2019
    Publication date: December 31, 2020
    Inventors: Chad Balling MCBRIDE, Amol A. AMBARDEKAR, Boris BOBROV, Kent D. CEDOLA, George PETRE, Larry Marvin WALL