Patents by Inventor Animesh Jain

Animesh Jain has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Performing hardware operator fusion

Patent number: 12321849

Abstract: A method of generating executable instructions for a computing system is provided. The method comprises: receiving a first set of instructions including a kernel of a first operator and a kernel of a second operator, the kernel of the first operator including instructions of the first operator and write instructions to a virtual data node, the kernel of the second operator including instructions of the second operator and read instructions to the virtual data node; determining, based on a mapping between the write instructions and read instructions, instructions of data transfer operations between the first operator and the second operator; and generating a second set of instructions representing a fused operator of the first operator and the second operator, the second set of instructions including the instructions of the first operator, the instructions of the second operator, and the instructions of the data transfer operations.

Type: Grant

Filed: August 28, 2023

Date of Patent: June 3, 2025

Assignee: Amazon Technologies, Inc.

Inventors: Animesh Jain, Tobias Joseph Kastulus Edler von Koch, Yizhi Liu, Taemin Kim, Jindrich Zejda, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang
Efficient utilization of processing element array

Patent number: 12198041

Abstract: Generating instructions for programming a processing element array to implement a convolution operation can include determining that the convolution operation under-utilizes the processing element array. The convolution operation involves using the processing element array to perform a series of matrix multiplications between a set of filters and a set of input matrices. Each filter comprises a weight matrix. Each input matrix is assigned to a respective row in the processing element array. Under-utilization can be determined through detecting that less than a threshold number of rows would be used concurrently. In response to determining that the convolution operation under-utilizes the processing element array, instructions can be added for modifying the convolution operation to increase the number of rows used concurrently. The added instructions are executable to cause at least one input matrix to be processed in parallel across more rows compared to processing without modifying the convolution operation.

Type: Grant

Filed: July 14, 2023

Date of Patent: January 14, 2025

Assignee: Amazon Technologies, Inc.

Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
Hierarchical partitioning of operators

Patent number: 12182688

Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.

Type: Grant

Filed: November 27, 2019

Date of Patent: December 31, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Animesh Jain, Yizhi Liu, Hongbin Zheng, Jeffrey T. Huynh, Haichen Li, Drazen Borkovic, Jindrich Zejda, Richard John Heaton, Randy Renfu Huang, Zhi Chen, Yida Wang
Dropout layer in a neural network processor

Patent number: 12159218

Abstract: A single instruction multiple data (SIMD) processor is used to implement a dropout layer between a first layer and a second layer of a neural network. The SIMD processor can implement the dropout layer by setting one or more elements in an output tensor of the first layer to zero before providing it as an input tensor to the second layer. Setting of the one or more elements to zero is based on a dropout rate, and pseudo-random numbers generated by a random number generator in the SIMD processor.

Type: Grant

Filed: July 27, 2020

Date of Patent: December 3, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Jiading Gai, Hongbin Zheng, Animesh Jain, Randy Renfu Huang, Vignesh Vivekraja
OVER-VOLTAGE TOLERANT POWER SUPPLY SENSING CIRCUIT WITH PROGRAMMABLE TRIP-POINT

Publication number: 20240210449

Abstract: A power sensing circuit in a first voltage domain senses an input voltage from a second voltage domain and provides a power OK signal. The maximum supply voltage of the first voltage domain is above a maximum tolerance for devices in the first voltage domain. Accordingly, protection techniques are employed to ensure that the potential difference between any two terminals of devices in the power sensing circuit does not exceed the maximum tolerance limit. The protection techniques utilize reference voltage-based techniques including level shifting and use of protection devices in transistor stacks. An over-voltage tolerant Schmitt trigger circuit is also employed in the power sensing circuit. A trip point device on the input of the power sensing circuit utilizes a programmable bias voltage to adjust the trip point of the power sensing circuit to accommodate different maximum input voltages from the second voltage domain.

Type: Application

Filed: December 27, 2022

Publication date: June 27, 2024

Inventors: Thanapandi Ganesan, Prateek Mishra, Pramod Baliga Kokkada, Rajesh Mangalore Anand, Aniket Bharat Waghide, Animesh Jain, Girish Anathahally Singrigowda, Dhruvin Devangbhai Shah
High to low level shifter architecture using lower voltage devices

Patent number: 11923852

Abstract: A voltage level-shifting circuit for an integrated circuit includes an input terminal receiving a voltage signal referenced to an input/output (I/O) voltage level. A transistor overvoltage protection circuit includes a first p-type metal oxide semiconductor (PMOS) transistor includes a source coupled to the second voltage supply, a gate receiving an enable signal, and a drain connected to a central node. A first n-type metal oxide semiconductor (NMOS) transistor includes a drain connected to the central node, a gate connected to the input terminal, and a source connected to an output terminal. A second NMOS transistor includes a drain connected to the input terminal, a gate connected to the central node, and a source connected to the output terminal.

Type: Grant

Filed: September 28, 2021

Date of Patent: March 5, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Prateek Mishra, Thanapandi G, Jagadeesh Anathahalli Singrigowda, Dhruvin Devangbhai Shah, Girish Anathahalli Singrigowda, Animesh Jain
EFFICIENT UTILIZATION OF PROCESSING ELEMENT ARRAY

Publication number: 20230359876

Abstract: Generating instructions for programming a processing element array to implement a convolution operation can include determining that the convolution operation under-utilizes the processing element array. The convolution operation involves using the processing element array to perform a series of matrix multiplications between a set of filters and a set of input matrices. Each filter comprises a weight matrix. Each input matrix is assigned to a respective row in the processing element array. Under-utilization can be determined through detecting that less than a threshold number of rows would be used concurrently. In response to determining that the convolution operation under-utilizes the processing element array, instructions can be added for modifying the convolution operation to increase the number of rows used concurrently. The added instructions are executable to cause at least one input matrix to be processed in parallel across more rows compared to processing without modifying the convolution operation.

Type: Application

Filed: July 14, 2023

Publication date: November 9, 2023

Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
Performing hardware operator fusion

Patent number: 11809981

Abstract: A method of generating executable instructions for a computing system is provided. The method comprises: receiving a first set of instructions including a kernel of a first operator and a kernel of a second operator, the kernel of the first operator including instructions of the first operator and write instructions to a virtual data node, the kernel of the second operator including instructions of the second operator and read instructions to the virtual data node; determining, based on a mapping between the write instructions and read instructions, instructions of data transfer operations between the first operator and the second operator; and generating a second set of instructions representing a fused operator of the first operator and the second operator, the second set of instructions including the instructions of the first operator, the instructions of the second operator, and the instructions of the data transfer operations.

Type: Grant

Filed: November 27, 2019

Date of Patent: November 7, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Animesh Jain, Tobias Joseph Kastulus Edler von Koch, Yizhi Liu, Taemin Kim, Jindrich Zejda, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang
Efficient utilization of processing element array

Patent number: 11741350

Abstract: A computer-implemented method includes receiving a neural network model for implementation using a processing element array, where the neural network model includes a convolution operation on a set of input feature maps and a set of filters. The method also includes determining, based on the neural network model, that the convolution operation utilizes less than a threshold number of rows in the processing element array for applying a set of filter elements to the set of input feature maps, where the set of filter elements includes one filter element in each filter of the set of filters. The method further includes generating, for the convolution operation and based on the neural network model, a first instruction and a second instruction for execution by respective rows in the processing element array, where the first instruction and the second instruction use different filter elements of a filter in the set of filters.

Type: Grant

Filed: November 27, 2019

Date of Patent: August 29, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
Efficient data encoding for deep neural network training

Patent number: 11715002

Abstract: Functions are added to a deep neural network (“DNN”) computation graph for encoding data structures during a forward training pass of the DNN and decoding previously-encoded data structures during a backward training pass of the DNN. The functions added to the DNN computation graph can be selected based upon on the specific layer pairs specified in the DNN computation graph. Once a modified DNN computation graph has been generated, the DNN can be trained using the modified DNN computation graph. The functions added to the modified DNN computation graph can reduce the utilization of memory during training of the DNN.

Type: Grant

Filed: June 29, 2018

Date of Patent: August 1, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Amar Phanishayee, Gennady Pekhimenko, Animesh Jain
HIGH TO LOW LEVEL SHIFTER ARCHITECTURE USING LOWER VOLTAGE DEVICES

Publication number: 20230098336

Abstract: A voltage level-shifting circuit for an integrated circuit includes an input terminal receiving a voltage signal referenced to an input/output (PO) voltage level. A transistor overvoltage protection circuit includes a first p-type metal oxide semiconductor (PMOS) transistor includes a source coupled to the second voltage supply, a gate receiving an enable signal, and a drain connected to a central node. A first n-type metal oxide semiconductor (NMOS) transistor includes a drain connected to the central node, a gate connected to the input terminal, and a source connected to an output terminal. A second NMOS transistor includes a drain connected to the input terminal, a gate connected to the central node, and a source connected to the output terminal.

Type: Application

Filed: September 28, 2021

Publication date: March 30, 2023

Applicant: Advanced Micro Devices, Inc.

Inventors: Prateek Mishra, Thanapandi G, Jagadeesh Anathahalli Singrigowda, Dhruvin Devangbhai Shah, Girish Anathahalli Singrigowda, Animesh Jain
Level shifting output circuit

Patent number: 11463084

Abstract: A level shifting output circuit converts a signal from a core voltage to an I/O voltage without causing voltage overstress on transistor terminals in the level shifting output circuit. The output circuit includes protection transistors to protect various transistors in the output circuit from overvoltage conditions including those transistors coupled to I/O power supply nodes.

Type: Grant

Filed: September 1, 2021

Date of Patent: October 4, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Thanapandi Ganesan, Prateek Mishra, Jagadeesh Anathahalli Singrigowda, Dhruvin Devangbhai Shah, Animesh Jain, Girish Anathahalli Singrigowda
HIERARCHICAL PARTITIONING OF OPERATORS

Publication number: 20210158131

Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.

Type: Application

Filed: November 27, 2019

Publication date: May 27, 2021

Inventors: Animesh Jain, Yizhi Liu, Hongbin Zheng, Jeffrey T. Huynh, Haichen Li, Drazen Borkovic, Jindrich Zejda, Richard John Heaton, Randy Renfu Huang, Zhi Chen, Yida Wang
EFFICIENT UTILIZATION OF PROCESSING ELEMENT ARRAY

Publication number: 20210158132

Abstract: A computer-implemented method includes receiving a neural network model for implementation using a processing element array, where the neural network model includes a convolution operation on a set of input feature maps and a set of filters. The method also includes determining, based on the neural network model, that the convolution operation utilizes less than a threshold number of rows in the processing element array for applying a set of filter elements to the set of input feature maps, where the set of filter elements includes one filter element in each filter of the set of filters. The method further includes generating, for the convolution operation and based on the neural network model, a first instruction and a second instruction for execution by respective rows in the processing element array, where the first instruction and the second instruction use different filter elements of a filter in the set of filters.

Type: Application

Filed: November 27, 2019

Publication date: May 27, 2021

Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
Reference voltage generation for current mode logic

Patent number: 10637472

Abstract: A reference voltage generation circuit for use with current mode logic includes a first transistor of a first conductivity type configured to operate as a diode-connected resistor with a source terminal coupled to a first voltage supply terminal for conducting a supply voltage and a gate terminal coupled to a drain terminal. Second and third transistors of a second conductivity type are coupled in series between the drain terminal of the first transistor and a second voltage supply terminal. Gate terminals of the second and third transistors coupled to the gate terminal of the first transistor. A reference voltage is obtained between the second and third transistors. The first and second NMOS transistors are sized such that they remain in sub-threshold mode operation during operation with an expected range of the supply voltage. Current mode logic circuits are also provided using the reference voltage generation circuit.

Type: Grant

Filed: May 21, 2019

Date of Patent: April 28, 2020

Assignee: Advanced Micro Devices, Inc.

Inventors: Aditya Mitra, Animesh Jain
EFFICIENT DATA ENCODING FOR DEEP NEURAL NETWORK TRAINING

Publication number: 20190347549

Abstract: Functions are added to a deep neural network (“DNN”) computation graph for encoding data structures during a forward training pass of the DNN and decoding previously-encoded data structures during a backward training pass of the DNN. The functions added to the DNN computation graph can be selected based upon on the specific layer pairs specified in the DNN computation graph. Once a modified DNN computation graph has been generated, the DNN can be trained using the modified DNN computation graph. The functions added to the modified DNN computation graph can reduce the utilization of memory during training of the DNN.

Type: Application

Filed: June 29, 2018

Publication date: November 14, 2019

Inventors: Amar PHANISHAYEE, Gennady PEKHIMENKO, Animesh JAIN
Eight piece quadrupole magnet, method for aligning quadrupole magent pole tips

Patent number: 9881723

Abstract: The invention provides an alternative to the standard 2-piece or 4-piece quadrupole. For example, an 8-piece and a 10-piece quadrupole are provided whereby the tips of each pole may be adjustable. Also provided is a method for producing a quadrupole using standard machining techniques but which results in a final tolerance accuracy of the resulting construct which is better than that obtained using standard machining techniques.

Type: Grant

Filed: January 13, 2017

Date of Patent: January 30, 2018

Assignee: UCHICAGO ARGONNE, LLC

Inventors: Mark S. Jaski, Jie Liu, Aric T. Donnelly, Joshua S. Downey, Jeremy J. Nudell, Animesh Jain
Supplying hysteresis effect mitigated clock signals based on silicon-test characterized parameter

Patent number: 8504866

Abstract: Embodiments of systems and methods are described for reducing the effects of hysteresis in the operation of data processing circuitry. In this embodiment of the invention, adaptive control circuitry is used to reduce the effects of hysteresis. The embodiment disclosed herein provides significant reduction in the effects of hysteresis and, therefore, a significant reduction in the amount of guard band needed to compensate for hysteresis effects in SOI processes and thereby improving the performance/power characteristics of the circuit.

Type: Grant

Filed: July 30, 2010

Date of Patent: August 6, 2013

Assignee: Advanced Micro Devices, Inc.

Inventors: Arun Iyer, Bhawna Tomar, Animesh Jain, Krishna Sethupathy Leela
Fast cyclic decoder circuit for FIFO/LIFO data buffer

Patent number: 8238187

Abstract: Embodiments of systems and methods for improved first-in-first-out (FIFO), last-in-last out (LIFO) and full-cycle decoders are described herein. In the various embodiments of the system, a clock generator is operable to generate a clock signal having an active phase and an inactive phase. A set of monotonic flip-flops are operable to capture a set of incoming data addresses during the active cycle of the clock and to generate therefrom data corresponding to single bits in the addresses that have changed compared to the data addresses received by the set of monotonic flip-flops during an immediately preceding data capture cycle. A set of static flip-flops are operable to capture a set of incoming data addresses during the inactive phase of the clock cycle and to generate set output data therefrom. A decoder operable to process the set output data from the set of static flip-flops and to generate a set of old wordlines corresponding to a set of data addresses in the immediately preceding data capture cycle.

Type: Grant

Filed: July 30, 2010

Date of Patent: August 7, 2012

Assignee: Advanced Micro Devices, Inc.

Inventors: Animesh Jain, Nagendra Chandrakar, Sonia Ghosh
Sequential circuit with dynamic pulse width control

Patent number: 8120406

Abstract: A pulsed latch circuit with conditional shutoff prevents an input node, such as a node receiving data, of the pulsed latch circuit, from latching data based on a delayed input control signal, such as an internal clocking signal, and based on a feedback latch state transition detection signal indicating that a current state of input data is stored in the latch. As such, two control conditions are used to shut down the latch. In one example, a condition generator detects when the latch has captured data correctly and outputs a signal to disable the input node. In addition, a variable delay circuit is used to adjust the width of the allowable input signal to set a worst case shutoff time. If data is latched early, a feedback latch state transition detection signal causes the input node to be disabled. If data is not latched early, the maximum allowable latch time is set by the variable delay circuit.

Type: Grant

Filed: July 29, 2009

Date of Patent: February 21, 2012

Assignee: ATI Technologies ULC

Inventors: Arun Iyer, Shibashish Patel, Animesh Jain

1 2 next