Patents by Inventor Animesh Jain
Animesh Jain has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12321849Abstract: A method of generating executable instructions for a computing system is provided. The method comprises: receiving a first set of instructions including a kernel of a first operator and a kernel of a second operator, the kernel of the first operator including instructions of the first operator and write instructions to a virtual data node, the kernel of the second operator including instructions of the second operator and read instructions to the virtual data node; determining, based on a mapping between the write instructions and read instructions, instructions of data transfer operations between the first operator and the second operator; and generating a second set of instructions representing a fused operator of the first operator and the second operator, the second set of instructions including the instructions of the first operator, the instructions of the second operator, and the instructions of the data transfer operations.Type: GrantFiled: August 28, 2023Date of Patent: June 3, 2025Assignee: Amazon Technologies, Inc.Inventors: Animesh Jain, Tobias Joseph Kastulus Edler von Koch, Yizhi Liu, Taemin Kim, Jindrich Zejda, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang
-
Patent number: 12198041Abstract: Generating instructions for programming a processing element array to implement a convolution operation can include determining that the convolution operation under-utilizes the processing element array. The convolution operation involves using the processing element array to perform a series of matrix multiplications between a set of filters and a set of input matrices. Each filter comprises a weight matrix. Each input matrix is assigned to a respective row in the processing element array. Under-utilization can be determined through detecting that less than a threshold number of rows would be used concurrently. In response to determining that the convolution operation under-utilizes the processing element array, instructions can be added for modifying the convolution operation to increase the number of rows used concurrently. The added instructions are executable to cause at least one input matrix to be processed in parallel across more rows compared to processing without modifying the convolution operation.Type: GrantFiled: July 14, 2023Date of Patent: January 14, 2025Assignee: Amazon Technologies, Inc.Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
-
Patent number: 12182688Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.Type: GrantFiled: November 27, 2019Date of Patent: December 31, 2024Assignee: Amazon Technologies, Inc.Inventors: Animesh Jain, Yizhi Liu, Hongbin Zheng, Jeffrey T. Huynh, Haichen Li, Drazen Borkovic, Jindrich Zejda, Richard John Heaton, Randy Renfu Huang, Zhi Chen, Yida Wang
-
Patent number: 12159218Abstract: A single instruction multiple data (SIMD) processor is used to implement a dropout layer between a first layer and a second layer of a neural network. The SIMD processor can implement the dropout layer by setting one or more elements in an output tensor of the first layer to zero before providing it as an input tensor to the second layer. Setting of the one or more elements to zero is based on a dropout rate, and pseudo-random numbers generated by a random number generator in the SIMD processor.Type: GrantFiled: July 27, 2020Date of Patent: December 3, 2024Assignee: Amazon Technologies, Inc.Inventors: Jiading Gai, Hongbin Zheng, Animesh Jain, Randy Renfu Huang, Vignesh Vivekraja
-
Publication number: 20240210449Abstract: A power sensing circuit in a first voltage domain senses an input voltage from a second voltage domain and provides a power OK signal. The maximum supply voltage of the first voltage domain is above a maximum tolerance for devices in the first voltage domain. Accordingly, protection techniques are employed to ensure that the potential difference between any two terminals of devices in the power sensing circuit does not exceed the maximum tolerance limit. The protection techniques utilize reference voltage-based techniques including level shifting and use of protection devices in transistor stacks. An over-voltage tolerant Schmitt trigger circuit is also employed in the power sensing circuit. A trip point device on the input of the power sensing circuit utilizes a programmable bias voltage to adjust the trip point of the power sensing circuit to accommodate different maximum input voltages from the second voltage domain.Type: ApplicationFiled: December 27, 2022Publication date: June 27, 2024Inventors: Thanapandi Ganesan, Prateek Mishra, Pramod Baliga Kokkada, Rajesh Mangalore Anand, Aniket Bharat Waghide, Animesh Jain, Girish Anathahally Singrigowda, Dhruvin Devangbhai Shah
-
Patent number: 11923852Abstract: A voltage level-shifting circuit for an integrated circuit includes an input terminal receiving a voltage signal referenced to an input/output (I/O) voltage level. A transistor overvoltage protection circuit includes a first p-type metal oxide semiconductor (PMOS) transistor includes a source coupled to the second voltage supply, a gate receiving an enable signal, and a drain connected to a central node. A first n-type metal oxide semiconductor (NMOS) transistor includes a drain connected to the central node, a gate connected to the input terminal, and a source connected to an output terminal. A second NMOS transistor includes a drain connected to the input terminal, a gate connected to the central node, and a source connected to the output terminal.Type: GrantFiled: September 28, 2021Date of Patent: March 5, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Prateek Mishra, Thanapandi G, Jagadeesh Anathahalli Singrigowda, Dhruvin Devangbhai Shah, Girish Anathahalli Singrigowda, Animesh Jain
-
Publication number: 20230359876Abstract: Generating instructions for programming a processing element array to implement a convolution operation can include determining that the convolution operation under-utilizes the processing element array. The convolution operation involves using the processing element array to perform a series of matrix multiplications between a set of filters and a set of input matrices. Each filter comprises a weight matrix. Each input matrix is assigned to a respective row in the processing element array. Under-utilization can be determined through detecting that less than a threshold number of rows would be used concurrently. In response to determining that the convolution operation under-utilizes the processing element array, instructions can be added for modifying the convolution operation to increase the number of rows used concurrently. The added instructions are executable to cause at least one input matrix to be processed in parallel across more rows compared to processing without modifying the convolution operation.Type: ApplicationFiled: July 14, 2023Publication date: November 9, 2023Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
-
Patent number: 11809981Abstract: A method of generating executable instructions for a computing system is provided. The method comprises: receiving a first set of instructions including a kernel of a first operator and a kernel of a second operator, the kernel of the first operator including instructions of the first operator and write instructions to a virtual data node, the kernel of the second operator including instructions of the second operator and read instructions to the virtual data node; determining, based on a mapping between the write instructions and read instructions, instructions of data transfer operations between the first operator and the second operator; and generating a second set of instructions representing a fused operator of the first operator and the second operator, the second set of instructions including the instructions of the first operator, the instructions of the second operator, and the instructions of the data transfer operations.Type: GrantFiled: November 27, 2019Date of Patent: November 7, 2023Assignee: Amazon Technologies, Inc.Inventors: Animesh Jain, Tobias Joseph Kastulus Edler von Koch, Yizhi Liu, Taemin Kim, Jindrich Zejda, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang
-
Patent number: 11741350Abstract: A computer-implemented method includes receiving a neural network model for implementation using a processing element array, where the neural network model includes a convolution operation on a set of input feature maps and a set of filters. The method also includes determining, based on the neural network model, that the convolution operation utilizes less than a threshold number of rows in the processing element array for applying a set of filter elements to the set of input feature maps, where the set of filter elements includes one filter element in each filter of the set of filters. The method further includes generating, for the convolution operation and based on the neural network model, a first instruction and a second instruction for execution by respective rows in the processing element array, where the first instruction and the second instruction use different filter elements of a filter in the set of filters.Type: GrantFiled: November 27, 2019Date of Patent: August 29, 2023Assignee: Amazon Technologies, Inc.Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
-
Patent number: 11715002Abstract: Functions are added to a deep neural network (“DNN”) computation graph for encoding data structures during a forward training pass of the DNN and decoding previously-encoded data structures during a backward training pass of the DNN. The functions added to the DNN computation graph can be selected based upon on the specific layer pairs specified in the DNN computation graph. Once a modified DNN computation graph has been generated, the DNN can be trained using the modified DNN computation graph. The functions added to the modified DNN computation graph can reduce the utilization of memory during training of the DNN.Type: GrantFiled: June 29, 2018Date of Patent: August 1, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Amar Phanishayee, Gennady Pekhimenko, Animesh Jain
-
Publication number: 20230098336Abstract: A voltage level-shifting circuit for an integrated circuit includes an input terminal receiving a voltage signal referenced to an input/output (PO) voltage level. A transistor overvoltage protection circuit includes a first p-type metal oxide semiconductor (PMOS) transistor includes a source coupled to the second voltage supply, a gate receiving an enable signal, and a drain connected to a central node. A first n-type metal oxide semiconductor (NMOS) transistor includes a drain connected to the central node, a gate connected to the input terminal, and a source connected to an output terminal. A second NMOS transistor includes a drain connected to the input terminal, a gate connected to the central node, and a source connected to the output terminal.Type: ApplicationFiled: September 28, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Prateek Mishra, Thanapandi G, Jagadeesh Anathahalli Singrigowda, Dhruvin Devangbhai Shah, Girish Anathahalli Singrigowda, Animesh Jain
-
Patent number: 11463084Abstract: A level shifting output circuit converts a signal from a core voltage to an I/O voltage without causing voltage overstress on transistor terminals in the level shifting output circuit. The output circuit includes protection transistors to protect various transistors in the output circuit from overvoltage conditions including those transistors coupled to I/O power supply nodes.Type: GrantFiled: September 1, 2021Date of Patent: October 4, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Thanapandi Ganesan, Prateek Mishra, Jagadeesh Anathahalli Singrigowda, Dhruvin Devangbhai Shah, Animesh Jain, Girish Anathahalli Singrigowda
-
Publication number: 20210158131Abstract: Methods and apparatuses for hierarchical partitioning of operators of a neural network for execution on an acceleration engine are provided. Neural networks are built in machine learning frameworks using neural network operators. The neural network operators are compiled into executable code for the acceleration engine. Development of new framework-level operators can exceed the capability to map the newly developed framework-level operators onto the acceleration engine. To enable neural networks to be executed on an acceleration engine, hierarchical partitioning can be used to partition the operators of the neural network. The hierarchical partitioning can identify operators that are supported by a compiler for execution on the acceleration engine, operators to be compiled for execution on a host processor, and operators to be executed on the machine learning framework.Type: ApplicationFiled: November 27, 2019Publication date: May 27, 2021Inventors: Animesh Jain, Yizhi Liu, Hongbin Zheng, Jeffrey T. Huynh, Haichen Li, Drazen Borkovic, Jindrich Zejda, Richard John Heaton, Randy Renfu Huang, Zhi Chen, Yida Wang
-
Publication number: 20210158132Abstract: A computer-implemented method includes receiving a neural network model for implementation using a processing element array, where the neural network model includes a convolution operation on a set of input feature maps and a set of filters. The method also includes determining, based on the neural network model, that the convolution operation utilizes less than a threshold number of rows in the processing element array for applying a set of filter elements to the set of input feature maps, where the set of filter elements includes one filter element in each filter of the set of filters. The method further includes generating, for the convolution operation and based on the neural network model, a first instruction and a second instruction for execution by respective rows in the processing element array, where the first instruction and the second instruction use different filter elements of a filter in the set of filters.Type: ApplicationFiled: November 27, 2019Publication date: May 27, 2021Inventors: Jeffrey T. Huynh, Ron Diamant, Hongbin Zheng, Yizhi Liu, Animesh Jain, Yida Wang, Vinod Sharma, Richard John Heaton, Randy Renfu Huang, Sundeep Amirineni, Drazen Borkovic
-
Patent number: 10637472Abstract: A reference voltage generation circuit for use with current mode logic includes a first transistor of a first conductivity type configured to operate as a diode-connected resistor with a source terminal coupled to a first voltage supply terminal for conducting a supply voltage and a gate terminal coupled to a drain terminal. Second and third transistors of a second conductivity type are coupled in series between the drain terminal of the first transistor and a second voltage supply terminal. Gate terminals of the second and third transistors coupled to the gate terminal of the first transistor. A reference voltage is obtained between the second and third transistors. The first and second NMOS transistors are sized such that they remain in sub-threshold mode operation during operation with an expected range of the supply voltage. Current mode logic circuits are also provided using the reference voltage generation circuit.Type: GrantFiled: May 21, 2019Date of Patent: April 28, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Aditya Mitra, Animesh Jain
-
Publication number: 20190347549Abstract: Functions are added to a deep neural network (“DNN”) computation graph for encoding data structures during a forward training pass of the DNN and decoding previously-encoded data structures during a backward training pass of the DNN. The functions added to the DNN computation graph can be selected based upon on the specific layer pairs specified in the DNN computation graph. Once a modified DNN computation graph has been generated, the DNN can be trained using the modified DNN computation graph. The functions added to the modified DNN computation graph can reduce the utilization of memory during training of the DNN.Type: ApplicationFiled: June 29, 2018Publication date: November 14, 2019Inventors: Amar PHANISHAYEE, Gennady PEKHIMENKO, Animesh JAIN
-
Patent number: 9881723Abstract: The invention provides an alternative to the standard 2-piece or 4-piece quadrupole. For example, an 8-piece and a 10-piece quadrupole are provided whereby the tips of each pole may be adjustable. Also provided is a method for producing a quadrupole using standard machining techniques but which results in a final tolerance accuracy of the resulting construct which is better than that obtained using standard machining techniques.Type: GrantFiled: January 13, 2017Date of Patent: January 30, 2018Assignee: UCHICAGO ARGONNE, LLCInventors: Mark S. Jaski, Jie Liu, Aric T. Donnelly, Joshua S. Downey, Jeremy J. Nudell, Animesh Jain
-
Patent number: 8504866Abstract: Embodiments of systems and methods are described for reducing the effects of hysteresis in the operation of data processing circuitry. In this embodiment of the invention, adaptive control circuitry is used to reduce the effects of hysteresis. The embodiment disclosed herein provides significant reduction in the effects of hysteresis and, therefore, a significant reduction in the amount of guard band needed to compensate for hysteresis effects in SOI processes and thereby improving the performance/power characteristics of the circuit.Type: GrantFiled: July 30, 2010Date of Patent: August 6, 2013Assignee: Advanced Micro Devices, Inc.Inventors: Arun Iyer, Bhawna Tomar, Animesh Jain, Krishna Sethupathy Leela
-
Patent number: 8238187Abstract: Embodiments of systems and methods for improved first-in-first-out (FIFO), last-in-last out (LIFO) and full-cycle decoders are described herein. In the various embodiments of the system, a clock generator is operable to generate a clock signal having an active phase and an inactive phase. A set of monotonic flip-flops are operable to capture a set of incoming data addresses during the active cycle of the clock and to generate therefrom data corresponding to single bits in the addresses that have changed compared to the data addresses received by the set of monotonic flip-flops during an immediately preceding data capture cycle. A set of static flip-flops are operable to capture a set of incoming data addresses during the inactive phase of the clock cycle and to generate set output data therefrom. A decoder operable to process the set output data from the set of static flip-flops and to generate a set of old wordlines corresponding to a set of data addresses in the immediately preceding data capture cycle.Type: GrantFiled: July 30, 2010Date of Patent: August 7, 2012Assignee: Advanced Micro Devices, Inc.Inventors: Animesh Jain, Nagendra Chandrakar, Sonia Ghosh
-
Patent number: 8120406Abstract: A pulsed latch circuit with conditional shutoff prevents an input node, such as a node receiving data, of the pulsed latch circuit, from latching data based on a delayed input control signal, such as an internal clocking signal, and based on a feedback latch state transition detection signal indicating that a current state of input data is stored in the latch. As such, two control conditions are used to shut down the latch. In one example, a condition generator detects when the latch has captured data correctly and outputs a signal to disable the input node. In addition, a variable delay circuit is used to adjust the width of the allowable input signal to set a worst case shutoff time. If data is latched early, a feedback latch state transition detection signal causes the input node to be disabled. If data is not latched early, the maximum allowable latch time is set by the variable delay circuit.Type: GrantFiled: July 29, 2009Date of Patent: February 21, 2012Assignee: ATI Technologies ULCInventors: Arun Iyer, Shibashish Patel, Animesh Jain