Patents by Inventor Mahesh Mehendale
Mahesh Mehendale has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250077230Abstract: Disclosed herein are improvements to instructions and hardware for performing neural network operations. In an embodiment, a processing device includes instruction fetch circuitry, decoder circuitry, and neural network operation circuitry. The instruction fetch circuitry is configured to fetch a neural network instruction from memory that specifies an operation and a set of values that enable sub-circuits of the neural network operation circuitry for use with one or more of the operations of the group of operations and provide the neural network instruction to the decoder circuitry. The decoder circuitry is configured to cause the neural network operation circuitry to perform, based on the operation, a convolution operation using a first sub-circuit of the neural network operation circuitry and a first subset of the set of values or a batch normalization operation using a second sub-circuit of the neural network operation circuitry and a second subset of the set of values.Type: ApplicationFiled: April 23, 2024Publication date: March 6, 2025Inventors: Atul Lele, Mahesh Mehendale, Uri Weinrib, Anurag Choudhury
-
Publication number: 20250004762Abstract: A system for accelerating binary convolution operations of a neural network includes a set of destination registers, binary convolution circuitry, a decoder coupled to the binary convolution circuitry, and instruction fetch circuitry coupled to the decoder and configured to fetch a binary convolution instruction from an associated memory. The binary convolution instruction specifies input data, weight data, and the set of destination registers for performing a binary convolution operation. The decoder receives the binary convolution instruction from the instruction fetch circuitry and causes the input data and the weight data to be provided to the binary convolution circuitry. In response, the binary convolution circuitry performs the binary convolution operation on the input data and the weight data to produce output data and stores the output data in the set of destination registers.Type: ApplicationFiled: June 29, 2023Publication date: January 2, 2025Inventors: Mahesh Mehendale, Uri Weinrib, Avi Berkovich
-
Patent number: 10423414Abstract: In an embodiment, a device including a processor, a plurality of hardware accelerator engines and a hardware scheduler is disclosed. The processor is configured to schedule an execution of a plurality of instruction threads, where each instruction thread includes a plurality of instructions associated with an execution sequence. The plurality of hardware accelerator engines performs the scheduled execution of the plurality of instruction threads. The hardware scheduler is configured to control the scheduled execution such that each hardware accelerator engine is configured to execute a corresponding instruction and the plurality of instructions are executed by the plurality of hardware accelerator engines in a sequential manner. The plurality of instruction threads are executed by plurality of hardware accelerator engines in a parallel manner based on the execution sequence and an availability status of each of the plurality of hardware accelerator engines.Type: GrantFiled: November 12, 2014Date of Patent: September 24, 2019Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Ajit Deepak Gupte, Mahesh Mehendale, Navin Acharya, Mel Alan Phipps
-
Patent number: 9734896Abstract: In described examples, a memory controller circuit controls accesses to an SRAM circuit. Precharge mode control circuitry outputs: a burst mode enable signal to the SRAM circuit indicating that a series of SRAM cells along a selected row of SRAM cells will be accessed; a precharge first mode signal to the SRAM circuit indicating that a first access along the selected row will occur; and a precharge last mode signal to the SRAM circuit indicating that a last access along the selected row will occur. The SRAM circuit includes an array of SRAM cells arranged in rows and columns to store data. Each SRAM cell is coupled to: a corresponding word line along a row of SRAM cells; and a corresponding pair of complementary bit lines.Type: GrantFiled: June 30, 2016Date of Patent: August 15, 2017Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Per Torstein Roine, Vinod Menezes, Mahesh Mehendale, Vamsi Gullapalli, Premkumar Seetharaman
-
Publication number: 20160314832Abstract: In described examples, a memory controller circuit controls accesses to an SRAM circuit. Precharge mode control circuitry outputs: a burst mode enable signal to the SRAM circuit indicating that a series of SRAM cells along a selected row of SRAM cells will be accessed; a precharge first mode signal to the SRAM circuit indicating that a first access along the selected row will occur; and a precharge last mode signal to the SRAM circuit indicating that a last access along the selected row will occur. The SRAM circuit includes an array of SRAM cells arranged in rows and columns to store data. Each SRAM cell is coupled to: a corresponding word line along a row of SRAM cells; and a corresponding pair of complementary bit lines.Type: ApplicationFiled: June 30, 2016Publication date: October 27, 2016Inventors: Per Torstein Roine, Vinod Menezes, Mahesh Mehendale, Vamsi Gullapalli, Premkumar Seetharaman
-
Patent number: 9384826Abstract: In aspects of the present application, circuitry for storing data is provided including a static random access memory (SRAM) circuit operable to store data in an array of SRAM cell circuits arranged in rows and columns, each SRAM cell coupled to a pair of complementary bit lines disposed along the columns of SRAM cells circuits, and one or more precharge circuits in the SRAM memory circuit coupled to one or more pairs of the complementary bit lines and operable to charge the pairs of complementary bit lines to a precharge voltage, responsive to a precharge control signal. The precharge control signal within the SRAM circuit is operable to cause coupling transistors within the SRAM circuit to couple a pair of complementary bit lines to the precharge voltage responsive to mode signals output from a memory controller circuit external to the SRAM circuit, indicating a bitline precharge is to be performed.Type: GrantFiled: December 5, 2014Date of Patent: July 5, 2016Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Per Torstein Roine, Vinod Menezes, Mahesh Mehendale, Vamsi Gullapalli, Premkumar Seetharaman
-
Publication number: 20160163379Abstract: In aspects of the present application, circuitry for storing data is provided including a static random access memory (SRAM) circuit operable to store data in an array of SRAM cell circuits arranged in rows and columns, each SRAM cell coupled to a pair of complementary bit lines disposed along the columns of SRAM cells circuits, and one or more precharge circuits in the SRAM memory circuit coupled to one or more pairs of the complementary bit lines and operable to charge the pairs of complementary bit lines to a precharge voltage, responsive to a precharge control signal. The precharge control signal within the SRAM circuit is operable to cause coupling transistors within the SRAM circuit to couple a pair of complementary bit lines to the precharge voltage responsive to mode signals output from a memory controller circuit external to the SRAM circuit, indicating a bitline precharge is to be performed.Type: ApplicationFiled: December 5, 2014Publication date: June 9, 2016Inventors: Per Torstein Roine, Vinod Menezes, Mahesh Mehendale, Vamsi Gullapalli, Premkumar Seetharaman
-
Publication number: 20160132329Abstract: In an embodiment, a device including a processor, a plurality of hardware accelerator engines and a hardware scheduler is disclosed. The processor is configured to schedule an execution of a plurality of instruction threads, where each instruction thread includes a plurality of instructions associated with an execution sequence. The plurality of hardware accelerator engines performs the scheduled execution of the plurality of instruction threads. The hardware scheduler is configured to control the scheduled execution such that each hardware accelerator engine is configured to execute a corresponding instruction and the plurality of instructions are executed by the plurality of hardware accelerator engines in a sequential manner. The plurality of instruction threads are executed by plurality of hardware accelerator engines in a parallel manner based on the execution sequence and an availability status of each of the plurality of hardware accelerator engines.Type: ApplicationFiled: November 12, 2014Publication date: May 12, 2016Inventors: Ajit Deepak Gupte, Mahesh Mehendale, Navin Acharya, Mel Alan Phipps
-
Publication number: 20040205326Abstract: This invention reduces redundant power consumption by early detection of predicate register values. This detects pending writes to the predicate registers. When there are no pending predicate register updates, the predicate value is read in the decode stage and a decision whether to nullify the instruction is made. When a write is pending, the instruction executes normally and the result write-back only is dependent upon the newly written predicate value. In the former case, nullifying an instruction completion saves power. The compiler attempts to increase the distance between the predicate-definition and predicate-use by the number of cycles required by the architecture. This scheduling increases the conditions under which the early predicate detection is possible and hence enhances the possibility of power saving.Type: ApplicationFiled: March 12, 2004Publication date: October 14, 2004Inventors: Vijay K.G. Sindagi, Mahesh Mehendale
-
Patent number: 6341344Abstract: A method and apparatus for manipulating data from a processor on a stack memory is disclosed. The method and apparatus comprises aligning a stack pointer (104) in the stack memory (110) to a first memory address (126). The method further comprises incrementing the stack pointer (104) to a second memory address (128). The method further comprises saving data from a register (102) into the stack memory (110) at the second memory address (128). The method further comprises aligning the stack pointer (104) to a next even address if at an odd address when the saving step is complete. The method further comprises performing processor operations. The method further comprises unaligning the stack pointer (104) from the even address back to the odd address. The method further comprises restoring data from the stack memory (110) into the register (102). The method further comprises decrementing the stack pointer (104) from the second memory address (128) to the first memory address (126).Type: GrantFiled: March 18, 1999Date of Patent: January 22, 2002Assignee: Texas Instruments IncorporatedInventors: Alexander Tessarolo, Mahesh Mehendale
-
Patent number: 5751162Abstract: A logic module 400 for use in a field programmable gate array 100 can be selectively reconfigured to perform over 2,200 boolean combinational functions on output 431, to operate as a full adder with sum and carry outputs, or to perform the sequential function of a D latch or a D flipflop. Logic module 400 is comprised of 2-input multiplexers 500 and 600 which are used to form both the combinational and sequential circuits, thereby efficiently utilizing space on gate array 100.Type: GrantFiled: June 7, 1996Date of Patent: May 12, 1998Assignee: Texas Instruments IncorporatedInventors: Mahesh Mehendale, Shivaling Mahant-Shetti, Manisha Agarwala, Mark G. Harward, Robert J. Landers