Patents by Inventor Mihir Mody

Mihir Mody has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220114120
    Abstract: A processing accelerator includes a shared memory, and a stream accelerator, a memory-to-memory accelerator, and a common DMA controller coupled to the shared memory. The stream accelerator is configured to process a real-time data stream, and to store stream accelerator output data generated by processing the real-time data stream in the shared memory. The memory-to-memory accelerator is configured to retrieve input data from the shared memory, to process the input data, and to store, in the shared memory, memory-to-memory accelerator output data generated by processing the input data. The common DMA controller is configured to retrieve stream accelerator output data from the shared memory and transfer the stream accelerator output data to memory external to the processing accelerator; and to retrieve the memory-to-memory accelerator output data from the shared memory and transfer the memory-to-memory accelerator output data to memory external to the processing accelerator.
    Type: Application
    Filed: December 21, 2021
    Publication date: April 14, 2022
    Inventors: Mihir MODY, Niraj NANDAN, Hetul SANGHVI, Brian CHAE, Rajasekhar Reddy ALLU, Jason A.T. JONES, Anthony LELL, Anish REGHUNATH
  • Patent number: 11237991
    Abstract: A processing accelerator includes a shared memory, and a stream accelerator, a memory-to-memory accelerator, and a common DMA controller coupled to the shared memory. The stream accelerator is configured to process a real-time data stream, and to store stream accelerator output data generated by processing the real-time data stream in the shared memory. The memory-to-memory accelerator is configured to retrieve input data from the shared memory, to process the input data, and to store, in the shared memory, memory-to-memory accelerator output data generated by processing the input data. The common DMA controller is configured to retrieve stream accelerator output data from the shared memory and transfer the stream accelerator output data to memory external to the processing accelerator; and to retrieve the memory-to-memory accelerator output data from the shared memory and transfer the memory-to-memory accelerator output data to memory external to the processing accelerator.
    Type: Grant
    Filed: August 17, 2020
    Date of Patent: February 1, 2022
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Mihir Mody, Niraj Nandan, Hetul Sanghvi, Brian Chae, Rajasekhar Reddy Allu, Jason A. T. Jones, Anthony Lell, Anish Reghunath
  • Publication number: 20220012312
    Abstract: In some examples, a system includes storage storing a machine learning model, wherein the machine learning model comprises a plurality of layers comprising multiple weights. The system also includes a processing unit coupled to the storage and operable to group the weights in each layer into a plurality of partitions; determine a number of least significant bits to be used for watermarking in each of the plurality of partitions; insert one or more watermark bits into the determined least significant bits for each of the plurality of partitions; and scramble one or more of the weight bits to produce watermarked and scrambled weights. The system also includes an output device to provide the watermarked and scrambled weights to another device.
    Type: Application
    Filed: September 28, 2021
    Publication date: January 13, 2022
    Inventors: Deepak Kumar PODDAR, Mihir MODY, Veeramanikandan RAJU, Jason A.T. JONES
  • Publication number: 20210397528
    Abstract: A system to implement debugging for a multi-threaded processor is provided. The system includes a hardware thread scheduler configured to schedule processing of data, and a plurality of schedulers, each configured to schedule a given pipeline for processing instructions. The system further includes a debug control configured to control at least one of the plurality of schedulers to halt, step, or resume the given pipeline of the at least one of the plurality of schedulers for the data to enable debugging thereof. The system further includes a plurality of hardware accelerators configured to implement a series of tasks in accordance with a schedule provided by a respective scheduler in accordance with a command from the debug control. Each of the plurality of hardware accelerators is coupled to at least one of the plurality of schedulers to execute the instructions for the given pipeline and to a shared memory.
    Type: Application
    Filed: August 31, 2021
    Publication date: December 23, 2021
    Inventors: NIRAJ NANDAN, HETUL SANGHVI, MIHIR MODY, GARY COOPER, ANTHONY LELL
  • Patent number: 11163861
    Abstract: In some examples, a system includes storage storing a machine learning model, wherein the machine learning model comprises a plurality of layers comprising multiple weights. The system also includes a processing unit coupled to the storage and operable to group the weights in each layer into a plurality of partitions; determine a number of least significant bits to be used for watermarking in each of the plurality of partitions; insert one or more watermark bits into the determined least significant bits for each of the plurality of partitions; and scramble one or more of the weight bits to produce watermarked and scrambled weights. The system also includes an output device to provide the watermarked and scrambled weights to another device.
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: November 2, 2021
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Deepak Kumar Poddar, Mihir Mody, Veeramanikandan Raju, Jason A. T. Jones
  • Publication number: 20210326276
    Abstract: Methods and apparatus to extend local buffer of a hardware accelerator are disclosed herein. In some examples, an apparatus, including a local memory, a first hardware accelerator (HWA), a second HWA, the second HWA and the first HWA connected in a flexible data pipeline, and a spare scheduler to manage, in response to the spare scheduler inserted in the flexible data pipeline, data movement between the first HWA and the second HWA through the local memory and a memory. Local buffer extension may be performed by software to control data movement between local memory and other system memory. The other system memory may be on-chip memory and/or external memory. The HWA sub-system includes a set of spare schedulers to manage the data movement. Data aggregation may be performed in the other system memory. Additionally, the other system memory may be utilized for conversion between data line and data block.
    Type: Application
    Filed: December 30, 2020
    Publication date: October 21, 2021
    Inventors: Niraj Nandan, Mihir Mody, Rajat Sagar
  • Publication number: 20210326229
    Abstract: Methods, apparatus, systems and articles of manufacture for an example event processor are disclosed to retrieve an input event and an input event timestamp corresponding to the input event, generate an output event based on the input event and the input event timestamp, in response to determination that an input event threshold is exceeded within a threshold of time, and an anomaly detector to retrieve the output event, determine whether the output event indicates threat to functional safety of a system on a chip, and in response to determining the output event indicates threat to functional safety of the system on a chip, adapt a process for the system on a chip to preserve functional safety.
    Type: Application
    Filed: December 28, 2020
    Publication date: October 21, 2021
    Inventors: Rajat Sagar, Niraj Nandan, Kedar Chitnis, Brijesh Jadav, Mihir Mody
  • Publication number: 20210326174
    Abstract: A device includes a hardware data processing node configured to execute a respective task, and a hardware thread scheduler including a hardware task scheduler. The hardware task scheduler is coupled to the hardware data processing node and has a producer socket, a consumer socket, and a spare socket. The spare socket is configured to provide data control signals also provided by a first socket of the producer and consumer sockets responsive to a memory-mapped register being a first value. The spare socket is configured to provide data control signals also provided by a second socket of the producer and consumer sockets responsive to the memory-mapped register being a second value.
    Type: Application
    Filed: December 30, 2020
    Publication date: October 21, 2021
    Inventors: Niraj NANDAN, Mihir MODY
  • Patent number: 11144417
    Abstract: A system to implement debugging for a multi-threaded processor is provided. The system includes a hardware thread scheduler configured to schedule processing of data, and a plurality of schedulers, each configured to schedule a given pipeline for processing instructions. The system further includes a debug control configured to control at least one of the plurality of schedulers to halt, step, or resume the given pipeline of the at least one of the plurality of schedulers for the data to enable debugging thereof. The system further includes a plurality of hardware accelerators configured to implement a series of tasks in accordance with a schedule provided by a respective scheduler in accordance with a command from the debug control. Each of the plurality of hardware accelerators is coupled to at least one of the plurality of schedulers to execute the instructions for the given pipeline and to a shared memory.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: October 12, 2021
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Niraj Nandan, Hetul Sanghvi, Mihir Mody, Gary Cooper, Anthony Lell
  • Publication number: 20210209390
    Abstract: An image data frame is received from an external source. An error concealment operation is performed on the received image data frame in response to determining that a first frame size of the received image data frame is erroneous. The first frame size of the image data frame is determined to be erroneous based on at least one frame synchronization signal associated with the image data frame. An image processing operation is performed on the received image data frame on which the error concealment operation has been performed, thereby enabling an image processing module to perform the image processing operation without entering into a deadlock state and thereby prevent a host processor from having to execute hardware resets of deadlocked modules.
    Type: Application
    Filed: January 17, 2020
    Publication date: July 8, 2021
    Inventors: Brian Okchon CHAE, Niraj NANDAN, Anthony Joseph LELL, Mihir MODY
  • Publication number: 20210209041
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed herein to enable data aggregation and pattern adaptation in hardware acceleration subsystems. In some examples, a hardware acceleration subsystem includes a first scheduler, a first hardware accelerator coupled to the first scheduler to process at least a first data element and a second data element, and a first load store engine coupled to the first hardware accelerator, the first load store engine configured to communicate with the first scheduler at a superblock level by sending a done signal to the first scheduler in response to determining that a block count is equal to a first BPR value and aggregate the first data element and the second data element based on the first BPR value to generate a first aggregated data element.
    Type: Application
    Filed: December 31, 2020
    Publication date: July 8, 2021
    Inventors: Niraj Nandan, Rajasekhar Reddy Allu, Brian Chae, Mihir Mody
  • Publication number: 20210136358
    Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed for image frame freeze detection. An example hardware accelerator includes a core logic circuit to generate second image data based on first image data associated with a first image frame, the second image data corresponding to at least one of processed image data, transformed image data, or one or more image data statistics, a load/store engine (LSE) coupled to the core logic circuit, the LSE to determine a first CRC value based on the second image data obtained from the core logic circuit, and a first interface coupled to a second interface, the second interface coupled to memory, the first interface to transmit the first CRC value obtained from the memory to a host device.
    Type: Application
    Filed: October 30, 2019
    Publication date: May 6, 2021
    Inventors: Niraj Nandan, Brian Chae, Mihir Mody, Rajasekhar Reddy Allu
  • Publication number: 20210005005
    Abstract: A method and system for dynamically transferring graphical image processing operations from a graphical processing unit (GPU) to a digital signal processor (DSP). The method includes estimating the number of operations needed for the processing a set of image data; determining the operational limits of a GPU and compare with estimated number of operations and if the operational limits are exceeded; transfer the processing operations to the DSP from the GPU. The transfer can include transferring a portion of executable code for performing the processing operations, and generating a replacement code for the GPU. The DSP can then process a portion of the image data before sending it to the GPU for further processing.
    Type: Application
    Filed: September 22, 2020
    Publication date: January 7, 2021
    Inventors: Mihir Mody, Hemant Hariyani, Anand Balagopalakrishnan, Jason Jones, Ajay Jayaraj, Manoj Koul
  • Publication number: 20200379928
    Abstract: A processing accelerator includes a shared memory, and a stream accelerator, a memory-to-memory accelerator, and a common DMA controller coupled to the shared memory. The stream accelerator is configured to process a real-time data stream, and to store stream accelerator output data generated by processing the real-time data stream in the shared memory. The memory-to-memory accelerator is configured to retrieve input data from the shared memory, to process the input data, and to store, in the shared memory, memory-to-memory accelerator output data generated by processing the input data. The common DMA controller is configured to retrieve stream accelerator output data from the shared memory and transfer the stream accelerator output data to memory external to the processing accelerator; and to retrieve the memory-to-memory accelerator output data from the shared memory and transfer the memory-to-memory accelerator output data to memory external to the processing accelerator.
    Type: Application
    Filed: August 17, 2020
    Publication date: December 3, 2020
    Inventors: Mihir MODY, Niraj NANDAN, Hetul SANGHVI, Brian CHAE, Rajasekhar Reddy ALLU, Jason A.T. JONES, Anthony LELL, Anish REGHUNATH
  • Publication number: 20200363210
    Abstract: Information communication circuitry, including a first integrated circuit for coupling to a second integrated circuit in a package on package configuration. The first integrated circuit comprises processing circuitry for communicating information bits, and the information bits comprise data bits and error correction bits, where the error correction bits are for indicating whether data bits are received correctly. The second integrated circuit comprises a memory for receiving and storing at least some of the information bits. The information communication circuitry also includes interfacing circuitry for selectively communicating, along a number of conductors, between the package on package configuration. In a first instance, the interfacing circuitry selectively communicates only data bits along the number of conductors.
    Type: Application
    Filed: August 3, 2020
    Publication date: November 19, 2020
    Inventors: Rahul Gulati, Aishwarya Dubey, Nainala Vyagrheswarudu, Vasant Easwaran, Prashant Dinkar Karandikar, Mihir Mody
  • Patent number: 10818067
    Abstract: A method and system for dynamically transferring graphical image processing operations from a graphical processing unit (GPU) to a digital signal processor (DSP). The method includes estimating the number of operations needed for the processing a set of image data; determining the operational limits of a GPU and compare with estimated number of operations and if the operational limits are exceeded; transfer the processing operations to the DSP from the GPU. The transfer can include transferring a portion of executable code for performing the processing operations, and generating a replacement code for the GPU. The DSP can then process a portion of the image data before sending it to the GPU for further processing.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: October 27, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Mihir Mody, Hemant Hariyani, Anand Balagopalakrishnan, Jason Jones, Ajay Jayaraj, Manoj Koul
  • Patent number: 10767998
    Abstract: Information communication circuitry, including a first integrated circuit for coupling to a second integrated circuit in a package on package configuration. The first integrated circuit comprises processing circuitry for communicating information bits, and the information bits comprise data bits and error correction bits, where the error correction bits are for indicating whether data bits are received correctly. The second integrated circuit comprises a memory for receiving and storing at least some of the information bits. The information communication circuitry also includes interfacing circuitry for selectively communicating, along a number of conductors, between the package on package configuration. In a first instance, the interfacing circuitry selectively communicates only data bits along the number of conductors.
    Type: Grant
    Filed: August 28, 2018
    Date of Patent: September 8, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Rahul Gulati, Aishwarya Dubey, Nainala Vyagrheswarudu, Vasant Easwaran, Prashant Dinkar Karandikar, Mihir Mody
  • Patent number: 10747692
    Abstract: A processing accelerator includes a shared memory, and a stream accelerator, a memory-to-memory accelerator, and a common DMA controller coupled to the shared memory. The stream accelerator is configured to process a real-time data stream, and to store stream accelerator output data generated by processing the real-time data stream in the shared memory. The memory-to-memory accelerator is configured to retrieve input data from the shared memory, to process the input data, and to store, in the shared memory, memory-to-memory accelerator output data generated by processing the input data. The common DMA controller is configured to retrieve stream accelerator output data from the shared memory and transfer the stream accelerator output data to memory external to the processing accelerator; and to retrieve the memory-to-memory accelerator output data from the shared memory and transfer the memory-to-memory accelerator output data to memory external to the processing accelerator.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: August 18, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Mihir Mody, Niraj Nandan, Hetul Sanghvi, Brian Chae, Rajasekhar Reddy Allu, Jason A. T. Jones, Anthony Lell, Anish Reghunath
  • Publication number: 20200210301
    Abstract: A system to implement debugging for a multi-threaded processor is provided. The system includes a hardware thread scheduler configured to schedule processing of data, and a plurality of schedulers, each configured to schedule a given pipeline for processing instructions. The system further includes a debug control configured to control at least one of the plurality of schedulers to halt, step, or resume the given pipeline of the at least one of the plurality of schedulers for the data to enable debugging thereof. The system further includes a plurality of hardware accelerators configured to implement a series of tasks in accordance with a schedule provided by a respective scheduler in accordance with a command from the debug control. Each of the plurality of hardware accelerators is coupled to at least one of the plurality of schedulers to execute the instructions for the given pipeline and to a shared memory.
    Type: Application
    Filed: December 31, 2018
    Publication date: July 2, 2020
    Inventors: NIRAJ NANDAN, Hetul Sanghvi, Mihir Mody, Gary Cooper, Anthony Lell
  • Publication number: 20200210351
    Abstract: A processing accelerator includes a shared memory, and a stream accelerator, a memory-to-memory accelerator, and a common DMA controller coupled to the shared memory. The stream accelerator is configured to process a real-time data stream, and to store stream accelerator output data generated by processing the real-time data stream in the shared memory. The memory-to-memory accelerator is configured to retrieve input data from the shared memory, to process the input data, and to store, in the shared memory, memory-to-memory accelerator output data generated by processing the input data. The common DMA controller is configured to retrieve stream accelerator output data from the shared memory and transfer the stream accelerator output data to memory external to the processing accelerator; and to retrieve the memory-to-memory accelerator output data from the shared memory and transfer the memory-to-memory accelerator output data to memory external to the processing accelerator.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 2, 2020
    Inventors: Mihir MODY, Niraj NANDAN, Hetul SANGHVI, Brian CHAE, Rajasekhar Reddy ALLU, Jason A.T. JONES, Anthony LELL, Anish REGHUNATH