Patents by Inventor Dipan Kumar Mandal

Dipan Kumar Mandal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method to compute sliding window block sum using instruction based selective horizontal addition in vector processor

Patent number: 10395381

Abstract: Disclosed techniques relate to forming a block sum of picture elements employing a vector dot product instruction to sum packed picture elements and the mask producing a vector of masked horizontal picture element. The block sum is formed from plural horizontal sums via vector single instruction multiple data (SIMD) addition.

Type: Grant

Filed: March 4, 2019

Date of Patent: August 27, 2019

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Jayasree Sankaranarayanan, Dipan Kumar Mandal
METHOD TO COMPUTE SLIDING WINDOW BLOCK SUM USING INSTRUCTION BASED SELECTIVE HORIZONTAL ADDITION IN VECTOR PROCESSOR

Publication number: 20190197718

Abstract: Disclosed techniques relate to forming a block sum of picture elements employing a vector dot product instruction to sum packed picture elements and the mask producing a vector of masked horizontal picture element. The block sum is formed from plural horizontal sums via vector single instruction multiple data (SIMD) addition.

Type: Application

Filed: March 4, 2019

Publication date: June 27, 2019

Inventors: Jayasree Sankaranarayanan, Dipan Kumar Mandal
Method for performing random read access to a block of data using parallel LUT read instruction in vector processors

Patent number: 10331347

Abstract: This disclosure is directed to the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a vector destination register. This enables data processing by vector single instruction multiple data (SIMD) operations. This vector destination register load can be repeated if the tables store more used data. New data can be loaded into the original tables if appropriate. A level one memory is preferably partitioned as part data cache and part directly addressable memory. The look up table memory is stored in the directly addressable memory.

Type: Grant

Filed: May 29, 2018

Date of Patent: June 25, 2019

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Jayasree Sankaranarayanan, Dipan Kumar Mandal
Scalable memory-optimized hardware for matrix-solve

Patent number: 10324689

Abstract: Systems and methods for matrix-solve applications include a memory-optimized hardware acceleration (HWA) solution with scalable architecture (i.e. specialized circuitry) for HWA matrix-solve operations. The matrix-solve solutions described herein may include a scalable hardware architecture with parallel processing (e.g., “within column” processing), which provides the ability to compute several output values in parallel. The HWA matrix-solve solutions described herein may include simultaneous multi-column processing, which provides a lower execution cycle count and a reduced total number of memory accesses. This HWA matrix-solve provides a low latency and energy-efficient matrix-solve solutions, which may be used to reduce energy consumption and improve performance in various matrix-based applications, such as computer vision, SLAM, AR/VR/mixed-reality, machine learning, data analytics, and other matrix-based applications.

Type: Grant

Filed: November 21, 2017

Date of Patent: June 18, 2019

Assignee: Intel IP Corporation

Inventors: Gurpreet Singh Kalsi, Om Ji Omer, Dipan Kumar Mandal, Santhosh Kumar Rethinagiri, Gopi Neela
Optimized image feature extraction

Patent number: 10318834

Abstract: One embodiment provides an image processing circuitry. The image processing circuitry includes a feature extraction circuitry and an optimization circuitry. The feature extraction circuitry is to determine a feature descriptor based, at least in part, on a feature point location and a corresponding scale. The optimization circuitry is to optimize an operation of the feature extraction circuitry. Each optimization is to at least one of accelerate the operation of the feature extraction circuitry, reduce a power consumption of the feature extraction circuitry and/or reduce a system memory bandwidth used by the feature extraction circuitry.

Type: Grant

Filed: May 1, 2017

Date of Patent: June 11, 2019

Assignee: Intel Corporation

Inventors: Gurpreet S. Kalsi, Om J. Omer, Biji George, Gopi Neela, Dipan Kumar Mandal, Sreenivas Subramoney
SELECTING KEYPOINTS IN IMAGES USING DESCRIPTOR SCORES

Publication number: 20190171909

Abstract: An example apparatus for selecting keypoints in image includes a keypoint detector to detect keypoints in a plurality of received images. The apparatus also includes a score calculator to calculate a keypoint score for each of the detected keypoints based on a descriptor score indicating descriptor invariance. The apparatus includes a keypoint selector to select keypoints based on the calculated keypoint scores. The apparatus also further includes a descriptor calculator to calculate descriptors for each of the selected keypoints. The apparatus also includes a descriptor matcher to match corresponding descriptors between images in the plurality of received images. The apparatus further also includes a feature tracker to track a feature in the plurality of images based on the matched descriptors.

Type: Application

Filed: December 26, 2018

Publication date: June 6, 2019

Applicant: INTEL CORPORATION

Inventors: Dipan Kumar Mandal, Gurpreet Kalsi, Om J. Omer, Prashant Laddha, Sreenivas Subramoney
FEATURE DETECTION, SORTING, AND TRACKING IN IMAGES USING A CIRCULAR BUFFER

Publication number: 20190043204

Abstract: An example apparatus for tracking features in image data includes an image data receiver to receive initial image data corresponding to an image from a camera and store the image data a circular buffer. The apparatus also includes a feature detector to detect features in the image data. The apparatus further includes a feature sorter to sort the detected features to generate sorted feature points. The apparatus includes a feature tracker to track the sorted feature points in subsequent image data corresponding to the image received at the image data receiver. The subsequent image data is to replace the initial image data in the circular buffer.

Type: Application

Filed: January 8, 2018

Publication date: February 7, 2019

Applicant: Intel IP Corporation

Inventors: Dipan Kumar Mandal, Nagadastagiri Reddy C., Mahesh Mamidipaka, Om J. Omer
SCALABLE MEMORY-OPTIMIZED HARDWARE FOR MATRIX-SOLVE

Publication number: 20190042195

Abstract: Systems and methods for matrix-solve applications include a memory-optimized hardware acceleration (HWA) solution with scalable architecture (i.e. specialized circuitry) for HWA matrix-solve operations. The matrix-solve solutions described herein may include a scalable hardware architecture with parallel processing (e.g., “within column” processing), which provides the ability to compute several output values in parallel. The HWA matrix-solve solutions described herein may include simultaneous multi-column processing, which provides a lower execution cycle count and a reduced total number of memory accesses. This HWA matrix-solve provides a low latency and energy-efficient matrix-solve solutions, which may be used to reduce energy consumption and improve performance in various matrix-based applications, such as computer vision, SLAM, AR/VR/mixed-reality, machine learning, data analytics, and other matrix-based applications.

Type: Application

Filed: November 21, 2017

Publication date: February 7, 2019

Inventors: Gurpreet Singh Kalsi, Om Ji Omer, Dipan Kumar Mandal, Santhosh Kumar Rethinagiri, Gopi Neela
ACCELERATOR FOR MATRIX DECOMPOSITION

Publication number: 20190042539

Abstract: Systems and methods for a hardware accelerated matrix decomposition matrix decomposition circuit are described herein. This matrix decomposition circuit splits matrix decomposition operations into parallel operation circuits and serial operation circuits, and joins the parallel and serial operation circuits using specific dependency handling logic for efficient parallel execution. This provides fast matrix decomposition with low power consumption, reduced memory footprint, and reduced memory bandwidth.

Type: Application

Filed: December 29, 2017

Publication date: February 7, 2019

Inventors: Gurpreet Singh Kalsi, Om Ji Omer, Santhosh Kumar Rethinagiri, Anish N K, Dipan Kumar Mandal
OPTIMIZED IMAGE FEATURE EXTRACTION

Publication number: 20180314903

Abstract: One embodiment provides an image processing circuitry. The image processing circuitry includes a feature extraction circuitry and an optimization circuitry. The feature extraction circuitry is to determine a feature descriptor based, at least in part, on a feature point location and a corresponding scale. The optimization circuitry is to optimize an operation of the feature extraction circuitry. Each optimization is to at least one of accelerate the operation of the feature extraction circuitry, reduce a power consumption of the feature extraction circuitry and/or reduce a system memory bandwidth used by the feature extraction circuitry.

Type: Application

Filed: May 1, 2017

Publication date: November 1, 2018

Applicant: INTEL CORPORATION

Inventors: Gurpreet S. Kalsi, Om J. Omer, Biji George, Gopi Neela, Dipan Kumar Mandal, Sreenivas Subramoney
METHOD FOR PERFORMING RANDOM READ ACCESS TO A BLOCK OF DATA USING PARALLEL LUT READ INSTRUCTION IN VECTOR PROCESSORS

Publication number: 20180275878

Abstract: This disclosure is directed to the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a vector destination register. This enables data processing by vector single instruction multiple data (SIMD) operations. This vector destination register load can be repeated if the tables store more used data. New data can be loaded into the original tables if appropriate. A level one memory is preferably partitioned as part data cache and part directly addressable memory. The look up table memory is stored in the directly addressable memory.

Type: Application

Filed: May 29, 2018

Publication date: September 27, 2018

Inventors: Jayasree Sankaranarayanan, Dipan Kumar Mandal
TECHNOLOGIES FOR FEATURE DETECTION AND TRACKING

Publication number: 20180189587

Abstract: Aspects of the present disclosure relates to technologies (systems, devices, methods, etc.) for performing feature detection and/or feature tracking based on image data. In embodiments, the technologies include or leverage a SLAM hardware accelerator (SWA) that includes a feature detection component and optionally a feature tracking component. The feature detection component may be configured to perform feature detection on working data encompassed by a sliding window. The feature tracking component is configured to perform feature tracking operations to track one or more detected features, e.g., using normalized cross correlation (NCC) or another method.

Type: Application

Filed: November 29, 2017

Publication date: July 5, 2018

Applicant: Intel Corporation

Inventors: Dipan Kumar Mandal, Om J. Omer, Lance E. Hacking, James Radford, Sreenivas Subramoney, Eagle Jones, Gautham N. Chinya
Low power ultra-HD video hardware engine

Patent number: 9973754

Abstract: A low power video hardware engine is disclosed. The video hardware engine includes a video hardware accelerator unit. A shared memory is coupled to the video hardware accelerator unit, and a scrambler is coupled to the shared memory. A vDMA (video direct memory access) engine is coupled to the scrambler, and an external memory is coupled to the vDMA engine. The scrambler receives an LCU (largest coding unit) from the vDMA engine. The LCU comprises N×N pixels, and the scrambler scrambles N×N pixels in the LCU to generate a plurality of blocks with M×M pixels. N and M are integers and M is less than N.

Type: Grant

Filed: March 18, 2015

Date of Patent: May 15, 2018

Assignee: Texas Instruments Incorporated

Inventors: Hetul Sanghvi, Mihir Narendra Mody, Niraj Nandan, Mahesh Madhukar Mehendale, Subrangshu Das, Dipan Kumar Mandal, Pavan Venkata Shastry
Method for efficient median filtering

Patent number: 9898805

Abstract: A method is disclosed for efficiently calculating a median value of a high-order array in a Single Instruction Multiple Data (SIMD) processor. Values of the high-order array are sorted vertically in each column followed by sorts on each individual row. After the sort, selective diagonal values of the sorted high-order array are used to form a low-order array to calculate the median of the high-order array. The median calculation using selective diagonal values of the high-order array in a low-order array significantly improves SIMD processor efficiency and throughput.

Type: Grant

Filed: February 10, 2016

Date of Patent: February 20, 2018

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Sanmati S. Kamath, Dipan Kumar Mandal, Wonki Choi
DYNAMIC FRAME PADDING IN A VIDEO HARDWARE ENGINE

Publication number: 20170318304

Abstract: A video hardware engine which support dynamic frame padding is disclosed. The video hardware engine includes an external memory. The external memory stores a reference frame. The reference frame includes a plurality of reference pixels. A motion estimation (ME) engine receives a current LCU (largest coding unit), and defines a search area around the current LCU for motion estimation. The ME engine receives a set of reference pixels corresponding to the current LCU. The set of reference pixels of the plurality of reference pixels are received from the external memory. The ME engine pads a set of duplicate pixels along an edge of the reference frame when a part area of the search area is outside the reference frame.

Type: Application

Filed: July 19, 2017

Publication date: November 2, 2017

Inventors: Hetul Sanghvi, Mihir Narendra Mody, Niraj Nandan, Mahesh Madhukar Mehendale, Subrangshu Das, Dipan Kumar Mandal, Nainala Vyagrheswarudu, Vijayavardhan Baireddy, Pavan Venkata Shastry
Embedding stall and event trace profiling data in the timing stream

Patent number: 9798645

Abstract: An electronic tracing process includes packing both stall (215) and reason (219) data into a single high priority timing information stream. An integrated circuit includes an electronic processor (110), and a tracing circuit (120) operable to pack both stall and events data into a single timing information stream. Other circuits, processes and systems are also disclosed.

Type: Grant

Filed: November 10, 2015

Date of Patent: October 24, 2017

Assignee: Texas Instruments Incorporated

Inventors: Kanika Ghai Bansal, Dipan Kumar Mandal, Gary A. Cooper, Bryan J. Thome
Optical flow determination using pyramidal block matching

Patent number: 9681150

Abstract: An image processing system includes a processor and optical flow determination logic. The optical flow determination logic is to quantify relative motion of a feature present in a first frame of video and a second frame of video with respect to the two frames of video. The optical flow determination logic configures the processor to convert each of the frames of video into a hierarchical image pyramid. The image pyramid comprises a plurality of image levels. Image resolution is reduced at each higher one of the image levels. For each image level and for each pixel in the first frame, the processor is configured to establish an initial estimate of a location of the pixel in the second frame and to apply a plurality of sequential searches, starting from the initial estimate, that establish refined estimates of the location of the pixel in the second frame.

Type: Grant

Filed: June 12, 2015

Date of Patent: June 13, 2017

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Hrushikesh Tukaram Garud, Soyeb Noormohammed Nagori, Dipan Kumar Mandal
Optimized fast feature detection for vector processors

Patent number: 9652686

Abstract: This invention enables effective corner detection of pixels of an image using the FAST algorithm using a vector SIMD processor. This invention loads an 8×8 pixel block that includes four 7×7 pixel blocks including the 16 peripheral pixels to be tested for each of four center pixels. This invention rearranges the 64 pixels of the 8×8 block to form a 16 element array for each center pixel preferably using a vector permutation instruction. This invention uses vector SIMD subtraction and compare and vector SIMD addition and compare to make the FAST algorithm comparisons. The N consecutive pixels determinations of the FAST algorithm are made from the results of plural shift and AND operations. The corresponding center pixel is marked a corner or not a corner dependent upon of the results of plural shift and AND operations.

Type: Grant

Filed: November 8, 2016

Date of Patent: May 16, 2017

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Jayasree Sankaranarayanan, Dipan Kumar Mandal, Prashanth R Viswanath
Optimized Fast Feature Detection for Vector Processors

Publication number: 20170076173

Abstract: This invention enables effective corner detection of pixels of an image using the FAST algorithm using a vector SIMD processor. This invention loads an 8×8 pixel block that includes four 7×7 pixel blocks including the 16 peripheral pixels to be tested for each of four center pixels. This invention rearranges the 64 pixels of the 8×8 block to form a 16 element array for each center pixel preferably using a vector permutation instruction. This invention uses vector SIMD subtraction and compare and vector SIMD addition and compare to make the FAST algorithm comparisons. The N consecutive pixels determinations of the FAST algorithm are made from the results of plural shift and AND operations. The corresponding center pixel is marked a corner or not a corner dependent upon of the results of plural shift and AND operations.

Type: Application

Filed: November 8, 2016

Publication date: March 16, 2017

Inventors: Jayasree Sankaranarayanan, Dipan Kumar Mandal, Prashanth R. Viswanath
METHOD FOR EFFICIENT MEDIAN FILTERING

Publication number: 20160232641

Abstract: A method is disclosed for efficiently calculating a median value of a high-order array in a Single Instruction Multiple Data (SIMD) processor. Values of the high-order array are sorted vertically in each column followed by sorts on each individual row. After the sort, selective diagonal values of the sorted high-order array are used to form a low-order array to calculate the median of the high-order array. The median calculation using selective diagonal values of the high-order array in a low-order array significantly improves SIMD processor efficiency and throughput.

Type: Application

Filed: February 10, 2016

Publication date: August 11, 2016

Inventors: Sanmati S. Kamath, Dipan Kumar Mandal, Wonki Choi

prev 1 2 3 next