Patents by Inventor Manoj Karunakaran Nambiar

Manoj Karunakaran Nambiar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

OPTIMAL DEPLOYMENT OF TRANSFORMER MODELS FOR HIGH PERFORMANCE INFERENCE ON FIELD PROGRAMMABLE GATE ARRAY (FPGA)

Publication number: 20250190757

Abstract: Existing techniques fail to deploy transformer models by optimally allocating Field Programmable Gate Array resources to each fundamental block of transformer-based models for maximum performance in terms of low latency and high throughput. This disclosure relates to a system and method which constructs a plurality of parameterized transformer model templates from input parameters comprising templates, one or more transformer-based models, one or more data types corresponding to one or more transformer-based models, table comprising one or more latency values and one or more resource utilization values and feedback mode. One or more values are assigned to each plurality of parameters comprised in a plurality of parameterized transformer model templates to obtain plurality of optimal parameters. An optimal template is obtained from the final template selector for each of the plurality of parameterized transformer model templates, having maximum performance in terms of low latency and maximum throughput.

Type: Application

Filed: November 25, 2024

Publication date: June 12, 2025

Applicant: Tata Consultancy Services Limited

Inventors: Ashwin KRISHNAN, Manoj Karunakaran NAMBIAR, Madan Yelandur NANJUNDASWAMY
METHOD AND SYSTEM FOR PERFORMING CONTENT AWARE MULTI-OBJECT TRACKING

Publication number: 20250139790

Abstract: Multi-object tracking (MOT) in video sequences plays a critical role in various computer vision applications. The primary objective of MOT is to accurately localize and track objects across consecutive frames. However, existing MOT approaches often suffer from computational limitations and low frame rates in commodity machines, which hinders real-time performance. Present disclosure provides method and system for performing content aware multi-object tracking. The system first classifies video into slow and fast moving object content videos depending on features of objects to be tracked in frames. Then, system applies a computationally intensive deep sort algorithm to perform tracking of objects by selectively skipping frames.

Type: Application

Filed: September 9, 2024

Publication date: May 1, 2025

Applicant: Tata Consultancy Services Limited

Inventors: RATUL KISHORE SAHA, REKHA SINGHAL, MANOJ KARUNAKARAN NAMBIAR
PRE-OPTIMIZER AND OPTIMIZER BASED FRAMEWORK FOR OPTIMAL DEPLOYMENT OF EMBEDDING TABLES ACROSS HETEROGENEOUS MEMORY ARCHITECTURE

Publication number: 20250086111

Abstract: High-performance deployment of DNN recommendation models heavily rely on embedding tables, and their performance bottleneck lies in the latency of embedding access. To optimize the deployment of RMs, the method and system is disclosed, which leverages heterogeneous memory types on FPGAs to improve the overall performance by maximizing the availability of frequently accessed data in faster memory. The system, using a optimizer dynamically allocates table partitions of the embedding tables based on history of input access history. A pre-optimizer block disclosed determines whether smaller tables should be partitioned or placed entirely in smaller memories, improving overall efficiency. The performance of RM is improved with improvement in average embedding fetch latency and effectively inference latency via modified Round Trip computation.

Type: Application

Filed: August 14, 2024

Publication date: March 13, 2025

Applicant: Tata Consultancy Services Limited

Inventors: ASHWIN KRISHNAN, MANOJ KARUNAKARAN NAMBIAR, REKHA SINGHAL
Optimal deployment of embeddings tables across heterogeneous memory architecture for high-speed recommendations inference

Patent number: 12182029

Abstract: Works in the literature fail to leverage embedding access patterns and memory units' access/storage capabilities, which when combined can yield high-speed heterogeneous systems by dynamically re-organizing embedding tables partitions across hardware during inference. A method and system for optimal deployment of embeddings tables across heterogeneous memory architecture for high-speed recommendations inference is disclosed, which dynamically partitions and organizes embedding tables across fast memory architectures to reduce access time. Partitions are chosen to take advantage of the past access patterns of those tables to ensure that frequently accessed data is available in the fast memory most of the time. Partition and replication is used to co-optimize memory access time and resources.

Type: Grant

Filed: August 25, 2023

Date of Patent: December 31, 2024

Assignee: TATA CONSULTANCY SERVICES LIMITED

Inventors: Ashwin Krishnan, Manoj Karunakaran Nambiar, Chinmay Narendra Mahajan, Rekha Singhal
OPTIMAL DEPLOYMENT OF EMBEDDINGS TABLES ACROSS HETEROGENEOUS MEMORY ARCHITECTURE FOR HIGH-SPEED RECOMMENDATIONS INFERENCE

Publication number: 20240119008

Abstract: Works in the literature fail to leverage embedding access patterns and memory units' access/storage capabilities, which when combined can yield high-speed heterogeneous systems by dynamically re-organizing embedding tables partitions across hardware during inference. A method and system for optimal deployment of embeddings tables across heterogeneous memory architecture for high-speed recommendations inference is disclosed, which dynamically partitions and organizes embedding tables across fast memory architectures to reduce access time. Partitions are chosen to take advantage of the past access patterns of those tables to ensure that frequently accessed data is available in the fast memory most of the time. Partition and replication is used to co-optimize memory access time and resources.

Type: Application

Filed: August 25, 2023

Publication date: April 11, 2024

Applicant: Tata Consultancy Services Limited

Inventors: Ashwin KRISHNAN, Manoj Karunakaran Nambiar, Chinmay Narendra Mahajan, Rekha Singhal
FIELD PROGRAMMABLE GATE ARRAY (FPGA) BASED ONLINE 3D BIN PACKING

Publication number: 20240112095

Abstract: The disclosure generally relates to an FPGA-based online 3D bin packing. Online 3D bin packing is the process of packing boxes into larger bins-Long Distance Containers (LDCs) such that the space inside each LDC is used to the maximum extent. The use of deep reinforcement learning (Deep RL) for this process is effective and popular. However, since the existing processor-based implementations are limited by Von-Neumann architecture and take a long time to evaluate each alignment for a box, only a few potential alignments are considered, resulting in sub-optimal packing efficiency. This disclosure describes an architecture for bin packing which leverages pipelining and parallel processing on FPGA for faster and exhaustive evaluation of all alignments for each box resulting in increased efficiency. In addition, a suitable generic purpose processor is employed to train the neural network within the algorithm to make the disclosed techniques computationally light, faster and efficient.

Type: Application

Filed: August 25, 2023

Publication date: April 4, 2024

Applicant: Tata Consultancy Services Limited

Inventors: ASHWIN KRISHNAN, HARSHAD KHADILKAR, REKHA SINGHAL, ANSUMA BASUMATARY, MANOJ KARUNAKARAN NAMBIAR, ARIJIT MUKHERJEE, KAVYA BORRA
METHOD AND SYSTEM FOR LATENCY OPTIMIZED HETEROGENEOUS DEPLOYMENT OF CONVOLUTIONAL NEURAL NETWORK

Publication number: 20240062045

Abstract: This disclosure relates generally to a method and system for latency optimized heterogeneous deployment of convolutional neural network (CNN). State-of-the-art methods for optimal deployment of convolutional neural network provide a reasonable accuracy. However, for unseen networks the same level of accuracy is not attained. The disclosed method provides an automated and unified framework for the convolutional neural network (CNN) that optimally partitions the CNN and maps these partitions to hardware accelerators yielding a latency optimized deployment configuration. The method provides an optimal partitioning of the CNN for deployment on heterogeneous hardware platforms by searching network partition and hardware pair optimized for latency while including communication cost between hardware. The method employs performance model-based optimization algorithm to optimally deploy components of a deep learning pipeline across right heterogeneous hardware for high performance.

Type: Application

Filed: July 27, 2023

Publication date: February 22, 2024

Applicant: Tata Consultancy Services Limited

Inventors: Nupur SUMEET, Manoj Karunakaran NAMBIAR, Rekha SINGHAL, Karan RAWAT
METHOD AND SYSTEM FOR GENERATING A DATA MODEL FOR TEXT EXTRACTION FROM DOCUMENTS

Publication number: 20240005686

Abstract: State of the art techniques used for document processing and particularly for handling processing of images for data extraction have the disadvantage that they have large computational load and memory footprint. The disclosure herein generally relates to text processing, and, more particularly, to a method and system for generating a data model for text extraction from documents. The system prunes a pretrained base model using a Lottery Ticket Hypothesis (LTH) algorithm, to generate a LTH pruned data model. The system further trims the LTH pruned data model to obtain a structured pruned data model, which involves discarding filters that have filter sparsity exceeding a threshold of filter sparsity. The structured pruned data model is then trained from a teacher model in a Knowledge Distillation algorithm, wherein a resultant data model obtained after training the structured pruned data model forms the data model for text detection.

Type: Application

Filed: March 31, 2023

Publication date: January 4, 2024

Applicant: Tata Consultancy Services Limited

Inventors: Nupur SUMEET, Manoj Karunakaran NAMBIAR, Karan RAWAT
METHOD AND SYSTEM TO ESTIMATE PERFORMANCE OF SESSION BASED RECOMMENDATION MODEL LAYERS ON FPGA

Publication number: 20230325647

Abstract: This disclosure relates generally to method and system to estimate performance of session based recommendation model layers on FPGA. Profiling is easy to perform on software based platforms such as a CPU and a GPU which have development frameworks and tool sets but on systems such as a FPGA, implementation risks are higher and important to model the performance prior to implementation. The disclosed method analyses a session based recommendation (SBR) model layers for performance estimation. Further, a network bandwidth is determined to process each layer of the SBR model based on dimensions. Performance of each layer of the SBR model is estimated at a predefined frequency by creating a layer profile comprising a throughput and a latency in one or more batches. Further, the method deploys an optimal layer on at least one of a heterogeneous hardware based on the estimated performance of each layer profile on the FPGA.

Type: Application

Filed: January 9, 2023

Publication date: October 12, 2023

Applicant: Tata Consultancy Services Limited

Inventors: ASHWIN KRISHNAN, MANOJ KARUNAKARAN NAMBIAR, NUPUR SUMEET
METHOD AND SYSTEM FOR NON-INTRUSIVE PROFILING OF HIGH-LEVEL SYNTHESIS (HLS) BASED APPLICATIONS

Publication number: 20230305814

Abstract: State of the art techniques provide dedicated High-Level Synthesis (HLS) performance estimator tools that can give insights on performance bottlenecks, stall rate, stall cause etc., in HLS designs. These estimators often limit themselves to simple loop topologies and limited pragma use which makes them unreliable for large designs with complex datapaths. Embodiments herein provide a method and system for non-intrusive profiling for high-level synthesis HLS based applications. The method provides a cycle-accurate, fine-grained performance profiling framework that is non-intrusive and provides an end-to-end profile of the design. Such profiling tool can help the designer/DSE tool to quickly identify the performance bottlenecks and have a guided approach towards tuning it.

Type: Application

Filed: December 21, 2022

Publication date: September 28, 2023

Applicant: Tata Consultancy Services Limited

Inventors: Nupur SUMEET, Manoj Karunakaran NAMBIAR, Deeksha KASHYAP
Method and system for message based communication and failure recovery for FPGA middleware framework

Patent number: 11212218

Abstract: The disclosure herein describes a method and a system for message based communication and failure recovery for FPGA middleware framework. A combination of FPGA and middleware framework provides a high throughput, low latency messaging and can reduce development time as most of the components can be re-used. Further the message based communication architecture built on a FPGA framework performs middleware activities that would enable reliable communication using TCP/UDP between different platforms regardless of their deployment. The proposed FPGA middleware framework provides for reliable communication of UDP based on TCP as well as failure recovery with minimum latency during a failover of an active FPGA framework during its operation, by using a passive FPGA in real-time and dynamic synchronization with the active FPGA.

Type: Grant

Filed: August 8, 2019

Date of Patent: December 28, 2021

Assignee: Tata Consultancy Services Limited

Inventors: Manoj Karunakaran Nambiar, Swapnil Rodi, Sunil Puranik, Mahesh Damodar Barve
Systems and methods for performance evaluation of input/output (I/O) intensive enterprise applications

Patent number: 11151013

Abstract: The present disclosure provides systems and methods for performance evaluation of Input/Output (I/O) intensive enterprise applications. Representative workloads may be generated for enterprise applications using synthetic benchmarks that can be used across multiple platforms with different storage systems. I/O traces are captured for an application of interest at low concurrencies and features that affect performance significantly are extracted, fed to a synthetic benchmark and replayed on a target system thereby accurately creating the same behavior of the application. Statistical methods are used to extrapolate the extract features to predict performance at higher concurrency level without generating traces at those concurrency levels. The method does not require deploying the application or database on the target system since performance of system is dependent on access patterns instead of actual data.

Type: Grant

Filed: January 29, 2018

Date of Patent: October 19, 2021

Assignee: Tate Consultancy Services Limited

Inventors: Dheeraj Chahal, Manoj Karunakaran Nambiar
DATA META-MODEL BASED FEATURE VECTOR SET GENERATION FOR TRAINING MACHINE LEARNING MODELS

Publication number: 20210232971

Abstract: This disclosure relates generally to data meta model and meta file generation for feature engineering and training of machine learning models thereof. Conventional methods do not facilitate appropriate relevant data identification for feature engineering and also do not implement standardization for use of solution across domains. Embodiments of the present disclosure provide systems and methods wherein datasets from various sources/domains are utilized for meta file generation that is based on mapping of the dataset with a data meta model based on the domains, the meta file comprises meta data and information pertaining to action(s) being performed. Further functions are generated using the meta file and the functions are assigned to corresponding data characterized in the meta file. Further functions are invoked to generate feature vector set and machine learning model(s) are trained using the features vector set. Implementation of the generated data meta-model enables re-using of feature engineering code.

Type: Application

Filed: January 27, 2021

Publication date: July 29, 2021

Applicant: Tata Consultancy Services Limited

Inventors: Mayank MISHRA, Shruti KUNDE, Sharod ROY CHOUDHURY, Amey PANDIT, Manoj Karunakaran NAMBIAR, Siddharth VERMA, Gautam SHROFF, Pankaj MALHOTRA, Rekha SINGHAL
Exactly-once transaction semantics for fault tolerant FPGA based transaction systems

Patent number: 10965519

Abstract: This disclosure relates generally to methods and systems for providing exactly-once transaction semantics for fault tolerant FPGA based transaction systems. The systems comprise middleware components in a server as well as client end. The server comprises Hosts and FPGAs. The FPGAs control transaction execution (the application processing logic also resides in the FPGA) and provide fault tolerance with high performance by means of a modified TCP implementation. The Hosts buffer and persist transaction records for failure recovery and achieving exactly-once transaction semantics. The monitoring and fault detecting components are distributed across the FPGA's and Hosts. Exactly-once transaction semantics is implemented without sacrificing performance by switching between a high performance mode and a conservative mode depending on component failures. PCIE switches for connectivity between FPGAs and Hosts ensure FPGAs are available even if Hosts fail.

Type: Grant

Filed: February 22, 2019

Date of Patent: March 30, 2021

Assignee: TATA CONSULTANCY SERVICES LIMITED

Inventors: Manoj Karunakaran Nambiar, Swapnil Rodi, Sunil Anant Puranik, Mahesh Damodar Barve
METHOD AND SYSTEM FOR MESSAGE BASED COMMUNICATION AND FAILURE RECOVERY FOR FPGA MIDDLEWARE FRAMEWORK

Publication number: 20200053004

Abstract: The disclosure herein describes a method and a system for message based communication and failure recovery for FPGA middleware framework. A combination of FPGA and middleware framework provides a high throughput, low latency messaging and can reduce development time as most of the components can be re-used. Further the message based communication architecture built on a FPGA framework performs middleware activities that would enable reliable communication using TCP/UDP between different platforms regardless of their deployment. The proposed FPGA middleware framework provides for reliable communication of UDP based on TCP as well as failure recovery with minimum latency during a failover of an active FPGA framework during its operation, by using a passive FPGA in real-time and dynamic synchronization with the active FPGA.

Type: Application

Filed: August 8, 2019

Publication date: February 13, 2020

Applicant: Tata Consultancy Services Limited

Inventors: Manoj Karunakaran NAMBIAR, Swapnil RODI, Sunil PURANIK, Mahesh Damodar BARVE
Method and system for pre-deployment performance estimation of input-output intensive workloads

Patent number: 10558549

Abstract: A method and system is provided for pre-deployment performance estimation of input-output intensive workloads. Particularly, the present application provides a method and system for predicting the performance of input-output intensive distributed enterprise application on multiple storage devices without deploying the application and the complete database in the target environment. The present method comprises of generating the input-output traces of an application on a source system with varying concurrencies; replaying the generated traces from the source system on a target system where application needs to be migrated; gathering performance data in the form of resource utilization, through-put and response time from the target system; extrapolating the data gathered from the target system in order to accurately predict the performance of multi-threaded input-output intensive applications in the target system for higher concurrencies.

Type: Grant

Filed: November 25, 2016

Date of Patent: February 11, 2020

Assignee: Tata Consultancy Services Limited

Inventors: Dheeraj Chahal, Rupinder Singh Virk, Manoj Karunakaran Nambiar
EXACTLY-ONCE TRANSACTION SEMANTICS FOR FAULT TOLERANT FPGA BASED TRANSACTION SYSTEMS

Publication number: 20190296964

Abstract: This disclosure relates generally to methods and systems for providing exactly-once transaction semantics for fault tolerant FPGA based transaction systems. The systems comprise middleware components in a server as well as client end. The server comprises Hosts and FPGAs. The FPGAs control transaction execution (the application processing logic also resides in the FPGA) and provide fault tolerance with high performance by means of a modified TCP implementation. The Hosts buffer and persist transaction records for failure recovery and achieving exactly-once transaction semantics. The monitoring and fault detecting components are distributed across the FPGAs and Hosts. Exactly-once transaction semantics is implemented without sacrificing performance by switching between a high performance mode and a conservative mode depending on component failures. PCIE switches for connectivity between FPGAs and Hosts ensure FPGAs are available even if Hosts fail.

Type: Application

Filed: February 22, 2019

Publication date: September 26, 2019

Applicant: Tata Consultancy Services Limited

Inventors: Manoj Karunakaran NAMBIAR, Swapnil RODI, Sunil PURANIK, Mahesh BARVE
Systems and methods for predicting performance of applications on an internet of things (IoT) platform

Patent number: 10338967

Abstract: Performance prediction systems and method of an Internet of Things (IoT) platform and applications includes obtaining input(s) comprising one of (i) user requests and (ii) sensor observations from sensor(s); invoking Application Programming Interface (APIs) of the platform based on input(s); identifying open flow (OF) and closed flow (CF) requests of system(s) connected to the platform; identifying workload characteristics of the OF and CF requests to obtain segregated OF and segregated CF requests, and a combination of open and closed flow requests; executing performance tests with the APIs based on the workload characteristics; measuring resource utilization of the system(s) and computing service demands of resource(s) from measured utilization, and user requests processed by the platform per unit time; executing the performance tests with the invoked APIs based on volume of workload characteristics pertaining to the application(s); and predicting, using queuing network, performance of the application(s) fo

Type: Grant

Filed: January 19, 2017

Date of Patent: July 2, 2019

Assignee: Tata Consultancy Services Limited

Inventors: Subhasri Duttagupta, Mukund Kumar, Manoj Karunakaran Nambiar
Systems and methods for benchmark based cross platform service demand prediction

Patent number: 10241902

Abstract: Systems and methods for benchmark based cross platform service demand prediction includes generation of performance mimicking benchmarks that require only application level profiling and provide a representative value of service demand of an application under consideration on a production platform, thereby eliminating need for actually deploying the application under consideration on a production platform. The PMBs require only a representative estimate of service demand of the application under test and can be reused to represent multiple applications. The PMBs are generated based on a skeletal benchmark corresponding to the technology stack used by the application under test and an input file generated based on application profiling that provides pre-defined lower level method calls, data flow sequences between multi-tiers of the application under test and send and receive network calls made by the application under consideration.

Type: Grant

Filed: August 3, 2016

Date of Patent: March 26, 2019

Assignee: Tata Consultancy Services Limited

Inventors: Subhasri Duttagupta, Mukund Kumar, Dhaval Shah, Manoj Karunakaran Nambiar
Service demand based performance prediction using a single workload

Patent number: 10146656

Abstract: Systems and methods for service demand based performance prediction using a single workload is provided to eliminate need for load testing. The process involves identifying a range of concurrencies for the application under test; capturing a single workload pertaining to the application under test; and iteratively performing for the identified range of concurrencies: generating an array of one or more predefined CPU performance metrics based on the captured single workload; generating an array of service demands based on the captured single workload and the generated array of the one or more pre-defined CPU performance metrics; computing an array of throughput based on the generated array of service demands; and updating the generated array of the one or more pre-defined CPU performance metrics based on the computed array of throughput.

Type: Grant

Filed: February 15, 2017

Date of Patent: December 4, 2018

Assignee: Tata Consultancy Services Limited

Inventors: Ajay Kattepur, Manoj Karunakaran Nambiar

1 2 3 next