Patents by Inventor Brojeshwar Bhowmick
Brojeshwar Bhowmick has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Audio-speech driven animated talking face generation using a cascaded generative adversarial network
Patent number: 11551394Abstract: Conventional state-of-the-art methods are limited in their ability to generate realistic animation from audio on any unknown faces and cannot be easily generalized to different facial characteristics and voice accents. Further, these methods fail to produce realistic facial animation for subjects which are quite different than that of distribution of facial characteristics network has seen during training. Embodiments of the present disclosure provide systems and methods that generate audio-speech driven animated talking face using a cascaded generative adversarial network (CGAN), wherein a first GAN is used to transfer lip motion from canonical face to person-specific face. A second GAN based texture generator network is conditioned on person-specific landmark to generate high-fidelity face corresponding to the motion. Texture generator GAN is made more flexible using meta learning to adapt to unknown subject's traits and orientation of face during inference.Type: GrantFiled: March 11, 2021Date of Patent: January 10, 2023Assignee: TATA CONSULTANCY SERVICES LIMITEDInventors: Sandika Biswas, Dipanjan Das, Sanjana Sinha, Brojeshwar Bhowmick -
Patent number: 11526174Abstract: The disclosure herein generally relates to the field of autonomous navigation, and, more particularly, to a diverse trajectory proposal for autonomous navigation. The embodiment discloses a hierarchical network based diverse trajectory proposal for autonomous navigation. The hierarchical 2-stage neural network architecture maps the perceived surroundings to diverse trajectories in the form of trajectory waypoints, that an autonomous navigation system can choose to navigate/traverse. The first stage of the disclosed hierarchical 2-stage Neural Network architecture is a Trajectory Proposal Network which generates a set of diverse traversable regions in an environment which can be occupied by the autonomous navigation system in the future. The second stage is a Trajectory Sampling network which predicts a fine-grained trajectory/trajectory waypoint over the diverse traversable regions proposed by Trajectory Proposal Network.Type: GrantFiled: June 5, 2020Date of Patent: December 13, 2022Assignee: TATA CONSULTANCY SERVICES LIMITEDInventors: Brojeshwar Bhowmick, Krishnam Madhava Krishna, Sriram Nochur Narayanan, Gourav Kumar, Abhay Singh, Siva Karthik Mustikovela, Saket Saurav
-
Publication number: 20220368882Abstract: This disclosure relates generally to method and system for draping a 3D garment on a 3D human body. Dressing digital humans in 3D have gained much attention due to its use in online shopping and draping 3D garments over the 3D human body has immense applications in virtual try-on, animations, and accurate fitment of the 3D garment is the utmost importance. The proposed disclosure is a single unified garment deformation model that learns the shared space of variations for a body shape, a body pose, and a styling garment. The method receives a plurality of human body inputs to construct a 3D skinned garments for the subject. The deep draper network trained using a plurality of losses provides efficient deep neural network based method that predicts fast and accurate 3D garment images. The method couples the geometric and multi-view perceptual constraints that efficiently learn the garment deformation's high-frequency geometry.Type: ApplicationFiled: December 29, 2021Publication date: November 17, 2022Applicant: Tata Consultancy Services LimitedInventors: LOKENDER TIWARI, BROJESHWAR BHOWMICK
-
Patent number: 11501777Abstract: The disclosure herein relates to methods and systems for enabling human-robot interaction (HRI) to resolve task ambiguity. Conventional techniques that initiates continuous dialogue with the human to ask a suitable question based on the observed scene until resolving the ambiguity are limited. The present disclosure use the concept of Talk-to-Resolve (TTR) which initiates a continuous dialogue with the user based on visual uncertainty analysis and by asking a suitable question that convey the veracity of the problem to the user and seek guidance until all the ambiguities are resolved. The suitable question is formulated based on the scene understanding and the argument spans present in the natural language instruction. The present disclosure asks questions in a natural way that not only ensures that the user can understand the type of confusion, the robot is facing; but also ensures minimal and relevant questioning to resolve the ambiguities.Type: GrantFiled: January 29, 2021Date of Patent: November 15, 2022Assignee: Tata Consultancy Services LimitedInventors: Chayan Sarkar, Pradip Pramanick, Snehasis Banerjee, Brojeshwar Bhowmick
-
Patent number: 11429467Abstract: This disclosure relates generally to a method and system for prediction of correct discrete sensor data, thus enabling continuous flow of data even when a discrete sensor fails. The activities of humans/subjects, housed in a smart environment is continuously monitored by plurality of non-intrusive discrete sensors embedded in living infrastructure. The collected discrete sensor data is usually sparse and largely unbalanced, wherein most of the discrete sensor data is ‘No’ and comparatively only a few samples of ‘Yes’, hence making prediction very challenging. The proposed prediction techniques based on introduction of temporal uncertainty is performed in several stages which includes pre-processing of received discrete sensor data, introduction of temporal uncertainty techniques followed by prediction based on neural network techniques of learning pattern using historical data.Type: GrantFiled: December 27, 2019Date of Patent: August 30, 2022Assignee: Tata Consultancy Services LimitedInventors: Avik Ghose, Brojeshwar Bhowmick
-
Publication number: 20220219325Abstract: This disclosure relates generally to navigation of a tele-robot in dynamic environment using in-situ intelligence. Tele-robotics is the area of robotics concerned with the control of robots (tele-robots) in a remote environment from a distance. In reality the remote environment where the tele robot navigates may be dynamic in nature with unpredictable movements, making the navigation extremely challenging. The disclosure proposes an in-situ intelligent navigation of a tele-robot in a dynamic environment. The disclosed in-situ intelligence enables the tele-robot to understand the dynamic environment by identification and estimation of future location of objects based on a generating/training a motion model. Further the disclosed techniques also enable communication between a master and the tele-robot (whenever necessary) based on an application layer communication semantic.Type: ApplicationFiled: March 11, 2021Publication date: July 14, 2022Applicant: Tata Consultancy Services LimitedInventors: Abhijan BHATTACHARYYA, Ruddra dev ROYCHOUDHURY, Sanjana SINHA, Sandika BISWAS, Ashis SAU, Madhurima GANGULY, Sayan PAUL, Brojeshwar BHOWMICK
-
Publication number: 20220148586Abstract: The disclosure herein relates to methods and systems for enabling human-robot interaction (HRI) to resolve task ambiguity. Conventional techniques that initiates continuous dialogue with the human to ask a suitable question based on the observed scene until resolving the ambiguity are limited. The present disclosure use the concept of Talk-to-Resolve (TTR) which initiates a continuous dialogue with the user based on visual uncertainty analysis and by asking a suitable question that convey the veracity of the problem to the user and seek guidance until all the ambiguities are resolved. The suitable question is formulated based on the scene understanding and the argument spans present in the natural language instruction. The present disclosure asks questions in a natural way that not only ensures that the user can understand the type of confusion, the robot is facing; but also ensures minimal and relevant questioning to resolve the ambiguities.Type: ApplicationFiled: January 29, 2021Publication date: May 12, 2022Applicant: Tata Consultancy Services LimitedInventors: Chayan SARKAR, Pradip Pramanick, Snehasis Banerjee, Brojeshwar Bhowmick
-
Patent number: 11295501Abstract: Most of the prior art references that generate animations fail to determine and consider head movement data. The prior art references which consider the head movement data for generating the animations rely on a sample video to generate/determine the head movements data, which, as a result, fail to capture changing head motions throughout course of a speech given by a subject in an actual whole length video. The disclosure herein generally relates to generating facial animations, and, more particularly, to a method and system for generating the facial animations from speech signal of a subject. The system determines the head movement, lip movements, and eyeball movements, of the subject, by processing a speech signal collected as input, and uses the head movement, lip movements, and eyeball movements, to generate an animation.Type: GrantFiled: March 1, 2021Date of Patent: April 5, 2022Assignee: Tata Consultancy Services LimitedInventors: Sandika Biswas, Dipanjan Das, Sanjana Sinha, Brojeshwar Bhowmick
-
Patent number: 11288769Abstract: The present disclosure provides a system and a method for stitching images using non-linear optimization and multi-constraint cost function minimization. Most of conventional homography based transformation approaches for image alignment, calculate transformations based on linear algorithms which ignore parameters such as lens distortion and unable to handle parallax for non-planar images resulting in improper image stitching with misalignments. The disclosed system and the method generates initial stitched image by estimating a global homography for each image using estimated pairwise homography matrix and feature point correspondences for each pair of images, based on a non-linear optimization. Local warping based image alignment is applied on the initial stitched image, using multi-constraint cost function minimization to mitigate aberrations caused by noises in the global homography estimation to generate the refined stitched image.Type: GrantFiled: March 26, 2020Date of Patent: March 29, 2022Assignee: Tata Consultancy Services LimitedInventors: Arindam Saha, Soumyadip Maity, Brojeshwar Bhowmick
-
Publication number: 20220076431Abstract: This disclosure relates generally to system and method for forecasting location of target in monocular first person view. Conventional systems for location forecasting utilizes complex neural networks and hence are computationally intensive and requires high compute power. The disclosed system includes an efficient and light-weight RNN based network model for predicting motion of targets in first person monocular videos. The network model includes an auto-encoder in the encoding phase and a regularizing layer in the end helps us get better accuracy. The disclosed method relies entirely just on detection bounding boxes for prediction as well as training of the network model and is still capable of transferring zero-shot on a different dataset.Type: ApplicationFiled: August 18, 2021Publication date: March 10, 2022Applicant: Tata Consultancy Services LimitedInventors: Junaid Ahmed ANSARI, Brojeshwar Bhowmick
-
Publication number: 20220058850Abstract: This disclosure relates generally to a method and system for generating 2D animated lip images synchronizing to an audio signal for an unseen subject. Recent advances in Convolutional Neural Network (CNN) based approaches generate convincing talking heads. Personalization of such talking heads requires training the model with large number of samples of the target person which is time consuming. The lip generator system receives an audio signal and a target lip image of an unseen target subject as inputs from a user and processes these inputs to extract a plurality of high dimensional audio image features. The lip generator system is meta-trained with training dataset which consists of large variety of subjects' ethnicity and vocabulary. The meta-trained model generates realistic animation for previously unseen face and unseen audio when finetuned with only a few-shot samples for a predefined interval of time. Additionally, the method protects intrinsic features of the unseen target subject.Type: ApplicationFiled: August 18, 2021Publication date: February 24, 2022Applicant: Tata Consultancy Services LimitedInventors: Swapna AGARWAL, Dipanjan DAS, Brojeshwar BHOWMICK
-
Patent number: 11256962Abstract: Estimating 3D human pose from monocular images is a challenging problem due to the variety and complexity of human poses and the inherent ambiguity in recovering depth from single view. Recent deep learning based methods show promising results by using supervised learning on 3D pose annotated datasets. However, the lack of large-scale 3D annotated training data makes the 3D pose estimation difficult in-the-wild. Embodiments of the present disclosure provide a method which can effectively predict 3D human poses from only 2D pose in a weakly-supervised manner by using both ground-truth 3D pose and ground-truth 2D pose based on re-projection error minimization as a constraint to predict the 3D joint locations. The method may further utilize additional geometric constraints on reconstructed body parts to regularize the pose in 3D along with minimizing re-projection error to improvise on estimating an accurate 3D pose.Type: GrantFiled: March 11, 2020Date of Patent: February 22, 2022Assignee: Tata Consultancy Services LimitedInventors: Sandika Biswas, Sanjana Sinha, Kavya Gupta, Brojeshwar Bhowmick
-
AUDIO-SPEECH DRIVEN ANIMATED TALKING FACE GENERATION USING A CASCADED GENERATIVE ADVERSARIAL NETWORK
Publication number: 20220036617Abstract: Conventional state-of-the-art methods are limited in their ability to generate realistic animation from audio on any unknown faces and cannot be easily generalized to different facial characteristics and voice accents. Further, these methods fail to produce realistic facial animation for subjects which are quite different than that of distribution of facial characteristics network has seen during training. Embodiments of the present disclosure provide systems and methods that generate audio-speech driven animated talking face using a cascaded generative adversarial network (CGAN), wherein a first GAN is used to transfer lip motion from canonical face to person-specific face. A second GAN based texture generator network is conditioned on person-specific landmark to generate high-fidelity face corresponding to the motion. Texture generator GAN is made more flexible using meta learning to adapt to unknown subject's traits and orientation of face during inference.Type: ApplicationFiled: March 11, 2021Publication date: February 3, 2022Applicant: Tata Consultancy Services LimitedInventors: Sandika BISWAS, Dipanjan DAS, Sanjana SINHA, Brojeshwar BHOWMICK -
Systems and methods for coupled representation using transform learning for solving inverse problems
Patent number: 11216692Abstract: This disclosure relates to systems and methods for solving generic inverse problems by providing a coupled representation architecture using transform learning. Convention solutions are complex, require long training and testing times, reconstruction quality also may not be suitable for all applications. Furthermore, they preclude application to real-time scenarios due to the mentioned inherent lacunae. The methods provided herein require involve very low computational complexity with a need for only three matrix-vector products, and requires very short training and testing times, which makes it applicable for real-time applications. Unlike the conventional learning architectures using inductive approaches, the CASC of the present disclosure can learn directly from the source domain and the number of features in a source domain may not be necessarily equal to the number of features in a target domain.Type: GrantFiled: July 3, 2019Date of Patent: January 4, 2022Assignee: Tata Consultancy Services LimitedInventors: Kavya Gupta, Brojeshwar Bhowmick, Angshul Majumdar -
Publication number: 20210370516Abstract: The disclosure generally relates to methods and systems for enabling human robot interaction by cognition sharing which includes gesture and audio. Conventional techniques that use the gestures and the speech, require extra hardware setup and are limited to navigation in structured outdoor driving environments. The present disclosure herein provides methods and systems that solves the technical problem of enabling the human robot interaction with a two-step approach by transferring the cognitive load from the human to the robot. An accurate shared perspective associated with the task is determined in the first step by computing relative frame transformations based on understanding of navigational gestures of the subject. Then, the shared perspective transformed to the robot in the field view of the robot. The transformed shared perspective is then given to a language grounding technique in the second step, to accurately determine a final goal associated with the task.Type: ApplicationFiled: February 4, 2021Publication date: December 2, 2021Applicant: Tata Consultancy Services LimitedInventors: Soumyadip MAITY, Gourav KUMAR, Ruddra Dev ROY CHOUDHURY, Brojeshwar BHOWMICK
-
Publication number: 20210366173Abstract: Speech-driven facial animation is useful for a variety of applications such as telepresence, chatbots, etc. The necessary attributes of having a realistic face animation are: 1) audiovisual synchronization, (2) identity preservation of the target individual, (3) plausible mouth movements, and (4) presence of natural eye blinks. Existing methods mostly address audio-visual lip synchronization, and synthesis of natural facial gestures for overall video realism. However, existing approaches are not accurate. Present disclosure provides system and method that learn motion of facial landmarks as an intermediate step before generating texture. Person-independent facial landmarks are generated from audio for invariance to different voices, accents, etc. Eye blinks are imposed on facial landmarks and the person-independent landmarks are retargeted to person-specific landmarks to preserve identity related facial structure.Type: ApplicationFiled: September 29, 2020Publication date: November 25, 2021Applicant: Tata Consultancy Services LimitedInventors: Sanjana SINHA, Sandika BISWAS, Brojeshwar BHOWMICK
-
Patent number: 11176724Abstract: Speech-driven facial animation is useful for a variety of applications such as telepresence, chatbots, etc. The necessary attributes of having a realistic face animation are: 1) audiovisual synchronization, (2) identity preservation of the target individual, (3) plausible mouth movements, and (4) presence of natural eye blinks. Existing methods mostly address audio-visual lip synchronization, and synthesis of natural facial gestures for overall video realism. However, existing approaches are not accurate. Present disclosure provides system and method that learn motion of facial landmarks as an intermediate step before generating texture. Person-independent facial landmarks are generated from audio for invariance to different voices, accents, etc. Eye blinks are imposed on facial landmarks and the person-independent landmarks are retargeted to person-specific landmarks to preserve identity related facial structure.Type: GrantFiled: September 29, 2020Date of Patent: November 16, 2021Assignee: Tata Consultancy Services LimitedInventors: Sanjana Sinha, Sandika Biswas, Brojeshwar Bhowmick
-
Publication number: 20210291363Abstract: Conventional tele-presence robots have their own limitations with respect to task execution, information processing and management. Embodiments of the present disclosure provide a tele-presence robot (TPR) that communicates with a master device associated with a user via an edge device for task execution wherein control command from the master device is parsed for determining instructions set and task type for execution. Based on this determination, the TPR queries for information across storage devices until a response is obtained enough to execute task. The task upon execution is validated with the master device and user. Knowledge acquired, during querying, task execution and validation of the executed task, is dynamically partitioned by the TPR across storage devices namely, on-board memory of the tele-present robot, an edge device, a cloud and a web interface respectively depending upon the task type, operating environment of the tele-presence robot, and other performance affecting parameters.Type: ApplicationFiled: September 9, 2020Publication date: September 23, 2021Applicant: Tata Consultancy Services LimitedInventors: Chayan Sarkar, Snehasis Banerjee, Pradip Pramanick, Hrishav Bakul Barua, Soumyadip Maity, Dipanjan Das, Brojeshwar Bhowmick, Ashis Sau, Abhijan Bhattacharyya, Arpan Pal, Balamuralidhar PURUSHOTHAMAN, Ruddra Roy Chowdhury
-
Publication number: 20210208581Abstract: Robotic platform for tele-presence applications has gained paramount importance, such as for remote meetings, group discussions, and the like and has sought much attention. There exist some robotic platforms for such tele-presence applications, these lack efficacy in communication and interaction between remote person and avatar robot deployed in another geographic location thus adding network overhead. Embodiments of the present disclosure for edge centric communication protocol for remotely maneuvering tele-presence robot in geographically distributed environment.Type: ApplicationFiled: August 7, 2020Publication date: July 8, 2021Applicant: Tata Consultancy Services LimitedInventors: Abhijan BHATTACHARYYA, Ashis SAU, Ruddra Dev ROYCHOUDHURY, Hrishav Bakul BARUA, Chayan SARKAR, Sayan PAUL, Brojeshwar BHOWMICK, Arpan PAL, Balamuralidhar PURUSHOTHAMAN
-
Patent number: 11033205Abstract: A method and system is provided for finding and analyzing gait parameters and postural balance of a person using a Kinect system. The system is easy to use and can be installed at home as well as in clinic. The system includes a Kinect sensor, a software development kit (SDK) and a processor. The temporal skeleton information obtained from the Kinect sensor to evaluate gait parameters includes stride length, stride time, stance time and swing time. Eigenvector based curvature detection is used to analyze the gait pattern with different speeds. In another embodiment, Eigenvector based curvature detection is employed to detect static single limb stance (SLS) duration along with gait variables for evaluating body balance.Type: GrantFiled: February 8, 2017Date of Patent: June 15, 2021Assignee: TATA CONSULTANCY SERVICES LIMITEDInventors: Kingshuk Chakravarty, Brojeshwar Bhowmick, Aniruddha Sinha, Abhijit Das