Patents by Inventor Vinay Namboodiri

Vinay Namboodiri has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEM AND METHOD FOR AUTOMATICALLY GENERATING A SIGN LANGUAGE VIDEO WITH AN INPUT SPEECH USING A MACHINE LEARNING MODEL

Publication number: 20230290371

Abstract: Embodiments herein provide a system and method for automatically generating a sign language video from an input speech using the machine learning model. The method includes (i) extracting a plurality of spectrograms of an input speech by (a) encoding, using an encoder, a time domain series of the input speech to a frequency domain series, and (b) decoding, using a decoder, a plurality of tokens for time steps of the frequency domain series, (ii) generating a plurality of pose sequences for a current time step of the plurality of spectrograms using a first machine learning model, and (iii) automatically generating, using a discriminator of a second machine learning model, a sign language video for the input speech using the plurality of pose sequences and the plurality of spectrograms when the plurality of pose sequences are matched with corresponding the plurality of spectrograms that are extracted.

Type: Application

Filed: March 11, 2023

Publication date: September 14, 2023

Inventors: C.V. Jawahar, Parul Kapoor, Sindhu B. Hegde, Rudrabha Mukhopadhyay, Vinay Namboodiri
SYSTEM AND METHOD FOR AUTOMATICALLY GENERATING SYNTHETIC HEAD VIDEOS USING A MACHINE LEARNING MODEL

Publication number: 20230290332

Abstract: Embodiments herein provide a system and a method for automatically generating at least one synthetic talking head video using a machine learning model. The method includes (i) extracting features from each frame of a video that is extracted from data sources, (ii) analyzing, using a face-detection model, the video to determine a driving face video if a number of identities, and faces of speakers are equal to one in all frames of the video, (iii) generating, using a text to speech model, synthetic speech utterances by automatically selecting a vocabulary of words and sentences from the data sources, (iv) modifying lip movements that are originally present in the driving face video corresponding to the synthetic speech utterances, and (v) generating, using machine learning model, synthetic talking head video based on the lip movements that are modified corresponding to the synthetic speech utterances.

Type: Application

Filed: March 11, 2023

Publication date: September 14, 2023

Inventors: C.V. Jawahar, Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri
SYSTEM AND METHOD FOR LIP-SYNCING A FACE TO TARGET SPEECH USING A MACHINE LEARNING MODEL

Publication number: 20220215830

Abstract: A processor-implemented method for generating a lip-sync for a face to a target speech of a live session to a speech in one or more languages in-sync with improved visual quality using a machine learning model and a pre-trained lip-sync model is provided. The method includes (i) determining a visual representation of the face and an audio representation, the visual representation includes crops of the face; (ii) modifying the crops of the face to obtain masked crops; (iii) obtaining a reference frame from the visual representation at a second timestamp; (iv) combining the masked crops at the first timestamp with the reference to obtain lower half crops; (v) training the machine learning model by providing historical lower half crops and historical audio representations as training data; (vi) generating lip-synced frames for the face to the target speech, and (vii) generating an in-sync lip-synced frames by the pre-trained lip-sync model.

Type: Application

Filed: January 1, 2022

Publication date: July 7, 2022

Inventors: C.V. Jawahar, Rudrabha Mukhopadhyay, K R Prajwal, Vinay Namboodiri
Efficient and scalable caching and representation of media with cross-similarities

Patent number: 10200429

Abstract: A system for making available at an end-user a media file, from a media provider comprising a media file patch related to at least one object, the system comprising: an encoding module at the media provider configured for determining at least one representation which resembles the media file patch, by comparing the media file patch with representations of said at least one object, and for including at least one identification corresponding with said representation in a skeleton file; a storage medium storing a dictionary including the representations of the at least one object at of the end-user and an intermediate node between the media provider and/or the end-user; a decoding module configured for decoding the skeleton file using the identification for looking up the corresponding representation in the dictionary of the storage medium and for rendering the media file patch based on the looked-up corresponding representation.

Type: Grant

Filed: February 9, 2015

Date of Patent: February 5, 2019

Assignee: ALCATEL LUCENT

Inventors: Maarten Aerts, Patrice Rondao Alface, Vinay Namboodiri
Capturing an environment with objects

Patent number: 10078913

Abstract: Method for capturing an environment with objects, using a 3D camera, wherein the images of the cameras captured at different moments in time are used to generate 3D models, and wherein accuracy values are assigned to segments of the models allowing efficient refining of the models using the accuracy values.

Type: Grant

Filed: April 9, 2015

Date of Patent: September 18, 2018

Assignee: Alcatel Lucent

Inventors: Patrice Rondao Alface, Vinay Namboodiri, Maarten Aerts
CAPTURING AN ENVIRONMENT WITH OBJECTS

Publication number: 20170032569

Abstract: Method for capturing an environment with objects, using a 3D camera, wherein the images of the cameras captured at different moments in time are used to generate 3D models, and wherein accuracy values are assigned to segments of the models allowing efficient refining of the models using the accuracy values.

Type: Application

Filed: April 9, 2015

Publication date: February 2, 2017

Applicant: Alcatel Lucent

Inventors: Patrice Rondao Alface, Vinay Namboodiri, Maarten Aerts
Method and arrangement for image retrieval based on multiple images

Patent number: 9535929

Abstract: A Method for retrieving at least one image from a database (DB) of images based on at least two input images (image i, image j), comprises the steps of determining (120) first low level feature correspondences (LC1) between said at least two input images (image i, image j), searching (200) within said database (DB) for at least two sets (Mi, Mj) of images respectively matching said at least to images (image i, image j), determining (120) second low level features correspondences (LC2) between respective images from said at least two sets of images (Mi, Mj), determining (130) a first set of relationships (RLC1) between entities of said at least two input images (image i, image j) based on said first low level feature correspondences (LC1), determining (130) a second set of relationships (RLC2) between respective entities of said respective images from said at least two sets of images (Mi, Mj) based on said second low level feature correspondences (LC2), identifying (300) matching relationships between said fir

Type: Grant

Filed: December 17, 2013

Date of Patent: January 3, 2017

Assignee: Alcatel Lucent

Inventors: Vinay Namboodiri, Mohamed Ali Feki, Erwin Six
EFFICIENT AND SCALABLE CACHING AND REPRESENTATION OF MEDIA WITH CROSS-SIMILARITIES

Publication number: 20160359934

Abstract: A system for making available at an end-user a media file, from a media provider comprising a media file patch related to at least one object, the system comprising: an encoding module at the media provider configured for determining at least one representation which resembles the media file patch, by comparing the media file patch with representations of said at least one object, and for including at least one identification corresponding with said representation in a skeleton file; a storage medium storing a dictionary including the representations of the at least one object at of the end-user and an intermediate node between the media provider and/or the end-user; a decoding module configured for decoding the skeleton file using the identification for looking up the corresponding representation in the dictionary of the storage medium and for rendering the media file patch based on the looked-up corresponding representation.

Type: Application

Filed: February 9, 2015

Publication date: December 8, 2016

Applicant: Alcatel Lucent

Inventors: Maarten Aerts, Patrice Rondao Alface, Vinay Namboodiri
MEDIA CONTENT ORDERING SYSTEM AND METHOD FOR ORDERING MEDIA CONTENT

Publication number: 20160306882

Abstract: The present invention relates to a media content ordering system and to a method for ordering media content. According to the invention, media content items are ordered in two different spaces, i.e. metadata space and feature space. This allows a user to select and retrieve desired content more easily. Media content that is clustered in either space represents similar media content. Suggestions can be made to the user taking into account the preferences of the user with respect to features and metadata particulars. By minimizing the difference in order in both spaces, it is ensured that suggestions to a user are close both in feature space and metadata space.

Type: Application

Filed: October 28, 2014

Publication date: October 20, 2016

Applicant: ALCATEL LUCENT

Inventors: Vinay Namboodiri, Patrice Rondao Alface, Maarten Aerts
METHOD AND SYSTEM FOR PROVIDING ACCESS TO AUXILIARY INFORMATION

Publication number: 20160247522

Abstract: The invention relates to a system and method for tag based access to auxiliary information during a video- and/or audio-conference. The system comprises a mapping between tags and associated portions of auxiliary information. The method may comprise transmitting video data and/or or audio data from the video- and/or audio-conferencing system to participants of the video-conference. During the transmission tags may be extracted from the video data and/or audio data. As soon as a request for auxiliary information is received from a participant, the method may comprise selecting at least one of the tags extracted from the transmitted video data and/or audio data, retrieving at least one auxiliary information portion associated with the selected at least one tag, and transmitting the at least one retrieved auxiliary information portion to the participant that has requested the auxiliary information.

Type: Application

Filed: October 28, 2014

Publication date: August 25, 2016

Applicant: Alcatel Lucent

Inventors: Vinay Namboodiri, Donny Tytgat, Maarten Aerts, Sammy Lievens
METHOD AND DEVICE FOR ROTATING A MULTIDIMENSIONAL SPACE

Publication number: 20160171742

Abstract: Embodiments relates to a method for rotating a multidimensional space and a device for rotating a multidimensional space. The method may comprise:—obtaining (S1) 2D coordinates of a first point (p1) and a second point (p2) of a 2D visualization space input by a user,—determining (S2) a first D-dimensional vector (p1) and a second D-dimensional vector (p2) by projecting said first point (p1) and said second point (p2) into a D-dimensional space, wherein D is equal to or greater than 4,—determining (S3) a rotation matrix (R) in function of said first D-dimensional vector (p1) and said second D-dimensional vector (p2). The D-dimensional space has a first dimension, a second dimension and D-2 other dimensions, and the 2D visualization space has D-2 predefined regions associated with respective dimensions among said D-2 other dimensions.

Type: Application

Filed: July 7, 2014

Publication date: June 16, 2016

Applicant: Alcatel Lucent

Inventors: Maarten Aerts, Sammy Lievens, Vinay Namboodiri, Donny Tytgat
METHOD AND SYSTEM FOR GENERATING A HIGH-RESOLUTION VIDEO STREAM

Publication number: 20150324952

Abstract: A process for generating a high-resolution video stream, the process comprising: receiving (310) a low-resolution video stream; receiving (320) at least one high-resolution video stream; selecting (331) first image patches from said at least one high-resolution video stream; generating (332) respective first low-resolution counterparts of said first high-resolution image patches; storing (333) said first high-resolution image patches indexed by said first low-resolution counterparts in a first data storage; and improving (350) said low-resolution video stream by substituting (351) portions of said low-resolution video stream that are similar to one or more of said first low-resolution counterparts with first high-resolution patches obtained from said first data storage in accordance with said indexing; wherein said low-resolution video stream and said at least one high-resolution video stream are substantially synchronized video streams.

Type: Application

Filed: June 24, 2013

Publication date: November 12, 2015

Applicant: ALCATEL LUCENT

Inventors: Vinay Namboodiri, Jean-Francois Macq, Patrice Rondao Alface, Nico Verzijp, Erwin Six
METHOD AND ARRANGEMENT FOR IMAGE RETRIEVAL BASED ON MULTIPLE IMAGES

Publication number: 20150302028

Abstract: A Method for retrieving at least one image from a database (DB) of images based on at least two input images (image i, image j), comprises the steps of determining (120) first low level feature correspondences (LC1) between said at least two input images (image i, image j), searching (200) within said database (DB) for at least two sets (Mi, Mj) of images respectively matching said at least to images (image i, image j), determining (120) second low level features correspondences (LC2) between respective images from said at least two sets of images (Mi, Mj), determining (130) a first set of relationships (RLC1) between entities of said at least two input images (image i, image j) based on said first low level feature correspondences (LC1), determining (130) a second set of relationships (RLC2) between respective entities of said respective images from said at least two sets of images (Mi, Mj) based on said second low level feature correspondences (LC2), identifying (300) matching relationships between said fir

Type: Application

Filed: December 17, 2013

Publication date: October 22, 2015

Applicant: ALCATEL LUCENT

Inventors: Vinay Namboodiri, Mohamed Ali Feki, Erwin Six
METHOD AND APPARATUS FOR ENABLING VISUAL MUTE OF A PARTICIPANT DURING VIDEO CONFERENCING

Publication number: 20150208031

Abstract: A method for adapting video data (video room1) recorded by a camera on a location (room1) during a video conference such as to hide the presence of a participant at said location of said video conference, comprises a step of registering a predefined gesture possibly to be performed by any participant of said video conference at said location (room1), a step of detecting said gesture, and upon detection thereof, identifying the at least one participant having performed said gesture at said location (room1), adapting said video data such as to eliminate data relating to said at least one participant having performed said gesture from said video data, thereby generating adapted video data (videoa room1), for being transmitted to other participants of said video conference on other locations (room2, . . . , roomn).

Type: Application

Filed: July 30, 2013

Publication date: July 23, 2015

Applicant: ALCATEL LUCENT

Inventors: Sammy Lievens, Donny Tytgat, Maarten Aerts, Vinay Namboodiri, Erwin Six