Patents by Inventor Chalapathy Venkata Neti
Chalapathy Venkata Neti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7295979Abstract: Bootstrapping of a system from one language to another often works well when the two languages share the similar acoustic space. However, when the new language has sounds that do not occur in the language from which the bootstrapping is to be done, bootstrapping does not produce good initial models and the new language data is not properly aligned to these models. The present invention provides techniques to generate context dependent labeling of the new language data using the recognition system of another language. Then, this labeled data is used to generate models for the new language phones.Type: GrantFiled: February 22, 2001Date of Patent: November 13, 2007Assignee: International Business Machines CorporationInventors: Chalapathy Venkata Neti, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma
-
Patent number: 7251603Abstract: Techniques for performing audio-visual speech recognition, with improved recognition performance, in a degraded visual environment. For example, in one aspect of the invention, a technique for use in accordance with an audio-visual speech recognition system for improving a recognition performance thereof includes the steps/operations of: (i) selecting between an acoustic-only data model and an acoustic-visual data model based on a condition associated with a visual environment; and (ii) decoding at least a portion of an input spoken utterance using the selected data model. Advantageously, during periods of degraded visual conditions, the audio-visual speech recognition system is able to decode (recognize) input speech data using audio-only data, thus avoiding recognition inaccuracies that may result from performing speech recognition based on acoustic-visual data models and degraded visual data.Type: GrantFiled: June 23, 2003Date of Patent: July 31, 2007Assignee: International Business Machines CorporationInventors: Jonathan H. Connell, Norman Haas, Etienne Marcheret, Chalapathy Venkata Neti, Gerasimos Potamianos
-
Patent number: 6964023Abstract: Systems and methods are provided for performing focus detection, referential ambiguity resolution and mood classification in accordance with multi-modal input data, in varying operating conditions, in order to provide an effective conversational computing environment for one or more users.Type: GrantFiled: February 5, 2001Date of Patent: November 8, 2005Assignee: International Business Machines CorporationInventors: Stephane Herman Maes, Chalapathy Venkata Neti
-
Publication number: 20040260554Abstract: Techniques for performing audio-visual speech recognition, with improved recognition performance, in a degraded visual environment. For example, in one aspect of the invention, a technique for use in accordance with an audio-visual speech recognition system for improving a recognition performance thereof includes the steps/operations of: (i) selecting between an acoustic-only data model and an acoustic-visual data model based on a condition associated with a visual environment; and (ii) decoding at least a portion of an input spoken utterance using the selected data model. Advantageously, during periods of degraded visual conditions, the audio-visual speech recognition system is able to decode (recognize) input speech data using audio-only data, thus avoiding recognition inaccuracies that may result from performing speech recognition based on acoustic-visual data models and degraded visual data.Type: ApplicationFiled: June 23, 2003Publication date: December 23, 2004Applicant: International Business Machines CorporationInventors: Jonathan H. Connell, Norman Haas, Etienne Marcheret, Chalapathy Venkata Neti, Gerasimos Potamianos
-
Patent number: 6816836Abstract: Techniques for providing speech recognition comprise the steps of processing a video signal associated with an arbitrary content video source, processing an audio signal associated with the video signal, and recognizing at least a portion of the processed audio signal, using at least a portion of the processed video signal, to generate an output signal representative of the audio signal.Type: GrantFiled: August 30, 2002Date of Patent: November 9, 2004Assignee: International Business Machines CorporationInventors: Sankar Basu, Philippe Christian de Cuetos, Stephane Herman Maes, Chalapathy Venkata Neti, Andrew William Senior
-
Publication number: 20040111432Abstract: An apparatus and method for analyzing multimedia content to identify the presence of audio, visual and textual cues that together correspond to one or more high-level semantics are provided. The apparatus and method make use of one or more analysis models that are trained to analyze audio, visual and textual portions of multimedia content to generate scores associated with the audio, visual and textual portions with respect to various high-level semantic concepts. These scores are used to generate a vector of scores. The apparatus is trained with regard to relationships between audio, visual and textual scores to thereby take the vector of scores generated for the multimedia content and classify the multimedia content into one or more high-level semantic concepts. Based on the scores for the various audio, video and textual portions of the multimedia content, a level of certainty regarding the high-level semantic concepts may be generated.Type: ApplicationFiled: December 10, 2002Publication date: June 10, 2004Applicant: International Business Machines CorporationInventors: Hugh William Adams, Giridharan Iyengar, Ching-Yung Lin, Milind R. Naphade, Chalapathy Venkata Neti, Harriet Jane Nock, John Richard Smith, Belle L. Tseng
-
Patent number: 6594629Abstract: In a first aspect of the invention, methods and apparatus for providing speech recognition comprise the steps of processing a video signal associated with an arbitrary content video source, processing an audio signal associated with the video signal, and decoding the processed audio signal in conjunction with the processed video signal to generate a decoded output signal representative of the audio signal. In a second aspect 6f the invention, methods and apparatus for providing speech detection in accordance with a speech recognition system comprise the steps of processing a video signal associated with a video source to detect whether one or more features associated with the video signal are representative of speech, and processing an audio signal associated with the video signal in accordance with the speech recognition system to generate a decoded output signal representative of the audio signal when the one or more features associated with the video signal are representative of speech.Type: GrantFiled: August 6, 1999Date of Patent: July 15, 2003Assignee: International Business Machines CorporationInventors: Sankar Basu, Philippe Christian de Cuetos, Stephane Herman Maes, Chalapathy Venkata Neti, Andrew William Senior
-
Publication number: 20030018475Abstract: Techniques for providing speech recognition comprise the steps of processing a video signal associated with an arbitrary content video source, processing an audio signal associated with the video signal, and recognizing at least a portion of the processed audio signal, using at least a portion of the processed video signal, to generate an output signal representative of the audio signal.Type: ApplicationFiled: August 30, 2002Publication date: January 23, 2003Applicant: International Business Machines CorporationInventors: Sankar Basu, Philippe Christian de Cuetos, Stephane Herman Maes, Chalapathy Venkata Neti, Andrew William Senior
-
Publication number: 20020152068Abstract: Bootstrapping of a system from one language to another often works well when the two languages share the similar acoustic space. However, when the new language has sounds that do not occur in the language from which the bootstrapping is to be done, bootstrapping does not produce good initial models and the new language data is not properly aligned to these models. The present invention provides techniques to generate context dependent labeling of the new language data using the recognition system of another language. Then, this labeled data is used to generate models for the new language phones.Type: ApplicationFiled: February 22, 2001Publication date: October 17, 2002Applicant: International Business Machines CorporationInventors: Chalapathy Venkata Neti, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma
-
Publication number: 20020135618Abstract: Systems and methods are provided for performing focus detection, referential ambiguity resolution and mood classification in accordance with multi-modal input data, in varying operating conditions, in order to provide an effective conversational computing environment for one or more users.Type: ApplicationFiled: February 5, 2001Publication date: September 26, 2002Applicant: International Business Machines CorporationInventors: Stephane Herman Maes, Chalapathy Venkata Neti
-
Patent number: 6219640Abstract: Methods and apparatus for performing speaker recognition comprise processing a video signal associated with an arbitrary content video source and processing an audio signal associated with the video signal. Then, an identification and/or verification decision is made based on the processed audio signal and the processed video signal. Various decision making embodiments may be employed including, but not limited to, a score combination approach, a feature combination approach, and a re-scoring approach. In another aspect of the invention, a method of verifying a speech utterance comprises processing a video signal associated with a video source and processing an audio signal associated with the video signal. Then, the processed audio signal is compared with the processed video signal to determine a level of correlation between the signals. This is referred to as unsupervised utterance verification.Type: GrantFiled: August 6, 1999Date of Patent: April 17, 2001Assignee: International Business Machines CorporationInventors: Sankar Basu, Homayoon S. M. Beigi, Stephane Herman Maes, Benoit Emmanuel Ghislain Maison, Chalapathy Venkata Neti, Andrew William Senior
-
Patent number: 5953701Abstract: A method of gender dependent speech recognition includes the steps of identifying phone state models common to both genders, identifying gender specific phone state models, identifying a gender of a speaker and recognizing acoustic data from the speaker.Type: GrantFiled: January 22, 1998Date of Patent: September 14, 1999Assignee: International Business Machines CorporationInventors: Chalapathy Venkata Neti, Salim Estephan Roukos