Patents by Inventor Chalapathy Venkata Neti

Chalapathy Venkata Neti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7295979
    Abstract: Bootstrapping of a system from one language to another often works well when the two languages share the similar acoustic space. However, when the new language has sounds that do not occur in the language from which the bootstrapping is to be done, bootstrapping does not produce good initial models and the new language data is not properly aligned to these models. The present invention provides techniques to generate context dependent labeling of the new language data using the recognition system of another language. Then, this labeled data is used to generate models for the new language phones.
    Type: Grant
    Filed: February 22, 2001
    Date of Patent: November 13, 2007
    Assignee: International Business Machines Corporation
    Inventors: Chalapathy Venkata Neti, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma
  • Patent number: 7251603
    Abstract: Techniques for performing audio-visual speech recognition, with improved recognition performance, in a degraded visual environment. For example, in one aspect of the invention, a technique for use in accordance with an audio-visual speech recognition system for improving a recognition performance thereof includes the steps/operations of: (i) selecting between an acoustic-only data model and an acoustic-visual data model based on a condition associated with a visual environment; and (ii) decoding at least a portion of an input spoken utterance using the selected data model. Advantageously, during periods of degraded visual conditions, the audio-visual speech recognition system is able to decode (recognize) input speech data using audio-only data, thus avoiding recognition inaccuracies that may result from performing speech recognition based on acoustic-visual data models and degraded visual data.
    Type: Grant
    Filed: June 23, 2003
    Date of Patent: July 31, 2007
    Assignee: International Business Machines Corporation
    Inventors: Jonathan H. Connell, Norman Haas, Etienne Marcheret, Chalapathy Venkata Neti, Gerasimos Potamianos
  • Patent number: 6964023
    Abstract: Systems and methods are provided for performing focus detection, referential ambiguity resolution and mood classification in accordance with multi-modal input data, in varying operating conditions, in order to provide an effective conversational computing environment for one or more users.
    Type: Grant
    Filed: February 5, 2001
    Date of Patent: November 8, 2005
    Assignee: International Business Machines Corporation
    Inventors: Stephane Herman Maes, Chalapathy Venkata Neti
  • Publication number: 20040260554
    Abstract: Techniques for performing audio-visual speech recognition, with improved recognition performance, in a degraded visual environment. For example, in one aspect of the invention, a technique for use in accordance with an audio-visual speech recognition system for improving a recognition performance thereof includes the steps/operations of: (i) selecting between an acoustic-only data model and an acoustic-visual data model based on a condition associated with a visual environment; and (ii) decoding at least a portion of an input spoken utterance using the selected data model. Advantageously, during periods of degraded visual conditions, the audio-visual speech recognition system is able to decode (recognize) input speech data using audio-only data, thus avoiding recognition inaccuracies that may result from performing speech recognition based on acoustic-visual data models and degraded visual data.
    Type: Application
    Filed: June 23, 2003
    Publication date: December 23, 2004
    Applicant: International Business Machines Corporation
    Inventors: Jonathan H. Connell, Norman Haas, Etienne Marcheret, Chalapathy Venkata Neti, Gerasimos Potamianos
  • Patent number: 6816836
    Abstract: Techniques for providing speech recognition comprise the steps of processing a video signal associated with an arbitrary content video source, processing an audio signal associated with the video signal, and recognizing at least a portion of the processed audio signal, using at least a portion of the processed video signal, to generate an output signal representative of the audio signal.
    Type: Grant
    Filed: August 30, 2002
    Date of Patent: November 9, 2004
    Assignee: International Business Machines Corporation
    Inventors: Sankar Basu, Philippe Christian de Cuetos, Stephane Herman Maes, Chalapathy Venkata Neti, Andrew William Senior
  • Publication number: 20040111432
    Abstract: An apparatus and method for analyzing multimedia content to identify the presence of audio, visual and textual cues that together correspond to one or more high-level semantics are provided. The apparatus and method make use of one or more analysis models that are trained to analyze audio, visual and textual portions of multimedia content to generate scores associated with the audio, visual and textual portions with respect to various high-level semantic concepts. These scores are used to generate a vector of scores. The apparatus is trained with regard to relationships between audio, visual and textual scores to thereby take the vector of scores generated for the multimedia content and classify the multimedia content into one or more high-level semantic concepts. Based on the scores for the various audio, video and textual portions of the multimedia content, a level of certainty regarding the high-level semantic concepts may be generated.
    Type: Application
    Filed: December 10, 2002
    Publication date: June 10, 2004
    Applicant: International Business Machines Corporation
    Inventors: Hugh William Adams, Giridharan Iyengar, Ching-Yung Lin, Milind R. Naphade, Chalapathy Venkata Neti, Harriet Jane Nock, John Richard Smith, Belle L. Tseng
  • Patent number: 6594629
    Abstract: In a first aspect of the invention, methods and apparatus for providing speech recognition comprise the steps of processing a video signal associated with an arbitrary content video source, processing an audio signal associated with the video signal, and decoding the processed audio signal in conjunction with the processed video signal to generate a decoded output signal representative of the audio signal. In a second aspect 6f the invention, methods and apparatus for providing speech detection in accordance with a speech recognition system comprise the steps of processing a video signal associated with a video source to detect whether one or more features associated with the video signal are representative of speech, and processing an audio signal associated with the video signal in accordance with the speech recognition system to generate a decoded output signal representative of the audio signal when the one or more features associated with the video signal are representative of speech.
    Type: Grant
    Filed: August 6, 1999
    Date of Patent: July 15, 2003
    Assignee: International Business Machines Corporation
    Inventors: Sankar Basu, Philippe Christian de Cuetos, Stephane Herman Maes, Chalapathy Venkata Neti, Andrew William Senior
  • Publication number: 20030018475
    Abstract: Techniques for providing speech recognition comprise the steps of processing a video signal associated with an arbitrary content video source, processing an audio signal associated with the video signal, and recognizing at least a portion of the processed audio signal, using at least a portion of the processed video signal, to generate an output signal representative of the audio signal.
    Type: Application
    Filed: August 30, 2002
    Publication date: January 23, 2003
    Applicant: International Business Machines Corporation
    Inventors: Sankar Basu, Philippe Christian de Cuetos, Stephane Herman Maes, Chalapathy Venkata Neti, Andrew William Senior
  • Publication number: 20020152068
    Abstract: Bootstrapping of a system from one language to another often works well when the two languages share the similar acoustic space. However, when the new language has sounds that do not occur in the language from which the bootstrapping is to be done, bootstrapping does not produce good initial models and the new language data is not properly aligned to these models. The present invention provides techniques to generate context dependent labeling of the new language data using the recognition system of another language. Then, this labeled data is used to generate models for the new language phones.
    Type: Application
    Filed: February 22, 2001
    Publication date: October 17, 2002
    Applicant: International Business Machines Corporation
    Inventors: Chalapathy Venkata Neti, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma
  • Publication number: 20020135618
    Abstract: Systems and methods are provided for performing focus detection, referential ambiguity resolution and mood classification in accordance with multi-modal input data, in varying operating conditions, in order to provide an effective conversational computing environment for one or more users.
    Type: Application
    Filed: February 5, 2001
    Publication date: September 26, 2002
    Applicant: International Business Machines Corporation
    Inventors: Stephane Herman Maes, Chalapathy Venkata Neti
  • Patent number: 6219640
    Abstract: Methods and apparatus for performing speaker recognition comprise processing a video signal associated with an arbitrary content video source and processing an audio signal associated with the video signal. Then, an identification and/or verification decision is made based on the processed audio signal and the processed video signal. Various decision making embodiments may be employed including, but not limited to, a score combination approach, a feature combination approach, and a re-scoring approach. In another aspect of the invention, a method of verifying a speech utterance comprises processing a video signal associated with a video source and processing an audio signal associated with the video signal. Then, the processed audio signal is compared with the processed video signal to determine a level of correlation between the signals. This is referred to as unsupervised utterance verification.
    Type: Grant
    Filed: August 6, 1999
    Date of Patent: April 17, 2001
    Assignee: International Business Machines Corporation
    Inventors: Sankar Basu, Homayoon S. M. Beigi, Stephane Herman Maes, Benoit Emmanuel Ghislain Maison, Chalapathy Venkata Neti, Andrew William Senior
  • Patent number: 5953701
    Abstract: A method of gender dependent speech recognition includes the steps of identifying phone state models common to both genders, identifying gender specific phone state models, identifying a gender of a speaker and recognizing acoustic data from the speaker.
    Type: Grant
    Filed: January 22, 1998
    Date of Patent: September 14, 1999
    Assignee: International Business Machines Corporation
    Inventors: Chalapathy Venkata Neti, Salim Estephan Roukos