Patents by Inventor Xavier Menendez-Pidal

Xavier Menendez-Pidal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for Mandarin Chinese speech recognition using an optimized phone set

Patent number: 7353173

Abstract: The present invention comprises a system and method for implementing a Mandarin Chinese speech recognizer with an optimized phone set, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may be implemented with a phonetic technique to separately include consonantal phones and vocalic phones. For reasons of system efficiency, the optimized Mandarin Chinese phone set may preferably be implemented in a compact manner to include only a minimum required number of consonantal phones and vocalic phones to accurately represent Mandarin Chinese speech during the speech recognition procedure.

Type: Grant

Filed: March 31, 2003

Date of Patent: April 1, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw
System and method for effectively implementing a Mandarin Chinese speech recognition dictionary

Patent number: 7353174

Abstract: The present invention comprises a system and method for effectively implementing a Mandarin Chinese speech recognition dictionary, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may efficiently be implemented by utilizing an allophone and phonemic variation technique. In addition, the foregoing vocabulary dictionary may be implemented by utilizing unified dictionary optimization techniques to provide robust and accurate speech recognition. Furthermore, the vocabulary dictionary may be implemented as an optimized dictionary to accurately recognize either Northern Mandarin Chinese speech or Southern Mandarin Chinese speech during the speech recognition procedure.

Type: Grant

Filed: March 31, 2003

Date of Patent: April 1, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw
System and method for cantonese speech recognition using an optimized phone set

Patent number: 7353172

Abstract: The present invention comprises a system and method for implementing a Cantonese speech recognizer with an optimized phone set, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Cantonese phone set. The optimized Cantonese phone set may be implemented with a phonetic technique to separately include consonantal phones and vocalic phones. For reasons of system efficiency, the optimized Cantonese phone set may preferably be implemented in a compact manner to include only a minimum required number of consonantal phones and vocalic phones to accurately represent Cantonese speech during the speech recognition procedure.

Type: Grant

Filed: March 24, 2003

Date of Patent: April 1, 2008

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Michael Emonts, Xavier Menendez-Pidal, Lex Olorenshaw
Methodology for performing a refinement procedure to implement a speech recognition dictionary

Patent number: 7272560

Abstract: A system and method for performing a refinement procedure to effectively implement a speech recognition dictionary for spontaneous speech recognition may include a problematic word identifier configured to divide vocabulary words from an initial speech recognition dictionary into problematic words and non-problematic words according to pre-defined identification criteria. A candidate generator may analyze the problematic words to produce one or more pronunciation candidates for each of the problematic words. An optimization module may then perform an optimization process for refining one or more pronunciation candidates according to certain optimization criteria to thereby generate optimized problematic pronunciations. A dictionary refinement manager may finally combine the optimized problematic pronunciations with non-problematic pronunciations of the non-problematic words to produce a refined speech recognition dictionary for use by the speech recognition system.

Type: Grant

Filed: March 22, 2004

Date of Patent: September 18, 2007

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Gustavo Hernandez Abrego, Xavier Menendez-Pidal, Lex Olorenshaw
Audio, video, simulation, and user interface paradigms

Publication number: 20070061142

Abstract: Consumer electronic devices have been developed with enormous information processing capabilities, high quality audio and video outputs, large amounts of memory, and may also include wired and/or wireless networking capabilities. Additionally, relatively unsophisticated and inexpensive sensors, such as microphones, video camera, GPS or other position sensors, when coupled with devices having these enhanced capabilities, can be used to detect subtle features about users and their environments. A variety of audio, video, simulation and user interface paradigms have been developed to utilize the enhanced capabilities of these devices. These paradigms can be used separately or together in any combination. One paradigm automatically creating user identities using speaker identification. Another paradigm includes a control button with 3-axis pressure sensitivity for use with game controllers and other input devices.

Type: Application

Filed: September 15, 2006

Publication date: March 15, 2007

Applicant: Sony Computer Entertainment Inc.

Inventors: Gustavo Hernandez-Abrego, Xavier Menendez-Pidal, Steven Osman, Ruxin Chen, Rishi Deshpande, Care Michaud-Wideman, Richard Marks, Eric Larsen, Xiaodong Mao
System and method for speech recognition utilizing a merged dictionary

Patent number: 7181396

Abstract: The present invention comprises a system and method for speech recognition utilizing a merged dictionary, and may include a recognizer that is configured to compare input speech data to a series of dictionary entries from the merged dictionary to detect a recognized phrase or command. The merged dictionary may be implemented by utilizing a merging technique that maps two or more related phrases or commands with similar meanings to a single one of the dictionary entries. The recognizer may thus achieve more accurate speech recognition accuracy by merging phrases or commands which might otherwise be erroneously mistaken for each other.

Type: Grant

Filed: March 24, 2003

Date of Patent: February 20, 2007

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Michael Emonts, Xavier Menendez-Pidal, Lex Olorenshaw
System and method for speech verification using a robust confidence measure

Patent number: 7103543

Abstract: The present invention comprises a system and method for speech verification using a robust confidence measure, and includes a speech verifier which compares a confidence measure for a recognized word to a predetermined threshold value in order to determine whether the recognized word is valid, where a recognized word corresponds to a word model that produces a highest recognition score. In accordance with the present invention, the foregoing confidence measure may be calculated using the recognition score for the recognized word, a background score of a worst recognition candidate, and a pseudo filler score that may be based upon selected average recognition scores from an N-best list of recognition candidates.

Type: Grant

Filed: August 13, 2002

Date of Patent: September 5, 2006

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Gustavo Hernandez-Abrego, Xavier Menendez-Pidal
System and method for utilizing distance measures to perform text classification

Publication number: 20060142993

Abstract: A system and method for utilizing distance measures to perform text classification includes text classification categories that each have reference models of reference N-grams. Input text that includes input N-grams is accessed for performing the text classification. A text classifier calculates distance measures between the input N-grams and the reference N-grams. The text classifier then utilizes the distance measures to identify a matching category for the input text. In certain embodiments, a verification module performs a verification procedure to determine whether the initially-selected matching category is a valid classification result for the text classification.

Type: Application

Filed: December 28, 2004

Publication date: June 29, 2006

Inventors: Xavier Menendez-Pidal, Lei Duan, Michael Emonts
Methodology for generating enhanced demiphone acoustic models for speech recognition

Publication number: 20060136209

Abstract: A system and method for effectively performing speech recognition procedures includes enhanced demiphone acoustic models that a speech recognition engine utilizes to perform the speech recognition procedures. The enhanced demiphone acoustic models each have three states that are collectively arranged to form a preceding demiphone and a succeeding demiphone. An acoustic model generator may utilize a decision tree for analyzing speech context information from a training database. The acoustic model generator then effectively configures each of the enhanced demiphone acoustic models as either a succeeding-dominant enhanced demiphone acoustic model or a preceding-dominant enhanced demiphone acoustic model to accurately model speech characteristics.

Type: Application

Filed: December 16, 2004

Publication date: June 22, 2006

Inventors: Xavier Menendez-Pidal, Lex Olorenshaw, Gustavo Abrego
System and method for tying variance vectors for speech recognition

Publication number: 20060136210

Abstract: A system and method for implementing a speech recognition engine includes acoustic models that the speech recognition engine utilizes to perform speech recognition procedures. An acoustic model optimizer performs a vector quantization procedure upon original variance vectors initially associated with the acoustic models. In certain embodiments, the vector quantization procedure may be performed as a block vector quantization procedure or as a subgroup vector quantization procedure. The vector quantization procedure produces a reduced number of tied variance vectors for optimally implementing the acoustic models.

Type: Application

Filed: December 16, 2004

Publication date: June 22, 2006

Inventors: Xavier Menendez-Pidal, Ajay Patrikar
Supervised automatic text generation based on word classes for language modeling

Patent number: 7035789

Abstract: A system and method is provided that randomly generates text with a given structure. The structure is taken from a number of learning examples. The structure of training examples is captured by word classification and the definition of the relationships between word classes in a given language. The text generated with this procedure is intended to replicate the information given by the original learning examples. The resulting text may be used to better model the structure of a language in a stochastic language model.

Type: Grant

Filed: September 4, 2001

Date of Patent: April 25, 2006

Assignees: Sony Corporation, Sony Electronics, Inc.

Inventors: Gustavo Hernandez Abrego, Xavier Menendez-Pidal
System and method for effectively implementing an optimized language model for speech recognition

Publication number: 20050228667

Abstract: A system and method for effectively implementing an optimized language model for speech recognition includes initial language models each created by combining source models according to selectable interpolation coefficients that define proportional relationships for combining the source models. A rescoring module iteratively utilizes the initial language models to process input development data for calculating word-error rates that each correspond to a different one of the initial language models. An optimized language model is then selected from the initial language models by identifying an optimal word-error rate from among the foregoing word-error rates. The speech recognizer may then utilize the optimized language model for effectively performing various speech recognition procedures.

Type: Application

Filed: March 30, 2004

Publication date: October 13, 2005

Inventors: Lei Duan, Gustavo Abrego, Xavier Menendez-Pidal, Lex Olorenshaw
Methodology for performing a refinement procedure to implement a speech recognition dictionary

Publication number: 20050209854

Abstract: A system and method for performing a refinement procedure to effectively implement a speech recognition dictionary for spontaneous speech recognition may include a problematic word identifier configured to divide vocabulary words from an initial speech recognition dictionary into problematic words and non-problematic words according to pre-defined identification criteria. A candidate generator may analyze the problematic words to produce one or more pronunciation candidates for each of the problematic words. An optimization module may then perform an optimization process for refining one or more pronunciation candidates according to certain optimization criteria to thereby generate optimized problematic pronunciations. A dictionary refinement manager may finally combine the optimized problematic pronunciations with non-problematic pronunciations of the non-problematic words to produce a refined speech recognition dictionary for use by the speech recognition system.

Type: Application

Filed: March 22, 2004

Publication date: September 22, 2005

Inventors: Gustavo Abrego, Xavier Menendez-Pidal, Lex Olorenshaw
System and method for automatically cataloguing data by utilizing speech recognition procedures

Publication number: 20050209849

Abstract: A system and method for automatically cataloguing data by utilizing speech recognition procedures includes an electronic device that captures audio/video data and corresponding verbal narration. A speech recognition engine coupled to the electronic device automatically performs a speech recognition process upon the audio/video data and verbal narration to generate labels that correspond to respective subject matter locations in the audio/video data. A label manager of the electronic device manages a label mode for generating and storing the foregoing labels. The label manager also controls a label search mode during which a system user utilizes the labels to automatically locate corresponding subject matter locations in the captured audio/video data.

Type: Application

Filed: March 22, 2004

Publication date: September 22, 2005

Inventors: Gustavo Abrego, Lex Olorenshaw, Lei Duan, Xavier Menendez-Pidal
System and method for performing speech recognition by utilizing a multi-language dictionary

Publication number: 20050038654

Abstract: The present invention comprises a system and method for speech recognition utilizing a multi-language dictionary, and may include a recognizer that is configured to compare input speech data to a series of dictionary entries from the multi-language dictionary to detect a recognized phrase or command. The multi-language dictionary may be implemented with a mixed-language technique that utilizes dictionary entries which incorporate multiple different languages such as Cantonese and English. The speech recognizer may thus advantageously achieve more accurate speech recognition accuracy in an efficient and compact manner.

Type: Application

Filed: August 11, 2003

Publication date: February 17, 2005

Inventors: Michael Emonts, Xavier Menendez-Pidal, Lex Olorenshaw
System and method for speech verification using an efficient confidence measure

Patent number: 6850886

Abstract: The present invention comprises a system and method for speech verification using an efficient confidence measure, and includes a speech verifier which compares a confidence measure for a recognized word to a predetermined threshold value in order to determine whether the recognized word is valid, where a recognized word corresponds to a word model that produces a highest recognition score. In accordance with the present invention, the foregoing confidence measure may be calculated using the recognition score for the recognized word and a pseudo filler score that may be based upon selected average recognition scores from an N-best list of recognition candidates.

Type: Grant

Filed: May 31, 2001

Date of Patent: February 1, 2005

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Gustavo Hernandez Abrego, Xavier Menendez-Pidal
Weighted frequency-channel background noise suppressor

Patent number: 6826528

Abstract: A method for implementing a noise suppressor in a speech recognition system comprises a filter bank for separating source speech data into discrete frequency sub-bands to generate filtered channel energy, and a noise suppressor for weighting the frequency sub-bands to improve the signal-to-noise ratio of the resultant noise-suppressed channel energy. The noise suppressor preferably includes a noise calculator for calculating background noise values, a speech energy calculator for calculating speech energy values for each channel of the filter bank, and a weighting module for applying calculated weighting values to the projected channel energy to generate the noise-suppressed channel energy.

Type: Grant

Filed: October 18, 2000

Date of Patent: November 30, 2004

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Duanpei Wu, Miyuki Tanaka, Xavier Menendez-Pidal
System and method for cantonese speech recognition using an optimized phone set

Publication number: 20040193418

Abstract: The present invention comprises a system and method for implementing a Cantonese speech recognizer with an optimized phone set, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Cantonese phone set. The optimized Cantonese phone set may be implemented with a phonetic technique to separately include consonantal phones and vocalic phones. For reasons of system efficiency, the optimized Cantonese phone set may preferably be implemented in a compact manner to include only a minimum required number of consonantal phones and vocalic phones to accurately represent Cantonese speech during the speech recognition procedure.

Type: Application

Filed: March 24, 2003

Publication date: September 30, 2004

Applicant: Sony Corporation and Sony Electronics Inc.

Inventors: Michael Emonts, Xavier Menendez-Pidal, Lex Olorenshaw
System and method for effectively implementing a mandarin chinese speech recognition dictionary

Publication number: 20040193417

Abstract: The present invention comprises a system and method for effectively implementing a Mandarin Chinese speech recognition dictionary, and may include a recognizer configured to compare input speech data to phone strings from a vocabulary dictionary that is implemented according to an optimized Mandarin Chinese phone set. The optimized Mandarin Chinese phone set may efficiently be implemented by utilizing an allophone and phonemic variation technique. In addition, the foregoing vocabulary dictionary may be implemented by utilizing unified dictionary optimization techniques to provide robust and accurate speech recognition. Furthermore, the vocabulary dictionary may be implemented as an optimized dictionary to accurately recognize either Northern Mandarin Chinese speech or Southern Mandarin Chinese speech during the speech recognition procedure.

Type: Application

Filed: March 31, 2003

Publication date: September 30, 2004

Inventors: Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Lex Olorenshaw
System and method for speech recognition utilizing a merged dictionary

Publication number: 20040193416

Abstract: The present invention comprises a system and method for speech recognition utilizing a merged dictionary, and may include a recognizer that is configured to compare input speech data to a series of dictionary entries from the merged dictionary to detect a recognized phrase or command. The merged dictionary may be implemented by utilizing a merging technique that maps two or more related phrases or commands with similar meanings to a single one of the dictionary entries. The recognizer may thus achieve more accurate speech recognition accuracy by merging phrases or commands which might otherwise be erroneously mistaken for each other.

Type: Application

Filed: March 24, 2003

Publication date: September 30, 2004

Applicants: Sony Corporation, Sony Electronics Inc.

Inventors: Michael Emonts, Xavier Menendez-Pidal, Lex Olorenshaw

prev 1 2 3 next