Patents by Inventor Jean-Claude Junqua

Jean-Claude Junqua has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Factorization for generating a library of mouth shapes

Publication number: 20020152074

Abstract: A library of mouth shapes is created by separating speaker-dependent and speaker independent variability. Preferably, speaker dependent variability is modeled by a speaker space while the speaker independent variability (i.e. context dependency), is modeled by a set of normalized mouth shapes that need be built only once. Given a small amount of data from a new speaker, it is possible to construct a corresponding mouth shape library by estimating a point in speaker space that maximizes the likelihood of adaptation data and by combining speaker dependent and speaker independent variability. Creation of talking heads is simplified because creation of a library of mouth shapes is enabled with only a few mouth shape instances. To build the speaker space, a context independent mouth shape parametric representation is obtained. Then a supervector containing the set of context-independent mouth shapes is formed for each speaker included in the speaker space.

Type: Application

Filed: March 12, 2002

Publication date: October 17, 2002

Inventor: Jean-Claude Junqua
Speech recognition training for small hardware devices

Patent number: 6463413

Abstract: A distributed speech processing system for constructing speech recognition reference models that are to be used by a speech recognizer in a small hardware device, such as a personal digital assistant or cellular telephone. The speech processing system includes a speech recognizer residing on a first computing device and a speech model server residing on a second computing device. The speech recognizer receives speech training data and processes it into an intermediate representation of the speech training data. The intermediate representation is then communicated to the speech model server. The speech model server generates a speech reference model by using the intermediate representation of the speech training data and then communicates the speech reference model back to the first computing device for storage in a lexicon associated with the speech recognizer.

Type: Grant

Filed: April 20, 1999

Date of Patent: October 8, 2002

Assignee: Matsushita Electrical Industrial Co., Ltd.

Inventors: Ted H. Applebaum, Jean-Claude Junqua
Method and tool for customization of speech synthesizer databses using hierarchical generalized speech templates

Publication number: 20020133348

Abstract: A speech synthesizer customization system provides a mechanism for generating a hierarchical customized user database. The customization system has a template management tool for generating the templates based on customization data from a user and associated replicated dynamic synthesis data from a text-to-speech (TTS) synthesizer. The replicated dynamic synthesis data is arranged in a dynamic data structure having hierarchical levels. The customization system further includes a user database that supplements a standard database of the synthesizer. The tool populates the user database with the templates such that the templates enable the user database to uniformly override subsequently generated speech synthesis data at all hierarchical levels of the dynamic data structure.

Type: Application

Filed: March 15, 2001

Publication date: September 19, 2002

Inventors: Steve Pearson, Peter Veprek, Jean-Claude Junqua
Voice personalization of speech synthesizer

Publication number: 20020120450

Abstract: The speech synthesizer is personalized to sound like or mimic the speech characteristics of an individual speaker. The individual speaker provides a quantity of enrollment data, which can be extracted from a short quantity of speech, and the system modifies the base synthesis parameters to more closely resemble those of the new speaker. More specifically, the synthesis parameters may be decomposed into speaker dependent parameters, such as context-independent parameters, and speaker independent parameters, such as context dependent parameters. The speaker dependent parameters are adapted using enrollment data from the new speaker. After adaptation, the speaker dependent parameters are combined with the speaker independent parameters to provide a set of personalized synthesis parameters.

Type: Application

Filed: February 26, 2001

Publication date: August 29, 2002

Inventors: Jean-Claude Junqua, Florent Perronnin, Roland Kuhn, Patrick Nguyen
System for identifying and adapting a TV-user profile by means of speech technology

Patent number: 6415257

Abstract: Speech input supplied by the user is evaluated by the speaker verification/identification module, and based on the evaluation, parameters are retrieved from a user profile database. These parameters adapt the speech models of the speech recognizer and also supply the natural language parser with customized dialog grammars. The user's speech is then interpreted by the speech recognizer and natural language parser to determine the meaning of the user's spoken input in order to control the television tuner. The parser works in conjunction with a command module that mediates the dialog with the user, providing on-screen prompts or synthesized speech queries to elicit further input from the user when needed. The system integrates with an electronic program guide, so that the natural language parser is made aware of what programs are available when conducting the synthetic dialog with the user.

Type: Grant

Filed: August 26, 1999

Date of Patent: July 2, 2002

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jean-Claude Junqua, Roland Kuhn, Tony Davis, Yi Zhao, Weiying Li
Robust preprocessing signal equalization system and method for normalizing to a target environment

Patent number: 6411927

Abstract: The audio source is spectrally shaped by filtering in the time domain to approximate or emulate a standardized or target microphone input channel. The background level is adjusted by adding noise to the time domain signal prior to the onset of speech to set a predetermined background noise level based on a predetermined target. The audio source is then monitored in real time and the signal-to-noise ratio is adjusted by adding noise to the time domain signal, in real time, to maintain a signal-to-noise ratio based on a predetermined target value. The normalized audio signal may be applied to both training speech and test speech. The resultant normalization minimizes the mismatch between training and testing and also improves other speech processing functions, such as speech endpoint detection.

Type: Grant

Filed: September 4, 1998

Date of Patent: June 25, 2002

Assignee: Matsushita Electric Corporation of America

Inventors: Philippe Morin, Philippe Gelin, Jean-Claude Junqua
Dimensionality reduction for speaker normalization and speaker and environment adaptation using eigenvoice techniques

Patent number: 6343267

Abstract: A set of speaker dependent models or adapted models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Dimensionality reduction is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. The adapted model may then be further adapted via MAP, MLLR, MLED or the like.

Type: Grant

Filed: September 4, 1998

Date of Patent: January 29, 2002

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua
Adaptation system and method for E-commerce and V-commerce applications

Patent number: 6341264

Abstract: Electronic commerce (E-commerce) and Voice commerce (V-commerce) proceeds by having the user speak into the system. The user's speech is converted by speech recognizer into a form required by the transaction processor that effects the electronic commerce operation. A dimensionality reduction processor converts the user's input speech into a reduced dimensionality set of values termed eigenvoice parameters. These parameters are compared with a set of previously stored eigenvoice parameters representing a speaker population (the eigenspace representing speaker space) and the comparison is used by the speech model adaptation system to rapidly adapt the speech recognizer to the user's speech characteristics. The user's eigenvoice parameters are also stored for subsequent use by the speaker verification and speaker identification modules.

Type: Grant

Filed: February 25, 1999

Date of Patent: January 22, 2002

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Roland Kuhn, Jean-Claude Junqua
Automatic filtering of TV contents using speech recognition and natural language

Patent number: 6330537

Abstract: Speech recognition and natural language parsing components are used to extract the meaning of the user's spoken input. The system stores a semantic representation of an electronic program guide, and the contents of the program guide can be mapped into the grammars used by the natural language parser. Thus, when the user wishes to navigate through the complex menu structure of the electronic program guide, he or she only needs to speak in natural language sentences. The system automatically filters the contents of the program guide and supplies the user with on-screen display or synthesized speech responses to the user's request.

Type: Grant

Filed: August 26, 1999

Date of Patent: December 11, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Tony Davis, Jean-Claude Junqua, Roland Kuhn, Weiying Li, Yi Zhao
Speaker and environment adaptation based on eigenvoices

Patent number: 6327565

Abstract: A set of speaker dependent models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Principal component analysis is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. Environmental adaptation may be performed by including environmental variations in the training data.

Type: Grant

Filed: April 30, 1998

Date of Patent: December 4, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Roland Kuhn, Jean-Claude Junqua
Speech detection using stochastic confidence measures on the frequency spectrum

Patent number: 6327564

Abstract: An accurate and reliable method is provided for detecting speech from an input speech signal. A probabilistic approach is used to classify each frame of the speech signal as speech or non-speech. The speech detection method is based on a frequency spectrum extracted from each frame, such that the value for each frequency band is considered to be a random variable and each frame is considered to be an occurrence of these random variables. Using the frequency spectrums from a non-speech part of the speech signal, a known set of random variables is constructed. Next, each unknown frame is evaluated as to whether or not it belongs to this known set of random variables. To do so, a unique random variable (preferably a chi-square value) is formed from the set of random variables associated with the unknown frame. The unique variable is normalized with respect the known set of random variables and then classified as either speech or non-speech using the “Test of Hypothesis”.

Type: Grant

Filed: March 5, 1999

Date of Patent: December 4, 2001

Assignee: Matsushita Electric Corporation of America

Inventors: Philippe Gelin, Jean-Claude Junqua
System and method for allowing family members to access TV contents and program media recorder over telephone or internet

Patent number: 6324512

Abstract: Users of the system can access the TV contents and program media recorder by speaking in natural language sentences. The user interacts with the television and with other multimedia equipment, such as media recorders and VCRs, through the unified access controller. A speaker verification/identification module determines the identity of the speaker and this information is used to control how the dialog between user and system proceeds. Speech can be input through either a microphone or over the telephone. In addition, the user can interact with the system using a suitable computer attached via the internet. Regardless of the mode of access, the unified access controller interprets the semantic content of the user's request and supplies the appropriate control signals to the television tuner and/or recorder.

Type: Grant

Filed: August 26, 1999

Date of Patent: November 27, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jean-Claude Junqua, Roland Kuhn, Tony Davis, Yi Zhao, Weiying Li
Automatic control of household activity using speech recognition and natural language

Publication number: 20010041980

Abstract: Speech recognition and natural language parsing components are used to extract the meaning of the user's spoken input. The system stores a semantic representation of an electronic activity guide, and the contents of the guide can be mapped into the grammars used by the natural language parser. Thus, when the user wishes to navigate through the complex menu structure of the electronic activity guide, he or she only needs to speak in natural language sentences. The system automatically filters the contents of the guide and supplies the user with on-screen display or synthesized speech responses to the user's request. The system allows the user to communicate in a natural way with a variety of devices communicating with the home network or home gateway.

Type: Application

Filed: June 6, 2001

Publication date: November 15, 2001

Inventors: John Howard K. Howard, Jean-Claude Junqua
Apparatus and method using speech understanding for automatic channel selection in interactive television

Patent number: 6314398

Abstract: A speech understanding system for receiving a spoken request from a user and processing the request against a knowledge base of programming information for automatically selecting a television program is disclosed. The speech understanding system includes a knowledge extractor for receiving electronic programming guide (EPG) information and processing the EPG information for creating a program database. The system also includes a speech recognizer for receiving the spoken request and translating the spoken request into a text stream having a plurality of words. A natural language processor is provided for receiving the text stream and processing the words for resolving a semantic content of the spoken request. The natural language processor places the meaning of the words into a task frame having a plurality of key word slots.

Type: Grant

Filed: March 1, 1999

Date of Patent: November 6, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jean-Claude Junqua, Matteo Contolini
Automated hotel attendant using speech recognition

Patent number: 6314165

Abstract: An automated hotel attendant is provided for coordinating room-to-room calling over a telephone switching system that supports multiple telephone extensions. A hotel registration system receives and stores the spelled names of hotel guests as well as assigns each guest an associated telephone extension. A lexicon training system is connected to the hotel registration system for generating pronunciations for each spelled name by converting the characters that spell those names into word-phoneme data. This word-phoneme data is in turn stored in a lexicon that is used by a speech recognition system. In particular, a phoneticizer in conjunction with a Hidden Markov Model (HMM) based model trainer serves as the basis for the lexicon training system, such that one or several HMM models associated with each guest name are stored in the lexicon.

Type: Grant

Filed: April 30, 1998

Date of Patent: November 6, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jean-Claude Junqua, Matteo Contolini
Supervised adaptation using corrective N-best decoding

Patent number: 6272462

Abstract: Supervised adaptation speech is supplied to the recognizer and the recognizer generates the N-best transcriptions of the adaptation speech. These transcriptions include the one transcription known to be correct, based on a priori knowledge of the adaptation speech, and the remaining transcriptions known to be incorrect. The system applies weights to each transcription: a positive weight to the correct transcription and negative weights to the incorrect transcriptions. These weights have the effect of moving the incorrect transcriptions away from the correct one, rendering the recognition system more discriminative for the new speaker's speaking characteristics. Weights applied to the incorrect solutions are based on the respective likelihood scores generated by the recognizer. The sum of all weights (positive and negative) are a positive number. This ensures that the system will converge.

Type: Grant

Filed: February 25, 1999

Date of Patent: August 7, 2001

Assignee: Panasonic Technologies, Inc.

Inventors: Patrick Nguyen, Philippe Gelin, Jean-Claude Junqua
Maximum likelihood method for finding an adapted speaker model in eigenvoice space

Patent number: 6263309

Abstract: A set of speaker dependent models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Principle component analysis is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. Environmental adaptation may be performed by including environmental variations in the training data.

Type: Grant

Filed: April 30, 1998

Date of Patent: July 17, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Patrick Nguyen, Roland Kuhn, Jean-Claude Junqua
Speech recognition and teaching apparatus able to rapidly adapt to difficult speech of children and foreign speakers

Patent number: 6253181

Abstract: The recognizer tests input utterances using a confidence measure to select words of high recognition confidence for use in the adaptation process. Adaptation is performed rapidly using a priori knowledge of about the class of speakers who will be using the system. This a priori knowledge can be expressed using eigenvoice basis vectors that capture information about the entire targeted user population. The dialogue system may also use the confidence measure to output a pronunciation example to the user, based on the confidence that the system has in the results of recognition, given the different possibilities that can be recognized. The dialogue system may also provide voiced prompts that teach the user how to correctly pronounce words.

Type: Grant

Filed: January 22, 1999

Date of Patent: June 26, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventor: Jean-Claude Junqua
Method for goal-oriented speech translation in hand-held devices using meaning extraction and dialogue

Patent number: 6233561

Abstract: A computer-implemented method and apparatus is provided for processing a spoken request from a user. A speech recognizer converts the spoken request into a digital format. A frame data structure associates semantic components of the digitized spoken request with predetermined slots. The slots are indicative of data which are used to achieve a predetermined goal. A speech understanding module which is connected to the speech recognizer and to the frame data structure determines semantic components of the spoken request. The slots are populated based upon the determined semantic components. A dialog manager which is connected to the speech understanding module may determine at least one slot which is unpopulated based upon the determined semantic components and in a preferred embodiment may provide confirmation of the populated slots. A computer generated-request is formulated in order for the user to provide data related to the unpopulated slot.

Type: Grant

Filed: April 12, 1999

Date of Patent: May 15, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jean-Claude Junqua, Roland Kuhn, Matteo Contolini, Murat Karaorman, Ken Field, Michael Galler, Yi Zhao
Method and system for automatically determining phonetic transcriptions associated with spelled words

Patent number: 6233553

Abstract: New entries are added to the lexicon by entering them as spelled words. A transcription generator, such as a decision-tree-based phoneme or morpheme transcription generator, converts each spelled word into a set of n-best transcriptions or sequences. Meanwhile, user input or automatically generated speech corresponding to the spelled word is processed by an automatic speech recognizer and the recognizer rescores the transcriptions or sequences produced by the transcription generator. One or more of the highest scored (highest confidence) transcriptions may be added to the lexicon to update it. If desired, the spelled word-pronunciation pairs generated by the system can be used to retrain the transcription generator, making the system adaptive or self-learning.

Type: Grant

Filed: September 4, 1998

Date of Patent: May 15, 2001

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Matteo Contolini, Jean-Claude Junqua, Roland Kuhn

prev … 2 3 4 5 6 7 next