Patents by Inventor Jean-Claude Junqua

Jean-Claude Junqua has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20020152074
    Abstract: A library of mouth shapes is created by separating speaker-dependent and speaker independent variability. Preferably, speaker dependent variability is modeled by a speaker space while the speaker independent variability (i.e. context dependency), is modeled by a set of normalized mouth shapes that need be built only once. Given a small amount of data from a new speaker, it is possible to construct a corresponding mouth shape library by estimating a point in speaker space that maximizes the likelihood of adaptation data and by combining speaker dependent and speaker independent variability. Creation of talking heads is simplified because creation of a library of mouth shapes is enabled with only a few mouth shape instances. To build the speaker space, a context independent mouth shape parametric representation is obtained. Then a supervector containing the set of context-independent mouth shapes is formed for each speaker included in the speaker space.
    Type: Application
    Filed: March 12, 2002
    Publication date: October 17, 2002
    Inventor: Jean-Claude Junqua
  • Patent number: 6463413
    Abstract: A distributed speech processing system for constructing speech recognition reference models that are to be used by a speech recognizer in a small hardware device, such as a personal digital assistant or cellular telephone. The speech processing system includes a speech recognizer residing on a first computing device and a speech model server residing on a second computing device. The speech recognizer receives speech training data and processes it into an intermediate representation of the speech training data. The intermediate representation is then communicated to the speech model server. The speech model server generates a speech reference model by using the intermediate representation of the speech training data and then communicates the speech reference model back to the first computing device for storage in a lexicon associated with the speech recognizer.
    Type: Grant
    Filed: April 20, 1999
    Date of Patent: October 8, 2002
    Assignee: Matsushita Electrical Industrial Co., Ltd.
    Inventors: Ted H. Applebaum, Jean-Claude Junqua
  • Publication number: 20020133348
    Abstract: A speech synthesizer customization system provides a mechanism for generating a hierarchical customized user database. The customization system has a template management tool for generating the templates based on customization data from a user and associated replicated dynamic synthesis data from a text-to-speech (TTS) synthesizer. The replicated dynamic synthesis data is arranged in a dynamic data structure having hierarchical levels. The customization system further includes a user database that supplements a standard database of the synthesizer. The tool populates the user database with the templates such that the templates enable the user database to uniformly override subsequently generated speech synthesis data at all hierarchical levels of the dynamic data structure.
    Type: Application
    Filed: March 15, 2001
    Publication date: September 19, 2002
    Inventors: Steve Pearson, Peter Veprek, Jean-Claude Junqua
  • Publication number: 20020120450
    Abstract: The speech synthesizer is personalized to sound like or mimic the speech characteristics of an individual speaker. The individual speaker provides a quantity of enrollment data, which can be extracted from a short quantity of speech, and the system modifies the base synthesis parameters to more closely resemble those of the new speaker. More specifically, the synthesis parameters may be decomposed into speaker dependent parameters, such as context-independent parameters, and speaker independent parameters, such as context dependent parameters. The speaker dependent parameters are adapted using enrollment data from the new speaker. After adaptation, the speaker dependent parameters are combined with the speaker independent parameters to provide a set of personalized synthesis parameters.
    Type: Application
    Filed: February 26, 2001
    Publication date: August 29, 2002
    Inventors: Jean-Claude Junqua, Florent Perronnin, Roland Kuhn, Patrick Nguyen
  • Patent number: 6415257
    Abstract: Speech input supplied by the user is evaluated by the speaker verification/identification module, and based on the evaluation, parameters are retrieved from a user profile database. These parameters adapt the speech models of the speech recognizer and also supply the natural language parser with customized dialog grammars. The user's speech is then interpreted by the speech recognizer and natural language parser to determine the meaning of the user's spoken input in order to control the television tuner. The parser works in conjunction with a command module that mediates the dialog with the user, providing on-screen prompts or synthesized speech queries to elicit further input from the user when needed. The system integrates with an electronic program guide, so that the natural language parser is made aware of what programs are available when conducting the synthetic dialog with the user.
    Type: Grant
    Filed: August 26, 1999
    Date of Patent: July 2, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Jean-Claude Junqua, Roland Kuhn, Tony Davis, Yi Zhao, Weiying Li
  • Patent number: 6411927
    Abstract: The audio source is spectrally shaped by filtering in the time domain to approximate or emulate a standardized or target microphone input channel. The background level is adjusted by adding noise to the time domain signal prior to the onset of speech to set a predetermined background noise level based on a predetermined target. The audio source is then monitored in real time and the signal-to-noise ratio is adjusted by adding noise to the time domain signal, in real time, to maintain a signal-to-noise ratio based on a predetermined target value. The normalized audio signal may be applied to both training speech and test speech. The resultant normalization minimizes the mismatch between training and testing and also improves other speech processing functions, such as speech endpoint detection.
    Type: Grant
    Filed: September 4, 1998
    Date of Patent: June 25, 2002
    Assignee: Matsushita Electric Corporation of America
    Inventors: Philippe Morin, Philippe Gelin, Jean-Claude Junqua
  • Patent number: 6343267
    Abstract: A set of speaker dependent models or adapted models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Dimensionality reduction is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. The adapted model may then be further adapted via MAP, MLLR, MLED or the like.
    Type: Grant
    Filed: September 4, 1998
    Date of Patent: January 29, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua
  • Patent number: 6341264
    Abstract: Electronic commerce (E-commerce) and Voice commerce (V-commerce) proceeds by having the user speak into the system. The user's speech is converted by speech recognizer into a form required by the transaction processor that effects the electronic commerce operation. A dimensionality reduction processor converts the user's input speech into a reduced dimensionality set of values termed eigenvoice parameters. These parameters are compared with a set of previously stored eigenvoice parameters representing a speaker population (the eigenspace representing speaker space) and the comparison is used by the speech model adaptation system to rapidly adapt the speech recognizer to the user's speech characteristics. The user's eigenvoice parameters are also stored for subsequent use by the speaker verification and speaker identification modules.
    Type: Grant
    Filed: February 25, 1999
    Date of Patent: January 22, 2002
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Roland Kuhn, Jean-Claude Junqua
  • Patent number: 6330537
    Abstract: Speech recognition and natural language parsing components are used to extract the meaning of the user's spoken input. The system stores a semantic representation of an electronic program guide, and the contents of the program guide can be mapped into the grammars used by the natural language parser. Thus, when the user wishes to navigate through the complex menu structure of the electronic program guide, he or she only needs to speak in natural language sentences. The system automatically filters the contents of the program guide and supplies the user with on-screen display or synthesized speech responses to the user's request.
    Type: Grant
    Filed: August 26, 1999
    Date of Patent: December 11, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Tony Davis, Jean-Claude Junqua, Roland Kuhn, Weiying Li, Yi Zhao
  • Patent number: 6327565
    Abstract: A set of speaker dependent models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Principal component analysis is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. Environmental adaptation may be performed by including environmental variations in the training data.
    Type: Grant
    Filed: April 30, 1998
    Date of Patent: December 4, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Roland Kuhn, Jean-Claude Junqua
  • Patent number: 6327564
    Abstract: An accurate and reliable method is provided for detecting speech from an input speech signal. A probabilistic approach is used to classify each frame of the speech signal as speech or non-speech. The speech detection method is based on a frequency spectrum extracted from each frame, such that the value for each frequency band is considered to be a random variable and each frame is considered to be an occurrence of these random variables. Using the frequency spectrums from a non-speech part of the speech signal, a known set of random variables is constructed. Next, each unknown frame is evaluated as to whether or not it belongs to this known set of random variables. To do so, a unique random variable (preferably a chi-square value) is formed from the set of random variables associated with the unknown frame. The unique variable is normalized with respect the known set of random variables and then classified as either speech or non-speech using the “Test of Hypothesis”.
    Type: Grant
    Filed: March 5, 1999
    Date of Patent: December 4, 2001
    Assignee: Matsushita Electric Corporation of America
    Inventors: Philippe Gelin, Jean-Claude Junqua
  • Patent number: 6324512
    Abstract: Users of the system can access the TV contents and program media recorder by speaking in natural language sentences. The user interacts with the television and with other multimedia equipment, such as media recorders and VCRs, through the unified access controller. A speaker verification/identification module determines the identity of the speaker and this information is used to control how the dialog between user and system proceeds. Speech can be input through either a microphone or over the telephone. In addition, the user can interact with the system using a suitable computer attached via the internet. Regardless of the mode of access, the unified access controller interprets the semantic content of the user's request and supplies the appropriate control signals to the television tuner and/or recorder.
    Type: Grant
    Filed: August 26, 1999
    Date of Patent: November 27, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Jean-Claude Junqua, Roland Kuhn, Tony Davis, Yi Zhao, Weiying Li
  • Publication number: 20010041980
    Abstract: Speech recognition and natural language parsing components are used to extract the meaning of the user's spoken input. The system stores a semantic representation of an electronic activity guide, and the contents of the guide can be mapped into the grammars used by the natural language parser. Thus, when the user wishes to navigate through the complex menu structure of the electronic activity guide, he or she only needs to speak in natural language sentences. The system automatically filters the contents of the guide and supplies the user with on-screen display or synthesized speech responses to the user's request. The system allows the user to communicate in a natural way with a variety of devices communicating with the home network or home gateway.
    Type: Application
    Filed: June 6, 2001
    Publication date: November 15, 2001
    Inventors: John Howard K. Howard, Jean-Claude Junqua
  • Patent number: 6314398
    Abstract: A speech understanding system for receiving a spoken request from a user and processing the request against a knowledge base of programming information for automatically selecting a television program is disclosed. The speech understanding system includes a knowledge extractor for receiving electronic programming guide (EPG) information and processing the EPG information for creating a program database. The system also includes a speech recognizer for receiving the spoken request and translating the spoken request into a text stream having a plurality of words. A natural language processor is provided for receiving the text stream and processing the words for resolving a semantic content of the spoken request. The natural language processor places the meaning of the words into a task frame having a plurality of key word slots.
    Type: Grant
    Filed: March 1, 1999
    Date of Patent: November 6, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Jean-Claude Junqua, Matteo Contolini
  • Patent number: 6314165
    Abstract: An automated hotel attendant is provided for coordinating room-to-room calling over a telephone switching system that supports multiple telephone extensions. A hotel registration system receives and stores the spelled names of hotel guests as well as assigns each guest an associated telephone extension. A lexicon training system is connected to the hotel registration system for generating pronunciations for each spelled name by converting the characters that spell those names into word-phoneme data. This word-phoneme data is in turn stored in a lexicon that is used by a speech recognition system. In particular, a phoneticizer in conjunction with a Hidden Markov Model (HMM) based model trainer serves as the basis for the lexicon training system, such that one or several HMM models associated with each guest name are stored in the lexicon.
    Type: Grant
    Filed: April 30, 1998
    Date of Patent: November 6, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Jean-Claude Junqua, Matteo Contolini
  • Patent number: 6272462
    Abstract: Supervised adaptation speech is supplied to the recognizer and the recognizer generates the N-best transcriptions of the adaptation speech. These transcriptions include the one transcription known to be correct, based on a priori knowledge of the adaptation speech, and the remaining transcriptions known to be incorrect. The system applies weights to each transcription: a positive weight to the correct transcription and negative weights to the incorrect transcriptions. These weights have the effect of moving the incorrect transcriptions away from the correct one, rendering the recognition system more discriminative for the new speaker's speaking characteristics. Weights applied to the incorrect solutions are based on the respective likelihood scores generated by the recognizer. The sum of all weights (positive and negative) are a positive number. This ensures that the system will converge.
    Type: Grant
    Filed: February 25, 1999
    Date of Patent: August 7, 2001
    Assignee: Panasonic Technologies, Inc.
    Inventors: Patrick Nguyen, Philippe Gelin, Jean-Claude Junqua
  • Patent number: 6263309
    Abstract: A set of speaker dependent models is trained upon a comparatively large number of training speakers, one model per speaker, and model parameters are extracted in a predefined order to construct a set of supervectors, one per speaker. Principle component analysis is then performed on the set of supervectors to generate a set of eigenvectors that define an eigenvoice space. If desired, the number of vectors may be reduced to achieve data compression. Thereafter, a new speaker provides adaptation data from which a supervector is constructed by constraining this supervector to be in the eigenvoice space based on a maximum likelihood estimation. The resulting coefficients in the eigenspace of this new speaker may then be used to construct a new set of model parameters from which an adapted model is constructed for that speaker. Environmental adaptation may be performed by including environmental variations in the training data.
    Type: Grant
    Filed: April 30, 1998
    Date of Patent: July 17, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Patrick Nguyen, Roland Kuhn, Jean-Claude Junqua
  • Patent number: 6253181
    Abstract: The recognizer tests input utterances using a confidence measure to select words of high recognition confidence for use in the adaptation process. Adaptation is performed rapidly using a priori knowledge of about the class of speakers who will be using the system. This a priori knowledge can be expressed using eigenvoice basis vectors that capture information about the entire targeted user population. The dialogue system may also use the confidence measure to output a pronunciation example to the user, based on the confidence that the system has in the results of recognition, given the different possibilities that can be recognized. The dialogue system may also provide voiced prompts that teach the user how to correctly pronounce words.
    Type: Grant
    Filed: January 22, 1999
    Date of Patent: June 26, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Jean-Claude Junqua
  • Patent number: 6233561
    Abstract: A computer-implemented method and apparatus is provided for processing a spoken request from a user. A speech recognizer converts the spoken request into a digital format. A frame data structure associates semantic components of the digitized spoken request with predetermined slots. The slots are indicative of data which are used to achieve a predetermined goal. A speech understanding module which is connected to the speech recognizer and to the frame data structure determines semantic components of the spoken request. The slots are populated based upon the determined semantic components. A dialog manager which is connected to the speech understanding module may determine at least one slot which is unpopulated based upon the determined semantic components and in a preferred embodiment may provide confirmation of the populated slots. A computer generated-request is formulated in order for the user to provide data related to the unpopulated slot.
    Type: Grant
    Filed: April 12, 1999
    Date of Patent: May 15, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Jean-Claude Junqua, Roland Kuhn, Matteo Contolini, Murat Karaorman, Ken Field, Michael Galler, Yi Zhao
  • Patent number: 6233553
    Abstract: New entries are added to the lexicon by entering them as spelled words. A transcription generator, such as a decision-tree-based phoneme or morpheme transcription generator, converts each spelled word into a set of n-best transcriptions or sequences. Meanwhile, user input or automatically generated speech corresponding to the spelled word is processed by an automatic speech recognizer and the recognizer rescores the transcriptions or sequences produced by the transcription generator. One or more of the highest scored (highest confidence) transcriptions may be added to the lexicon to update it. If desired, the spelled word-pronunciation pairs generated by the system can be used to retrain the transcription generator, making the system adaptive or self-learning.
    Type: Grant
    Filed: September 4, 1998
    Date of Patent: May 15, 2001
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Matteo Contolini, Jean-Claude Junqua, Roland Kuhn