Patents by Inventor Jean-Claude Junqua

Jean-Claude Junqua has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Text selection and recording by feedback and adaptation for development of personalized text-to-speech systems

Patent number: 6792407

Abstract: A new speaker provides speech from which comparison snippets are extracted. The comparison snippets are compared with initial snippets stored in a recorded snippet database that is associated with a concatenative synthesizer. The comparison of the snippets to the initial snippets produces required sound units. A greedy selection algorithm is performed with the required sound units for identifying the smallest subset of the input text that contains all of the text for the new speaker to read. The new speaker then reads the optimally selected text and sound units are extracted from the human speech such that the recorded snippet database is modified and the speech synthesized adopts the voice quality and characteristics of the new speaker.

Type: Grant

Filed: March 30, 2001

Date of Patent: September 14, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Nicholas Kibre, Steven Pearson, Brian Hanson, Jean-Claude Junqua
Gaussian model-based dynamic time warping system and method for speech processing

Publication number: 20040122672

Abstract: The Gaussian Dynamic Time Warping model provides a hierarchical statistical model for representing an acoustic pattern. The first layer of the model represents the general acoustic space; the second layer represents each speaker space and the third layer represents the temporal structure information contained in each enrollment speech utterance, based on equally-spaced time intervals. These three layers are hierarchically developed: the second layer is derived from the first, and the third layer is derived from the second. The model is useful in speech processing application, particularly in applications such as word and speaker recognition, using a spotting recognition mode.

Type: Application

Filed: December 18, 2002

Publication date: June 24, 2004

Inventors: Jean-Francois Bonastre, Philippe Morin, Jean-Claude Junqua
Distributed apparatus to improve safety and communication for law enforcement applications

Publication number: 20040085203

Abstract: A wearable, computerized apparatus for use with law enforcement has an evidence collector adapted to collect evidentiary information of a type collected according to law enforcement procedures and useful for identification of a suspect. It further has a safety monitor adapted to collect safety information relating to well-being of an officer. A wireless communications link communicates the evidentiary information and the safety information to a centralized component of a distributed communications system to assist in identifying suspects and dispatching assistance.

Type: Application

Filed: November 5, 2002

Publication date: May 6, 2004

Inventor: Jean-Claude Junqua
Technique for developing discriminative sound units for speech recognition and allophone modeling

Patent number: 6711541

Abstract: A set of models is developed to represent sound units and these models are then used with the incorrect sound units to determine which generate high likelihood scores. The models generating high likelihood scores for the incorrect sound units represent those that are more likely to be confused. The resulting confusability data may then be used in generating more discriminative speech models and in subsequent pruning of the acoustic decision tree. The confusability data may also be used to develop confusability predictors used for rejection during search and in developing continuous speech recognition models that are optimized to minimize confusability.

Type: Grant

Filed: September 7, 1999

Date of Patent: March 23, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Roland Kuhn, Jean-Claude Junqua, Matteo Contolini
Client-server voice customization

Publication number: 20040054534

Abstract: A user customizes a synthesized voice in a distributed speech synthesis system. The user selects voice criteria at a local device. The voice criteria represents characteristics that the user desires for a synthesized voice. The voice criteria is communicated to a network device. The network device generates a set of synthesized voice rules based on the voice criteria. The synthesized voice rules represent prosodic aspects and other characteristics of the synthesized voice. The synthesized voice rules are communicated to the local device and used to create the synthesized voice.

Type: Application

Filed: September 13, 2002

Publication date: March 18, 2004

Inventor: Jean-Claude Junqua
System and method of media file access and retrieval using speech recognition

Publication number: 20040054541

Abstract: An embedded device for playing media files is capable of generating a play list of media files based on input speech from a user. It includes an indexer generating a plurality of speech recognition grammars. According to one aspect of the invention, the indexer generates speech recognition grammars based on contents of a media file header of the media file. According to another aspect of the invention, the indexer generates speech recognition grammars based on categories in a file path for retrieving the media file to a user location. When a speech recognizer receives an input speech from a user while in a selection mode, a media file selector compares the input speech received while in the selection mode to the plurality of speech recognition grammars, thereby selecting the media file.

Type: Application

Filed: September 16, 2002

Publication date: March 18, 2004

Inventors: David Kryze, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
Multimodal concierge for secure and convenient access to a home or building

Publication number: 20040046641

Abstract: An improved method is provided for enrolling with a resource security system. The method includes: providing an access code to a system user; accessing the resource security system using the access code; prompting the user to input a biometric feature which identifies the user; capturing a biometric feature associated with the user; and associating the captured biometric feature with the identity of the user for subsequent verification. The method further includes subsequently granting access to the secured resource based on biometric feature data input by the user.

Type: Application

Filed: September 9, 2002

Publication date: March 11, 2004

Inventors: Jean-Claude Junqua, Philippe Morin
Speaker verification and speaker identification based on a priori knowledge

Patent number: 6697778

Abstract: Client speaker locations in a speaker space are used to generate speech models for comparison with test speaker data or test speaker speech models. The speaker space can be constructed using training speakers that are entirely separate from the population of client speakers, or from client speakers, or from a mix of training and client speakers. Reestimation of the speaker space based on client environment information is also provided to improve the likelihood that the client data will fall within the speaker space. During enrollment of the clients into the speaker space, additional client speech can be obtained when predetermined conditions are met. The speaker distribution can also be used in the client enrollment step.

Type: Grant

Filed: July 5, 2000

Date of Patent: February 24, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Roland Kuhn, Olivier Thyes, Patrick Nguyen, Jean-Claude Junqua, Robert Boman
Method for additive and convolutional noise adaptation in automatic speech recognition using transformed matrices

Patent number: 6691091

Abstract: A noise adaptation system and method provide for noise adaptation in a speech recognition system. The method includes the steps of generating a reference model based on a training speech signal, and compensating the reference model for additive noise in the cepstral domain. The reference model is also compensated for convolutional noise in the cepstral domain. In one embodiment, the convolutional noise is compensated for by estimating a convolutional bias between the reference model and a target speech signal. The estimated convolutional bias is transformed with a channel adaptation matrix, and the transformed convolutional bias is added to the reference model in the cepstral domain.

Type: Grant

Filed: July 31, 2000

Date of Patent: February 10, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Christophe Cerisara, Luca Rigazio, Robert Boman, Jean-Claude Junqua
Methods and apparatus for blind channel estimation based upon speech correlation structure

Patent number: 6687672

Abstract: Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.

Type: Grant

Filed: March 15, 2002

Date of Patent: February 3, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
Small footprint language and vocabulary independent word recognizer using registration by word spelling

Patent number: 6684185

Abstract: A phoneticizer converts spelled words or names into one or an n-best number of phonetic transcriptions. The n-best transcriptions may be generated from a single transcription using a confusion matrix. These n-best transcriptions are then transformed into hybrid units. Preferably only the most frequently encountered units are stored as syllables, with the remainder being stored as smaller units such as demi-syllables or phonemes. Voice input is then used to rescore the n-best transcriptions and these are stored preferably as speaker-independent, similarity-based hybrid units concatenated into a string representing the spelled word.

Type: Grant

Filed: September 4, 1998

Date of Patent: January 27, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jean-Claude Junqua, Ted Applebaum, Roland Kuhn
Method and apparatus for communicating between a portable device and a server

Publication number: 20030212465

Abstract: A control system including a portable device and a server. The portable device includes: (1) a body; (2) a microphone for receiving a first audio data; (3) an audio coder for converting the first audio data to first audio data signals; (4) an optical sensor for reading a first optical data; (5) an optical coder for converting the first optical data to first optical data signals; and (6) a transmitter for transmitting at least the first audio data signals or the first optical data signals to the server.

Type: Application

Filed: May 9, 2002

Publication date: November 13, 2003

Inventors: John K. Howard, Dwayne Escola, Jim Pollock, Jean-Claude Junqua
Voice activated controller for recording and retrieving audio/video programs

Patent number: 6643620

Abstract: The system includes a database of program records representing A/V programs which are available for recording. The system also includes an A/V recording device for receiving a recording command and recording the A/V program. A speech recognizer is provided for receiving the spoken request and translating the spoken request into a text stream having a plurality of words. A natural language processor receives the text stream and processes the words for resolving a semantic content of the spoken request. The natural language processor places the meaning of the words into a task frame having a plurality of key word slots. A dialogue system analyzes the task frame for determining if a sufficient number of key word slots have been filled and prompts the user for additional information for filling empty slots.

Type: Grant

Filed: March 15, 1999

Date of Patent: November 4, 2003

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Matteo Contolini, Jean-Claude Junqua, Roland Kuhn
Method and apparatus for natural language parsing using multiple passes and tags

Patent number: 6631346

Abstract: A computer-implemented speech parsing method and apparatus for processing an input phrase. The method and apparatus include providing a plurality of grammars that are indicative of predetermined topics. A plurality of parse forests are generated using the grammars. Tags are associated with words preferably according to a scoring scheme utilizing the generated parse forests while parsing the input phrase. The tags that are associated with the words are used as a parsed representation of the input phrase.

Type: Grant

Filed: April 7, 1999

Date of Patent: October 7, 2003

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Murat Karaorman, Jean-Claude Junqua
Speaker authentication system and method

Publication number: 20030182119

Abstract: A speaker authentication system includes a data fuser operable to fuse information to assist in authenticating a speaker providing audio input. In other aspects, the system includes a data store of speaker voiceprints and a voiceprint matching module adapted to receive an audio input and operable to attempt to assist in authenticating a speaker by matching the audio input to at least one of the speaker voiceprints.

Type: Application

Filed: March 20, 2003

Publication date: September 25, 2003

Inventors: Jean-Claude Junqua, Matteo Contolini
Methods and apparatus for blind channel estimation based upon speech correlation structure

Publication number: 20030177003

Abstract: Methods and apparatus for blind channel estimation of a speech signal corrupted by a communication channel are provided. One method includes converting a noisy speech signal into either a cepstral representation or a log-spectral representation; estimating a correlation of the representation of the noisy speech signal; determining an average of the noisy speech signal; constructing and solving, subject to a minimization constraint, a system of linear equations utilizing a correlation structure of a clean speech training signal, the correlation of the representation of the noisy speech signal, and the average of the noisy speech signal; and selecting a sign of the solution of the system of linear equations to estimate an average clean speech signal in a processing window.

Type: Application

Filed: March 15, 2002

Publication date: September 18, 2003

Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
Computer telephony system to access secure resources

Publication number: 20030171930

Abstract: User interaction with a secure resource is controlled or mediated by the security server that includes a telephony interface by which the server is either coupled to the telephone system or provides messages to the telephone system directly or through an intermediate component. A biometric data store stores biometric data, such as speech data or visual recognition data. If desired the biometric data may also be stored in association with the extension identifiers of the telephone system. A biometric verification/.identification system accesses this data store and evaluates provided user biometric data vis-à-vis the stored biometric data to determine if the user may control or interact with the secure resource. If interaction is permitted, the security server sends control signals to the secure resource. The telephone system provides an interface through which the user trains the system to store the biometric verification/.identification data of that user.

Type: Application

Filed: March 7, 2002

Publication date: September 11, 2003

Inventor: Jean-Claude Junqua
Customizing the speaking style of a speech synthesizer based on semantic analysis

Publication number: 20030163314

Abstract: A method is provided for customizing the speaking style of a speech synthesizer. The method includes: receiving input text; determining semantic information for the input text; determining a speaking style for rendering the input text based on the semantic information; and customizing the audible speech output of the speech synthesizer based on the identified speaking style.

Type: Application

Filed: February 27, 2002

Publication date: August 28, 2003

Inventor: Jean-Claude Junqua
Personalized agent for portable devices and cellular phone

Publication number: 20030157968

Abstract: Personalized agent services are provided in a personal messaging device, such as a cellular telephone or personal digital assistant, through services of a speech recognizer that converts speech into text and a text-to-speech synthesizer that converts text to speech. Both recognizer and synthesizer may be server-based or locally deployed within the device. The user dictates an e-mail message which is converted to text and stored. The stored text is sent back to the user as text or as synthesized speech, to allow the user to edit the message and correct transcription errors before sending as e-mail. The system includes a summarization module that prepares short summaries of incoming e-mail and voice mail. The user may access these summaries, and retrieve and organize email and voice mail using speech commands.

Type: Application

Filed: February 18, 2002

Publication date: August 21, 2003

Inventors: Robert Boman, Kirill Stoimenov, Roland Kuhn, Jean-Claude Junqua
Method for natural dialog interface to car devices

Patent number: 6598018

Abstract: A computer-implemented method and apparatus for processing a spoken request from a user to control an automobile device. A speech recognizer recognizes a user's speech input and a speech understanding module determines semantic components of the speech input. A dialogue manager determines insufficiency in the input speech, and also provides the user with information about a device in response to the input speech.

Type: Grant

Filed: December 15, 1999

Date of Patent: July 22, 2003

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventor: Jean-Claude Junqua

prev 1 2 3 4 5 6 7 next