Patents by Inventor Patrick An Nguyen

Patrick An Nguyen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing

Patent number: 7324943

Abstract: A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.

Type: Grant

Filed: October 2, 2003

Date of Patent: January 29, 2008

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Luca Rigazio, Robert Boman, Patrick Nguyen, Jean-Claude Junqua
Multi-slot dialog systems and methods

Publication number: 20070255566

Abstract: Systems and methods for constructing a series of interactions with a user to collect multiple pieces of related information for the purpose of accomplishing a specific goal or topic (a multi-slot dialog) using a component-based approach are disclosed. The method generally includes outputting a primary header prompt to elicit values for slots in a segment from the user, receiving a primary user response containing a value for each slot in at least a subset of the slots in the segment, processing the primary user response to determine at least one possible recognition value for each slot contained in the primary user response, filling each slot contained in the primary user response with a matched value selected from the corresponding possible recognition values, and repeating the outputting, receiving, processing and filling for any unfilled slots in the segment until all slots in the segment of slots are filled.

Type: Application

Filed: April 17, 2007

Publication date: November 1, 2007

Applicant: Voxify, Inc.

Inventors: Patrick Nguyen, Jesus Lopez-Amaro, Amit Desai, Adeeb Shana'a
Time-anchored posterior indexing of speech

Publication number: 20070143110

Abstract: A computer-implemented method of indexing a speech lattice for search of audio corresponding to the speech lattice is provided. The method includes identifying at least two speech recognition hypotheses for a word which have time ranges satisfying a criteria. The method further includes merging the at least two speech recognition hypotheses to generate a merged speech recognition hypothesis for the word.

Type: Application

Filed: December 15, 2005

Publication date: June 21, 2007

Applicant: Microsoft Corporation

Inventors: Alejandro Acero, Asela Gunawardana, Ciprian Chelba, Erik Selberg, Frank Torsten Seide, Patrick Nguyen, Roger Yu
Speech recognition using adaptation and prior knowledge

Publication number: 20070129943

Abstract: A speech recognition system includes a feature extraction component that receives a speech signal and extracts feature vectors from the speech signal. Also included, is a decoder having a speech acoustic model, a feature modification component, and a comparison component. The feature modification component changes the feature vectors, using adaptation data and prior data, to more closely match. the speech acoustic model. The comparison component utilizes the modified feature vectors and the speech acoustic model to recognize the speech signal.

Type: Application

Filed: December 6, 2005

Publication date: June 7, 2007

Applicant: Microsoft Corporation

Inventors: Xin Lei, Jonathan Hamaker, Xiaodong He, Patrick Nguyen
Use of a silicon carbide-based ceramic material in aggressive environments

Publication number: 20070086937

Abstract: A SiC-based composite material capable of use as an inner coating for an aluminium smelting furnace or as an inner coating for a fused salt electrolytic cell, wherein said composite material has been prepared from a precursor mixture comprising at least one ?-SiC precursor and at least one carbonated resin, and wherein said composite material contains inclusions, and wherein at least one part thereof comprises ?-SiC, in a ?-SiC matrix.

Type: Application

Filed: April 15, 2004

Publication date: April 19, 2007

Applicants: Universite Louis Pasteur de Strasbourg, SICAT, Centre National de la Recherche Scientifique

Inventors: Charlotte Pham, Cuong Pham-Huu, Marc-Jacques Ledoux, Patricks Nguyen
Method and apparatus for feature domain joint channel and additive noise compensation

Patent number: 7089182

Abstract: A method for performing noise adaptation of a target speech signal input to a speech recognition system, where the target speech signal contains both additive and convolutional noises. The method includes estimating an additive noise bias and a convolutional noise bias; in the target speech signal; and jointly compensating the target speech signal for the additive and convolutional noise biases in a feature domain.

Type: Grant

Filed: March 15, 2002

Date of Patent: August 8, 2006

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Younes Souilmi, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
Recognition system using lexical trees

Patent number: 7035802

Abstract: The dynamic programming technique employs a lexical tree that is encoded in computer memory as a flat representation in which the nodes of each generation occupy contiguous memory locations. The traversal algorithm employs a set of traversal rules whereby nodes of a given generation are processed before the parent nodes of that generation. The deepest child generation is processed first and traversal among nodes of each generation proceeds in the same topological direction.

Type: Grant

Filed: July 31, 2000

Date of Patent: April 25, 2006

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Luca Rigazio, Patrick Nguyen
Multi-slot dialog systems and methods

Publication number: 20060009973

Abstract: Systems and methods for constructing a series of interactions with a user to collect multiple pieces of related information for the purpose of accomplishing a specific goal or topic (a multi-slot dialog) using a component-based approach are disclosed. The method generally includes outputting a primary header prompt to elicit values for slots in a segment from the user, receiving a primary user response containing a value for each slot in at least a subset of the slots in the segment, processing the primary user response to determine at least one possible recognition value for each slot contained in the primary user response, filling each slot contained in the primary user response with a matched value selected from the corresponding possible recognition values, and repeating the outputting, receiving, processing and filling for any unfilled slots in the segment until all slots in the segment of slots are filled.

Type: Application

Filed: July 6, 2004

Publication date: January 12, 2006

Applicant: Voxify, Inc. A CALIFORNIA CORPORATION

Inventors: Patrick Nguyen, Jesus Amaro, Amit Desai, Adeeb Shana'a
Voice personalization of speech synthesizer

Patent number: 6970820

Abstract: The speech synthesizer is personalized to sound like or mimic the speech characteristics of an individual speaker. The individual speaker provides a quantity of enrollment data, which can be extracted from a short quantity of speech, and the system modifies the base synthesis parameters to more closely resemble those of the new speaker. More specifically, the synthesis parameters may be decomposed into speaker dependent parameters, such as context-independent parameters, and speaker independent parameters, such as context dependent parameters. The speaker dependent parameters are adapted using enrollment data from the new speaker. After adaptation, the speaker dependent parameters are combined with the speaker independent parameters to provide a set of personalized synthesis parameters.

Type: Grant

Filed: February 26, 2001

Date of Patent: November 29, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jean-Claude Junqua, Florent Perronnin, Roland Kuhn, Patrick Nguyen
Media production system using time alignment to scripts

Publication number: 20050228663

Abstract: A media production system includes a textual alignment module aligning multiple speech recordings to textual lines of a script based on speech recognition results. A navigation module responds to user navigation selections respective of the textual lines of the script by communicating to the user corresponding, line-specific portions of the multiple speech recordings. An editing module responds to user associations of multiple speech recordings with textual lines by accumulating line-specific portions of the multiple speech recordings in a combination recording based on at least one of relationships of textual lines in the script to the combination recording, and temporal alignments between the multiple speech recordings and the combination recording.

Type: Application

Filed: March 31, 2004

Publication date: October 13, 2005

Inventors: Robert Boman, Patrick Nguyen, Jean-Claude Junqua
Pattern matching for large vocabulary speech recognition with packed distribution and localized trellis access

Publication number: 20050159952

Abstract: A method is provided for improving pattern matching in a speech recognition system having a plurality of acoustic models (20). Similarity measures for acoustic feature vectors (54) are determined in groups that are then buffered into cache memory (59). To further reduce computational processing, the acoustic data may be partitioned amongst a plurality of processing nodes (66, 67, 68). In addition, a priori knowledge of the spoken order may be used to establish the access order (124) used to copy records from the main speech parameter table (120, 200) into a sub-table (130, 204). The sub-table is processed such that the entries are in contiguous memory locations (206) and sorted according to the processing order (208). The speech processing algorithm is then directed to operate upon the sub-table (210) which causes the processor to load the sub-table into high speed cache memory (104, 212).

Type: Application

Filed: March 19, 2003

Publication date: July 21, 2005

Applicant: MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD

Inventors: Patrick Nguyen, Luca Rigazio
Speaker and environment adaptation based on linear separation of variability sources

Patent number: 6915259

Abstract: Linear approximation of the background noise is applied after feature extraction and prior to speaker adaptation to allow the speaker adaptation system to adapt the speech models to the enrolling user without distortion from background noise. The linear approximation is applied in the feature domain, such as in the cepstral domain. Any adaptation technique that is commutative in the feature domain may be used.

Type: Grant

Filed: May 24, 2001

Date of Patent: July 5, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Luca Rigazio, Patrick Nguyen, David Kryze, Jean-Claude Junqua
System and method of media file access and retrieval using speech recognition

Patent number: 6907397

Abstract: An embedded device for playing media files is capable of generating a play list of media files based on input speech from a user. It includes an indexer generating a plurality of speech recognition grammars. According to one aspect of the invention, the indexer generates speech recognition grammars based on contents of a media file header of the media file. According to another aspect of the invention, the indexer generates speech recognition grammars based on categories in a file path for retrieving the media file to a user location. When a speech recognizer receives an input speech from a user while in a selection mode, a media file selector compares the input speech received while in the selection mode to the plurality of speech recognition grammars, thereby selecting the media file.

Type: Grant

Filed: September 16, 2002

Date of Patent: June 14, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: David Kryze, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
Focused language models for improved speech input of structured documents

Patent number: 6901364

Abstract: An e-mail message process is provided for use with a personal digital assistant which allows for the use of input speech messaging which is converted to text using a focused language model which is downloaded by a cellular phone connection to an Internet server which provides the focused language model based upon a topic for the intended e-mail message. The text that is generated from the input speech method can be summarized by the e-mail message processor and can be edited by the user. The generated e-mail message can then be transmitted again via cellular connection to an Internet e-mail server for transmitting the e-mail message to a recipient.

Type: Grant

Filed: September 13, 2001

Date of Patent: May 31, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Patrick Nguyen, Luca Rigazio, Jean-Claude Junqua
Eigenvoice re-estimation technique of acoustic models for speech recognition, speaker identification and speaker verification

Patent number: 6895376

Abstract: A reduced dimensionality eigenvoice analytical technique is used during training to develop context-dependent acoustic models for allophones. Re-estimation processes are performed to more strongly separate speaker-dependent and speaker-independent components of the speech model. The eigenvoice technique is also used during run time upon the speech of a new speaker. The technique removes individual speaker idiosyncrasies, to produce more universally applicable and robust allophone models. In one embodiment the eigenvoice technique is used to identify the centroid of each speaker, which may then be “subtracted out” of the recognition equation.

Type: Grant

Filed: May 4, 2001

Date of Patent: May 17, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Florent Perronnin, Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua
Pattern matching for large vocabulary speech recognition systems

Patent number: 6879954

Abstract: A method is provided for improving pattern matching in a speech recognition system having a plurality of acoustic models. The improved method includes: receiving continuous speech input; generating a sequence of acoustic feature vectors that represent temporal and spectral behavior of the speech input; loading a first group of acoustic feature vectors from the sequence of acoustic feature vectors into a memory workspace accessible to a processor; loading an acoustic model from the plurality of acoustic models into the memory workspace; and determining a similarity measure for each acoustic feature vector of the first group of acoustic feature vectors in relation to the acoustic model. Prior to retrieving another group of acoustic feature vectors, similarity measures are computed for the first group of acoustic feature vectors in relation to each of the acoustic models employed by the speech recognition system.

Type: Grant

Filed: April 22, 2002

Date of Patent: April 12, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Patrick Nguyen, Luca Rigazio
Voice tagging, voice annotation, and speech recognition for portable devices with optional post processing

Publication number: 20050075881

Abstract: A media capture device has an audio input receptive of user speech relating to a media capture activity in close temporal relation to the media capture activity. A plurality of focused speech recognition lexica respectively relating to media capture activities are stored on the device, and a speech recognizer recognizes the user speech based on a selected one of the focused speech recognition lexica. A media tagger tags captured media with generated speech recognition text, and a media annotator annotates the captured media with a sample of the user speech that is suitable for input to a speech recognizer. Tagging and annotating are based on close temporal relation between receipt of the user speech and capture of the captured media. Annotations may be converted to tags during post processing, employed to edit a lexicon using letter-to-sound rules and spelled word input, or matched directly to speech to retrieve captured media.

Type: Application

Filed: October 2, 2003

Publication date: April 7, 2005

Inventors: Luca Rigazio, Robert Boman, Patrick Nguyen, Jean-Claude Junqua
Bubble splitting for compact acoustic modeling

Publication number: 20050038655

Abstract: An improved method is provided for constructing compact acoustic models for use in a speech recognizer. The method includes: partitioning speech data from a plurality of training speakers according to at least one speech related criteria (i.e., vocal tract length); grouping together the partitioned speech data from training speakers having a similar speech characteristic; and training an acoustic bubble model for each group using the speech data within the group.

Type: Application

Filed: August 13, 2003

Publication date: February 17, 2005

Inventors: Ambroise Mutel, Patrick Nguyen, Luca Rigazio
Speech data mining for call center management

Publication number: 20050010411

Abstract: A speech data mining system for use in generating a rich transcription having utility in call center management includes a speech differentiation module differentiating between speech of interacting speakers, and a speech recognition module improving automatic recognition of speech of one speaker based on interaction with another speaker employed as a reference speaker. A transcript generation module generates a rich transcript based on recognized speech of the speakers. Focused, interactive language models improve recognition of a customer on a low quality channel using context extracted from speech of a call center operator on a high quality channel with a speech model adapted to the operator. Mined speech data includes number of interaction turns, customer frustration phrases, operator polity, interruptions, and/or contexts extracted from speech recognition results, such as topics, complaints, solutions, and resolutions.

Type: Application

Filed: July 9, 2003

Publication date: January 13, 2005

Inventors: Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua, Robert Boman
Pressure/flow control valve and system using same

Publication number: 20040211422

Abstract: A pressure/flow control valve that includes a first valve member and a second valve member moveable relative to one another and cooperating with one another so as to define a valve opening having a size that varies with a relative position between these members. The first magnet is operatively coupled to the first valve member and a second magnet is operatively coupled to the second valve member and magnetically coupled to the first magnet. The first and second magnets are disposed such that a repulsive force between them increases asymptotically as the magnets move together, thereby providing a dampening force between the first and second valve members that allows these valve members to be moved rapidly from one position to the next in a highly controlled fashion.

Type: Application

Filed: April 26, 2004

Publication date: October 28, 2004

Applicant: RIC Investments, Inc.

Inventors: Mabini M. Arcilla, Mehdi M. Jafari, Patrick Nguyen

prev 1 2 3 4 5 6 next