Patents by Inventor David Kryze

David Kryze has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Speaker and environment adaptation based on linear separation of variability sources

Patent number: 6915259

Abstract: Linear approximation of the background noise is applied after feature extraction and prior to speaker adaptation to allow the speaker adaptation system to adapt the speech models to the enrolling user without distortion from background noise. The linear approximation is applied in the feature domain, such as in the cepstral domain. Any adaptation technique that is commutative in the feature domain may be used.

Type: Grant

Filed: May 24, 2001

Date of Patent: July 5, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Luca Rigazio, Patrick Nguyen, David Kryze, Jean-Claude Junqua
System and method of media file access and retrieval using speech recognition

Patent number: 6907397

Abstract: An embedded device for playing media files is capable of generating a play list of media files based on input speech from a user. It includes an indexer generating a plurality of speech recognition grammars. According to one aspect of the invention, the indexer generates speech recognition grammars based on contents of a media file header of the media file. According to another aspect of the invention, the indexer generates speech recognition grammars based on categories in a file path for retrieving the media file to a user location. When a speech recognizer receives an input speech from a user while in a selection mode, a media file selector compares the input speech received while in the selection mode to the plurality of speech recognition grammars, thereby selecting the media file.

Type: Grant

Filed: September 16, 2002

Date of Patent: June 14, 2005

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: David Kryze, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
Apparatus and method for voice-tagging lexicon

Publication number: 20050114131

Abstract: A voice-tag editor develops voice-tag “sounds like” pairs for a voice-tagging lexicon. The voice-tag editor is receptive of alphanumeric characters input by a user. The alphanumeric characters are indicative of a voice tag and/or “sounds like” text. The voice-tag editor is configured to allow the user to view and edit the alphanumeric characters. A text parser connected to the voice-tag editor generates normalized text corresponding to the “sounds like” text. The normalized text serves as recognition text for the voice tag and is displayed by the voice-tag editor. A storage mechanism is connected to the editor. The storage mechanism updates the lexicon with the alphanumeric characters which represent voice-tag “sounds like” pairs.

Type: Application

Filed: November 24, 2003

Publication date: May 26, 2005

Inventors: Kirill Stoimenov, David Kryze, Peter Veprek
Methods and apparatus for speech end-point detection

Publication number: 20040064314

Abstract: In one aspect, the present invention provides a method for detecting speech end points in an input signal containing speech portions and non-speech (noise) portions. The method includes processing signal frames of a digital input signal containing speech and non-speech portions to extract features from the signal frames, comparing at least one property of the processed signal frames to a noise model and a speech model to determine whether a processed signal frame contains speech or noise, generating a signal indicative of the speech or noise determination, and updating either the speech model or the noise model depending upon whether a processed signal frame is determined to contain speech or noise, respectively. In some configurations, the method also includes resetting the speech and noise models dependent upon whether a number of zero crossings in a determined inter-frame correlation is greater than a threshold number.

Type: Application

Filed: September 27, 2002

Publication date: April 1, 2004

Inventors: Nicolas de Saint Aubert, David Kryze
System and method of media file access and retrieval using speech recognition

Publication number: 20040054541

Abstract: An embedded device for playing media files is capable of generating a play list of media files based on input speech from a user. It includes an indexer generating a plurality of speech recognition grammars. According to one aspect of the invention, the indexer generates speech recognition grammars based on contents of a media file header of the media file. According to another aspect of the invention, the indexer generates speech recognition grammars based on categories in a file path for retrieving the media file to a user location. When a speech recognizer receives an input speech from a user while in a selection mode, a media file selector compares the input speech received while in the selection mode to the plurality of speech recognition grammars, thereby selecting the media file.

Type: Application

Filed: September 16, 2002

Publication date: March 18, 2004

Inventors: David Kryze, Luca Rigazio, Patrick Nguyen, Jean-Claude Junqua
Speaker and environment adaptation based on linear separation of variability sources

Publication number: 20030050780

Abstract: Linear approximation of the background noise is applied after feature extraction and prior to speaker adaptation to allow the speaker adaptation system to adapt the speech models to the enrolling user without distortion from background noise. The linear approximation is applied in the feature domain, such as in the cepstral domain. Any adaptation technique that is commutative in the feature domain may be used.

Type: Application

Filed: May 24, 2001

Publication date: March 13, 2003

Inventors: Luca Rigazio, Patrick Nguyen, David Kryze, Jean-Claude Junqua
Optimized local feature extraction for automatic speech recognition

Patent number: 6513004

Abstract: The acoustic speech signal is decomposed into wavelets arranged in an asymmetrical tree data structure from which individual nodes may be selected to best extract local features, as needed to model specific classes of sound units. The wavelet packet transformation is smoothed through integration and compressed to apply a non-linearity prior to discrete cosine transformation. The resulting subband features such as cepstral coefficients may then be used to construct the speech recognizer's speech models. Using the local feature information extracted in this manner allows a single recognizer to be optimized for several different classes of sound units, thereby eliminating the need for parallel path recognizers.

Type: Grant

Filed: November 24, 1999

Date of Patent: January 28, 2003

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Luca Rigazio, David Kryze, Ted Applebaum, Jean-Claude Junqua

prev 1 2 3 4

Speaker and environment adaptation based on linear separation of variability sources

System and method of media file access and retrieval using speech recognition

Apparatus and method for voice-tagging lexicon

Methods and apparatus for speech end-point detection

System and method of media file access and retrieval using speech recognition

Speaker and environment adaptation based on linear separation of variability sources

Optimized local feature extraction for automatic speech recognition