Patents Examined by David D. Knepper

Computationally efficient method and apparatus for speaker recognition

Patent number: 6772119

Abstract: A speaker recognition technique is provided that can operate within the memory and processing constraints of existing portable computing devices. A smaller memory footprint and computational efficiency are achieved using single Gaussian models for each enrolled speaker. During enrollment, features are extracted from one or more enrollment utterances from each enrolled speaker, to generate a target speaker model based on a sample covariance matrix. During a recognition phase, features are extracted from one or more test utterances to generate a test utterance model that is also based on the sample covariance matrix. A sphericity ratio is computed that compares the test utterance model to the target speaker model, as well as a background model. The sphericity ratio indicates how similar test utterance speech is to the speech used when the user was enrolled, as represented by the target speaker model, and how dissimilar the test utterance speech is from the background model.

Type: Grant

Filed: December 10, 2002

Date of Patent: August 3, 2004

Assignee: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Ganesh N. Ramaswamy, Ran Zilca
DATA PROCESSING APPARATUS FOR PROCESSING SOUND DATA, A DATA PROCESSING METHOD FOR PROCESSING SOUND DATA, A PROGRAM PROVIDING MEDIUM FOR PROCESSING SOUND DATA, AND A RECORDING MEDIUM FOR PROCESSING SOUND DATA

Patent number: 6772113

Abstract: A data processing apparatus and method in which spectral characteristic information and waveform characteristic information within a time area are detected from inputted audio data and the detected spectral characteristic information and waveform characteristic information are recorded together with information indicating a correspondence relationship with the audio data. As a result, an efficient search can be achieved when searching audio data.

Type: Grant

Filed: January 21, 2000

Date of Patent: August 3, 2004

Assignee: Sony Corporation

Inventors: Noriaki Fujita, Yasuhiro Toguri
Method and a device for recognizing speech

Patent number: 6772117

Abstract: In a speech recognition method and apparatus, according to the present invention, feature vectors produced by an analysing unit of a speech recognition device are modified for compensating the effects of noise. According to the invention, feature vectors are normalized using a sliding normalization buffer (31). By means of the method according to the invention, the performance of the speech recognition device improves in situations, wherein the speech recognition device's training phase has been carried out in a noise environment that differs from the noise environment of the actual speech recognition phase.

Type: Grant

Filed: April 9, 1998

Date of Patent: August 3, 2004

Assignee: Nokia Mobile Phones Limited

Inventors: Kari Laurila, Olli Viikki
Content-driven speech- or audio-browser

Patent number: 6772124

Abstract: The Internet is searched in order to find resources that provide streamable audio such as live Internet broadcasts. The resources are identified based on their file extension and are categorized according to, e.g., the natural language or music style. The user is enabled to browse the collection based on textual or musical input.

Type: Grant

Filed: November 5, 2002

Date of Patent: August 3, 2004

Assignee: Koninklijke Philips Electronics N.V.

Inventors: Mark B. Hoffberg, Yevgeniy Eugene Shteyn
Character animation

Patent number: 6772122

Abstract: The present invention provides a method and apparatus for generating an animated character representation. This is achieved by using marked-up data including both content data and presentation data. The system then uses this information to generate phoneme and viseme data representing the speech to be presented by the character. By providing the presentation data this ensures that at least some variation in character appearance will automatically occur beyond that of the visemes required to make the character appear to speak. This contributes to the character having a far more lifelike appearance.

Type: Grant

Filed: February 11, 2003

Date of Patent: August 3, 2004

Assignee: Ananova Limited

Inventors: Jonathan Simon Jowitt, William James Cooper, Andrew Robert Burgess
Apparatus and method for noise attenuation in a speech recognition system

Patent number: 6768979

Abstract: The noise suppressor utilizes statistical characteristics of the noise signal to attenuate amplitude values of the noisy speech signal that have a probability of containing noise. In one embodiment, the noise suppressor utilizes an attenuation function having a shape determined in part by a noise average and a noise standard deviation. In a further embodiment, the noise suppressor also utilizes an adaptive attenuation coefficient that depends on signal-to-noise conditions in the speech recognition system.

Type: Grant

Filed: March 31, 1999

Date of Patent: July 27, 2004

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Xavier Menéndez-Pidal, Miyuki Tanaka, Ruxin Chen
Sentence recognition apparatus, sentence recognition method, program, and medium

Patent number: 6763331

Abstract: In the prior art, it has been difficult to perform proper sentence recognition by using speech recognition or text sentence recognition. The present invention provides a sentence recognition apparatus comprising: a data base for storing a plurality of predetermined standard content word pairs each formed from a plurality of predetermined content words; a speech recognition means of recognizing an input sentence made up of a plurality of words; a content word selection means of selecting content words from among the plurality of words forming the recognized sentence; a judging means of judging whether a content word pair arbitrarily formed from the selected content words matches any one of the standard content word pairs stored in the data base; and an erroneously recognized content word determining means 105 of determining, based on the result of the judgement, an erroneously recognized content word for which the recognition failed from among the selected content words.

Type: Grant

Filed: April 9, 2003

Date of Patent: July 13, 2004

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Yumi Wakita, Kenji Matsui
Speech detection system and method

Patent number: 6757651

Abstract: A system, method and computer program product for performing speech detection. The method first receives a sound signal and determines if the energy value of the sound signal is above a threshold energy value. If the energy level of the signal is above the threshold energy value, the method determines a predictive signal of the received signal, subtracts the predictive signal from the signal, and determines if the result of the subtraction indicates the presence of speech. If it is determined that no presence of speech is indicated, the threshold energy value is set to the energy level of the present received signal. If it is determined that the result of the subtraction indicates the presence of speech, the received signal is sent to a speech recognition engine. The speech recognition engine generates control system commands for controlling one or more system components. The system components are vehicle system components.

Type: Grant

Filed: December 17, 2001

Date of Patent: June 29, 2004

Assignee: Intellisist, LLC

Inventor: Julien Rivarol Vergin
System for selling a product utilizing audio content identification

Patent number: 6748360

Abstract: It is determined whether audio identifying information generated for an audio content image matches audio identifying information in an audio content database. If the audio identifying information generated for the audio content image matches audio identifying information in the audio content database, at least one product containing or relating to audio content that corresponds to the matching audio identifying information is identified. In one embodiment, the audio content image is received, and the audio identifying information is generated for the audio content image. In another embodiment, the audio identifying information for the audio content image is received. Also provided is a system for selling products.

Type: Grant

Filed: February 21, 2002

Date of Patent: June 8, 2004

Assignee: International Business Machines Corporation

Inventors: Michael C. Pitman, Blake G. Fitch, Steven Abrams, Robert S. Germain
Differential stereo using two coding techniques

Patent number: 6741965

Abstract: A first audio signal is generated from a number of stereo input channels (such as a left and a right channel). A signal level that corresponds to one of the plurality of input channels and another signal level from another of the plurality of input channels are determined. A second audio signal is selected on the basis of the signal levels such that the second audio signal is selected from the group consisting of the one input channel, the other input channel, and a signal generated from the number of input channels that is different than the first audio signal. The first audio signal and the selected second audio signal are separately coded.

Type: Grant

Filed: March 12, 1999

Date of Patent: May 25, 2004

Assignee: Sony Corporation

Inventors: Osamu Shimoyoshi, Kyoya Tsutsui
Spectral enhancement of acoustic signals to provide improved recognition of speech

Patent number: 6732073

Abstract: A method and apparatus for enhancing an auditory signal to make sounds, particularly speech sounds, more distinguishable. An input auditory signal is divided into a plurality of spectral channels. An output gain for each channel is derived based on the time varying history of the energy in the channel and, preferably, the time varying history of energy in neighboring channels. The magnitude of the output gain for each channel thus derived is preferably inversely related to the history of energy in the channel. The output gain derived for each channel is applied to the channel to form a plurality of modified spectral channel signals. The plurality of modified spectral channel signals are combined to form an enhanced output auditory signal. The present invention is particularly applicable to electronic hearing aid devices, speech recognition systems, and the like.

Type: Grant

Filed: September 7, 2000

Date of Patent: May 4, 2004

Assignee: Wisconsin Alumni Research Foundation

Inventors: Keith R. Kluender, Rick L. Jenison
Speech recognizing GIS/GPS/AVL system

Patent number: 6732077

Abstract: A speech recognition equipped geographic information recording apparatus and method. In one embodiment, a mobile data terminal has a communication node therein. A geographic mapping system is integral with the mobile data terminal, and is coupled to the communication node. A speech recognition system adapted to receive verbal information is also coupled to the mobile data terminal. The speech recognition system is adapted to receive attribute data verbalized by an operator of the mobile data terminal. Additionally, the speech recognition system is adapted to receive operating commands verbalized by an operator of the mobile data terminal. The communication node of the mobile data terminal includes a transmitter for sending information from the mobile data terminal to a desired location, and a receiver for receiving information from a desired location. In the present embodiment, a real-time communication link exists between the mobile data terminal and the desired location.

Type: Grant

Filed: May 28, 1996

Date of Patent: May 4, 2004

Assignee: Trimble Navigation Limited

Inventors: Charles Gilbert, James M. Janky, Charles N. Branch, Mark E. Nichols
Apparatus and method using speech recognition and scripts to capture, author and playback synchronized audio and video

Patent number: 6728682

Abstract: Audio associated with a video program, such as an audio track or live or recorded commentary, may be analyzed to recognize or detect one or more predetermined sound patterns, such as words or sound effects. The recognized or detected sound patterns may be used to enhance video processing, by controlling video capture and/or delivery during editing, or to facilitate selection of clips or splice points during editing.

Type: Grant

Filed: October 25, 2001

Date of Patent: April 27, 2004

Assignee: Avid Technology, Inc.

Inventor: Peter Fasciano
Conversion scheme for use between DTX and non-DTX speech coding systems

Patent number: 6721712

Abstract: In an exemplary conversion scheme, a frame of a first speech signal comprising a plurality of frames encoded at a plurality of first rates, including a first non-speech rate, is received. The rate of the received frame is determined, and if the received frame is encoded at the first non-speech rate, then the received frame is re-encoded at either a second or third non-speech rate to generate a frame of a second speech signal. Moreover, a system for converting a speech signal comprises a receiver for receiving a frame of a first speech signal and a processor capable of determining the encoding rate of the received frame and re-encoding the received frame at either a second or third non-speech rate if the received frame was originally encoded at a first non-speech rate.

Type: Grant

Filed: January 24, 2002

Date of Patent: April 13, 2004

Assignee: Mindspeed Technologies, Inc.

Inventors: Adil Benyassine, Eyal Shlomot, Huan-Yu Su
Method and apparatus for reducing aliasing in cascaded filter banks

Patent number: 6718300

Abstract: A method and apparatus are disclosed for reducing aliasing between neighboring subbands in cascaded filter banks. An alias reduction filter bank is included to reduce the aliasing components between different subbands. Generally, the magnitude response and phase of the alias reduction filter bank is similar to the magnitude response of the synthesis filter bank of the first stage filter bank. The alias reduction filter bank filters and adds the signals from a set of M2 subbands from the M1 subbands of the first stage analysis filter bank. A higher frequency resolution is obtained after the alias reduction stage by a following analysis filter bank. The signals of these subbands are first fed into an alias reduction filter bank to reduce the aliasing.

Type: Grant

Filed: June 2, 2000

Date of Patent: April 6, 2004

Assignee: Agere Systems Inc.

Inventor: Gerald Dietrich Schuller
Speech band division decoder

Patent number: 6718295

Abstract: A volume control unit 3 obtains, from a scale factor updating unit 2, data about how much the playback scale factor of sub-bands adjacent to a sub-band to be emphasized has been reduced, and increases the analog audio signal output level of a volume control 77 to an extent corresponding to the playback scale factor necessary for restoring the playback scale factor before the reduction. As a result, the signal level of the desired sub-band (i.e., sub-band to be emphasized) is increased to obtain a sufficient sense of emphasis of the desired sub-band.

Type: Grant

Filed: November 5, 1998

Date of Patent: April 6, 2004

Assignee: NEC Corporation

Inventor: Satoshi Hasegawa
Voiceband signal classifier

Patent number: 6708146

Abstract: A method and apparatus for classifying signals into a multiplicity of signal classes which employs discriminant functions of low-complexity discriminant variables that are computed directly from the passband signal. The method can be applied to the problem of classifying voiceband data (VBD), facsimile (FAX), native binary data, and speech on a 64 Kbps digital channel. In a hybrid two stage classification system, the first stage employs linear discriminant functions to make classification decisions into a smaller number of possible preliminary signal classes. The decisions of the first stage are then refined by a second stage that uses nonlinear discriminant functions such as quadratic or pseudo-quadratic functions. The second stage of a hybrid classifier then assigns the signal into a larger number of possible classes than does the first stage of the classifier alone.

Type: Grant

Filed: April 30, 1999

Date of Patent: March 16, 2004

Assignee: Telecommunications Research Laboratories

Inventors: Jeremy S. Sewall, Bruce F. Cockburn, Deepak P. Sarda
Speech recognition user interface

Patent number: 6697777

Abstract: A system and method for generating a user interface for a speech recognition program module which provides user feedback by inserting a place mark or bar into the text of the document at the insertion point. The place mark indicates to the user that the speech recognition program module has recorded the dictated speech string and is in the process of translating the speech string. The place mark consists of a string of characters, such as a string of ellipses. The place mark has a length that is proportional in length to the expected length of the text that the user has dictated. The length of the place mark is based on the elapsed time of the speech string dictated by the user. When the speech recognition engine has completed the translation of the speech string into text, the final text replaces the place mark in the document. The place mark may be highlighted in different colors or the characters rendered in different colors to indicate to the user the volume level of the speech string being translated.

Type: Grant

Filed: June 28, 2000

Date of Patent: February 24, 2004

Assignee: Microsoft Corporation

Inventors: Sebastian Sai Dun Ho, Jeffrey C. Reynar
Speaker recognition using a hierarchical speaker model tree

Patent number: 6684186

Abstract: In an illustrative embodiment, a speaker model is generated for each of a number of speakers from which speech samples have been obtained. Each speaker model contains a collection of distributions of audio feature data derived from the speech sample of the associated speaker. A hierarchical speaker model tree is created by merging similar speaker models on a layer by layer basis. Each time two or more speaker models are merged, a corresponding parent speaker model is created in the next higher layer of the tree. The tree is useful in applications such as speaker verification and speaker identification.

Type: Grant

Filed: January 26, 1999

Date of Patent: January 27, 2004

Assignee: International Business Machines Corporation

Inventors: Homayoon S. M. Beigi, Stephane H. Maes, Jeffrey S. Sorensen
Noise reduced speech recognition parameters

Patent number: 6678656

Abstract: A voice sample characterization front-end suitable for use in a distributed speech recognition context. A digitized voice sample 31 is split between a low frequency path 32 and a high frequency path 33. Both paths are used to determine spectral content suitable for use when determining speech recognition parameters (such as cepstral coefficients) that characterize the speech sample for recognition purposes. The low frequency path 32 has a thorough noise reduction capability. In one embodiment, the results of this noise reduction are used by the high frequency path 33 to aid in de-noising without requiring the same level of resource capacity as used by the low frequency path 32.

Type: Grant

Filed: January 30, 2002

Date of Patent: January 13, 2004

Assignee: Motorola, Inc.

Inventors: Dusan Macho, Yan Ming Cheng

prev … 5 6 7 8 9 10 11 12 13 … next