Patents Examined by T{overscore (a)}livaldis Ivars {haeck over (S)}mits

Process and device for evaluating the quality of a transmitted voice signal

Patent number: 6427133

Abstract: For the automatic evaluation of the transmission quality of a voice signal that is transmitted by a digital transmission system (1) frequency components that correspond to a data frame rate (fr) of the digital transmission system (1) are extracted from the received voice signal and analyzed. In order to obtain a measurement value that is independent of signal strength, a standardization with the spectral output values occurring in the middle of the quoted frequencies can be conducted. For comparison purposes, an undistorted voice signal is processed the same way in advance. The measurement values of the transmitted voice signal are placed at a ratio to the reference values and then evaluated, e.g., by a neuronal network (13).

Type: Grant

Filed: April 19, 1999

Date of Patent: July 30, 2002

Assignee: Ascom Infrasys AG

Inventors: Martin Paping, Thomas Fahnle
Non-interactive enrollment in speech recognition

Patent number: 6424943

Abstract: A computer enrolls a user in a speech recognition system by obtaining data representing a user's speech, the speech including multiple user utterances and generally corresponding to an enrollment text, and analyzing acoustic content of data corresponding to a user utterance. The computer determines, based on the analysis, whether the user utterance matches a portion of the enrollment text. If so, the computer uses the acoustic content of the user utterance to update acoustic models corresponding to the portion of the enrollment text. The computer may determine that the user utterance matches a portion of the enrollment text even when the user has skipped or repeated words of the enrollment text.

Type: Grant

Filed: July 24, 2000

Date of Patent: July 23, 2002

Assignee: Scansoft, Inc.

Inventors: Stefan Sherwood, David Wilsberg Parmenter, Joel M. Gould, Toffee A. Albina, Allan Gold
Quantization using frequency and mean compensated frequency input data for robust speech recognition

Patent number: 6418412

Abstract: A speech recognition system utilizes multiple quantizers to process frequency parameters and mean compensated frequency parameters derived from an input signal. The quantizers may be matrix and vector quantizer pairs, and such quantizer pairs may also function as front ends to a second stage speech classifiers such as hidden Markov models (HMMs) and/or utilizes neural network postprocessing to, for example, improve speech recognition performance. Mean compensating the frequency parameters can remove noise frequency components that remain approximately constant during the duration of the input signal. HMM initial state and state transition probabilities derived from common quantizer types and the same input signal may be consolidated to improve recognition system performance and efficiency. Matrix quantization exploits the “evolution” of the speech short-term spectral envelopes as well as frequency domain information, and vector quantization (VQ) primarily operates on frequency domain information.

Type: Grant

Filed: August 28, 2000

Date of Patent: July 9, 2002

Assignee: Legerity, Inc.

Inventors: Safdar M. Asghar, Lin Cong
System and method for effectively implementing fixed masking thresholds in an audio encoder device

Patent number: 6418404

Abstract: A system and method for effectively implementing fixed masking thresholds in an audio encoder device comprises a filter bank for filtering source audio data to produce frequency sub-bands, a lookup table for storing masking threshold corresponding to the frequency sub-bands, and a bit allocator for using the masking thresholds to identify and discard masked audio data to thereby reduce the total amount of audio data that requires processing.

Type: Grant

Filed: December 28, 1998

Date of Patent: July 9, 2002

Assignees: Sony Corporation, Sony Electronics Inc.

Inventor: Lin Yin
Subband coder or decoder band-limiting the overlap region between a processed subband and an adjacent non-processed one

Patent number: 6415251

Abstract: If all original subbands are not selected for processing in conventional subband coders or decoders aliasing distortion is generated by the characteristics of their subband band-splitting filters or subband band synthesis filters. To improve sound quality in a subband decoder the decoded frequency components in the overlap region adjacent to a subband selected not to be decoded are band-limited prior to synthesis. Alternatively, in a subband coder the sound quality in a processed subband adjacent to one not to be coded is improved by band-limiting the filtering frequency overlap region between these subbands prior to coding. By thus decoding only the non-overlapping part of the subband adjacent to an omitted subband signal distortion is reduced.

Type: Grant

Filed: June 3, 1999

Date of Patent: July 2, 2002

Assignee: Sony Corporation

Inventors: Yoshiaki Oikawa, Mitsuyuki Hatanaka, Kenzo Akagiri
Frame-based subband Kalman filtering method and apparatus for speech enhancement

Patent number: 6408269

Abstract: A method and apparatus for enhancing a speech signal contaminated by additive noise through Kalman filtering. The speech is decomposed into subband speech signals by a multichannel analysis filter bank including bandpass filters and decimation filters. Each subband speech signal is converted into a sequence of voice frames. A plurality of low-order Kalman filters are respectively applied to filter each of the subband speech signals. The autoregression (AR) parameters which are required for each Kalman filter are estimated frame-by-frame by using a correlation subtraction method to estimate the autocorrelation function and solving the corresponding Yule-Walker equations for each of the subband speech signals, respectively. The filtered subband speech signals are then combined or synthesized by a multichannel synthesis filter bank including interpolation filters and bandpass filters, and the outputs of the multichannel synthesis filter bank are summed in an adder to produce the enhanced fullband speech signal.

Type: Grant

Filed: March 3, 1999

Date of Patent: June 18, 2002

Assignee: Industrial Technology Research Institute

Inventors: Wen-Rong Wu, Po-Cheng Chen, Hwai-Tsu Chang, Chun-Hung Kuo
Natural language dialogue system automatically continuing conversation on behalf of a user who does not respond

Patent number: 6397188

Abstract: A natural language conversation system is capable of making interaction between the user and the system smooth by continuing conversation with presenting guidance or with making the system to carry out speech for the user when user cannot response to presentation of the system to cause non-response, and can reduce load required for speech of user. The natural language conversation system includes conversation control portion for outputting data in a conversation scenario, and alternative conversation portion for alternatively inputting a preliminarily prepared natural language data to the conversation control means on behalf of a user when the user fails to input within a predetermined period.

Type: Grant

Filed: July 20, 1999

Date of Patent: May 28, 2002

Assignee: NEC Corporation

Inventor: Tohru Iwasawa
Method and system for performing speech recognition based on best-word scoring of repeated speech attempts

Patent number: 6397180

Abstract: A method and system for improving word classification performance of a speech recognition system having a predefined vocabulary of acceptable words and allowing multiple speech attempts from a user. At least one best word and corresponding word and non-word scores is determined for each speech attempt by the user. At least one common best word is then determined among all the speech attempts. If the highest ranking best word is common across all attempts, that word is used to accept the speech input. Otherwise, an objective measure representing a confidence level of the corresponding word and non-word scores is determined for each of the common best words for each speech attempt by the user. Each of the objective measures is then compared to a predetermined threshold. The speech attempts by the user is then classified based on the comparison of the objective measure to the predetermined threshold.

Type: Grant

Filed: May 22, 1996

Date of Patent: May 28, 2002

Assignee: Qwest Communications International Inc.

Inventors: Paul D. Jaramillo, Frank Haiqing Wu
Handwriting and speech recognizer using neural network with separate start and continuation output scores

Patent number: 6393395

Abstract: A method and system for recognizing user input information including cursive handwriting and spoken words. A time-delayed neural network having an improved architecture is trained at the word level with an improved method, which, along with preprocessing improvements, results in a recognizer with greater recognition accuracy. Preprocessing is performed on the input data and, for example, may include resampling the data with sample points based on the second derivative to focus the recognizer on areas of the input data where the slope change per time is greatest. The input data is segmented, featurized and fed to the time-delayed neural network which outputs a matrix of character scores per segment. The neural network architecture outputs a separate score for the start and the continuation of a character.

Type: Grant

Filed: January 7, 1999

Date of Patent: May 21, 2002

Assignee: Microsoft Corporation

Inventors: Angshuman Guha, Patrick M. Haluptzok, James A. Pittman
Method for producing remotely a picture display device storing one or more associated audio messages

Patent number: 6393402

Abstract: A picture frame and accompanying audio message circuit is provided such that one or more desired audio messages stored in the audio message circuit associated with one or more display pictures can be played upon the touching of the pictures or the frame, or in response to a voice recognition device sensing an audio command associated with the particular audio message and/or pictures. When audio message playback is desired, a switch on the frame or under a protective cover for the picture is activated by touching, or a position sensitive device may be used to sense whether a particular position on the picture has been touched. Digital or analog information representing the desired audio message is retrieved from a memory device, which is subsequently transmitted to a speaker which produces the desired audio message perceptible to a human.

Type: Grant

Filed: December 6, 2001

Date of Patent: May 21, 2002

Assignee: LJ Talk LLC

Inventors: Alan R. Loudermilk, Wayne D. Jung
System and method for generating a phonetic baseform for a word and using the generated baseform for speech recognition

Patent number: 6389395

Abstract: Out-of-vocabulary word models for a speech recognizer vocabulary are generated by forming phonemic transcriptions (phonetic baseforms) of user's utterances in terms of existing reference phonemes by using a speech recognition algorithm to match input sub-word feature sample sequences to suitably-constrained allowable sequences of existing reference phoneme features. The resultant new-vocabulary-word phonetic baseform models are stored for subsequent speech recognition using the same recognition algorithm.

Type: Grant

Filed: April 4, 1997

Date of Patent: May 14, 2002

Assignee: British Telecommunications public limited company

Inventor: Simon P. Ringland
Speech recognition using unequally-weighted subvector error measures for determining a codebook vector index to represent plural speech parameters

Patent number: 6389389

Abstract: Quantization unit (108) comprises evaluator (120) and comparator (122) in signal processing for identifying an utterance in system (100). The evaluator (120) weights a first intermediate result of an operation on a first set of a plurality of speech parameters (104) differently than a second intermediate result of an operation on a second set of the plurality of speech parameters (104) in a weighted representation of the plurality of speech parameters (104). The comparator (122) employs the weighted representation of the plurality of speech parameters (104) to determine a vector index to represent the plurality of speech parameters (104). The quantization unit (108), in one example, can employ split vector quantization in conjunction with the weighted representation to determine a vector index to represent the plurality of speech parameters (104).

Type: Grant

Filed: October 13, 1999

Date of Patent: May 14, 2002

Assignee: Motorola, Inc.

Inventors: Jeffrey A. Meunier, William M. Kushner, David John Pearce
Speech encoding/decoding method using reduced subframe pulse positions having density related to pitch

Patent number: 6385576

Abstract: A speech encoding method in which information representing characteristics of a synthesis filter is generated based on an input speech signal in units of one frame. A pitch vector is generated from an adaptive codebook containing past excitation signals, and a first number of reduced pulse position candidates are generated by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame, where a density of the reduced pulse position candidates is high where the pitch vector has a large power and decreases in accordance with a decrease in the power.

Type: Grant

Filed: December 23, 1998

Date of Patent: May 7, 2002

Assignee: Kabushiki Kaisha Toshiba

Inventors: Tadashi Amada, Kimio Miseki
Constant bitrate real-time lossless audio encoding and decoding by moving excess data amounts

Patent number: 6385587

Abstract: A lossless encoding apparatus encodes audio data and a lossless decoding apparatus restores the losslessly compression encoded audio data on a real-time basis, and a method therefor. The lossless encoding apparatus includes a lossless compression unit which losslessly compression encodes the audio data stored in an input buffer in units of predetermined data and outputs the encoded data in sequence, and an output buffer which stores the encoded audio data output from the lossless compression unit.

Type: Grant

Filed: May 6, 1999

Date of Patent: May 7, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventor: Jae-Hoon Heo
Adaptive two-threshold method for discriminating noise from speech in a communication signal

Patent number: 6381570

Abstract: A method of discriminating noise and voice energy in a communication signal. A signal is measured in a plurality of block periods, which are sampled to obtain a measurement of the block energy value for the signal. The blocks are compared to a noise threshold and to a voice threshold to discriminate between noise and voice. The thresholds for noise and voice are periodically updated based on the minimum and maximum energy levels measured for block energies. In a preferred embodiment, the voice energy threshold and noise energy threshold values are updated according to a formula where the revised thresholds are based upon a factor of the minimum and maximum energy levels of the current block and the most recent past block and the average energy of the previous blocks. Updating of threshold levels allows for more accurate estimation of noise and voice during changes in either noise, voice or both to avoid missclassification of noise and/or voice.

Type: Grant

Filed: February 12, 1999

Date of Patent: April 30, 2002

Assignee: Telogy Networks, Inc.

Inventors: Dunling Li, Zoran Mladenovic, Bogdan Kosanovic
Method for producing remotely a display device storing one or more audio messages

Patent number: 6377926

Abstract: A picture frame and accompanying audio message circuit is provided such that one or more desired audio messages stored in the audio message circuit associated with one or more display pictures can be played upon the touching of the pictures or the frame, or in response to a voice recognition device sensing an audio command associated with the particular audio message and/or pictures. When audio message playback is desired, a switch on the frame or under a protective cover for the picture is activated by touching, or a position sensitive device may be used to sense whether a particular position on the picture has been touched. Digital or analog information representing the desired audio message is retrieved from a memory device, which is subsequently transmitted to a speaker which produces the desired audio message perceptible to a human.

Type: Grant

Filed: February 13, 2001

Date of Patent: April 23, 2002

Assignee: LJ Laboratories, L.L.C.

Inventor: Alan R. Loudermilk
Electronic translator for assisting communications

Patent number: 6377925

Abstract: An electronic translator translates input speech into multiple streams of data that are simultaneously delivered to the user, such as a hearing impaired individual. Preferably, the data is delivered in audible, visual and text formats. These multiple data streams are delivered to the hearing-impaired individual in a synchronized fashion, thereby creating a cognitive response. Preferably, the system of the present invention converts the input speech to a text format, and then translates the text to any of three other forms, including sign language, animation and computer generated speech. The sign language and animation translations are preferably implemented by using the medium of digital movies in which videos of a person signing words, phrase and finger spelled words, and of animations corresponding to the words, are selectively accessed from databases and displayed.

Type: Grant

Filed: July 7, 2000

Date of Patent: April 23, 2002

Assignee: Interactive Solutions, Inc.

Inventors: Morgan Greene, Jr., Virginia Greene, Harry E. Newman, Mark J. Yuhas, Michael F. Dorety
Solid-state audio recording unit

Patent number: 6377929

Abstract: A solid-state audio recording unit, capable of checking whether normal audio recording is performed or not on a real-time basis, includes one input buffer for receiving incoming audio data operating at a standard speed, another input buffer for writing audio data into a memory 9 operating at a high speed, one output buffer for receiving audio data from the memory 9 operating at a high speed, and another output buffer for delivering audio data as output operating at a standard speed. As such, it is possible to write/read data into/from the memory at a high speed; and thus operation is ensured enabling, on appearance, parallel processing of input and output, even though, in reality, a single memory is shared by the input and output ports; and thus it becomes possible to deliver audio data stored in a memory as an output on a real-time basis.

Type: Grant

Filed: August 25, 1999

Date of Patent: April 23, 2002

Assignee: U.S. Philips Corporation

Inventor: Yoshinori Takisawa
Method of determining the voicing probability of speech signals

Patent number: 6377920

Abstract: A voicing probability determination method is provided for estimating a percentage of unvoiced and voiced energy for each harmonic within each of a plurality of bands of a speech signal spectrum. Initially, a synthetic speech spectrum is generated based on the assumption that speech is purely voiced. The original and synthetic speech spectra are then divided into plurality of bands. The synthetic and original speech spectra are compared harmonic by harmonic, and a voicing determination is made based on this comparison. In one embodiment, each harmonic of the original speech spectrum is assigned a voicing decision as either completely voiced or unvoiced by comparing the difference with an adaptive threshold. If the difference for each harmonic is less than the adaptive threshold, the corresponding harmonic is declared as voiced; otherwise the harmonic is declared as unvoiced. The voicing probability for each band is then computed based on the amount of energy in the voiced harmonics in that decision band.

Type: Grant

Filed: February 28, 2001

Date of Patent: April 23, 2002

Assignee: Comsat Corporation

Inventor: Suat Yeldener
Adaptive speech rate conversion without extension of input data duration, using speech interval detection

Patent number: 6374213

Abstract: Frame power of an input signal is calculated to discriminate speech frame intervals from non-speech intervals, by thresholding current frame power using an adaptive speech-detection threshold based on the past maximum frame power value and the difference between past maximum and the minimum frame power values, adaptively updated using a predetermined number of frames prior to the current one.

Type: Grant

Filed: February 12, 2001

Date of Patent: April 16, 2002

Assignee: Nippon Hoso Kyokai

Inventors: Atsushi Imai, Nobumasa Seiyama, Tohru Takagi

prev 1 2 3 4 5 6 7 next