Patents Examined by T{overscore (a)}livaldis Ivars {haeck over (S)}mits
  • Patent number: 6427133
    Abstract: For the automatic evaluation of the transmission quality of a voice signal that is transmitted by a digital transmission system (1) frequency components that correspond to a data frame rate (fr) of the digital transmission system (1) are extracted from the received voice signal and analyzed. In order to obtain a measurement value that is independent of signal strength, a standardization with the spectral output values occurring in the middle of the quoted frequencies can be conducted. For comparison purposes, an undistorted voice signal is processed the same way in advance. The measurement values of the transmitted voice signal are placed at a ratio to the reference values and then evaluated, e.g., by a neuronal network (13).
    Type: Grant
    Filed: April 19, 1999
    Date of Patent: July 30, 2002
    Assignee: Ascom Infrasys AG
    Inventors: Martin Paping, Thomas Fahnle
  • Patent number: 6424943
    Abstract: A computer enrolls a user in a speech recognition system by obtaining data representing a user's speech, the speech including multiple user utterances and generally corresponding to an enrollment text, and analyzing acoustic content of data corresponding to a user utterance. The computer determines, based on the analysis, whether the user utterance matches a portion of the enrollment text. If so, the computer uses the acoustic content of the user utterance to update acoustic models corresponding to the portion of the enrollment text. The computer may determine that the user utterance matches a portion of the enrollment text even when the user has skipped or repeated words of the enrollment text.
    Type: Grant
    Filed: July 24, 2000
    Date of Patent: July 23, 2002
    Assignee: Scansoft, Inc.
    Inventors: Stefan Sherwood, David Wilsberg Parmenter, Joel M. Gould, Toffee A. Albina, Allan Gold
  • Patent number: 6418412
    Abstract: A speech recognition system utilizes multiple quantizers to process frequency parameters and mean compensated frequency parameters derived from an input signal. The quantizers may be matrix and vector quantizer pairs, and such quantizer pairs may also function as front ends to a second stage speech classifiers such as hidden Markov models (HMMs) and/or utilizes neural network postprocessing to, for example, improve speech recognition performance. Mean compensating the frequency parameters can remove noise frequency components that remain approximately constant during the duration of the input signal. HMM initial state and state transition probabilities derived from common quantizer types and the same input signal may be consolidated to improve recognition system performance and efficiency. Matrix quantization exploits the “evolution” of the speech short-term spectral envelopes as well as frequency domain information, and vector quantization (VQ) primarily operates on frequency domain information.
    Type: Grant
    Filed: August 28, 2000
    Date of Patent: July 9, 2002
    Assignee: Legerity, Inc.
    Inventors: Safdar M. Asghar, Lin Cong
  • Patent number: 6418404
    Abstract: A system and method for effectively implementing fixed masking thresholds in an audio encoder device comprises a filter bank for filtering source audio data to produce frequency sub-bands, a lookup table for storing masking threshold corresponding to the frequency sub-bands, and a bit allocator for using the masking thresholds to identify and discard masked audio data to thereby reduce the total amount of audio data that requires processing.
    Type: Grant
    Filed: December 28, 1998
    Date of Patent: July 9, 2002
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventor: Lin Yin
  • Patent number: 6415251
    Abstract: If all original subbands are not selected for processing in conventional subband coders or decoders aliasing distortion is generated by the characteristics of their subband band-splitting filters or subband band synthesis filters. To improve sound quality in a subband decoder the decoded frequency components in the overlap region adjacent to a subband selected not to be decoded are band-limited prior to synthesis. Alternatively, in a subband coder the sound quality in a processed subband adjacent to one not to be coded is improved by band-limiting the filtering frequency overlap region between these subbands prior to coding. By thus decoding only the non-overlapping part of the subband adjacent to an omitted subband signal distortion is reduced.
    Type: Grant
    Filed: June 3, 1999
    Date of Patent: July 2, 2002
    Assignee: Sony Corporation
    Inventors: Yoshiaki Oikawa, Mitsuyuki Hatanaka, Kenzo Akagiri
  • Patent number: 6408269
    Abstract: A method and apparatus for enhancing a speech signal contaminated by additive noise through Kalman filtering. The speech is decomposed into subband speech signals by a multichannel analysis filter bank including bandpass filters and decimation filters. Each subband speech signal is converted into a sequence of voice frames. A plurality of low-order Kalman filters are respectively applied to filter each of the subband speech signals. The autoregression (AR) parameters which are required for each Kalman filter are estimated frame-by-frame by using a correlation subtraction method to estimate the autocorrelation function and solving the corresponding Yule-Walker equations for each of the subband speech signals, respectively. The filtered subband speech signals are then combined or synthesized by a multichannel synthesis filter bank including interpolation filters and bandpass filters, and the outputs of the multichannel synthesis filter bank are summed in an adder to produce the enhanced fullband speech signal.
    Type: Grant
    Filed: March 3, 1999
    Date of Patent: June 18, 2002
    Assignee: Industrial Technology Research Institute
    Inventors: Wen-Rong Wu, Po-Cheng Chen, Hwai-Tsu Chang, Chun-Hung Kuo
  • Patent number: 6397188
    Abstract: A natural language conversation system is capable of making interaction between the user and the system smooth by continuing conversation with presenting guidance or with making the system to carry out speech for the user when user cannot response to presentation of the system to cause non-response, and can reduce load required for speech of user. The natural language conversation system includes conversation control portion for outputting data in a conversation scenario, and alternative conversation portion for alternatively inputting a preliminarily prepared natural language data to the conversation control means on behalf of a user when the user fails to input within a predetermined period.
    Type: Grant
    Filed: July 20, 1999
    Date of Patent: May 28, 2002
    Assignee: NEC Corporation
    Inventor: Tohru Iwasawa
  • Patent number: 6397180
    Abstract: A method and system for improving word classification performance of a speech recognition system having a predefined vocabulary of acceptable words and allowing multiple speech attempts from a user. At least one best word and corresponding word and non-word scores is determined for each speech attempt by the user. At least one common best word is then determined among all the speech attempts. If the highest ranking best word is common across all attempts, that word is used to accept the speech input. Otherwise, an objective measure representing a confidence level of the corresponding word and non-word scores is determined for each of the common best words for each speech attempt by the user. Each of the objective measures is then compared to a predetermined threshold. The speech attempts by the user is then classified based on the comparison of the objective measure to the predetermined threshold.
    Type: Grant
    Filed: May 22, 1996
    Date of Patent: May 28, 2002
    Assignee: Qwest Communications International Inc.
    Inventors: Paul D. Jaramillo, Frank Haiqing Wu
  • Patent number: 6393395
    Abstract: A method and system for recognizing user input information including cursive handwriting and spoken words. A time-delayed neural network having an improved architecture is trained at the word level with an improved method, which, along with preprocessing improvements, results in a recognizer with greater recognition accuracy. Preprocessing is performed on the input data and, for example, may include resampling the data with sample points based on the second derivative to focus the recognizer on areas of the input data where the slope change per time is greatest. The input data is segmented, featurized and fed to the time-delayed neural network which outputs a matrix of character scores per segment. The neural network architecture outputs a separate score for the start and the continuation of a character.
    Type: Grant
    Filed: January 7, 1999
    Date of Patent: May 21, 2002
    Assignee: Microsoft Corporation
    Inventors: Angshuman Guha, Patrick M. Haluptzok, James A. Pittman
  • Patent number: 6393402
    Abstract: A picture frame and accompanying audio message circuit is provided such that one or more desired audio messages stored in the audio message circuit associated with one or more display pictures can be played upon the touching of the pictures or the frame, or in response to a voice recognition device sensing an audio command associated with the particular audio message and/or pictures. When audio message playback is desired, a switch on the frame or under a protective cover for the picture is activated by touching, or a position sensitive device may be used to sense whether a particular position on the picture has been touched. Digital or analog information representing the desired audio message is retrieved from a memory device, which is subsequently transmitted to a speaker which produces the desired audio message perceptible to a human.
    Type: Grant
    Filed: December 6, 2001
    Date of Patent: May 21, 2002
    Assignee: LJ Talk LLC
    Inventors: Alan R. Loudermilk, Wayne D. Jung
  • Patent number: 6389395
    Abstract: Out-of-vocabulary word models for a speech recognizer vocabulary are generated by forming phonemic transcriptions (phonetic baseforms) of user's utterances in terms of existing reference phonemes by using a speech recognition algorithm to match input sub-word feature sample sequences to suitably-constrained allowable sequences of existing reference phoneme features. The resultant new-vocabulary-word phonetic baseform models are stored for subsequent speech recognition using the same recognition algorithm.
    Type: Grant
    Filed: April 4, 1997
    Date of Patent: May 14, 2002
    Assignee: British Telecommunications public limited company
    Inventor: Simon P. Ringland
  • Patent number: 6389389
    Abstract: Quantization unit (108) comprises evaluator (120) and comparator (122) in signal processing for identifying an utterance in system (100). The evaluator (120) weights a first intermediate result of an operation on a first set of a plurality of speech parameters (104) differently than a second intermediate result of an operation on a second set of the plurality of speech parameters (104) in a weighted representation of the plurality of speech parameters (104). The comparator (122) employs the weighted representation of the plurality of speech parameters (104) to determine a vector index to represent the plurality of speech parameters (104). The quantization unit (108), in one example, can employ split vector quantization in conjunction with the weighted representation to determine a vector index to represent the plurality of speech parameters (104).
    Type: Grant
    Filed: October 13, 1999
    Date of Patent: May 14, 2002
    Assignee: Motorola, Inc.
    Inventors: Jeffrey A. Meunier, William M. Kushner, David John Pearce
  • Patent number: 6385576
    Abstract: A speech encoding method in which information representing characteristics of a synthesis filter is generated based on an input speech signal in units of one frame. A pitch vector is generated from an adaptive codebook containing past excitation signals, and a first number of reduced pulse position candidates are generated by selecting a first number of pulse positions from a number of possible pulse positions in each of sub-frames obtained by dividing the frame, where a density of the reduced pulse position candidates is high where the pitch vector has a large power and decreases in accordance with a decrease in the power.
    Type: Grant
    Filed: December 23, 1998
    Date of Patent: May 7, 2002
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Tadashi Amada, Kimio Miseki
  • Patent number: 6385587
    Abstract: A lossless encoding apparatus encodes audio data and a lossless decoding apparatus restores the losslessly compression encoded audio data on a real-time basis, and a method therefor. The lossless encoding apparatus includes a lossless compression unit which losslessly compression encodes the audio data stored in an input buffer in units of predetermined data and outputs the encoded data in sequence, and an output buffer which stores the encoded audio data output from the lossless compression unit.
    Type: Grant
    Filed: May 6, 1999
    Date of Patent: May 7, 2002
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Jae-Hoon Heo
  • Patent number: 6381570
    Abstract: A method of discriminating noise and voice energy in a communication signal. A signal is measured in a plurality of block periods, which are sampled to obtain a measurement of the block energy value for the signal. The blocks are compared to a noise threshold and to a voice threshold to discriminate between noise and voice. The thresholds for noise and voice are periodically updated based on the minimum and maximum energy levels measured for block energies. In a preferred embodiment, the voice energy threshold and noise energy threshold values are updated according to a formula where the revised thresholds are based upon a factor of the minimum and maximum energy levels of the current block and the most recent past block and the average energy of the previous blocks. Updating of threshold levels allows for more accurate estimation of noise and voice during changes in either noise, voice or both to avoid missclassification of noise and/or voice.
    Type: Grant
    Filed: February 12, 1999
    Date of Patent: April 30, 2002
    Assignee: Telogy Networks, Inc.
    Inventors: Dunling Li, Zoran Mladenovic, Bogdan Kosanovic
  • Patent number: 6377926
    Abstract: A picture frame and accompanying audio message circuit is provided such that one or more desired audio messages stored in the audio message circuit associated with one or more display pictures can be played upon the touching of the pictures or the frame, or in response to a voice recognition device sensing an audio command associated with the particular audio message and/or pictures. When audio message playback is desired, a switch on the frame or under a protective cover for the picture is activated by touching, or a position sensitive device may be used to sense whether a particular position on the picture has been touched. Digital or analog information representing the desired audio message is retrieved from a memory device, which is subsequently transmitted to a speaker which produces the desired audio message perceptible to a human.
    Type: Grant
    Filed: February 13, 2001
    Date of Patent: April 23, 2002
    Assignee: LJ Laboratories, L.L.C.
    Inventor: Alan R. Loudermilk
  • Patent number: 6377925
    Abstract: An electronic translator translates input speech into multiple streams of data that are simultaneously delivered to the user, such as a hearing impaired individual. Preferably, the data is delivered in audible, visual and text formats. These multiple data streams are delivered to the hearing-impaired individual in a synchronized fashion, thereby creating a cognitive response. Preferably, the system of the present invention converts the input speech to a text format, and then translates the text to any of three other forms, including sign language, animation and computer generated speech. The sign language and animation translations are preferably implemented by using the medium of digital movies in which videos of a person signing words, phrase and finger spelled words, and of animations corresponding to the words, are selectively accessed from databases and displayed.
    Type: Grant
    Filed: July 7, 2000
    Date of Patent: April 23, 2002
    Assignee: Interactive Solutions, Inc.
    Inventors: Morgan Greene, Jr., Virginia Greene, Harry E. Newman, Mark J. Yuhas, Michael F. Dorety
  • Patent number: 6377929
    Abstract: A solid-state audio recording unit, capable of checking whether normal audio recording is performed or not on a real-time basis, includes one input buffer for receiving incoming audio data operating at a standard speed, another input buffer for writing audio data into a memory 9 operating at a high speed, one output buffer for receiving audio data from the memory 9 operating at a high speed, and another output buffer for delivering audio data as output operating at a standard speed. As such, it is possible to write/read data into/from the memory at a high speed; and thus operation is ensured enabling, on appearance, parallel processing of input and output, even though, in reality, a single memory is shared by the input and output ports; and thus it becomes possible to deliver audio data stored in a memory as an output on a real-time basis.
    Type: Grant
    Filed: August 25, 1999
    Date of Patent: April 23, 2002
    Assignee: U.S. Philips Corporation
    Inventor: Yoshinori Takisawa
  • Patent number: 6377920
    Abstract: A voicing probability determination method is provided for estimating a percentage of unvoiced and voiced energy for each harmonic within each of a plurality of bands of a speech signal spectrum. Initially, a synthetic speech spectrum is generated based on the assumption that speech is purely voiced. The original and synthetic speech spectra are then divided into plurality of bands. The synthetic and original speech spectra are compared harmonic by harmonic, and a voicing determination is made based on this comparison. In one embodiment, each harmonic of the original speech spectrum is assigned a voicing decision as either completely voiced or unvoiced by comparing the difference with an adaptive threshold. If the difference for each harmonic is less than the adaptive threshold, the corresponding harmonic is declared as voiced; otherwise the harmonic is declared as unvoiced. The voicing probability for each band is then computed based on the amount of energy in the voiced harmonics in that decision band.
    Type: Grant
    Filed: February 28, 2001
    Date of Patent: April 23, 2002
    Assignee: Comsat Corporation
    Inventor: Suat Yeldener
  • Patent number: 6374213
    Abstract: Frame power of an input signal is calculated to discriminate speech frame intervals from non-speech intervals, by thresholding current frame power using an adaptive speech-detection threshold based on the past maximum frame power value and the difference between past maximum and the minimum frame power values, adaptively updated using a predetermined number of frames prior to the current one.
    Type: Grant
    Filed: February 12, 2001
    Date of Patent: April 16, 2002
    Assignee: Nippon Hoso Kyokai
    Inventors: Atsushi Imai, Nobumasa Seiyama, Tohru Takagi