Patents Examined by Michael Sartori
  • Patent number: 5459814
    Abstract: A voice activity detector (VAD) which determines whether an input signal contains speech by deriving parameters measuring short term time domain characteristics of the input signal, including the average signal level and the absolute value of any change in average signal level, and comparing the derived parameter values with corresponding predetermined threshold values. In order to further minimize clipping and false alarms, the VAD periodically monitors and updates the threshold values to reflect changes in the level of background noise.
    Type: Grant
    Filed: March 26, 1993
    Date of Patent: October 17, 1995
    Assignee: Hughes Aircraft Company
    Inventors: Prabhat K. Gupta, Shrirang Jangi, Allan B. Lamkin, W. Robert Kepley, III, Adrian J. Morris
  • Patent number: 5457769
    Abstract: The presence of human voice signals in audio signals is detected by a method and apparatus based on the recognition that fundamental frequency components of human voice signals are separated from one another by a characteristic frequency difference ranging from about 120 hertz to about 180 hertz. A limited frequency band portion of the audio signals is mixed and filtered to produce a signal containing the difference frequencies of the frequency components included in the limited frequency band portion of the audio signals, and the latter signal is processed to determine whether it contains a component of significant magnitude representing the human voice characteristic difference frequency.
    Type: Grant
    Filed: December 8, 1994
    Date of Patent: October 10, 1995
    Assignee: Earmark, Inc.
    Inventor: Robert A. Valley
  • Patent number: 5448679
    Abstract: A method and system for creating a compressed data representation of a human speech utterance which may be utilized to accurately regenerate the human speech utterance. First, the location and occurrence of each period of silence, voiced sound and unvoiced sound within the speech utterance is detected. Next, a single representative data frame which may be repetitively utilized to approximate each voiced sound is iteratively determined, along with the duration of each voiced sound. The spectral content of each unvoiced sound, along with variations in the amplitude thereof is also determined. A compressed data presentation is then created which includes encoded representations of a duration of each period of silence, a duration and single representative data frame for each voiced sound and a spectral content and amplitude variations for each unvoiced sound. The compressed data representation may then be utilized to regenerate the speech utterance without substantial loss in intelligibility.
    Type: Grant
    Filed: December 30, 1992
    Date of Patent: September 5, 1995
    Assignee: International Business Machines Corporation
    Inventor: Frank A. McKiel, Jr.
  • Patent number: 5444817
    Abstract: A speech recognizing apparatus designed to predicate a duration of a recognition unit to be subsequently recognized with the use of both of the duration of the recognition unit already recognized within an input speech and the duration of each recognition units learned and then to perform a matching with the use of the predicated duration so that it can be established as a recognition candidate only when the difference in duration of the recognition units within the input speech is realistic.
    Type: Grant
    Filed: October 2, 1992
    Date of Patent: August 22, 1995
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Yumi Takizawa
  • Patent number: 5434949
    Abstract: A score evaluation display device for an electronic song accompaniment apparatus has an audio signal processing unit to evaluate a user's singing. A sampling processor samples the difference between an input song signal from a microphone and reference song signal, a volume deference detector detects a voltage difference between these two signals, a rhythm difference detector detects the difference in rhythm between these two signals, and an adder sums the outputs of the volume difference detector and the rhythm difference detector, thereby producing a finally evaluated score. Accordingly, a more reliable and accurate evaluation can be performed for a user's singing, based on the difference between the microphone's input song signal and the reference song signal.
    Type: Grant
    Filed: August 13, 1993
    Date of Patent: July 18, 1995
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Tae-hwa Jeong
  • Patent number: 5432884
    Abstract: Disclosed herein are methods and apparatus for improving the quality of synthesized speech that is transmitted through a channel that is susceptible to transmission errors. In a presently preferred embodiment of the invention a speech signal is assumed to be first encoded using a Linear Predictive Coding (LPC) technique prior to transmission. The parameters that describe the short-term spectral behavior of the speech signal are received and then applied to and processed by a non-linear median processing block only on an occurrence of a predetermined number of transmission errors in the received LPC speech signal. The median-processed short term speech parameters are subsequently employed, together with a received excitation signal, in a synthesis filter to synthesize a speech signal of improved quality over what would be obtained if the short term speech parameters were not median processed to compensate for the transmission errors.
    Type: Grant
    Filed: March 22, 1993
    Date of Patent: July 11, 1995
    Assignees: Nokia Mobile Phones Ltd., Nokia Telcommunications Oy
    Inventors: Pekka Kapanen, Yrjo Neuvo, Kari Jarvinen
  • Patent number: 5430827
    Abstract: Passwords are spoken by users and stored as speech models in a database. The database also contains a plurality of reference voice (RV) speech models based on speech inputs by various persons; each RV speech model includes characters, digits, or phrases comprising user assigned passwords. Preferably, a group of the RV speech models are selected based upon a predetermined level of difference between same and a speech model of the user's spoken password. In requesting access to the system, a user speaks the assigned password. The password entered by the user to obtain access is compared with user's own speech models and with the selected RV speech models to determine a measure of similarity. The validity of the password is determined based upon this measure of similarity.
    Type: Grant
    Filed: April 23, 1993
    Date of Patent: July 4, 1995
    Assignee: AT&T Corp.
    Inventor: Eugene L. Rissanen
  • Patent number: 5408581
    Abstract: In an apparatus for speech signal processing, first a coefficient calculation is performed to determine a value for suppressing a change of level of an input signal. Next, an input signal delay is performed to delay the input signal by a time required for the coefficient calculation. Then an output of the input signal delay is multiplied by the value obtained by the coefficient calculation, thereby obtaining an output signal.
    Type: Grant
    Filed: March 10, 1992
    Date of Patent: April 18, 1995
    Assignee: Technology Research Association of Medical and Welfare Apparatus
    Inventors: Ryoji Suzuki, Yoshiyuki Yoshizumi, Tsuyoshi Mekata, Yoshinori Yamada, Masayuki Misaki
  • Patent number: 5392381
    Abstract: An acoustic analysis device using a reduced scale model comprises a model of a reduced scale of 1/n, a loudspeaker and a microphone provided in the model, a sound signal forming circuit for forming a signal to be radiated by AD-converting an analog signal to be measured at a predetermined sampling frequency, storing the converted signal in a first RAM, reading out the stored signal at a first sampling clock and DA-converting the read out signal, a collected sound reproducing circuit for reproducing collected sound by AD-converting an output signal from the sound collecting device at a second sampling clock, storing the converted signal in a second RAM and reading out the stored signal at a third sampling clock, a sampling clock control circuit for setting the first sampling clock frequency at a value which is n times as high as the predetermined sampling frequency and setting the second sampling clock frequency at a value which is n times as high as the third sampling clock frequency, and an acoustic analysis
    Type: Grant
    Filed: December 17, 1993
    Date of Patent: February 21, 1995
    Assignee: Yamaha Corporation
    Inventors: Hiroshi Furuya, Yasushi Shimizu, Fukushi Kawakami
  • Patent number: 5384892
    Abstract: A method of speech recognition which determines acoustic features in a sound sample; recognizes words comprising the acoustic features based on a language model, which determines the possible sequences of words that may be recognized; and the selection of an appropriate response based on the words recognized. Information about what words may be recognized, under which conditions those words may be recognized, and what response is appropriate when the words are recognized, is stored, in a preferred embodiment, in a data structure called a speech rule. These speech rules are partitioned according to the context in which they are active. When speech is detected, concurrent with acoustic feature extraction, the current state of the computer system is used to determine which rules are active and how they are to be combined in order to generate a language model for word recognition. A language model is dynamically generated and used to find the best interpretation of the acoustic features as a word sequence.
    Type: Grant
    Filed: December 31, 1992
    Date of Patent: January 24, 1995
    Assignee: Apple Computer, Inc.
    Inventor: Robert D. Strong
  • Patent number: 5377301
    Abstract: A signal processing arrangement uses a codebook of first vector quantized speech feature signals formed responsive to a large collection of speech feature signals. The codebook is altered by combining the first speech feature signals of the codebook with second speech feature signals generated responsive to later input speech patterns during normal speech processing. A speaker recognition template can be updated in this fashion to take account of change which may occur in the voice and speaking characteristics of a known speaker.
    Type: Grant
    Filed: January 21, 1994
    Date of Patent: December 27, 1994
    Assignee: AT&T Corp.
    Inventors: Aaron E. Rosenberg, Frank K.-P. Soong
  • Patent number: 5377303
    Abstract: Voice utterances are substituted for manipulation of a pointing device, the pointing device being of the kind which is manipulated to control motion of a cursor on a computer display and to indicate desired actions associated with the position of the cursor on the display, the cursor being moved and the desired actions being aided by an operating system in, the computer in response to control signals received from the pointing device, the computer also having an alphanumeric keyboard, the operating system being separately responsive to control signals received from the keyboard in accordance with a predetermined format specific to the keyboard; in the system, a voice recognizer recognizes the voiced utterance, and an interpreter converts the voiced utterance into control signals which will directly create a desired action aided by the operating system without first being converted into control signals expressed in the predetermined format specific to the keyboard.
    Type: Grant
    Filed: December 9, 1993
    Date of Patent: December 27, 1994
    Assignee: Articulate Systems, Inc.
    Inventor: Thomas R. Firman