Patents Examined by Susan Wieland
  • Patent number: 5907822
    Abstract: A method and device for extrapolating past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. The extrapolation method uses past-signal history that is stored in a buffer. The method is implemented with a device that utilizes a finite-impulse response (FIR) multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression algorithm (SCA) parameters. Once a speech connection has been established, the speech compression algorithm device begins sending encoded speech frames. As the speech frames are received, they are decoded and converted back into speech signal voltages. During the normal decoding process, pre-processing of the required SCA parameters will occur and the results stored in the past-history buffer.
    Type: Grant
    Filed: April 4, 1997
    Date of Patent: May 25, 1999
    Assignee: Lincom Corporation
    Inventor: Jaime L. Prieto, Jr.
  • Patent number: 5899975
    Abstract: The presentation of audio information, particularly audio information generated by a voice synthesizer from text using a text or screen reader, is controlled using a style sheet. The style sheet permits default presentation styles, such as voice-family, voice-pitch, voice-variant, voice speed and volume to be set, and then varied based on embedded text presentation commands such as those found in hypertext markup language and in desktop publishing.
    Type: Grant
    Filed: April 3, 1997
    Date of Patent: May 4, 1999
    Assignee: Sun Microsystems, Inc.
    Inventor: Jakob Nielsen
  • Patent number: 5899973
    Abstract: In this speech recognition system, the size of the language model is reduced by discarding those n-grams that the acoustic part of the system can recognize most accurately without support from a language model. The n-grams can be discarded dynamically during the running of the system or during the build or setup-time of the system. Trigrams occurring infrequently in the text corpora are substituted for the discarded n-grams to increase the accuracy of the word recognitions.
    Type: Grant
    Filed: September 25, 1997
    Date of Patent: May 4, 1999
    Assignee: International Business Machines Corporation
    Inventors: Upali Bandara, Siegfried Kunzmann, Karlheinz Mohr, Burn L. Lewis
  • Patent number: 5897616
    Abstract: A method and apparatus for securing access to a service or facility employing automatic speech recognition, text-independent speaker identification, natural language understanding techniques and additional dynamic and static features. The method includes the steps of receiving and decoding speech containing indicia of the speaker such as a name, address or customer number; accessing a database containing information on candidate speakers; questioning the speaker based on the information; receiving, decoding and verifying an answer to the question; obtaining a voice sample of the speaker and verifying the voice sample against a model; generating a score based on the answer and the voice sample; and granting access if the score is equal to or greater than a threshold.
    Type: Grant
    Filed: June 11, 1997
    Date of Patent: April 27, 1999
    Assignee: International Business Machines Corporation
    Inventors: Dimitri Kanevsky, Stephane Herman Maes
  • Patent number: 5895449
    Abstract: A singing sound-synthesizing apparatus sequentially synthesizes vocal sounds based on singing data including lyric data of a lyric formed of a plurality of phonemes and sounding data designating a sounding time period over which the lyric data is sounded. A designating device designates a predetermined voiced phoneme from the plurality of phonemes of the lyric data. A sounding control device carries out sounding control such that sounding of the predetermined voiced phoneme designated by the designating device is started within the sounding time period designated for the plurality of phonemes by the sounding data and continued until the sounding time period designated for the plurality of phonemes elapses. In another form, ones of phoneme parameter sets and ones of coarticulation parameter sets corresponding to signing data are read from a phoneme data storing the phoneme parameter sets and the coarticulation parameter sets.
    Type: Grant
    Filed: July 22, 1997
    Date of Patent: April 20, 1999
    Assignee: Yamaha Corporation
    Inventors: Yasuyoshi Nakajima, Masahiro Koyama
  • Patent number: 5893066
    Abstract: An audio decoder in a multimedia processor improves decoding efficiency and performance through the usage of multiple parallel processors including a scalar processor and a vector processor. The scalar processor most efficiently performs tasks including bit manipulation, indexing and conditional operations. The vector processor most efficiently performs operations involving multiple data calculations, operating on unconditional, sequential data. Improved performance is achieved by executing as many operations as possible on the vector processor, rather than the scalar processor, so long as the data is sequential data. The audio decoder includes a requantization program code that shifts data handling operations from the scalar processor to the vector processor through the conversion of nonsequential data to sequential data, thereby "vectorizing" the data.
    Type: Grant
    Filed: October 15, 1996
    Date of Patent: April 6, 1999
    Assignee: Samsung Electronics Co. Ltd.
    Inventor: Kicheon Hong
  • Patent number: 5890114
    Abstract: HMM training method comprising a first parameter predicting step, a centroid state set calculating step, a reconstructing step, a second parameter predicting step and a control step. In the first parameter predicting step, a parameter of an HMM (hidden Markov model) is predicted based on training data. In the centroid state set calculating step, a centroid state set is calculated by clustering the state of said HMM whose parameter is predicted in the first parameter predicting step. In the reconstructing step, an HMM is reconstructed with using the centroid state calculated in the centroid state set calculating step. In the second parameter predicting step, predicted a parameter of the HMM reconstructed in the reconstructing step with using the training data. And, the centroid step is reexecuted by the control step in the case that a likelihood of the HMM whose parameter is predicted in the second parameter predicting step does not satisfy a predetermined condition.
    Type: Grant
    Filed: February 28, 1997
    Date of Patent: March 30, 1999
    Assignee: Oki Electric Industry Co., Ltd.
    Inventor: Jie Yi
  • Patent number: 5884262
    Abstract: The computer document audio access and conversion system allows a user to access information originally formatted for audio/visual interfacing on a computer network via a simple telephone. Of course, files formatted specifically for audio interfacing can also be accessed by the system. A user can call a designated telephone number and request a file via dual-tone multi-frequency (DTMF) signalling or through voice commands. The system analyzes the request and accesses a predetermined document. The document may be in a standard document file format, such as hyper-text mark-up language (HTML) which is used on the World Wide Web. The document is analyzed by the system, and depending on the different types of formats used in the document, information is translated from an audio/visual format to an audio format and played to the user via the telephone interface. The document may contain links to other documents which can be invoked to access such other documents.
    Type: Grant
    Filed: March 28, 1996
    Date of Patent: March 16, 1999
    Assignee: Bell Atlantic Network Services, Inc.
    Inventors: Laird H. Wise, Efstathios Mavrotheris, James E. Curry
  • Patent number: 5878390
    Abstract: A speech recognition apparatus which includes a speech recognition section for performing a speech recognition process on an uttered speech with reference to a predetermined statistical language model, based on a series of speech signal of the uttered speech sentence composed of a series of input words. The speech recognition section calculates a functional value of a predetermined erroneous sentence judging function with respect to speech recognition candidates, where the erroneous sentence judging representing a degree of unsuitability for the speech recognition candidates. When the calculated functional value exceeds a predetermined threshold value, the speech recognition section performs the speech recognition process by eliminating a speech recognition candidate corresponding to a calculated functional value.
    Type: Grant
    Filed: June 23, 1997
    Date of Patent: March 2, 1999
    Assignee: ATR Interpreting Telecommunications Research Laboratories
    Inventors: Jun Kawai, Yumi Wakita
  • Patent number: 5875428
    Abstract: A reading system includes a computer and a mass storage device including software comprising instructions for causing a computer to accept an image file generated from optically scanning an image of a document. The software converts the image file into a converted text file that includes text information, and positional information associating the text with the position of its representation in the image file. The reading system has the ability therefore to display the image representation of the scanned image on a computer monitor and permit a user to control operation of the reader with respect to the displayed image representation of the document by using the locational information associated with the converted text file. Also described are techniques for dual highlighting spoken text and a technique for determining the nearest word to a position selected by use of mouse or other pointing device operating on the image representation as displayed on the monitor.
    Type: Grant
    Filed: June 27, 1997
    Date of Patent: February 23, 1999
    Assignee: Kurzweil Educational Systems, Inc.
    Inventors: Raymond C. Kurzweil, Firdaus Bhathena
  • Patent number: 5873060
    Abstract: Wide-band speech signals and also music signals are coded with relatively less computational efforts and less sound quality deterioration even at low bit rates. A spectral parameter calculator obtains a spectral parameter from sub-frames of an input signal from a sub-frame divider, and quantizes the obtained spectral parameter. A divider divides the difference result from a subcontractor into a plurality of sub-bands. Adaptive codebook circuits obtain a pitch prediction signal by obtaining pitch data in at least one of the sub-bands. Judging circuits execute pitch prediction judgment by using the pitch data in at least one of the sub-bands. A synthesizer synthesizes a pitch prediction signal. A subtractor subtracts the pitch prediction signal from the difference result obtained from a subtractor and thus obtains an excitation signal. An excitation quantizer quantizes the excitation signal with reference to an excitation codebook.
    Type: Grant
    Filed: May 27, 1997
    Date of Patent: February 16, 1999
    Assignee: NEC Corporation
    Inventor: Kazunori Ozawa
  • Patent number: 5873059
    Abstract: A method and apparatus for reproducing speech signals at a controlled speed and for synthesizing speech includes a dividing unit that divides the input speech into time segments and an encoding unit that discriminates whether each of the speech segments is voiced or unvoiced. Based on the results of the discrimination, the encoding unit performs sinusoidal synthesis and encoding for voiced segments and vector quantization by closed-loop search for an optimum vector using an analysis-by-synthesis method for unvoiced segments in order to find encoded parameters. A period modification unit modifies the length of time associated with each signal segment and calculates a set of modified encoded parameters.
    Type: Grant
    Filed: October 25, 1996
    Date of Patent: February 16, 1999
    Assignee: Sony Corporation
    Inventors: Kazuyuki Iijima, Masayuki Nishiguchi, Jun Matsumoto, Shiro Omori
  • Patent number: 5864811
    Abstract: An audio circuit for a computer includes a bidirectional modem connection, a microphone input, first and second audio output channels, and an audio synthesizing circuit arranged to produce first and second synthesized audio channels. In a first mode of operation the first synthesized audio channel is applied to the first audio output channel and the second synthesized audio channel is applied to the second audio output channel. In a second mode of operation the first and second synthesized audio channels are combined into a monotonic signal and applied to the second audio output channel, and audio signals from the bidirectional modem connection are applied to the first audio output channel.
    Type: Grant
    Filed: November 13, 1996
    Date of Patent: January 26, 1999
    Assignee: Compaq Computer Corporation
    Inventors: Thanh T. Tran, John A. Landry, Robert F. Watts
  • Patent number: 5864819
    Abstract: Method for representing a target software application program to a voice recognition navigator program on a computer system. The method requires analyzing an application program to determine a plurality of application states. Each of the application states is defined as a set of window objects within the application for performing a specific user task. According to the invention, each of the application states is preferably represented by a sub-context tree, comprised of a plurality of sub-context objects. The tree allows the navigator to associate decoded spoken commands to specific window objects.
    Type: Grant
    Filed: November 8, 1996
    Date of Patent: January 26, 1999
    Assignee: International Business Machines Corporation
    Inventors: Mario E. De Armas, Harvey M. Ruback
  • Patent number: 5864807
    Abstract: A method and apparatus for training a system to assess the identity of a person through the audio characteristics of their voice. The system inserts an audio input (10) into an A/D Converter (20) for processing in a digital signal processor (30). The system then applies Neural network type processing by using a polynomial pattern classifier (60) for training the speaker recognition system.
    Type: Grant
    Filed: February 25, 1997
    Date of Patent: January 26, 1999
    Assignee: Motorola, Inc.
    Inventors: William Michael Campbell, Khaled Talal Assaleh
  • Patent number: 5864791
    Abstract: A method of extracting at least one pitch from every frame of a speech signal, which includes the steps of generating a number of residual signals revealing high and low points of the speech signal within a frame, and taking one of those residual signals which satisfies a predetermined condition among the generated residual signals, as the pitch. In the step of generating the residual signals, the speech is filtered using a FIR-STREAK filter which is a combination of the finite impulse response (FIR) filter and a STREAK filter, and the filtration result is output as the residual signal. In the step of generating the pitch, only the residual signal whose amplitude is over a predetermined value, and the residual signal whose temporal interval is within a predetermined period of time is generated as the pitch.
    Type: Grant
    Filed: February 28, 1997
    Date of Patent: January 26, 1999
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: See-Woo Lee
  • Patent number: 5864798
    Abstract: Adjusting the shape of a spectrum of a speech signal includes the steps of using a first filter with pole-zero transfer function A(z)/B(z) for subjecting a speech signal to a spectrum envelop emphasis and a second filter cascade-connected with the first filter, for compensating for a spectral tilt due to the first filter, independently deriving two filter coefficients used in the second filter for compensating for the spectral tilt from the pole-zero transfer function, and compensating for the spectral tilt corresponding to the pole-zero transfer function according to the derived filter coefficients.
    Type: Grant
    Filed: September 17, 1996
    Date of Patent: January 26, 1999
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kimio Miseki, Masahiro Oshikiri, Akinobu Yamashita, Masami Akamine, Tadashi Amada
  • Patent number: 5860063
    Abstract: A system and method for automated task selection is provided where a selected task is identified from the natural speech of the user making the selection. The system and method incorporate the selection of meaningful phrases through the use of a test for significance. The selected meaningful phrases are then clustered. The meaningful phrase clusters are input to a speech recognizer that determines whether any meaningful phrase clusters are present in the input speech. Task-type decisions are then made on the basis of the recognized meaningful phrase clusters.
    Type: Grant
    Filed: July 11, 1997
    Date of Patent: January 12, 1999
    Assignee: AT&T Corp
    Inventors: Allen Louis Gorin, Jeremy Huntley Wright
  • Patent number: 5860062
    Abstract: A speech recognition apparatus and method learns in advance a plurality of kinds of noises that can occur in the environment of use to determine a plurality of noise HMMs, synthesizes these noise HMMs into one noise HMM, generates a NOVO-HMM by executing NOVO (voice mixed with noise) conversion for a speech HMM of a reference pattern by using this composite noise HMM, and uses this NOVO-HMM for a speech recognition processing. Since a plurality of noises are incorporated in the NOVO-HMM generated in this manner, the speech can be recognized with high accuracy even when the noise changes.
    Type: Grant
    Filed: June 13, 1997
    Date of Patent: January 12, 1999
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Kenichi Taniguchi, Nobuyuki Kono, Toshimichi Tokuda, Yoshio Ikura
  • Patent number: 5855002
    Abstract: A system for interfacing a human user to a data processor which receives inputs from the user and includes associated storage resource information. The data processor generates outputs to the user and associated output devices. The interface system includes structure for receiving a statement generated by the human user in natural language on a word-by-word basis. The system analyzes the statement to identify a subject and object and searches the stored resource information for data related to the identified subject. Output devices are provided for generating to the user the data from the stored resource information related to the identified subject.
    Type: Grant
    Filed: June 11, 1996
    Date of Patent: December 29, 1998
    Assignee: Pegasus Micro-Technologies, Inc.
    Inventor: Alan A. Armstrong