Patents Examined by Robert Louis Sax
  • Patent number: 6014621
    Abstract: A speech compression system called "Transform Predictive Coding", or TPC, provides for encoding 7 kHz wideband speech (16 kHz sampling) at a target bit-rate range of 16 to 32 kb/s (1 to 2 bits/sample). The system uses short-term and long-term prediction to remove the redundancy in speech. A prediction residual is transformed and coded in the frequency domain to take advantage of knowledge in human auditory perception. The TPC coder uses only open-loop quantization and therefore has a fairly low complexity. The speech quality of TPC is essentially transparent at 32 kb/s, very good at 24 kb/s, and acceptable at 16 kb/s.
    Type: Grant
    Filed: April 2, 1997
    Date of Patent: January 11, 2000
    Assignee: Lucent Technologies Inc.
    Inventor: Juin-Hwey Chen
  • Patent number: 6006181
    Abstract: A continuous speech decoder that is built up of multiple layers. Each of the layers uses independent knowledge sources and rules, but all the layers cooperate to quickly decode the speech input into words. A first layer is concerned with acoustic data, a second layer with phone data of speech and a third layer concerns word data and word sequences. By separating these layers, the higher layers can be made time independent and asynchronous. Thus the asynchronous layers can process data quickly and give fast support to the first layer which keeps a dynamic record called a dynamic network of the most likely continuous speech results. The speed and separation of this decoder allows better memory efficiency and better decoder results compared to previously known continuous speech decoders.
    Type: Grant
    Filed: September 12, 1997
    Date of Patent: December 21, 1999
    Assignee: Lucent Technologies Inc.
    Inventors: Eric Rolse Buhrke, Wu Chou
  • Patent number: 6006185
    Abstract: A speaker independent, continuous speech, word spotting voice recognition system and method. The edges of phonemes in an utterance are quickly and accurately isolated. The utterance is broken into wave segments based upon the edges of the phonemes. A voice recognition engine is consulted multiple times for several wave segments and the results are analyzed to correctly identify the words in the utterance.
    Type: Grant
    Filed: May 9, 1997
    Date of Patent: December 21, 1999
    Inventor: Peter Immarco
  • Patent number: 6003001
    Abstract: In encoding in which an adaptive codebook such as PSI-CELP or a fixed codebook is used on switching selection, waveform distortion caused by selection of the fixed codebook in case input speech frequency components are changed significantly is diminished. An output of an adaptive codebook 21 or an output of a fixed codebook 22 is selected by a changeover selection switch 26 and summed to an output of noise codebooks 23, 24 so as to be sent to a linear prediction synthesis filter 16. A switching control circuit 19 for controlling the switching of a changeover control switch 26 operates in response to a prediction gain which is a ratio of the linear prediction residual energy to the initial signal energy from a linear prediction analysis circuit 14 so that, if the prediction gain is smaller than a pre-set threshold value, the switching control circuit 19 judges the input signal to be voiced and controls the changeover control switch 26 for compulsorily selecting the output of the adaptive codebook 21.
    Type: Grant
    Filed: June 25, 1997
    Date of Patent: December 14, 1999
    Assignee: Sony Corporation
    Inventor: Yuji Maeda
  • Patent number: 5999901
    Abstract: The present invention provides a telephone network apparatus for performing speech recognition services in a telephone system in substantially real time. The apparatus uses a telephone channel signal to determine the echo delay of a telephone channel and then uses this delay to configure an echo cancellation filter for use in performing speech recognition. Use of echo delay in configuring the filter allows the echo cancellation function to be done using much less computational time than would be needed without its use, thereby granting a speech recognition unit greater access to a resident microprocessor to perform its function in substantially real time.
    Type: Grant
    Filed: December 12, 1997
    Date of Patent: December 7, 1999
    Assignees: MediaOne Group, Inc, U S West, Inc
    Inventors: Curtis D. Knittle, Paul D. Jaramillo, Frank H. Wu
  • Patent number: 5991726
    Abstract: A voice recognition system which controls industrial equipment or machinery. A proximity detector is attached to automatically adjust microphone sensitivity and to control automatic shutdown when the machine operator is not present. An enhanced barge-in feature uses a data switch that includes an input audio delay storage. The delay storage prevents loss of initial input data by delaying the input until the data switch switches from output to input modes. A variety of audio/video responses are provided to vary output and enhance attention span. Rules based input data handling provides a flexible response to user input.
    Type: Grant
    Filed: May 9, 1997
    Date of Patent: November 23, 1999
    Inventors: Peter Immarco, Lawrence Cohen, Theodore J. Gordon
  • Patent number: 5991720
    Abstract: The input speech is segmented using plural grammar networks, including a network that includes a filler model designed to represent noise or extraneous speech. Recognition processing results in plural lists of candidates, each list containing the N-best candidates generated. The lists are then separately aligned with the dictionary of valid names to generate two lists of valid names. The final recognition pass combines these two lists of names into a dynamic grammar and this dynamic grammar may be used to find the best candidate name using Viterbi recognition. A telephone call routing application based on the recognition system selects the best candidate name corresponding to the name spelled by the user, whether the user pronounces the name prior to spelling, or not.
    Type: Grant
    Filed: April 16, 1997
    Date of Patent: November 23, 1999
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Michael Galler, Jean-Claude Junqua
  • Patent number: 5987408
    Abstract: The invention relates to an automated directory assistance system that utilizes a heuristics model for predicting the most likely requested number. Each orthography link in the speech recognition dictionary pointing toward an entry in the white pages is associated with a data structure that provides a probability value of that link pointing toward the telephone number intended by the user on the basis of locality information specified by the user of the automated directory assistance system.
    Type: Grant
    Filed: December 16, 1996
    Date of Patent: November 16, 1999
    Assignee: Nortel Networks Corporation
    Inventor: Vishwa Gupta
  • Patent number: 5987415
    Abstract: The invention is embodied in a computer user interface including an observer capable of observing user behavior, an agent capable of conveying emotion and personality by exhibiting corresponding behavior to a user, and a network linking user behavior observed by said observer and emotion and personality conveyed by said agent. The network can include an observing network facilitating inferencing user emotional and personality states from the behavior observed by the observer as well as an agent network facilitating inferencing of agent behavior from emotion and personality states to be conveyed by the agent. In addition, a policy module can dictate to the agent network desired emotion and personality states to be conveyed by the agent based upon user emotion and personality states inferred by the observing network. Typically, each network is a stochastic model.
    Type: Grant
    Filed: June 30, 1998
    Date of Patent: November 16, 1999
    Assignee: Microsoft Corporation
    Inventors: John S. Breese, John Eugene Ball
  • Patent number: 5983176
    Abstract: A method and apparatus for searching for multimedia files in a distributed database and for displaying results of the search based on the context and content of the multimedia files.
    Type: Grant
    Filed: April 30, 1997
    Date of Patent: November 9, 1999
    Assignee: Magnifi, Inc.
    Inventors: Eric M. Hoffert, Karl Cremin, Leo Degen
  • Patent number: 5983192
    Abstract: A method and apparatus for distributing audio signals from one of a plurality of audio sources to an output connect compressed audio signals from each one of the plurality of audio sources to an audio processor. Uncompressed audio signals are derived from the compressed audio signals. The compressed audio signal from one of the plurality of audio sources is selectively coupled to the output based upon the uncompressed audio signals. In a preferred embodiment, the compressed audio signal from one of the plurality of audio sources is coupled to the output selectively in accordance with speech information detected in the uncompressed audio signals from the plurality of audio sources.
    Type: Grant
    Filed: January 22, 1999
    Date of Patent: November 9, 1999
    Assignee: PictureTel Corporation
    Inventors: Stephen C. Botzko, David M. Franklin
  • Patent number: 5983184
    Abstract: The present invention enables a visually impaired user to freely and easily control hyper text. A voice synthesis program orally reads hyper text on the Internet. In synchronization with this reading, the system focuses on a link keyword that is most closely related to the location where reading is currently being performed. When an instruction "jump to link destination" is input (by voice or with a key), the program control can jump to the link destination for the link keyword that is being focused on. Further, the reading of only a link keyword can be instructed.
    Type: Grant
    Filed: July 29, 1997
    Date of Patent: November 9, 1999
    Assignee: International Business Machines Corporation
    Inventor: Atsushi Noguchi
  • Patent number: 5978755
    Abstract: A dictation device for the storage of speech signals comprises processors and generators. The processors are adapted to digitize applied speech signals into data blocks (DB) of comprising digital data with data, a header portion (HP) and a data portion (DP). The data in each data block (DB) is organized in accordance with a specification comprising specification information. The generators are adapted to generate designation records which are each associated with specification. The processors are adapted to generate data blocks (DB) whose header portion (HP) has a first section (HSP-"1") and at least one further section (HSP-"2", HSP-"3", HSP-"4"). The generation means are adapted to generate a designation record (DS3) associated with a specification partner code associated with a specification partner, which designation record can be inserted in the first section (HSP-"1") of a data block (DB).
    Type: Grant
    Filed: March 6, 1997
    Date of Patent: November 2, 1999
    Assignee: U.S. Philips Corporation
    Inventor: Gerhard Podhradsky
  • Patent number: 5974382
    Abstract: A method for configuring an audio interface for a voice recognition application in a computer system comprises the steps of: (a) displaying a first graphical user interface having a message prompting a user to initiate testing of a microphone connection, a user activatable button and at least one icon representing recording status; (b) in response to activation of the button: (1) modifying the message section to instruct the user to remain silent; (2) recording background noise through the microphone; (3) initiating an animation of the at least one icon to represent an active recording status; and, (4) terminating the background noise recording and the animation of the at least one icon; (c) upon termination of the background noise recording: (1) modifying the message to instruct the user to speak into the microphone; (2) recording the user speaking through the microphone; (3) reinitiating the animation of the at least one icon; (4) terminating the recording of the user speaking and the animation of the icon;
    Type: Grant
    Filed: October 29, 1997
    Date of Patent: October 26, 1999
    Assignee: International Business Machines Corporation
    Inventors: Frank Fado, Peter Guasti, Amado Nassiff, Ronald VanBuskirk
  • Patent number: 5974381
    Abstract: To avoid a predetermined amount of time and or a certain amount of processing time prior to determining a number of frames for each speech input portion, a fast voice recognition system enables real-time frame counting based upon a comparison between a decreasing number of frames and an increasing time-dependent threshold. The real-time voice recognition also enables a substantially reduced rate for erroneous partial matching.
    Type: Grant
    Filed: December 19, 1997
    Date of Patent: October 26, 1999
    Assignee: Ricoh Company, Ltd.
    Inventor: Syuji Kubota
  • Patent number: 5974383
    Abstract: A method for configuring an audio mixer in an audio interface for a speech recognition application in a computer system in accordance with an inventive arrangement comprises the steps of: (a) displaying at least one graphical user interface for selecting an audio input and output device from a plurality of audio input and output devices compatible with the speech recognition application, including: a headset with a microphone and an earpiece speaker; a microphone not forming part of a headset; at least one external speaker connected to the computer system; at least one internal speaker connected to the computer system; and, different combinations of the input and output audio devices; (b) selecting and adjusting microphone inputs of the mixer in accordance with an audio input device selected in step (a); and, (c) deselecting audio inputs and outputs of the audio mixer in accordance with the audio input device and an audio output device selected in step (a), whereby unneeded audio inputs and outputs of the mix
    Type: Grant
    Filed: October 29, 1997
    Date of Patent: October 26, 1999
    Assignee: International Business Machines Corporation
    Inventors: Frank Fado, Peter Guasti, Amado Nassiff, Ronald VanBuskirk
  • Patent number: 5970456
    Abstract: The apparatus comprises a microcontroller (7) which receives codes representing vocabulary elements, a speech synthesizer (20) which generates, in analog form, phonemes for a loudspeaker (5) which correspond to the vocabulary elements represented by said codes, and a vocabulary memory (8, 23) which can be addressed by means of codes.In accordance with the invention the memory contains, in correspondence with a given code, the ASCII characters of the word or the groups of words designated by this code, to be displayed on a display screen (10), as well as a sequence of digital data which defines the pronunciation thereof. The apparatus comprises means (22, 7, 21) for applying said sequence of digital data to the speech synthesizer (20) when the latter is to supply the loudspeaker (5) with the vocabulary element represented by the code. Moreover, a memory containing proper names and controlled in accordance with the invention is accommodated on a removable card (23).
    Type: Grant
    Filed: April 12, 1996
    Date of Patent: October 19, 1999
    Assignee: Mannesman VDO AG
    Inventors: Jean-Marc Patillot, Bernard De Vergnette, Donald Zeegers
  • Patent number: 5963906
    Abstract: A method and system performs speech recognition training using Hidden Markov Models. Initially, preprocessed speech signals that include a plurality of observations are stored by the system. Initial Hidden Markov Model (HMM) parameters are then assigned. Summations are then calculated using modified equations derived substantially from the following equations, wherein u.ltoreq.v<w:P(X.sub.u.sup.v)=P(x.sub.u.sup.v)P(x.sub.v+1.sup.w)and.OMEGA..sub.ij (x.sub.u.sup.w)=.OMEGA..sub.ij (x.sub.u.sup.v)P(x.sub.v+1.sup.w)+P(x.sub.u.sup.v).OMEGA..sub.ij (x.sub.v+1.sup.w)The calculated summations are then used to perform HMM parameter reestimation. It then determines whether the HMM parameters have converged. If they have, the HMM parameters are then stored. However, if the HMM parameters have not converged, the system again calculates summations, performs HMM parameter reestimation using the summations, and determines whether the parameters have converged.
    Type: Grant
    Filed: May 20, 1997
    Date of Patent: October 5, 1999
    Assignee: AT & T Corp
    Inventor: William Turin
  • Patent number: 5960391
    Abstract: A signal extraction system for extracting one or more signal components from an input signal including a plurality of signal components. This system is equipped with a neural network arithmetic section designed to process information through the use of a recurrent neural network. The neural network arithmetic section extracts one or more signal components, for example, a speech signal component and a noise signal component from an input signal including a plurality of signal components such as a speech and noises and outputs the extracted signal components. Owing to the presence of this neural network arithmetic section, the signal extraction becomes possible with a high accuracy.
    Type: Grant
    Filed: December 13, 1996
    Date of Patent: September 28, 1999
    Assignee: Denso Corporation
    Inventors: Masahiko Tateishi, Shinichi Tamura
  • Patent number: 5956463
    Abstract: The invention relates to an automated system for monitoring wildlife auditory data and recording same for subsequent analysis and identification. The system comprises one or more microphones coupled to a recording apparatus for recording wildlife vocalizations in digital format. The resultant recorded data is preprocessed, segmented, and analyzed by means of a neural network to identify the respective species. The system minimizes the need for human intervention and subjective interpretation of the recorded sounds.
    Type: Grant
    Filed: October 7, 1996
    Date of Patent: September 21, 1999
    Assignee: Ontario Hydro
    Inventors: Paul H. Patrick, Narayan Ramani, William G. Hanson, Ronald W. Sheehan, Robert L. Jennette