Patents Examined by David D. Knepper
  • Patent number: 7110940
    Abstract: Efficient recursive audio processing of one or more input data streams using a multistage processor for performing one or more predetermined functions and programmable audio effects. A first stage performs a first predetermined function, such as frequency shifting function. Intermediate results are preferably mixed. The second stage applies programmable audio effects to the mixed data, such as a reverberation effect, and stores the second stage output in a destination mix bin. The second stage output is preferably transferred to a main memory accessible to a primary processor. The second stage output is directed back to the first stage of the multistage processor to perform a second predetermined function, such as three dimensional spatialization. The primary processor modifies parameters of the first predetermined function to efficiently perform dynamic operations, such as Doppler shifts and volume transitions between multiple sound sources and a mixture of those sounds as a single point source.
    Type: Grant
    Filed: October 30, 2002
    Date of Patent: September 19, 2006
    Assignee: Microsoft Corporation
    Inventors: Derek H. Smith, Brian L. Schmidt, Georgios Chrysanthakopoulos
  • Patent number: 7103542
    Abstract: Systems and methods for automatically improving a voice recognition system are provided. In one embodiment, the systems and methods retrieve voice recognition information produced by a voice recognition system in response to recognizing a user utterance. The voice recognition information comprises a recognized voice command associated with the user utterance and a reference to an audio file that includes the user utterance. The audio file is played and it is determined if the recognized voice command matches the user utterance included in the audio file. The user utterance is then transcribed to create a transcribed utterance, if the recognized voice command does not match the user utterance. The transcribed utterance is then recorded in association with the recognized voice command to monitor recognition accuracy.
    Type: Grant
    Filed: December 14, 2001
    Date of Patent: September 5, 2006
    Assignee: Ben Franklin Patent Holding LLC
    Inventor: Sean Doyle
  • Patent number: 7096182
    Abstract: In order to enhance the quality of a communication signal derived from speech and noise, a filter divides the communication signal into a plurality of frequency band signals. A calculator generates a plurality of power band signals each having a power band value and corresponding to one of the frequency band signals. The power band values are based on estimating, over a time period, the power of one of the frequency band signals. The time period is different for different ones of the frequency band signals. The power band values are used to calculate weighting factors which are used to alter the frequency band signals that are combined to generate an improved communication signal.
    Type: Grant
    Filed: February 28, 2003
    Date of Patent: August 22, 2006
    Assignee: Tellabs Operations, Inc.
    Inventors: Ravi Chandran, Bruce E. Dunne, Daniel J. Marchok
  • Patent number: 7092880
    Abstract: A system and a method for quantitatively measuring voice transmission quality within a voice-over-data-network such as a telephony-enabled LAN utilize speech recognition to measure the quality of voice transmission. A first aspect of the invention involves determining the suitability of the LAN for voice communications. A voice sample is selected by a first terminal and is transmitted to a second terminal on the LAN a number of times. The first terminal introduces an incrementally larger quantity of noise into each transmission of the voice packet. The second terminal performs speech recognition for each successively received voice sample and determines the accuracy for each speech recognition session. The amount of noise which drops the speech recognition accuracy below a threshold level provides a measure of the suitability of the LAN for voice communications. During normal operation of the LAN, speech recognition accuracy tests are performed between various endpoints to monitor voice transmission quality.
    Type: Grant
    Filed: September 25, 2002
    Date of Patent: August 15, 2006
    Assignee: Siemens Communications, Inc.
    Inventors: Branislav Ivanic, Eli Jacobi, Peter Kozdon, Noboru Nishiya, Christoph A. Aktas
  • Patent number: 7092883
    Abstract: Systems and methods for determining word confidence scores. Speech recognition systems generate a word lattice for speech input. Posterior probabilities of the words in the word lattice are determined using a forward-backward algorithm. Next, time slots are defined for the word lattice, and for all transitions that at least partially overlap a particular time slot, the posterior probabilities of transitions that have the same word label are combined for those transitions. The combined posterior probabilities are used as confidence scores. A local entropy can be computed on the competitor transitions of a particular time slot and also used as a confidence score.
    Type: Grant
    Filed: September 25, 2002
    Date of Patent: August 15, 2006
    Assignee: AT&T
    Inventors: Roberto Gretter, Giuseppe Riccardi
  • Patent number: 7089176
    Abstract: A method, system and computer readable medium for increasing the audio perceptual loudness includes shifting at least one frequency of a first audio signal to create a second audio signal so as to increase the audio perceptual loudness. The power level of the second audio signal is not more than a power level of the first audio signal. The method also includes generating high-audio perceptual loudness tone alert sequences based on psychoacoustic and audiometric data. It further includes acquiring a listener's threshold audio profile; adding the listener's audio profile to the loudness sensitivity curve for producing the listener's tonal sensitivity curve; determining a required dB scaling for critical band tones from the listener's tonal sensitivity curve; normalizing the tonal sensitivity curve for creating a decibel curve; selecting a frequency range of the tones by using the tonal sensitivity curve; and spacing the sequence of tones along a critical band scale.
    Type: Grant
    Filed: March 27, 2003
    Date of Patent: August 8, 2006
    Assignee: Motorola, Inc.
    Inventors: Marc Andre Boillot, Dennis Anson, Audley F. Patterson
  • Patent number: 7089187
    Abstract: A voice synthesizing system can make necessary calculation amount satisfactorily small and can make necessary file size small. The system includes a compressed pitch segment database storing compressed voice waveform segments, a pitch developing portion reading out the voice waveform segment from the database and decompressing the compressed data for reproducing an original voice waveform segment when the voice waveform segment necessary for voice waveform synthesis is demanded, and a cache processing portion temporarily storing the voice waveform segment already used in voice waveform synthesis, and when voice waveform segment necessary for voice waveform synthesis is demanded, returning demanded voice waveform segment to a demander when demanded voice waveform segment is already stored, and obtaining the voice waveform segment from the database via the pitch developing portion to hold the obtained voice waveform segment and return to the demander when demanded voice waveform segment is not stored.
    Type: Grant
    Filed: September 26, 2002
    Date of Patent: August 8, 2006
    Assignee: NEC Corporation
    Inventors: Reishi Kondo, Hiroaki Hattori
  • Patent number: 7082391
    Abstract: A system for responding to spoken commands enables a computer system to avoid some synchronization problems. In certain embodiments, object oriented programming may be utilized to create an object which seamlessly coordinates between an application program and a speech engine. An object may respond to a given command which is applied tactilely or through spoken commands in a fashion which allows the application to act in the same way whatever the form of the input. In this way, an application may efficiently utilize spoken commands. For example, the spoken commands may be recognized as applying to one of two currently active tasks or windows. Through coordination with the application itself, the speech engine can be advised of a sub-vocabulary within its overall vocabulary which should be consulted in connection with certain ongoing commands. This improves the accuracy of the speech recognition process, since the spoken words need only be compared to a smaller sub-vocabulary for speech recognition purposes.
    Type: Grant
    Filed: July 14, 1998
    Date of Patent: July 25, 2006
    Assignee: Intel Corporation
    Inventor: John W. Merrill
  • Patent number: 7080007
    Abstract: An apparatus and a method for computing a Speech Absence Probability (SAP), and an apparatus and a method for removing noise by using the SAP computing device and method are provided.
    Type: Grant
    Filed: September 25, 2002
    Date of Patent: July 18, 2006
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Chang-yong Son, Vladimir Shin, Sang-ryong Kim
  • Patent number: 7080016
    Abstract: Audio information is read out of a recording medium which records musical pieces as audio information. The audio information read out is processed to detect BPM values and positions of beats in the musical piece. A musical piece is reproduced from the recording medium by reproducing audio information in accordance with the detected BPM values and positions of beats. The BPM value indicates the tempo of a musical piece, and the beat indicates a strength of a sound which repeatedly appears in each musical piece. When the audio information is reproduced in accordance with the detected BPM values and positions of beats, the musical piece is reproduced at the correct tempo and beats without giving an unnatural feeling to the listener.
    Type: Grant
    Filed: September 23, 2002
    Date of Patent: July 18, 2006
    Assignee: Pioneer Corporation
    Inventors: Masahiko Miyashita, Koji Ogura, Kensuke Chiba, Takeaki Funada
  • Patent number: 7069211
    Abstract: A voice channel data processor 207 and corresponding method 600 operable in a wireless communications unit's 200 receiver and transmitter to facilitate data transmission on a voice channel includes an encoder 301 for encoding data traffic as a transmit voice frame having a predetermined vocoder parameter and inserting the transmit voice frame into a stream of transmit voice frames with voice traffic and further includes a decoder 303 for parsing a stream of received voice frames to obtain a vocoder parameter for each, comparing the vocoder parameter for each received frame to the predetermined vocoder parameter, routing the received voice frame for processing as data traffic when the comparison is favorable, and otherwise routing the received voice frame for processing as voice traffic.
    Type: Grant
    Filed: April 30, 2003
    Date of Patent: June 27, 2006
    Assignee: Motorola, Inc.
    Inventors: Gordon W. Chiu, Daniel J. Landron, Vincent Vigna, Chin P. Wong, David R. Heeschen
  • Patent number: 7062439
    Abstract: A speech synthesizer has a language generator for generating a text-form utterance from input semantic information and a text-to-speech converter for converting the text-from utterance into speech form. The overall quality of the speech-form utterance produced by the text-to-speech converter, is assessed and if judged inadequate, the language generator is triggered to produce a new version of the text-form utterance. The assessment of the overall quality of the speech form utterance is preferably effected by a classifier fed with feature values generated during the conversion process operated by the text-to-speech converter.
    Type: Grant
    Filed: August 11, 2003
    Date of Patent: June 13, 2006
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Paul St John Brittan, Roger Cecil Ferry Tucker
  • Patent number: 7058572
    Abstract: Acoustic noise for wireless or landline telephony is reduced through optimal filtering in which each frequency band of every time frame is filtered as a function of the estimated signal-to-noise ratio and the estimated total noise energy for the frame. Non-speech bands and other special frames are further attenuated by one or more predetermined multiplier values. Noise in a transmitted signal formed of frames each formed of frequency bands is reduced. A respective total signal energy and a respective current estimate of the noise energy for at least one of the frequency bands is determined. A respective local signal-to-noise ratio for at least one of the frequency bands is determined as a function of the respective signal energy and the respective current estimate of the noise energy. A respective smoothed signal-to-noise ratio is determined from the respective local signal-to-noise ratio and another respective signal-to-noise ratio estimated for a previous frame.
    Type: Grant
    Filed: January 28, 2000
    Date of Patent: June 6, 2006
    Assignee: Nortel Networks Limited
    Inventor: Elias J. Nemer
  • Patent number: 7058576
    Abstract: The invention relates to speech recognition based on HMM, in which speech recognition is performed by performing vector quantization and obtaining an output probability by table reference, and the amount of computation and use of memory area are minimized while achieving a high ability of recognition. Exemplary codebooks used for vector quantization can be provided as follows: if phonemes are used as subwords, codebooks for respective phonemes, such that a codebook CB1 is a codebook for a phoneme /a/ and a codebook CB2 is a codebook for a phoneme /i/, and these codebooks are associated with respective phoneme HMMs.
    Type: Grant
    Filed: July 18, 2002
    Date of Patent: June 6, 2006
    Assignee: Seiko Epson Corporation
    Inventors: Yasunaga Miyazawa, Hiroshi Hasegawa
  • Patent number: 7058578
    Abstract: A media handler is used in a transaction processing system, where the system is configured to route incoming calls from callers to agents associated with the transaction processing system, and the incoming calls are based on voice-mode communication and text-mode communication. The media handler includes a media translator operatively incorporated into the transaction processing system and configured to facilitate translation between the voice-mode communication and the text-mode communication. An agent preference setting is selectable by the agent between a voice-mode and a text-mode. Also included is a speech recognition unit configured to convert the voice-mode communication to the text-mode communication and a speech synthesizer configured to convert the text-mode communication to the voice-mode communication.
    Type: Grant
    Filed: September 24, 2002
    Date of Patent: June 6, 2006
    Assignee: Rockwell Electronic Commerce Technologies, L.L.C.
    Inventors: Mark J. Michelson, Roger A. Sumner, Mark J. Power, Carlo Bonifazi, Jeffrey D. Hodson, Craig R. Shambaugh, Robert P. Beckstrom, Anthony J. Dezonno
  • Patent number: 7054818
    Abstract: Multi-modal system which enables documents to be interpreted as either or both of voice browser-based documents and/or visual browser-based documents for a thin client such as a portable telephone. Special techniques and additions are added into the document to enable the document to be converted between voice markup and visual markup languages. In addition, the document can be simultaneously viewed in both of the voice markup and the visual markup languages. Special techniques are used to allow keeping track of the browsing position within this document.
    Type: Grant
    Filed: January 14, 2004
    Date of Patent: May 30, 2006
    Assignee: V-Enablo, Inc.
    Inventors: Dipanshu Sharma, Sunil Kumar, Chandra Kholia
  • Patent number: 7047197
    Abstract: The invention generally relates to a method, apparatus, and system capable of changing a voice user interface possessing both operational characteristics and security characteristics based upon user-specific contextual information. The voice processing system consists of at least the following components: a voice user interface possessing both operational characteristics and security characteristics; a database to store user-specific contextual information; and a computer program to use the user-specific contextual information to dynamically change the operational characteristics of the voice user interface.
    Type: Grant
    Filed: September 29, 2000
    Date of Patent: May 16, 2006
    Assignee: Intel Corporation
    Inventor: Steven M. Bennett
  • Patent number: 7047184
    Abstract: A speech coding apparatus comprises a repetition period pre-selecting unit for generating a plurality of candidates for the repetition period of a driving excitation source by multiplying the repetition period of an adaptive excitation source by a plurality of constant numbers, respectively, and for pre-selecting a predetermined number of candidates from all the candidates generated. A driving excitation source coding unit provides both excitation source location information and excitation source polarity information that minimize a coding distortion, for each of the predetermined number of candidates, and provides an evaluation value associated with the minimum coding distortion for each of the predetermined number of candidates.
    Type: Grant
    Filed: November 7, 2000
    Date of Patent: May 16, 2006
    Assignee: Mitsubishi Denki Kabushiki Kaisha
    Inventors: Hirohisa Tasaki, Tadashi Yamaura
  • Patent number: 7047195
    Abstract: A translation device which has both advantages of a table look-up translation device and advantages of a machine translation device by leading the user's utterance through a sentence template suitable for the user's intent of speech is realized. Since the translation device searches for sentence templates suitable for the user's intent of speech with an orally inputted keyword and displays retrieved sentences, the user's utterance can be lead. In addition, the user is free from a troublesome manipulation for replacing a word since an expression uttered by the user is inserted into a replaceable portion (slot) within the sentence template, and the translation device translates a resulting sentence with the replaced expression embedded in the slot.
    Type: Grant
    Filed: January 26, 2005
    Date of Patent: May 16, 2006
    Assignee: Hitachi, Ltd.
    Inventors: Atsuko Koizumi, Hiroyuki Kaji, Yasunari Obuchi, Yoshinori Kitahara
  • Patent number: 7047183
    Abstract: A method and apparatus perform semantic parsing by designating one or more words in an input text stream as wildcards. Under some embodiments, partially constructed parses formed from other words in the text stream are used to control when a later word will be identified as a wildcard. In particular, if a partial parse is expecting a semantic token that begins with a wildcard, the next word in the input text segment is designated as a wildcard term.
    Type: Grant
    Filed: August 21, 2001
    Date of Patent: May 16, 2006
    Assignee: Microsoft Corporation
    Inventor: YeYi Wang