Patents Examined by V. Paul Harper
  • Patent number: 7013271
    Abstract: A method and system for implementing a low complexity spectrum estimation technique for comfort noise generation are disclosed. Another aspect of the present invention involves segregating filter parameter encoding from an adaptation process for transmission in the form of silence insertion descriptors. A method for implementing a spectrum estimation for comfort noise generation comprises the steps of receiving an input noise signal; approximating a spectrum of the input noise signal using an algorithm over a period of time; detecting an absence of speech signals; and generating comfort noise based on the approximating step when the absence of speech signals is detected; wherein the spectrum of the input noise signal is substantially constant over the period of time.
    Type: Grant
    Filed: June 5, 2002
    Date of Patent: March 14, 2006
    Assignee: GlobespanVirata Incorporated
    Inventor: Vasudev S. Nayak
  • Patent number: 7013265
    Abstract: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.
    Type: Grant
    Filed: December 3, 2004
    Date of Patent: March 14, 2006
    Assignee: Microsoft Corporation
    Inventors: Xuedong D. Huang, Milind V. Mahajan, Ye-Yi Wang, Xiaolong Mou
  • Patent number: 7003458
    Abstract: An automated voice pattern filtering method implemented in a system having a client side and a server side is disclosed. At the client side, a speech signal is transformed into a first set of spectral parameters which are encoded into a set of spectral shapes that are compared to a second set of spectral parameters corresponding to one or more keywords. From the comparison, the client side determines if the speech signal is acceptable. If so, spectral information indicating a difference in a voice pattern between the speech signal and the keyword(s) is encoded and utilized as a basis to generate a voice pattern filter.
    Type: Grant
    Filed: January 15, 2002
    Date of Patent: February 21, 2006
    Assignee: General Motors Corporation
    Inventors: Kai-Ten Feng, Jane F. MacFarlane, Stephen C. Habermas
  • Patent number: 6999922
    Abstract: The present invention (110) permits a user to speed up and slow down speech without changing the speakers pitch (102, 110, 112, 128, 402–416). It is a user adjustable feature to change the spoken rate to the listeners' preferred listening rate or comfort. It can be included on the phone as a customer convenience feature without changing any characteristics of the speakers voice besides the speaking rate with soft key button (202) combinations (in interconnect or normal). From the users perspective, it would seem only that the talker changed his speaking rate, and not that the speech was digitally altered in any way. The pitch and general prosody of the speaker are preserved. The following uses of the time expansion/compression feature are listed to compliment already existing technologies or applications in progress including messaging services, messaging applications and games, real-time feature to slow down the listening rate.
    Type: Grant
    Filed: June 27, 2003
    Date of Patent: February 14, 2006
    Assignee: Motorola, Inc.
    Inventors: Marc Andre Boillot, John Gregory Harris, Thomas Lawrence Reinke
  • Patent number: 6988068
    Abstract: A method of automatically adjusting volume of speech generated by a text-to-speech application can include measuring an ambient noise level of an audio environment. A target volume for speech output generated by a text-to-speech application can be calculated based in part upon the ambient noise level. A volume of speech generated by the text-to-speech application can be automatically adjusted responsive to the performed calculation.
    Type: Grant
    Filed: March 25, 2003
    Date of Patent: January 17, 2006
    Assignee: International Business Machines Corporation
    Inventors: Francis Fado, Peter J. Guasti
  • Patent number: 6988069
    Abstract: An arrangement is provided for generating a reduced unit database of a desired size to be used in text to speech operations. A reduced unit database with a desired size is generated based on a full unit database. The reduction is carried out with respect to a text database with a plurality of sentences. Units from the full database are pruned to minimize an overall cost associated with using alternative units other than the units in the reduced unit database.
    Type: Grant
    Filed: January 31, 2003
    Date of Patent: January 17, 2006
    Assignee: Speechworks International, Inc.
    Inventor: Michael Stuart Phillips
  • Patent number: 6983240
    Abstract: A method generates normalized representations of strings, in particular sentences. The method, which can be used for translation, receives an input string. The input string is subjected to a first operation out of a plurality of operating functions for linguistically processing the input string to generate a first normalized representation of the input string that includes linguistic information. The first normalized representation is then subjected to a second operation for replacing linguistic information in the first normalized representation by abstract variables and to generate a second normalized representation.
    Type: Grant
    Filed: December 18, 2000
    Date of Patent: January 3, 2006
    Assignee: Xerox Corporation
    Inventors: Salah Ait-Mokhtar, Jean-Pierre Chanod, Eric Gaussier
  • Patent number: 6983243
    Abstract: A multiple description coder generates a number of different descriptions of a given portion of a signal in a wireless communication system, using multiple description scalar quantization (MDSQ) or another type of multiple description coding. The different descriptions of the given portion of the signal are then arranged into packets such that at least a first description of the given portion is placed in a first packet and a second description is placed in a second packet. Each of the packets are then transmitted using a frequency hopping modulator, and the hopping rate of the modulator is selected or otherwise configured based at least in part on the number of descriptions generated for the different portions of the signal.
    Type: Grant
    Filed: October 27, 2000
    Date of Patent: January 3, 2006
    Assignee: Lucent Technologies Inc.
    Inventors: Vivek K. Goyal, Jelena Kovacevic, Francois Masson
  • Patent number: 6980953
    Abstract: A system and method is provided for real time transcription or translation services. In one aspect of the system and method, a user requests transcription/translation service with certain service parameters. It is determined what resources can be used for such service, and, if all the service parameters can be met, the service is performed. Resources include live stenographers or translators, and computer processing power. If all the service parameters can not be met, it is determined whether to perform the service by meeting only some of the service parameters. These determinations may be programmed into the system beforehand. In another aspect of the system and method, a user makes a request for transcription/translation service, and the request is displayed so that stenographers or translators may make bids to perform the transcription/translation service. In some embodiments, the request is only displayed to those stenographers or translators who are determined to be able to perform the service.
    Type: Grant
    Filed: October 31, 2000
    Date of Patent: December 27, 2005
    Assignee: International Business Machines Corp.
    Inventors: Dimitri Kanevsky, Sara H. Basson, Edward Adam Epstein, Alexander Zlatsin
  • Patent number: 6978240
    Abstract: A speech translation system and method are disclosed utilizing a holographic storage medium having a plurality of frames therein, each frame containing one or more discrete speech wave forms thereon for comparison with a spoken word wave form to select a wave form of a second language of equivalent meaning which through a digital audio player can be made audible for speech translation.
    Type: Grant
    Filed: January 31, 2002
    Date of Patent: December 20, 2005
    Inventor: Gregory R. Brotz
  • Patent number: 6973424
    Abstract: A speech coder capable of achieving an excellent sound quality even at a low bit rate. A mode judging circuit 800 of the speech coder judges a mode by the use of a feature quantity of an input speech signal for each subframe. In case of a predetermined mode, an excitation quantization circuit 350 searches combinations of every code vectors stored in codebooks 351 and 352 for simultaneously quantizing amplitudes or polarities of a plurality of pulses and each of a plurality of shift amounts for temporally shifting predetermined pulse positions, and selects a combination of the code vector and the shift amount which minimizes distortion from an input speech. A gain quantization circuit 365 quantizes a gain by the use of a gain codebook 380.
    Type: Grant
    Filed: June 29, 1999
    Date of Patent: December 6, 2005
    Assignee: NEC Corporation
    Inventor: Kazunori Ozawa
  • Patent number: 6973425
    Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.
    Type: Grant
    Filed: April 19, 2000
    Date of Patent: December 6, 2005
    Assignee: AT&T Corp.
    Inventor: David A. Kapilow
  • Patent number: 6963840
    Abstract: In a speech recognition system, a method for using multiple cursors for dictation and correction of text can include a series of steps. The steps can include detecting whether a correction marker has been included within a body of text and searching for a user specified portion of text to be corrected within the body of text. The method further can include selecting the user specified portion of text and substituting an alternate user specified portion of text for the user specified portion of text within the body of text. The step of locating the correction marker within the body of text at a location defined by the alternate user specified portion of text further can be included. Additionally, the method can include the step of relocating an insertion cursor to the end of the body of text.
    Type: Grant
    Filed: January 12, 2001
    Date of Patent: November 8, 2005
    Assignee: International Business Machines Corporation
    Inventor: Charles Sumner
  • Patent number: 6961697
    Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.
    Type: Grant
    Filed: April 19, 2000
    Date of Patent: November 1, 2005
    Assignee: AT&T Corp.
    Inventor: David A. Kapilow
  • Patent number: 6952668
    Abstract: The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.
    Type: Grant
    Filed: April 19, 2000
    Date of Patent: October 4, 2005
    Assignee: AT&T Corp.
    Inventor: David A. Kapilow
  • Patent number: 6947896
    Abstract: A method for marking dictated text for deferred correction or review of dictated text in a speech recognition system proofreader, comprising the steps of: displaying previously dictated text; sequentially highlighting words in the text; selectively establishing a mark for different ones of the sequentially highlighted words responsive to user commands; and, storing the marks in an ordered list, each of the marks including a current position and length of a corresponding marked word, whereby the marked words can be later recalled for correction in accordance with the ordered list. The method can, further comprise the steps of: displaying the previously dictated text in a first display window; sequentially displaying in a second display window a portion of the previously dictated text including the sequentially highlighted word; and, sequentially displaying in a third display window within the second display window the sequentially highlighted word.
    Type: Grant
    Filed: January 15, 2002
    Date of Patent: September 20, 2005
    Assignee: International Business Machines Corporation
    Inventor: Gary Robert Hanson
  • Patent number: 6931372
    Abstract: The invention provides methods and apparatus for processing information, e.g., audio, video or image information, for transmission in a communication system. In an illustrative embodiment, a joint multiple program coder determines the value of a single-bit or multiple-bit criticality measure, e.g., criticality flag, in a designated interval, e.g., the duration of an audio frame, for each of the programs in a set of multiple programs to be transmitted in the system. The joint multiple program coder allocates a pool of available bits to the programs based at least in part on the determined values of the criticality measures, such that a program with a higher-valued criticality measure in the designated time interval is allocated a greater percentage of the available bits for that interval than another one of the programs with a lower-valued criticality measure. The joint multiple program coder repeats the determination and allocation operations for each of a number of time intervals, e.g.
    Type: Grant
    Filed: January 27, 1999
    Date of Patent: August 16, 2005
    Assignee: Agere Systems Inc.
    Inventors: Deepen Sinha, Carl-Erik Wilhelm Sundberg
  • Patent number: 6915257
    Abstract: This invention presents a voicing determination algorithm for classification of a speech signal segment as voiced or unvoiced. The algorithm is based on a normalized autocorrelation where the length of the window is proportional to the pitch period. The speech segment to be classified is further divided into a number of sub-segments, and the normalized autocorrelation is calculated for each sub-segment if a certain number of the normalized autocorrelation values is above a predetermined threshold, the speech segment is classified as voiced. To improve the performance of the voicing determination algorithm in unvoiced to voiced transients, the normalized autocorrelations of the last sub-segments are emphasized. The performance of the voicing decision algorithm can be enhanced by utilizing also the possible lookahead information.
    Type: Grant
    Filed: December 21, 2000
    Date of Patent: July 5, 2005
    Assignee: Nokia Mobile Phones Limited
    Inventors: Ari Heikkinen, Samuli Pietila, Vesa Ruoppila
  • Patent number: 6910004
    Abstract: The invention relates to a method and a computer system for enhanced part-of-speech (POS-) tagging as well as grammatically disambiguating a phrase. A phrase is usually a short multiword expression that may be ambiguous. By introducing grammatical constraints the invention supports POS-tagging as well as grammatically disambiguating the phrase. According to an identifier for the phrase, the phrase is supplemented with artificial context information. The supplemented phrase is then POS-tagged or grammatically disambiguated. Important applications are POS-tagging, Automatic Term Encoding, Headword Detection and Information Retrieval.
    Type: Grant
    Filed: December 19, 2000
    Date of Patent: June 21, 2005
    Assignee: Xerox Corporation
    Inventors: Nelly Tarbouriech, Herve Poirier
  • Patent number: 6898567
    Abstract: A system and method for multi-level distributed speech recognition includes a terminal (122) having a terminal speech recognizer (136) coupled to a microphone (130). The terminal speech recognizer (136) receives an audio command (37), generating at least one terminal recognized audio command having a terminal confidence value. A network element (124) having at least one network speech recognizer (150) also receives the audio command (149), generating a at least one network recognized audio command having a network confidence value. A comparator (152) receives the recognized audio commands, comparing compares the speech recognition confidence values. The comparator (152) provides an output (162) to a dialog manager (160) of at least one recognized audio command, wherein the dialog manager then executes an operation based on the at least one recognized audio command, such as presenting the at least one recognized audio command to a user for verification or accessing a content server.
    Type: Grant
    Filed: December 29, 2001
    Date of Patent: May 24, 2005
    Assignee: Motorola, Inc.
    Inventor: Senaka Balasuriya