Patents Examined by V. Paul Harper
  • Patent number: 6898565
    Abstract: A system and terminal for facilitating a “virtual presence” allows users on a communication network to simply begin speaking through other users. A system immediately detects the destination party's name, and begins routing the audio signal to a particular destination without any noticeable call set-up. Additionally, the system performs pitch corrected speed control in order to allow the detection and processing of a speech pattern without causing delay to an end user.
    Type: Grant
    Filed: January 6, 2003
    Date of Patent: May 24, 2005
    Assignee: Intel Corporation
    Inventor: Howard Bubb
  • Patent number: 6892175
    Abstract: Methods and apparatus for encoding an arbitrary digital message, e.g., a watermark, into a speech signal are provided. In one aspect of the invention, a method of embedding digital information in a speech signal comprises the steps of: (i) generating a spread spectrum signal, wherein the spread spectrum signal is representative of the digital information and further wherein the spread spectrum signal is within a frequency bandwidth corresponding to speech; and (ii) embedding the spread spectrum signal in the speech signal. By making use of spread spectrum technology and speech analysis techniques in the signal generation and embedding operations, respectively, significantly higher bit rates can be embedded into the speech signal without effecting the perceived quality of the recording. The invention also provides methods and apparatus for recovering the digital information embedded in the speech signal.
    Type: Grant
    Filed: November 2, 2000
    Date of Patent: May 10, 2005
    Assignee: International Business Machines Corporation
    Inventors: Qiang Cheng, Jeffrey Scott Sorensen
  • Patent number: 6879954
    Abstract: A method is provided for improving pattern matching in a speech recognition system having a plurality of acoustic models. The improved method includes: receiving continuous speech input; generating a sequence of acoustic feature vectors that represent temporal and spectral behavior of the speech input; loading a first group of acoustic feature vectors from the sequence of acoustic feature vectors into a memory workspace accessible to a processor; loading an acoustic model from the plurality of acoustic models into the memory workspace; and determining a similarity measure for each acoustic feature vector of the first group of acoustic feature vectors in relation to the acoustic model. Prior to retrieving another group of acoustic feature vectors, similarity measures are computed for the first group of acoustic feature vectors in relation to each of the acoustic models employed by the speech recognition system.
    Type: Grant
    Filed: April 22, 2002
    Date of Patent: April 12, 2005
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Patrick Nguyen, Luca Rigazio
  • Patent number: 6868381
    Abstract: A speech recognition system having an input for receiving an input signal indicative of a spoken utterance that is indicative of at least one speech element. The system further includes a first processing unit operative for processing the input signal to derive from a speech recognition dictionary a speech model associated to a given speech element that constitutes a potential match to the at least one speech element. The system further comprised a second processing unit for generating a modified version of the speech model on the basis of the input signal. The system further provides a third processing unit for processing the input signal on the basis of the modified version of the speech model to generate a recognition result indicative of whether the modified version of the at least one speech model constitutes a match to the input signal.
    Type: Grant
    Filed: December 21, 1999
    Date of Patent: March 15, 2005
    Assignee: Nortel Networks Limited
    Inventors: Stephen Douglas Peters, Daniel Boies, Benoit Dumoulin
  • Patent number: 6865529
    Abstract: A method of estimating the pitch of a speech signal comprises the steps of dividing the speech signal into segments, calculating for each segment a conformity function, and detecting peaks in the conformity function. The method further comprises the steps of estimating an average distance between said peaks, and using the estimated average distance as an estimate of the pitch. In this way a method less complex than prior art methods, and thus suitable for small digital signal processors, is provided. The method also avoids the pitch halving situation. When the method is based on the fact that the identified peaks in the conformity function show a periodic behavior and that the true pitch period actually corresponds to the distance between the peaks, a simpler algorithm is achieved which provides the true pitch period independent on the occurrence of pitch halving, pitch doubling, etc. A similar device is also provided.
    Type: Grant
    Filed: April 5, 2001
    Date of Patent: March 8, 2005
    Assignee: Telefonaktiebolaget L M Ericsson (publ)
    Inventors: Cecilia Brandel, Henrik Johannisson
  • Patent number: 6865528
    Abstract: A language processing system includes a unified language model. The unified language model comprises a plurality of context-free grammars having non-terminal tokens representing semantic or syntactic concepts and terminals, and an N-gram language model having non-terminal tokens. A language processing module capable of receiving an input signal indicative of language accesses the unified language model to recognize the language. The language processing module generates hypotheses for the received language as a function of words of the unified language model and/or provides an output signal indicative of the language and at least some of the semantic or syntactic concepts contained therein.
    Type: Grant
    Filed: June 1, 2000
    Date of Patent: March 8, 2005
    Assignee: Microsoft Corporation
    Inventors: Xuedong D. Huang, Milind V. Mahajan, Ye-Yi Wang, Xiaolong Mou
  • Patent number: 6842730
    Abstract: A method and a system for extracting information from a natural language text corpus based on a natural language query are disclosed. In the method the natural language text corpus is analyzed with respect to surface structure of word tokens and surface syntactic roles of constituents, and the analyzed natural language text corpus is then indexed and stored. Furthermore a natural language query is analyzed with respect to surface structure of word tokens and surface syntactic roles of constituents. From the analyzed natural language query one or more surface variants are then created, where these surface variants are equivalent to the natural language query with respect to lexical meaning of word tokens and surface syntactic roles of constituents.
    Type: Grant
    Filed: June 23, 2000
    Date of Patent: January 11, 2005
    Assignee: Hapax Limited
    Inventors: Eva Ingegord Ejerhed, Peter A. Braroe
  • Patent number: 6839665
    Abstract: A system, method, and computer program for automatically generating text analysis systems is disclosed. Individual passes of a multi-pass text analyzer are created by generating rules from samples supplied by users. Successive passes are created in a cascading fashion by performing partial text analyses employing existing passes. A complete text analyzer interleaves the generated passes with a framework of existing passes. The complete text analysis system can then process texts to identify patterns similar to samples added by users. Generation of rules from samples encompasses a wide range of constructs and granularities that occur in text, from individual words to intrasentential patterns, to sentential, paragraph, section, and other formats that occur in text documents.
    Type: Grant
    Filed: June 27, 2000
    Date of Patent: January 4, 2005
    Assignee: Text Analysis International, Inc.
    Inventor: Amnon Meyers
  • Patent number: 6836758
    Abstract: A method and system for speech recognition combines different types of engines in order to recognize user-defined digits and control words, predefined digits and control words, and nametags. Speaker-independent engines are combined with speaker-dependent engines. A Hidden Markov Model (HMM) engine is combined with Dynamic Time Warping (DTW) engines.
    Type: Grant
    Filed: January 9, 2001
    Date of Patent: December 28, 2004
    Assignee: Qualcomm Incorporated
    Inventors: Ning Bi, Andrew P. DeJaco, Harinath Garudadri, Chienchung Chang, William Yee-Ming Huang, Narendranath Malayath, Suhail Jalil, David Puig Oses, Yingyong Qi
  • Patent number: 6813604
    Abstract: A text to speech system modeling durational characteristics of a target speaker is addressed herein. A body of target speaker training text is selected having maximum possible information about speaker specific characteristics. The body of target speaker training text is read by a target speaker to produce a target speaker training corpus. A previously generated source model reflecting characteristics of a source model is retrieved and the target speaker training corpus is processed to produce modification parameters reflecting differences between durational characteristics of the target speaker and those predicted by the source model. The modification parameters are applied to the source model to produce a target model. Text inputs are processed using the target model to produce speech outputs reflecting durational characteristics of the target speaker.
    Type: Grant
    Filed: November 13, 2000
    Date of Patent: November 2, 2004
    Assignee: Lucent Technologies Inc.
    Inventors: Chi-Lin Shih, Jan Pieter Hendrik van Santen
  • Patent number: 6801891
    Abstract: A system is provided for decoding one or more sequences of sub-word units output by a speech recognition system into one or more representative words. The system uses a dynamic programming technique to align the sequence of sub-word units output by the recognition system with a number of dictionary sub-word unit sequences representative of dictionary words to identify the most likely word or words corresponding to the spoken input.
    Type: Grant
    Filed: November 13, 2001
    Date of Patent: October 5, 2004
    Assignee: Canon Kabushiki Kaisha
    Inventors: Philip Neil Garner, Jason Peter Andrew Charlesworth
  • Patent number: 6801886
    Abstract: A system for improved digital data compression in an audio encoder. A threshold is established which depends on the bit rate of the input data. A determination is made whether the bit rate is above or below the established threshold. A masking index is calculated for the input data according to a first formula if the input data is being transmitted at a rate at or below the threshold. A second formula is used to calculate the masking index if the input data is being transmitted at a rate above the threshold. The masking index is used to generate a masking threshold, and data deemed insignificant relative to the masking threshold is ignored. In the preferred embodiment of the present invention, a psycho-acoustic modeler, which is included in the encoding section of an encoding/decoding (CODEC) circuit, is used to determine a masking index. The masking index is then used to generate a masking threshold.
    Type: Grant
    Filed: November 17, 2000
    Date of Patent: October 5, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Wan-Chieh Pai, Fengduo Hu
  • Patent number: 6799158
    Abstract: A characteristic identifier for digital data is generated. Thereby, the information contained in a digital data set is reduced such that the resulting identifier is made comparable to another identifier made in the same manner. The generated identifiers are used for detecting identical digital data or to determine inexact copies of digital data. In one embodiment of the invention, the digital data is a digital audio signal and the characteristic identifier is called an audio signature. The comparison of identical audio data according to the invention can be carried out without a person actually listening to the audio data. The present invention can be used to establish automated processes to find potential unauthorized copies of audio data, e.g., music recordings, and therefore enables a better enforcement of copyrights in the audio industry.
    Type: Grant
    Filed: December 18, 2000
    Date of Patent: September 28, 2004
    Assignee: International Business Machines Corporation
    Inventors: Uwe Fischer, Stefan Hoffmann, Werner Kriechbaum, Gerhard Stenzel
  • Patent number: 6792405
    Abstract: A feature extraction process for use in a wireless communication system provides automatic speech recognition based on both spectral envelope and voicing information. The shape of the spectral envelope is used to determine the LSPs of the incoming bitstream and the adaptive gain coefficients and fixed gain coefficients are used to generate the “voiced” and “unvoiced” feature parameter information.
    Type: Grant
    Filed: December 5, 2000
    Date of Patent: September 14, 2004
    Assignee: AT&T Corp.
    Inventors: Richard Vandervoort Cox, Hong Kook Kim
  • Patent number: 6785657
    Abstract: A digital signal processor includes a coded audio data generation unit for generating coded audio data; an audio data generation unit for generating audio data; a signal switching unit for outputting one of the coded audio data and the audio data respectively supplied from the coded audio data generation unit and the audio data generation unit, and switching the output between these data; and a signal switching control unit for detecting the periodicity of the coded audio data outputted from the coded audio data generation unit, and controlling the signal switching unit so as to switch the output at the boundary of periods of the coded audio data.
    Type: Grant
    Filed: November 29, 2000
    Date of Patent: August 31, 2004
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventor: Norio Hatanaka
  • Patent number: 6785645
    Abstract: An efficient and accurate classification method for classifying speech and music signals, or other diverse signal types, is provided. The method and system are especially, although not exclusively, suited for use in real-time applications. Long-term and short-term features are extracted relative to each frame, whereby short-term features are used to detect a potential switching point at which to switch a coder operating mode, and long-term features are used to classify each frame and validate the potential switch at the potential switch point according to the classification and a predefined criterion.
    Type: Grant
    Filed: November 29, 2001
    Date of Patent: August 31, 2004
    Assignee: Microsoft Corporation
    Inventors: Hosam Adel Khalil, Vladimir Cuperman, Tian Wang
  • Patent number: 6766297
    Abstract: A method for integrating a picture archiving and communication system (“PACS”) and a voice dictating or speech recognition system (collectively “voice dictation system”) is provided. The method includes logging in to a PACS. The method also includes automatically logging in to a voice dictation system responsive to the logging in step and opening an image file in the PACS. The integration method further includes automatically creating a dictation file associated with the image file in the voice dictation system responsive to at least one of the logging in step, the automatic logging in step or the opening step. The method may optionally include sending a predetermined identifier to the dictation file to associate the image file with the dictation file. The method may additionally optionally include performing the automatic logging in step responsive to the PACS being in a dictation mode.
    Type: Grant
    Filed: December 29, 1999
    Date of Patent: July 20, 2004
    Assignee: General Electric Company
    Inventors: Roland Lamer, Jeremy Malecha
  • Patent number: 6766294
    Abstract: A performance gauge for use in conjunction with a transcription system including a speech processor linked to at least one speech recognition engine and at least one transcriptionist. The speech processor includes an input for receiving speech files and storage means for storing the received speech files until such a time that they are forwarded to a selected speech recognition engine or transcriptionist for processing. The system includes a transcriptionist text file database in which manually transcribed transcriptionist text files are stored, each stored transcriptionist text file including time stamped data indicative of position within an original speech file. The system further includes a recognition engine text file database in which recognition engine text files transcribed via the at least one speech recognition engine are stored, each stored recognition engine text file including time stamped data indicative of position within an original speech file.
    Type: Grant
    Filed: November 30, 2001
    Date of Patent: July 20, 2004
    Assignee: Dictaphone Corporation
    Inventors: Andrew MacGinite, James Cyr, Martin Hold, Channell Greene, Regina Kuhnen
  • Patent number: 6766299
    Abstract: Methods, systems and apparatuses directed toward an authoring tool that gives users the ability to make high-quality, speech-driven animation in which the animated character speaks in the user's voice. Embodiments of the present invention allow the animation to be sent as a message over the Internet or used as a set of instructions for various applications including Internet chat rooms. According to one embodiment, the user chooses a character and a scene from a menu, then speaks into the computer's microphone to generate a personalized message. Embodiments of the present invention use speech-recognition technology to match the audio input to the appropriate animated mouth shapes creating a professional looking 2D or 3D animated scene with lip-synced audio characteristics.
    Type: Grant
    Filed: December 20, 1999
    Date of Patent: July 20, 2004
    Assignee: Thrillionaire Productions, Inc.
    Inventors: Victor Cyril Bellomo, Eric Manaolana Herrmann
  • Patent number: 6745161
    Abstract: Disclosed is a method for linguistic pattern recognition of information. Initially, textual information is retrieved from a data source utilizing a network. The textual information is then segmented into a plurality of phrases, which are then scanned for patterns of interest. For each pattern of interest found a corresponding event structure is built. Event structures that provide information about essentially the same incident are then merged.
    Type: Grant
    Filed: July 10, 2000
    Date of Patent: June 1, 2004
    Assignee: Discern Communications, Inc.
    Inventors: James F. Arnold, Loren L. Voss