Patents Examined by David D. Knepper
  • Patent number: 7219057
    Abstract: A speech recognition method includes receiving signals derived from indices of a codebook corresponding to recognition feature vectors extracted from speech to be recognized. The signals include an indication of the number of bits per codebook index. The method also includes obtaining the string of indices from the received signals, obtaining the corresponding recognition feature vectors from the string of indices, and applying the recognition feature vectors to a word-level recognition process. To conserve network capacity, the size of the codebook and the corresponding number of bits per codebook index, are adapted on a dialogue-by-dialogue basis. The adaptation accomplishes a tradeoff between expected recognition rate and expected bitrate by optimizing a metric which is a function of both.
    Type: Grant
    Filed: June 8, 2005
    Date of Patent: May 15, 2007
    Assignee: Koninklijke Philips Electronics
    Inventor: Yin-Pin Yang
  • Patent number: 7209884
    Abstract: A system and method are provided for the voice input of a destination using a defined input dialog into a destination guiding system in real-time operation. Devices are included by which an entered voice statement of a user is detected via a voice recognition device, compared with stored voice statements and classified according to its recognition probability, and by which the stored voice statement is recognized as the entered voice statement with the highest recognition probability. The stored voice statements assigned to a destination are composed at least of the destination name and at least a regionally limiting additional information unambiguously identifying the destination name. Each destination name is stored with a flag symbol in a first database and each additional information is stored in a second database.
    Type: Grant
    Filed: March 8, 2001
    Date of Patent: April 24, 2007
    Assignee: Bayerische Motoren Werke Aktiengesellschaft
    Inventors: Mihai Steingruebner, Tarek Said
  • Patent number: 7209879
    Abstract: A network noise suppressor includes a decoder for partially decoding a CELP coded bit-stream. A noise suppressing filter H(z) is determined from the decoded parameters. The filter is used to determine modified LP and gain parameters. Corresponding parameters in the coded bit-stream are overwritten with the modified parameters.
    Type: Grant
    Filed: March 26, 2002
    Date of Patent: April 24, 2007
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Anders Eriksson, Tönu Trump
  • Patent number: 7191135
    Abstract: A speech recognition system that includes a host computer which is operative to communicate at least one graphical user interface (GUI) display file to a mobile terminal of the system. The mobile terminal includes a microphone for receiving speech input; wherein the at least one GUI display file is operative to be associated with at least one of a dictionary file and syntax file to facilitate speech recognition in connection with the at least one GUI display file.
    Type: Grant
    Filed: April 8, 1998
    Date of Patent: March 13, 2007
    Assignee: Symbol Technologies, Inc.
    Inventor: Timothy P. O'Hagan
  • Patent number: 7184955
    Abstract: A system and method for indexing multimedia files utilizes audio characteristics of predefined audio content contained in selected multimedia segments of the multimedia files to distinguish the selected multimedia segments. In the exemplary embodiment, the predefined audio content is speech contained in video segments of video files. Furthermore, the audio characteristics are speaker characteristics. The speech-containing video segments are detected by analyzing the audio contents of the video files. The audio contents of the speech-containing video segments are then characterized to distinguish the video segments according to speakers. The indexing of speech-containing video segments based on speakers allows users to selectively access video segments that contain speech from a particular speaker without having to manually search all the speech-containing video segments.
    Type: Grant
    Filed: March 25, 2002
    Date of Patent: February 27, 2007
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Pere Obrador, Tong Zhang
  • Patent number: 7177801
    Abstract: A method of communicating speech across a communication link using very low digital data bandwidth is disclosed, having the steps of: translating speech into text at a source terminal; communicating the text across the communication link to a destination terminal; and translating the text into reproduced speech at the destination terminal. In a preferred embodiment, a speech profile corresponding to the speaker is used to reproduce the speech at the destination terminal so that the reproduced speech more closely approximates the original speech of the speaker. A default voice profile is used to recreate speech when a user profile is unavailable. User specific profiles can be created during training prior to communication or can be created during communication from actual speech. The user profiles can be updated to improve accuracy of recognition and to enhance reproduction of speech. The updated user profiles are transmitted to the destination terminals as needed.
    Type: Grant
    Filed: December 21, 2001
    Date of Patent: February 13, 2007
    Assignee: Texas Instruments Incorporated
    Inventors: Kieth Krasnanski, Doug Wescott, William Taboada
  • Patent number: 7174293
    Abstract: A method and system for direct audio capture and identification of the captured audio. A user may then be offered the opportunity to purchase recordings directly over the Internet or similar outlet. The system preferably includes one or more user-carried portable audio capture devices that employ a microphone, analog to digital converter, signal processor, and memory to store samples of ambient audio or audio features calculated from the audio. Users activate their capture devices when they hear a recording that they would like to identify or purchase. Later, the user may connect the capture device to a personal computer to transfer the audio samples or audio feature samples to an Internet site for identification. The Internet site preferably uses automatic pattern recognition techniques to identify the captured samples from a library of recordings offered for sale. The user can then verify that the sample is from the desired recording and place an order online.
    Type: Grant
    Filed: July 13, 2001
    Date of Patent: February 6, 2007
    Assignee: Iceberg Industries LLC
    Inventors: Stephen C. Kenyon, Laura Simkins
  • Patent number: 7158935
    Abstract: The invention concerns a system and method of predicting problematic dialogs in an automated dialog system based on the user's input communications. The method may include determining whether a probability of conducting a successful dialog with the user exceeds a first threshold. The successful dialog may be defined as a dialog exchange between an automated dialog system and the user that results in at least one of processing of the user's input communication and routing the user's input communication. The method may further operate such that if the first threshold is exceeded, further dialog is conducted with the user. Otherwise, the user may be directed to a human for assistance.
    Type: Grant
    Filed: November 15, 2000
    Date of Patent: January 2, 2007
    Assignee: AT&T Corp.
    Inventors: Allen Louis Gorin, Irene Langkilde Geary, Diane Judith Litman, Marilyn Ann Walker, Jeremy H. Wright
  • Patent number: 7149689
    Abstract: A speech recognition system comprises exactly two automated speech recognition (ASR) engines connected to receive the same inputs. Each engine produces a recognition output, a hypothesis. The system implements one of two (or both) methods for combining the output of the two engines. In one method, a confusion matrix statistically generated for each speech recognition engine is converted into an alternatives matrix in which every column is ordered by highest-to-lowest probability. A program loop is set up in which the recognition outputs of the speech recognition engines are cross-compared with the alternatives matrices. If the output from the first ASR engine matches an alternative, its output is adopted as the final output. If the vectors provided by the alternatives matrices are exhausted without finding a match, the output from the first speech recognition engine is adopted as the final output. In a second method, the confusion matrix for each ASR engine is converted into Bayesian probability matrix.
    Type: Grant
    Filed: January 30, 2003
    Date of Patent: December 12, 2006
    Assignee: Hewlett-Packard Development Company, LP.
    Inventor: Sherif Yacoub
  • Patent number: 7149684
    Abstract: An improved method and system for performing speech reception threshold testing includes calibrating one or more recorded spoken words to have substantially the same sound energy and presenting the one or more calibrated recorded spoken words to a test subject. A speech reception threshold of the test subject is measured by utilizing the one or more calibrated recorded spoken words wherein the speech reception threshold measured is indicative of a sound level at which the test subject can recognize the presented recorded spoken word or words.
    Type: Grant
    Filed: December 18, 2001
    Date of Patent: December 12, 2006
    Assignee: The United States of America as represented by the Secretary of the Army
    Inventor: William A. Ahroon
  • Patent number: 7149687
    Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.
    Type: Grant
    Filed: December 24, 2002
    Date of Patent: December 12, 2006
    Assignee: AT&T Corp.
    Inventors: Allen Louis Gorin, Dilek Z. Hakkani-Tur, Giuseppe Riccardi
  • Patent number: 7149688
    Abstract: An approach to multi-lingual speech recognition that permits different words in an utterance to be from different languages. Words from different languages are represented using different sets of sub-word units that are each associate with the corresponding language. Despite the use of different sets of sub-word units, the approach enables use of cross-word context at boundaries between words from different languages (cross-language context) to select appropriate variants of the sub-word units to match the context.
    Type: Grant
    Filed: November 4, 2002
    Date of Patent: December 12, 2006
    Assignee: SpeechWorks International, Inc.
    Inventor: Johan Schalkwyk
  • Patent number: 7143042
    Abstract: A computer-implemented graphical design tool allows a developer to graphically author a dialog flow for use in a voice response system and to graphically create an operational link between a hypermedia page and a speech object. The hypermedia page may be a Web site, and the speech object may define a spoken dialog interaction between a person and a machine. Using a drag-and-drop interface, the developer can graphically define a dialog as a sequence of speech objects. The developer can also create a link between a property of any speech object and any field of a Web page, to voice-enable the Web page, or to enable a speech application to access Web site data.
    Type: Grant
    Filed: October 4, 1999
    Date of Patent: November 28, 2006
    Assignee: Nuance Communications
    Inventors: Julian Sinai, Steven C. Ehrlich, Rajesh Ragoobeer
  • Patent number: 7143031
    Abstract: An improved method and system for performing speech intelligibility testing includes calibrating one or more recorded spoken words to have substantially the same sound energy and presenting the one or more calibrated recorded spoken words to a test subject. Speech intelligibility of the test subject is measured by utilizing the one or more calibrated recorded spoken words wherein the speech intelligibility measured is indicative of a percentage of the calibrated word or words that the test subject successfully identified.
    Type: Grant
    Filed: December 18, 2001
    Date of Patent: November 28, 2006
    Assignee: The United States of America as represented by the Secretary of the Army
    Inventor: William A. Ahroon
  • Patent number: 7139712
    Abstract: A second phoneme is generated in consideration of a phonemic context with respect to a first phoneme as a search target. Phonemic piece data corresponding to the second phoneme is searched out from a database. A third phoneme is generated by changing the phonemic context on the basis of the search result, and phonemic piece data corresponding to the third phoneme is re-searched out from the database. The search or re-search result is registered in a table in correspondence with the second or third phoneme.
    Type: Grant
    Filed: March 5, 1999
    Date of Patent: November 21, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventor: Masayuki Yamada
  • Patent number: 7139708
    Abstract: A system and method for speech recognition using an enhanced phone set comprises speech data, an enhanced phone set, and a transcription generated by a transcription process. The transcription process selects appropriate phones from the enhanced phone set to represent acoustic-phonetic content of the speech data. The enhanced phone set includes base-phones and composite-phones. A phone dataset includes the speech data and the transcription. The present invention also comprises a transformer that applies transformation rules to the phone dataset to produce a transformed phone dataset. The transformed phone dataset may be utilized in training a speech recognizer, such as a Hidden Markov Model. Various types of transformation rules may be applied to the phone dataset of the present invention to find an optimum transformed phone dataset for training a particular speech recognizer.
    Type: Grant
    Filed: August 4, 1999
    Date of Patent: November 21, 2006
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Lex S. Olorenshaw, Mariscela Amador-Hernandez
  • Patent number: 7139710
    Abstract: A system 10 is provided which is receptive to selective tuning at particular frequencies. The system 10 includes a display device 15, an audio synthesizer 14, and a controller 20. The controller 20 is in communication with the display device 15 and the audio synthesizer 14. The controller 20 communicates with the audio synthesizer 14 when a malfunction is detected with respect to the display device 15. In one embodiment, the audio synthesizer 14 produces an audible announcement of a frequency at which the system 10 is currently tuned.
    Type: Grant
    Filed: November 9, 2000
    Date of Patent: November 21, 2006
    Assignee: Honeywell International, Inc.
    Inventor: Rich Bontrager
  • Patent number: 7136816
    Abstract: A method for generating a prosody model that predicts prosodic parameters is disclosed. Upon receiving text annotated with acoustic features, the method comprises generating first classification and regression trees (CARTs) that predict durations and F0 from text by generating initial boundary labels by considering pauses, generating initial accent labels by applying a simple rule on text-derived features only, adding the predicted accent and boundary labels to feature vectors, and using the feature vectors to generate the first CARTs. The first CARTs are used to predict accent and boundary labels. Next, the first CARTs are used to generate second CARTs that predict durations and F0 from text and acoustic features by using lengthened accented syllables and phrase-final syllables, refining accent and boundary models simultaneously, comparing actual and predicted duration of a whole prosodic phrase to normalize speaking rate, and generating the second CARTs that predict the normalized speaking rate.
    Type: Grant
    Filed: December 24, 2002
    Date of Patent: November 14, 2006
    Assignee: AT&T Corp.
    Inventor: Volker Franz Strom
  • Patent number: 7124076
    Abstract: The present invention relates to encoding apparatuses which allow encoding to be performed such that the occurrence of pre-echo and post-echo is suppressed. A predetermined waveform analysis is applied to a low-frequency-component input time-sequential signal which includes a high-frequency component occurring at a specific time, and a low-frequency-component time-sequential signal like that shown in FIG. 9A is generated according to a result of the analysis. The low-frequency-component time-sequential signal is removed from the input time-sequential signal to generate a residual time-sequential signal like that shown in FIG. 9B. An amplitude control process is applied such that the amplitude of the residual time-sequential signal is made almost constant in a block which serves as a unit of encoding to generate a time-sequential signal to be quantized, like that shown in FIG. 9C. The time-sequential signal to be quantized is quantized and encoded.
    Type: Grant
    Filed: December 14, 2001
    Date of Patent: October 17, 2006
    Assignee: Sony Corporation
    Inventors: Minoru Tsuji, Shiro Suzuki, Keisuke Toyama
  • Patent number: 7124085
    Abstract: A constraint-based speech recognition system for use with a form-filling application employed over a telephone system is disclosed. The system comprises an input signal, wherein the input signal includes both speech input and non-speech input of a type generated by a user via a manually operated device. The system further comprises a constraint module operable to access an information database containing information suitable for use with speech recognition, and to generate candidate information based on the non-speech input and the information database, wherein the candidate information corresponds to a portion of the information. The system further comprises a speech recognition module operable to recognize speech based on the speech input and the candidate information. In an exemplary embodiment, the manually operated device is a touch-tone telephone keypad, and the information database is a lexicon encoded according to classes defined by the keys of the keypad.
    Type: Grant
    Filed: December 13, 2001
    Date of Patent: October 17, 2006
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Jean-Claude Junqua, Matteo Contolini