Patents Examined by David D. Knepper

Speech recognition method

Patent number: 7219057

Abstract: A speech recognition method includes receiving signals derived from indices of a codebook corresponding to recognition feature vectors extracted from speech to be recognized. The signals include an indication of the number of bits per codebook index. The method also includes obtaining the string of indices from the received signals, obtaining the corresponding recognition feature vectors from the string of indices, and applying the recognition feature vectors to a word-level recognition process. To conserve network capacity, the size of the codebook and the corresponding number of bits per codebook index, are adapted on a dialogue-by-dialogue basis. The adaptation accomplishes a tradeoff between expected recognition rate and expected bitrate by optimizing a metric which is a function of both.

Type: Grant

Filed: June 8, 2005

Date of Patent: May 15, 2007

Assignee: Koninklijke Philips Electronics

Inventor: Yin-Pin Yang
Speech input into a destination guiding system

Patent number: 7209884

Abstract: A system and method are provided for the voice input of a destination using a defined input dialog into a destination guiding system in real-time operation. Devices are included by which an entered voice statement of a user is detected via a voice recognition device, compared with stored voice statements and classified according to its recognition probability, and by which the stored voice statement is recognized as the entered voice statement with the highest recognition probability. The stored voice statements assigned to a destination are composed at least of the destination name and at least a regionally limiting additional information unambiguously identifying the destination name. Each destination name is stored with a flag symbol in a first database and each additional information is stored in a second database.

Type: Grant

Filed: March 8, 2001

Date of Patent: April 24, 2007

Assignee: Bayerische Motoren Werke Aktiengesellschaft

Inventors: Mihai Steingruebner, Tarek Said
Noise suppression

Patent number: 7209879

Abstract: A network noise suppressor includes a decoder for partially decoding a CELP coded bit-stream. A noise suppressing filter H(z) is determined from the decoded parameters. The filter is used to determine modified LP and gain parameters. Corresponding parameters in the coded bit-stream are overwritten with the modified parameters.

Type: Grant

Filed: March 26, 2002

Date of Patent: April 24, 2007

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventors: Anders Eriksson, Tönu Trump
Speech recognition system and method for employing the same

Patent number: 7191135

Abstract: A speech recognition system that includes a host computer which is operative to communicate at least one graphical user interface (GUI) display file to a mobile terminal of the system. The mobile terminal includes a microphone for receiving speech input; wherein the at least one GUI display file is operative to be associated with at least one of a dictionary file and syntax file to facilitate speech recognition in connection with the at least one GUI display file.

Type: Grant

Filed: April 8, 1998

Date of Patent: March 13, 2007

Assignee: Symbol Technologies, Inc.

Inventor: Timothy P. O'Hagan
System and method for indexing videos based on speaker distinction

Patent number: 7184955

Abstract: A system and method for indexing multimedia files utilizes audio characteristics of predefined audio content contained in selected multimedia segments of the multimedia files to distinguish the selected multimedia segments. In the exemplary embodiment, the predefined audio content is speech contained in video segments of video files. Furthermore, the audio characteristics are speaker characteristics. The speech-containing video segments are detected by analyzing the audio contents of the video files. The audio contents of the speech-containing video segments are then characterized to distinguish the video segments according to speakers. The indexing of speech-containing video segments based on speakers allows users to selectively access video segments that contain speech from a particular speaker without having to manually search all the speech-containing video segments.

Type: Grant

Filed: March 25, 2002

Date of Patent: February 27, 2007

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Pere Obrador, Tong Zhang
Speech transfer over packet networks using very low digital data bandwidths

Patent number: 7177801

Abstract: A method of communicating speech across a communication link using very low digital data bandwidth is disclosed, having the steps of: translating speech into text at a source terminal; communicating the text across the communication link to a destination terminal; and translating the text into reproduced speech at the destination terminal. In a preferred embodiment, a speech profile corresponding to the speaker is used to reproduce the speech at the destination terminal so that the reproduced speech more closely approximates the original speech of the speaker. A default voice profile is used to recreate speech when a user profile is unavailable. User specific profiles can be created during training prior to communication or can be created during communication from actual speech. The user profiles can be updated to improve accuracy of recognition and to enhance reproduction of speech. The updated user profiles are transmitted to the destination terminals as needed.

Type: Grant

Filed: December 21, 2001

Date of Patent: February 13, 2007

Assignee: Texas Instruments Incorporated

Inventors: Kieth Krasnanski, Doug Wescott, William Taboada
Audio identification system and method

Patent number: 7174293

Abstract: A method and system for direct audio capture and identification of the captured audio. A user may then be offered the opportunity to purchase recordings directly over the Internet or similar outlet. The system preferably includes one or more user-carried portable audio capture devices that employ a microphone, analog to digital converter, signal processor, and memory to store samples of ambient audio or audio features calculated from the audio. Users activate their capture devices when they hear a recording that they would like to identify or purchase. Later, the user may connect the capture device to a personal computer to transfer the audio samples or audio feature samples to an Internet site for identification. The Internet site preferably uses automatic pattern recognition techniques to identify the captured samples from a library of recordings offered for sale. The user can then verify that the sample is from the desired recording and place an order online.

Type: Grant

Filed: July 13, 2001

Date of Patent: February 6, 2007

Assignee: Iceberg Industries LLC

Inventors: Stephen C. Kenyon, Laura Simkins
Method and system for predicting problematic situations in a automated dialog

Patent number: 7158935

Abstract: The invention concerns a system and method of predicting problematic dialogs in an automated dialog system based on the user's input communications. The method may include determining whether a probability of conducting a successful dialog with the user exceeds a first threshold. The successful dialog may be defined as a dialog exchange between an automated dialog system and the user that results in at least one of processing of the user's input communication and routing the user's input communication. The method may further operate such that if the first threshold is exceeded, further dialog is conducted with the user. Otherwise, the user may be directed to a human for assistance.

Type: Grant

Filed: November 15, 2000

Date of Patent: January 2, 2007

Assignee: AT&T Corp.

Inventors: Allen Louis Gorin, Irene Langkilde Geary, Diane Judith Litman, Marilyn Ann Walker, Jeremy H. Wright
Two-engine speech recognition

Patent number: 7149689

Abstract: A speech recognition system comprises exactly two automated speech recognition (ASR) engines connected to receive the same inputs. Each engine produces a recognition output, a hypothesis. The system implements one of two (or both) methods for combining the output of the two engines. In one method, a confusion matrix statistically generated for each speech recognition engine is converted into an alternatives matrix in which every column is ordered by highest-to-lowest probability. A program loop is set up in which the recognition outputs of the speech recognition engines are cross-compared with the alternatives matrices. If the output from the first ASR engine matches an alternative, its output is adopted as the final output. If the vectors provided by the alternatives matrices are exhausted without finding a match, the output from the first speech recognition engine is adopted as the final output. In a second method, the confusion matrix for each ASR engine is converted into Bayesian probability matrix.

Type: Grant

Filed: January 30, 2003

Date of Patent: December 12, 2006

Assignee: Hewlett-Packard Development Company, LP.

Inventor: Sherif Yacoub
Determining speech reception threshold

Patent number: 7149684

Abstract: An improved method and system for performing speech reception threshold testing includes calibrating one or more recorded spoken words to have substantially the same sound energy and presenting the one or more calibrated recorded spoken words to a test subject. A speech reception threshold of the test subject is measured by utilizing the one or more calibrated recorded spoken words wherein the speech reception threshold measured is indicative of a sound level at which the test subject can recognize the presented recorded spoken word or words.

Type: Grant

Filed: December 18, 2001

Date of Patent: December 12, 2006

Assignee: The United States of America as represented by the Secretary of the Army

Inventor: William A. Ahroon
Method of active learning for automatic speech recognition

Patent number: 7149687

Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.

Type: Grant

Filed: December 24, 2002

Date of Patent: December 12, 2006

Assignee: AT&T Corp.

Inventors: Allen Louis Gorin, Dilek Z. Hakkani-Tur, Giuseppe Riccardi
Multi-lingual speech recognition with cross-language context modeling

Patent number: 7149688

Abstract: An approach to multi-lingual speech recognition that permits different words in an utterance to be from different languages. Words from different languages are represented using different sets of sub-word units that are each associate with the corresponding language. Despite the use of different sets of sub-word units, the approach enables use of cross-word context at boundaries between words from different languages (cross-language context) to select appropriate variants of the sub-word units to match the context.

Type: Grant

Filed: November 4, 2002

Date of Patent: December 12, 2006

Assignee: SpeechWorks International, Inc.

Inventor: Johan Schalkwyk
Tool for graphically defining dialog flows and for establishing operational links between speech applications and hypermedia content in an interactive voice response environment

Patent number: 7143042

Abstract: A computer-implemented graphical design tool allows a developer to graphically author a dialog flow for use in a voice response system and to graphically create an operational link between a hypermedia page and a speech object. The hypermedia page may be a Web site, and the speech object may define a spoken dialog interaction between a person and a machine. Using a drag-and-drop interface, the developer can graphically define a dialog as a sequence of speech objects. The developer can also create a link between a property of any speech object and any field of a Web page, to voice-enable the Web page, or to enable a speech application to access Web site data.

Type: Grant

Filed: October 4, 1999

Date of Patent: November 28, 2006

Assignee: Nuance Communications

Inventors: Julian Sinai, Steven C. Ehrlich, Rajesh Ragoobeer
Determining speech intelligibility

Patent number: 7143031

Abstract: An improved method and system for performing speech intelligibility testing includes calibrating one or more recorded spoken words to have substantially the same sound energy and presenting the one or more calibrated recorded spoken words to a test subject. Speech intelligibility of the test subject is measured by utilizing the one or more calibrated recorded spoken words wherein the speech intelligibility measured is indicative of a percentage of the calibrated word or words that the test subject successfully identified.

Type: Grant

Filed: December 18, 2001

Date of Patent: November 28, 2006

Assignee: The United States of America as represented by the Secretary of the Army

Inventor: William A. Ahroon
Speech synthesis apparatus, control method therefor and computer-readable memory

Patent number: 7139712

Abstract: A second phoneme is generated in consideration of a phonemic context with respect to a first phoneme as a search target. Phonemic piece data corresponding to the second phoneme is searched out from a database. A third phoneme is generated by changing the phonemic context on the basis of the search result, and phonemic piece data corresponding to the third phoneme is re-searched out from the database. The search or re-search result is registered in a table in correspondence with the second or third phoneme.

Type: Grant

Filed: March 5, 1999

Date of Patent: November 21, 2006

Assignee: Canon Kabushiki Kaisha

Inventor: Masayuki Yamada
System and method for speech recognition using an enhanced phone set

Patent number: 7139708

Abstract: A system and method for speech recognition using an enhanced phone set comprises speech data, an enhanced phone set, and a transcription generated by a transcription process. The transcription process selects appropriate phones from the enhanced phone set to represent acoustic-phonetic content of the speech data. The enhanced phone set includes base-phones and composite-phones. A phone dataset includes the speech data and the transcription. The present invention also comprises a transformer that applies transformation rules to the phone dataset to produce a transformed phone dataset. The transformed phone dataset may be utilized in training a speech recognizer, such as a Hidden Markov Model. Various types of transformation rules may be applied to the phone dataset of the present invention to find an optimum transformed phone dataset for training a particular speech recognizer.

Type: Grant

Filed: August 4, 1999

Date of Patent: November 21, 2006

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Lex S. Olorenshaw, Mariscela Amador-Hernandez
Audio synthesis of a currently tuned frequency

Patent number: 7139710

Abstract: A system 10 is provided which is receptive to selective tuning at particular frequencies. The system 10 includes a display device 15, an audio synthesizer 14, and a controller 20. The controller 20 is in communication with the display device 15 and the audio synthesizer 14. The controller 20 communicates with the audio synthesizer 14 when a malfunction is detected with respect to the display device 15. In one embodiment, the audio synthesizer 14 produces an audible announcement of a frequency at which the system 10 is currently tuned.

Type: Grant

Filed: November 9, 2000

Date of Patent: November 21, 2006

Assignee: Honeywell International, Inc.

Inventor: Rich Bontrager
System and method for predicting prosodic parameters

Patent number: 7136816

Abstract: A method for generating a prosody model that predicts prosodic parameters is disclosed. Upon receiving text annotated with acoustic features, the method comprises generating first classification and regression trees (CARTs) that predict durations and F0 from text by generating initial boundary labels by considering pauses, generating initial accent labels by applying a simple rule on text-derived features only, adding the predicted accent and boundary labels to feature vectors, and using the feature vectors to generate the first CARTs. The first CARTs are used to predict accent and boundary labels. Next, the first CARTs are used to generate second CARTs that predict durations and F0 from text and acoustic features by using lengthened accented syllables and phrase-final syllables, refining accent and boundary models simultaneously, comparing actual and predicted duration of a whole prosodic phrase to normalize speaking rate, and generating the second CARTs that predict the normalized speaking rate.

Type: Grant

Filed: December 24, 2002

Date of Patent: November 14, 2006

Assignee: AT&T Corp.

Inventor: Volker Franz Strom
Encoding apparatus and decoding apparatus

Patent number: 7124076

Abstract: The present invention relates to encoding apparatuses which allow encoding to be performed such that the occurrence of pre-echo and post-echo is suppressed. A predetermined waveform analysis is applied to a low-frequency-component input time-sequential signal which includes a high-frequency component occurring at a specific time, and a low-frequency-component time-sequential signal like that shown in FIG. 9A is generated according to a result of the analysis. The low-frequency-component time-sequential signal is removed from the input time-sequential signal to generate a residual time-sequential signal like that shown in FIG. 9B. An amplitude control process is applied such that the amplitude of the residual time-sequential signal is made almost constant in a block which serves as a unit of encoding to generate a time-sequential signal to be quantized, like that shown in FIG. 9C. The time-sequential signal to be quantized is quantized and encoded.

Type: Grant

Filed: December 14, 2001

Date of Patent: October 17, 2006

Assignee: Sony Corporation

Inventors: Minoru Tsuji, Shiro Suzuki, Keisuke Toyama
Constraint-based speech recognition system and method

Patent number: 7124085

Abstract: A constraint-based speech recognition system for use with a form-filling application employed over a telephone system is disclosed. The system comprises an input signal, wherein the input signal includes both speech input and non-speech input of a type generated by a user via a manually operated device. The system further comprises a constraint module operable to access an information database containing information suitable for use with speech recognition, and to generate candidate information based on the non-speech input and the information database, wherein the candidate information corresponds to a portion of the information. The system further comprises a speech recognition module operable to recognize speech based on the speech input and the candidate information. In an exemplary embodiment, the manually operated device is a touch-tone telephone keypad, and the information database is a lexicon encoded according to classes defined by the keys of the keypad.

Type: Grant

Filed: December 13, 2001

Date of Patent: October 17, 2006

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Jean-Claude Junqua, Matteo Contolini

prev 1 2 3 4 5 6 7 … next