Update Patterns Patents (Class 704/244)

Automatic grammar generation using distributedly collected knowledge

Patent number: 7957968

Abstract: The invention includes a computer based system or method for automatically generating a grammar associated with a first task comprising the steps of: receiving first data representing the first task based from responses received from a distributed network; automatically tagging the first data into parts of speech to form first tagged data; identifying filler words and core words from said first tagged data; modeling sentence structure based upon said first tagged data using a first set of rules; identifying synonyms of said core words; and creating the grammar for the first task using said modeled sentence structure, first tagged data and said synonyms.

Type: Grant

Filed: December 12, 2006

Date of Patent: June 7, 2011

Assignee: Honda Motor Co., Ltd.

Inventors: Rakesh Gupta, Ken Hennacy
SYSTEM AND METHOD FOR STANDARDIZED SPEECH RECOGNITION INFRASTRUCTURE

Publication number: 20110119059

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.

Type: Application

Filed: November 13, 2009

Publication date: May 19, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Andrej LJOLJE, Bernard S. RENGER, Steven Neil TISCHER
Hybrid speech recognition

Patent number: 7933777

Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.

Type: Grant

Filed: August 30, 2009

Date of Patent: April 26, 2011

Assignee: Multimodal Technologies, Inc.

Inventor: Detlef Koll
System and method for automatic generation of a natural language understanding model

Patent number: 7933774

Abstract: A system and method is provided for rapidly generating a new spoken dialog application. In one embodiment, a user experience person labels the transcribed data (e.g., 3000 utterances) using a set of interactive tools. The labeled data is then stored in a processed data database. During the labeling process, the user experience person not only groups utterances in various call type categories, but also flags (e.g., 100-200) specific utterances as positive and negative examples for use in an annotation guide. The labeled data in the processed data database can also be used to generate an initial natural language understanding (NLU) model.

Type: Grant

Filed: March 18, 2004

Date of Patent: April 26, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Lee Begeja, Mazin G. Rahim, Allen Louis Gorin, Behzad Shahraray, David Crawford Gibbon, Zhu Liu, Bernard S. Renger, Patrick Guy Haffner, Harris Drucker, Steven Hart Lewis
Adaptation of language models and context free grammar in speech recognition

Patent number: 7925505

Abstract: Architecture is disclosed herewith for minimizing an empirical error rate by discriminative adaptation of a statistical language model in a dictation and/or dialog application. The architecture allows assignment of an improved weighting value to each term or phrase to reduce empirical error. Empirical errors are minimized whether a user provides correction results or not based on criteria for discriminatively adapting the user language model (LM)/context-free grammar (CFG) to the target. Moreover, algorithms are provided for the training and adaptation processes of LM/CFG parameters for criteria optimization.

Type: Grant

Filed: April 10, 2007

Date of Patent: April 12, 2011

Assignee: Microsoft Corporation

Inventor: Jian Wu
METHOD FOR THE CORRECTION OF MEASURED VALUES OF VOWEL NASALANCE

Publication number: 20110082697

Abstract: A method is described for correcting and improving the functioning of certain devices for the diagnosis and treatment of speech that dynamically measure the functioning of the velum in the control of nasality during speech. The correction method uses an estimate of the vowel frequency spectrum to greatly reduce the variation of nasalance with the vowel being spoken, so as to result in a corrected value of nasalance that reflects with greater accuracy the degree of velar opening. Correction is also described for reducing the effect on nasalance values of energy from the oral and nasal channels crossing over into the other channel because of imperfect acoustic separation.

Type: Application

Filed: October 6, 2009

Publication date: April 7, 2011

Applicant: Rothenberg Enterprises

Inventor: Martin ROTHENBERG
SYSTEM AND METHOD FOR HANDLING REPEAT QUERIES DUE TO WRONG ASR OUTPUT

Publication number: 20110077942

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for handling expected repeat speech queries or other inputs. The method causes a computing device to detect a misrecognized speech query from a user, determine a tendency of the user to repeat speech queries based on previous user interactions, and adapt a speech recognition model based on the determined tendency before an expected repeat speech query. The method can further include recognizing the expected repeat speech query from the user based on the adapted speech recognition model. Adapting the speech recognition model can include modifying an acoustic model, a language model, and/or a semantic model. Adapting the speech recognition model can also include preparing a personalized search speech recognition model for the expected repeat query based on usage history and entries in a recognition lattice. The method can include retaining unmodified speech recognition models with adapted speech recognition models.

Type: Application

Filed: September 30, 2009

Publication date: March 31, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Andrej LJOLJE, Diamantino Antonio Caseiro
Pattern searching methods and apparatuses

Patent number: 7912828

Abstract: A computer-based method for identifying patterns in computer text using structures defining types of patterns which are to be identified, wherein a structure comprises one or more definition items, the method comprising assigning a weighting to each structure and each definition item; searching the computer text for a pattern to be identified on the basis of a particular structure, a pattern being provisionally identified if it matches the definition given by said particular structure; in a provisionally identified pattern, determining those of the definition items making up said particular structure that have been identified in the provisionally identified pattern; combining the weightings of the determined definition items and optionally, the weighting of the particular structure, to a single quantity; assessing whether the single quantity fulfils a given condition; depending on the result of said assessment, rejecting or confirming the provisionally identified pattern.

Type: Grant

Filed: February 23, 2007

Date of Patent: March 22, 2011

Assignee: Apple Inc.

Inventors: Olivier Bonnet, Frédéric De Jaeger, Toby Paterson
Predictive speech-to-text input

Patent number: 7904298

Abstract: This disclosure describes a practical system/method for predicting spoken text (a spoken word or a spoken sentence/phrase) given that text's partial spelling (example, initial characters forming the spelling of a word/sentence). The partial spelling may be given using “Speech” or may be inputted using the keyboard/keypad or may be obtained using other input methods. The disclosed system is an alternative method for inputting text into devices; the method is faster (especially for long words or phrases) compared to existing predictive-text-input and/or word-completion methods.

Type: Grant

Filed: November 16, 2007

Date of Patent: March 8, 2011

Inventor: Ashwin P. Rao
Semantic reconstruction

Patent number: 7899826

Abstract: Determining a semantic relationship is disclosed. Source content is received. Cluster analysis is performed at least in part by using at least a portion of the source content. At least a portion of a result of the cluster analysis is used to determine the semantic relationship between two or more content elements comprising the source content.

Type: Grant

Filed: August 31, 2009

Date of Patent: March 1, 2011

Assignee: Apple Inc.

Inventors: Philip Andrew Mansfield, Michael Robert Levy, Yuri Khramov, Darryl Will Fuller
Enhanced accuracy for speech recognition grammars

Patent number: 7890328

Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.

Type: Grant

Filed: September 7, 2006

Date of Patent: February 15, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Harry Blanchard, Steven Lewis, Shankarnarayan Sivaprasad, Lan Zhang
Joint training of feature extraction and acoustic model parameters for speech recognition

Patent number: 7885812

Abstract: Parameters for a feature extractor and acoustic model of a speech recognition module are trained. An objective function is utilized to determine values for the feature extractor parameters and the acoustic model parameters.

Type: Grant

Filed: November 15, 2006

Date of Patent: February 8, 2011

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, James G. Droppo, Milind V. Mahajan
VoiceXML language extension for natively supporting voice enrolled grammars

Patent number: 7881932

Abstract: The present invention extends the VoiceXML language model to natively support voice enrolled grammars. Specifically, three VoiceXML tags can be added to the language model to add, modify, and delete acoustically provided phrases to voice enrolled grammars. Once created, the voice enrolled grammars can be used in normal speaker dependent speech recognition operations. That is, the voice enrolled grammars can be referenced and utilized just like text enrolled grammars can be referenced and utilized. For example using the present invention, voice enrolled grammars can be referenced by standard text-based Speech Recognition Grammar Specification (SRGS) grammars to create more complex, usable grammars.

Type: Grant

Filed: October 2, 2006

Date of Patent: February 1, 2011

Assignee: Nuance Communications, Inc.

Inventor: Brien H. Muschett
Representing n-gram language models for compact storage and fast retrieval

Patent number: 7877258

Abstract: Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.

Type: Grant

Filed: March 29, 2007

Date of Patent: January 25, 2011

Assignee: Google Inc.

Inventors: Ciprian Chelba, Thorsten Brants
MODEL TRAINING FOR AUTOMATIC SPEECH RECOGNITION FROM IMPERFECT TRANSCRIPTION DATA

Publication number: 20100318355

Abstract: Techniques and systems for training an acoustic model are described. In an embodiment, a technique for training an acoustic model includes dividing a corpus of training data that includes transcription errors into N parts, and on each part, decoding an utterance with an incremental acoustic model and an incremental language model to produce a decoded transcription. The technique may further include inserting silence between a pair of words into the decoded transcription and aligning an original transcription corresponding to the utterance with the decoded transcription according to time for each part. The technique may further include selecting a segment from the utterance having at least Q contiguous matching aligned words, and training the incremental acoustic model with the selected segment. The trained incremental acoustic model may then be used on a subsequent part of the training data. Other embodiments are described and claimed.

Type: Application

Filed: June 10, 2009

Publication date: December 16, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Jinyu Li, Yifan Gong, Chaojun Liu, Kaisheng Yao
Electronic instrument for speech recognition with standby time shortening and acoustic model deletion

Patent number: 7853448

Abstract: An electronic instrument includes: a display control unit for displaying a control content corresponding to the command information based on the result of the speech recognition; an instruction unit for instructing that a control for the control content displayed by the display control unit, is cancelled; a control unit for performing the control based on the command information based on the result of the speech recognition after a predetermined standby time elapses since the control content corresponding to the command information based on the result of the speech recognition starts to be displayed by the display control unit when the instruction unit does not instruct that the control for the control content is cancelled within the predetermined standby time, and for canceling the control based on the command information based on the result of the speech recognition when the instruction unit instructs that the control is cancelled within the predetermined standby time.

Type: Grant

Filed: April 16, 2007

Date of Patent: December 14, 2010

Assignee: Funai Electric Co., Ltd.

Inventors: Shusuke Narita, Susumu Tokoshima
SYSTEM AND METHOD FOR SPEECH PERSONALIZATION BY NEED

Publication number: 20100312556

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.

Type: Application

Filed: June 9, 2009

Publication date: December 9, 2010

Applicant: AT & T Intellectual Property I , L.P.

Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. SYRDAL
Unsupervised labeling of sentence level accent

Patent number: 7844457

Abstract: Methods are disclosed for automatic accent labeling without manually labeled data. The methods are designed to exploit accent distribution between function and content words.

Type: Grant

Filed: February 20, 2007

Date of Patent: November 30, 2010

Assignee: Microsoft Corporation

Inventors: YiNing Chen, Frank Kao-ping Soong, Min Chu
Computer program code comparison using lexemes

Patent number: 7823144

Abstract: A method, apparatus and computer program product for comparing two computer program codes is disclosed. For each code, a stream of lexemes is generated for the program text of each code. The streams are concatenated in the same order as the program text. The two concatenated streams of lexemes are compared on a language-type by language-type basis to identify lexemes present only in one stream. The comparison derives a set of edit operations including minimal text block moves needed to convert one program code into the other program code.

Type: Grant

Filed: December 29, 2005

Date of Patent: October 26, 2010

Assignee: International Business Machines Corporation

Inventors: Donald P Pazel, Pradeep Varma
Training system for a speech recognition application

Patent number: 7813926

Abstract: A training system for a speech recognition application is disclosed. In embodiments described, the training system is used to train a classification model or language model. The classification model is trained using an adaptive language model generated by an iterative training process. In embodiments described, the training data is recognized by the speech recognition component and the recognized text is used to create the adaptive language model which is used for speech recognition in a following training iteration.

Type: Grant

Filed: March 16, 2006

Date of Patent: October 12, 2010

Assignee: Microsoft Corporation

Inventors: Ye-Yi Wang, John Sie Yuen Lee, Alex Acero
METHOD FOR PERFORMING SPEECH RECOGNITION AND PROCESSING SYSTEM

Publication number: 20100256978

Abstract: A method for performing speech recognition relating to an object for the purpose of affecting automatic processing of the object by a processing system. The object carries information with at least a character string of processing information. The character string spoken by an operator is processed by way of a speech recognition procedure to generate a first result. Based on the need for more information of an element of the first result additional processing data is requested. An operator's response generates a second result. The first result is then modified to achieve consistency with the operator's response.

Type: Application

Filed: April 6, 2010

Publication date: October 7, 2010

Applicant: SIEMENS AKTIENGESELLSCHAFT

Inventor: Walter Rosenbaum
Speech analyzing system with adaptive noise codebook

Patent number: 7797156

Abstract: Presented herein are systems and methods for generating an adaptive noise codebook for use with electronic speech systems. The noise codebook includes a plurality of entries which may be updated based on environmental noise sounds. The speech system includes a speech codebook and the adaptive noise codebook. The system identifies speech sounds in an audio signal using the speech and noise codebooks.

Type: Grant

Filed: February 15, 2006

Date of Patent: September 14, 2010

Assignee: Raytheon BBN Technologies Corp.

Inventors: Robert David Preuss, Darren Ross Fabbri, Daniel Ramsay Cruthirds
SPEECH RECOGNITION METHOD

Publication number: 20100204988

Abstract: A speech recognition method includes receiving a speech input signal in a first noise environment which includes a sequence of observations, determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model, adapting the model trained in a second noise environment to that of the first environment, wherein adapting the model trained in the second environment to that of the first environment includes using second order or higher order Taylor expansion coefficients derived for a group of probability distributions and the same expansion coefficient is used for the whole group.

Type: Application

Filed: April 20, 2010

Publication date: August 12, 2010

Inventors: Haitian XU, Kean Kheong Chin
Spoken man-machine interface with speaker identification

Patent number: 7769588

Abstract: The method of operating a man-machine interface unit includes classifying at least one utterance of a speaker to be of a first type or of a second type. If the utterance is classified to be of the first type, the utterance belongs to a known speaker of a speaker data base, and if the utterance is classified to be of the second type, the utterance belongs to an unknown speaker that is not included in the speaker data base. The method also includes storing a set of utterances of the second type, clustering the set of utterances into clusters, wherein each cluster comprises utterances having similar features, and automatically adding a new speaker to the speaker data base based on utterances of one of the clusters.

Type: Grant

Filed: August 20, 2008

Date of Patent: August 3, 2010

Assignee: Sony Deutschland GmbH

Inventors: Ralf Kompe, Thomas Kemp
SPEECH UNDERSTANDING APPARATUS

Publication number: 20100191530

Abstract: A speech understanding apparatus includes a speech recognition unit which performs speech recognition of an utterance using multiple language models, and outputs multiple speech recognition results obtained by the speech recognition, a language understanding unit which uses multiple language understanding models to perform language understanding for each of the multiple speech recognition results output from the speech recognition unit, and outputs multiple speech understanding results obtained from the language understanding, and an integrating unit which calculates, based on values representing features of the speech understanding results, utterance batch confidences that numerically express accuracy of the speech understanding results for each of the multiple speech understanding results output from the language understanding unit, and selects one of the speech understanding results with a highest utterance batch confidence among the calculated utterance batch confidences.

Type: Application

Filed: January 22, 2010

Publication date: July 29, 2010

Applicant: HONDA MOTOR CO., LTD.

Inventors: Mikio NAKANO, Masaki KATSUMARU, Kotaro FUNAKOSHI, Hiroshi OKUNO
Method for generating output data

Patent number: 7761302

Abstract: A method for generating output data identifying a class of a predetermined plurality of classes. The method comprises receiving data representing an acoustic signal; determining an amplitude of at least a first predetermined frequency component of said acoustic signal; and comparing the or each amplitude with a respective primary threshold; and generating output data identifying one of said classes to which said acoustic signal should be allocated, based upon said comparison.

Type: Grant

Filed: June 7, 2005

Date of Patent: July 20, 2010

Assignee: South Manchester University Hospitals NHS Trust

Inventors: Ashley Arthur Woodcock, Jaclyn Ann Smith, Kevin McGuinness
SIGNAL PROCESSING APPARATUS AND METHOD OF RECOGNIZING A VOICE COMMAND THEREOF

Publication number: 20100179812

Abstract: Provided are an apparatus and method for recognizing voice commands, the apparatus including: a voice command recognition unit which recognizes an input voice command; a voice command recognition learning unit which learns a recognition-targeted voice command; and a controller which controls the voice command recognition unit to recognize the recognition-targeted voice command from an input voice command, controls the voice command recognition learning unit to learn the input voice command if the voice command recognition is unsuccessful, and performs a particular operation corresponding to the recognized voice command if the voice command recognition is successful.

Type: Application

Filed: September 2, 2009

Publication date: July 15, 2010

Applicant: Samsung Electronics Co., Ltd.

Inventors: Jong-hyuk Jang, Seung-kwon Park, Jong-ho Lea
Automatic language model update

Patent number: 7756708

Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.

Type: Grant

Filed: April 3, 2006

Date of Patent: July 13, 2010

Assignee: Google Inc.

Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
SPEAKER ADAPTATION APPARATUS AND PROGRAM THEREOF

Publication number: 20100169094

Abstract: A speaker adaptation apparatus includes an acquiring unit configured to acquire an acoustic model including HMMs and decision trees for estimating what type of the phoneme or the word is included in a feature value used for speech recognition, the HMMs having a plurality of states on a phoneme-to-phoneme basis or a word-to-word basis, and the decision trees being configured to reply to questions relating to the feature value and output likelihoods in the respective states of the HMMs, and a speaker adaptation unit configured to adapt the decision trees to a speaker, the decision trees being adapted using speaker adaptation data vocalized by the speaker of an input speech.

Type: Application

Filed: September 17, 2009

Publication date: July 1, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masami Akamine, Jitendra Ajmera, Partha Lal
TRAINING WIDEBAND ACOUSTIC MODELS IN THE CEPSTRAL DOMAIN USING MIXED-BANDWIDTH TRAINING DATA FOR SPEECH RECOGNITION

Publication number: 20100161332

Abstract: A method and apparatus are provided that use narrowband data and wideband data to train a wideband acoustic model.

Type: Application

Filed: March 8, 2010

Publication date: June 24, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Michael L. Seltzer, Alejandro Acero
Method for Preparing Information for a Speech Dialogue System

Publication number: 20100161331

Abstract: In many application environments, it is desirable to provide voice access to tables on Internet pages, where the user asks a subject-related question in a natural language and receives an adequate answer from the table read out to him in a natural language. A method is disclosed for preparing information presented in a tabular form for a speech dialogue system so that the information of the table can be consulted in a user dialogue in a targeted manner.

Type: Application

Filed: October 25, 2006

Publication date: June 24, 2010

Applicant: Siemens Aktiengesellschaft

Inventors: Hans-Ulrich Block, Manfred Gehrke, Stefanie Schachchti
SYSTEM-EFFECTED METHODS FOR ANALYZING, PREDICTING, AND/OR MODIFYING ACOUSTIC UNITS OF HUMAN UTTERANCES FOR USE IN SPEECH SYNTHESIS AND RECOGNITION

Publication number: 20100161327

Abstract: A computer-implemented method for automatically analyzing, predicting, and/or modifying acoustic units of prosodic human speech utterances for use in speech synthesis or speech recognition. Possible steps include: initiating analysis of acoustic wave data representing the human speech utterances, via the phase state of the acoustic wave data; using one or more phase state defined acoustic wave metrics as common elements for analyzing, and optionally modifying, pitch, amplitude, duration, and other measurable acoustic parameters of the acoustic wave data, at predetermined time intervals; analyzing acoustic wave data representing a selected acoustic unit to determine the phase state of the acoustic unit; and analyzing the acoustic wave data representing the selected acoustic unit to determine at least one acoustic parameter of the acoustic unit with reference to the determined phase state of the selected acoustic unit. Also included are systems for implementing the described and related methods.

Type: Application

Filed: December 16, 2009

Publication date: June 24, 2010

Inventors: Nishant CHANDRA, Reiner Wilhelms-Tricarico, Rattima Nitisaroj, Brian Mottershead, Gary A. Marple, John B. Reichenbach
System and method for repairing a TTS voice database

Patent number: 7742919

Abstract: The present invention provides various elements of a toolkit used for generating a TTS voice for use in a spoken dialog system. The embodiments in each case may be in the form of the system, a computer-readable medium or a method for generating the TTS voice. One embodiment of the invention relates to a method of correcting a database associated with the development of a text-to-speech (TTS) voice. The method comprises generating a pronunciation dictionary for use with a TTS voice, generating a TTS voice to a stage wherein it is prepared to be tested before being deployed, identifying mislabeled phonetic units associated with the TTS voice, for each identified mislabeled phonetic unit, linking to an entry within the pronunciation dictionary to correct the entry and deleting utterances and all associated data for unacceptable utterances.

Type: Grant

Filed: September 27, 2005

Date of Patent: June 22, 2010

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Steven Lawrence Davis, Shane Fetters, David Eugene Schulz, Beverly Gustafson, Louise Loney
Reduced complexity recursive least square lattice structure adaptive filter by means of limited recursion of the backward and forward error prediction squares

Patent number: 7734466

Abstract: A method for reducing a computational complexity of an m-stage adaptive filter is provided by updating recursively forward and backward error prediction square terms for a first portion of a length of the adaptive filter, and keeping the updated forward and backward error prediction square terms constant for a second portion of the length of the adaptive filter.

Type: Grant

Filed: April 7, 2006

Date of Patent: June 8, 2010

Assignee: Motorola, Inc.

Inventors: David L. Barron, Kyle K. Iwai, James B. Piket
Applying speech recognition adaptation in an automated speech recognition system of a telematics-equipped vehicle

Patent number: 7725316

Abstract: A speech recognition adaptation method for a vehicle having a telematics unit with an embedded speech recognition system. Speech is received and pre-processed to generate acoustic feature vectors, and an adaptation parameter is applied to the acoustic feature vectors to yield transformed acoustic feature vectors. The transformed acoustic feature vectors are decoded and a hypothesis of the speech is selected, and the adaptation parameter is trained using acoustic feature vectors from the hypothesis. The method also includes one or more of the following steps: the speech is observed for a certain characteristic and the trained adaptation parameter is saved in accordance with the certain characteristic for use in transforming feature vectors of subsequent speech having the certain characteristic; use of the trained adaptation parameter persists from one vehicle ignition cycle to the next; and use of the trained adaptation parameter is ceased upon detection of a system fault.

Type: Grant

Filed: July 5, 2006

Date of Patent: May 25, 2010

Assignee: General Motors LLC

Inventors: Rathinavelu Chengalvarayan, John J Correia, Scott M Pennock
Digital voice profiles

Patent number: 7720681

Abstract: Generally described, the present invention is directed toward generating, maintaining, updating, and applying digital voice profiles. Voice profiles may be generated for individuals. The voice profiles include information that is unique to each individual and which may be applied to digital representations of that individual's voice to improve the quality of a transmitted digital representation of that individual's voice. A voice profile may include, but is not limited to, basic information about the individual, and filter definitions relating to the individuals voice patters, such as a frequency range and amplitude range. The voice profile may also include a speech definition that includes digital representations of the individual's unique speech patterns.

Type: Grant

Filed: March 23, 2006

Date of Patent: May 18, 2010

Assignee: Microsoft Corporation

Inventors: David Milstein, Kuansan Wang, Linda Criddle
Method, apparatus and computer program product for providing adaptive language model scaling

Patent number: 7716049

Abstract: An apparatus for providing adaptive language model scaling includes an adaptive scaling element and an interface element. The adaptive scaling element is configured to receive input speech comprising a sequence of spoken words and to determine a plurality of candidate sequences of text words in which each of the candidate sequences has a corresponding sentence score representing a probability that a candidate sequence matches the sequence of spoken words. Each corresponding sentence score is calculated using an adaptive scaling factor. The interface element is configured to receive a user input selecting one of the candidate sequences. The adaptive scaling element is further configured to estimate an objective function based on the user input and to modify the adaptive scaling factor based on the estimated objective function.

Type: Grant

Filed: June 30, 2006

Date of Patent: May 11, 2010

Assignee: Nokia Corporation

Inventor: Jilei Tian
Multitask Learning for Spoken Language Understanding

Publication number: 20100100380

Abstract: A system, method and computer-readable medium provide a multitask learning method for intent or call-type classification in a spoken language understanding system. Multitask learning aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.

Type: Application

Filed: December 28, 2009

Publication date: April 22, 2010

Applicant: AT&T Corp.

Inventor: Gokhan Tur
Unsupervised data-driven pronunciation modeling

Patent number: 7702509

Abstract: Pronunciation for an input word is modeled by generating a set of candidate phoneme strings having pronunciations close to the input word in an orthographic space. Phoneme sub-strings in the set are selected as the pronunciation. In one aspect, a first closeness measure between phoneme strings for words chosen from a dictionary and contexts within the input word is used to determine the candidate phoneme strings. The words are chosen from the dictionary based on a second closeness measure between a representation of the input word in the orthographic space and orthographic anchors corresponding to the words in the dictionary. In another aspect, the phoneme sub-strings are selected by aligning the candidate phoneme strings on common phoneme sub-strings to produce an occurrence count, which is used to choose the phoneme sub-strings for the pronunciation.

Type: Grant

Filed: November 21, 2006

Date of Patent: April 20, 2010

Assignee: Apple Inc.

Inventor: Jerome R. Bellegarda
WEIGHT COEFFICIENT LEARNING SYSTEM AND AUDIO RECOGNITION SYSTEM

Publication number: 20100094629

Abstract: A weighting factor learning system includes an audio recognition section that recognizes learning audio data and outputting the recognition result; a weighting factor updating section that updates a weighting factor applied to a score obtained from an acoustic model and a language model so that the difference between a correct-answer score calculated with the use of a correct-answer text of the learning audio data and a score of the recognition result becomes large; a convergence determination section that determines, with the use of the score after updating, whether to return to the weighting factor updating section to update the weighting factor again; and a weighting factor convergence determination section that determines, with the use of the score after updating, whether to return to the audio recognition section to perform the process again and update the weighting factor using the weighting factor updating section.

Type: Application

Filed: February 19, 2008

Publication date: April 15, 2010

Inventors: Tadashi Emori, Yoshifumi Onishi
METHOD AND APPARATUS FOR AGGREGATING AND PRESENTING DATA ASSOCIATED WITH GEOGRAPHIC LOCATIONS

Publication number: 20100076968

Abstract: Implementations relate to systems and methods for aggregating and presenting data related to geographic locations. Geotag data related to geographic locations and associated features or attributes can be collected to build a regional profile characterizing a set of locations within the region. Geotag data related to the constituent locations, such as user ratings or popularity ranks for restaurants, shops, parks, or other features, sites, or attractions, can be combined to generate a profile of characteristics of locations in the region. The platform can generate recommendations of locations to transmit to the user of a mobile device, based for instance on the location of the device in the region as reported by GPS or other location service and the regional profile. Geotag data can include audio data analyzed using region-specific terms, and user recommendations can be presented via dynamic menus based on regional profiles, user preferences or other criteria.

Type: Application

Filed: May 21, 2009

Publication date: March 25, 2010

Inventors: Mark R. BOYNS, Chand MEHTA, Jeffrey C. TSAY, Giridhar D. MANDYAM
Adaptation of symbols

Patent number: 7676366

Abstract: A speaker adaptation system and method for speech models of symbols displays a multi-word symbol to be spoken as a symbol. The supervised adaptation system and method has unsupervised adaptation for multi-word symbols, limited to the set of words associated with each multi-word symbol.

Type: Grant

Filed: January 13, 2003

Date of Patent: March 9, 2010

Assignee: Art Advanced Recognition Technologies Inc.

Inventors: Ran Mochary, Sasi Solomon, Tal El-Hay, Tal Yadid, Itamar Bartur
System for estimating parameters of a gaussian mixture model

Patent number: 7664640

Abstract: A signal processing system is disclosed which is implemented using Gaussian Mixture Model (GMM) based Hidden Markov Model (HMM), or a GMM alone, parameters of which are constrained during its optimization procedure. Also disclosed is a constraint system applied to input vectors representing the input signal to the system. The invention is particularly, but not exclusively, related to speech recognition systems. The invention reduces the tendency, common in prior art systems, to get caught in local minima associated with highly anisotropic Gaussian components—which reduces the recognizer performance—by employing the constraint system as above whereby the anisotropy of such components are minimized. The invention also covers a method of processing a signal, and a speech recognizer trained according to the method.

Type: Grant

Filed: March 24, 2003

Date of Patent: February 16, 2010

Assignee: Qinetiq Limited

Inventor: Christopher John St. Clair Webber
Transparent monitoring and intervention to improve automatic adaptation of speech models

Patent number: 7660715

Abstract: A system and method to improve the automatic adaptation of one or more speech models in automatic speech recognition systems. After a dialog begins, for example, the dialog asks the customer to provide spoken input and it is recorded. If the speech recognizer determines it may not have correctly transcribed the verbal response, i.e., voice input, the invention uses monitoring and if necessary, intervention to guarantee that the next transcription of the verbal response is correct. The dialog asks the customer to repeat his verbal response, which is recorded and a transcription of the input is sent to a human monitor, i.e., agent or operator. If the transcription of the spoken input is correct, the human does not intervene and the transcription remains unmodified. If the transcription of the verbal response is incorrect, the human intervenes and the transcription of the misrecognized word is corrected. In both cases, the dialog asks the customer to confirm the unmodified and corrected transcription.

Type: Grant

Filed: January 12, 2004

Date of Patent: February 9, 2010

Assignee: Avaya Inc.

Inventor: David Preshan Thambiratnam
Speech recognition system and program thereof

Patent number: 7660717

Abstract: Speech recognition is performed by matching between a characteristic quantity of an inputted speech and a composite HMM obtained by synthesizing a speech HMM (hidden Markov model) and a noise HMM for each speech frame of the inputted speech by use of the composite HMM.

Type: Grant

Filed: January 9, 2008

Date of Patent: February 9, 2010

Assignee: Nuance Communications, Inc.

Inventors: Tetsuya Takiguchi, Masafumi Nishimura
Speech processing apparatus, speech processing method, program, and recording medium

Patent number: 7657430

Abstract: An apparatus inputs an utterance and performs speech recognition on the input utterance. The speech processing apparatus determines whether the recognition result contains an unknown word. If it is determined that the recognition result contains an unknown word, it is then determined whether the recognition result is rejected or not. If it is determined that the recognition result is not rejected, a word corresponding to the unknown word contained in the recognition result is required. The apparatus can be used as a speech processing apparatus.

Type: Grant

Filed: July 20, 2005

Date of Patent: February 2, 2010

Assignee: Sony Corporation

Inventor: Hiroaki Ogawa
EXTENDED RECOGNITION DICTIONARY LEARNING DEVICE AND SPEECH RECOGNITION SYSTEM

Publication number: 20100023329

Abstract: Speech recognition of even a speaker who uses a speech recognition system is enabled by using an extended recognition dictionary suited to the speaker without requiring any previous learning using an utterance label corresponding to the speech of the speaker.

Type: Application

Filed: January 15, 2008

Publication date: January 28, 2010

Applicant: NEC CORPORATION

Inventor: Yoshifumi Onishi
Apparatus and method for speech utterance verification

Publication number: 20100004931

Abstract: An apparatus is provided for speech utterance verification. The apparatus is configured to compare a first prosody component from a recorded speech with a second prosody component for a reference speech. The apparatus determines a prosodic verification evaluation for the recorded speech utterance in dependence of the comparison.

Type: Application

Filed: September 15, 2006

Publication date: January 7, 2010

Inventors: Bin Ma, Haizhou Li, Minghui Dong
Context-sensitive communication and translation methods for enhanced interactions and understanding among speakers of different languages

Patent number: 7643985

Abstract: Architecture that interacts with a user, or users of different tongues to enhance speech translation. A recognized concept or situation is sensed and/or converged upon, and disambiguated with mixed-initiative user interaction with a device to provide simplified inferences about user communication goals in working with others who speak another language. Reasoning is applied about communication goals based on the concept or situation at the current focus of attention or the probability distribution over the likely focus of attention, and the user or user's conversational partner is provided with appropriately triaged choices and, images, text and/or speech translations for review or perception. The inferences can also process an utterance or other input from a user as part of the evidence in reasoning about a concept, situation, goals, and/or disambiguating the latter.

Type: Grant

Filed: June 27, 2005

Date of Patent: January 5, 2010

Assignee: Microsoft Corporation

Inventor: Eric J. Horvitz
Language model update device, language Model update method, and language model update program

Publication number: 20090313017

Abstract: A framework in which a numerical value that represents a statistical appearance tendency of each of words in a language model is set with respect to the words not only as a constant, but also as an update function that changes in time, is included. The numerical value that represents the set statistical appearance tendency of a word is automatically updated in accordance with passage of time.

Type: Application

Filed: July 6, 2007

Publication date: December 17, 2009

Inventors: Satoshi Nakazawa, Hitoshi Yamamoto, Tasuku Kitade

prev … 4 5 6 7 8 9 10 11 12 … next