Update Patterns Patents (Class 704/244)
  • Patent number: 7957968
    Abstract: The invention includes a computer based system or method for automatically generating a grammar associated with a first task comprising the steps of: receiving first data representing the first task based from responses received from a distributed network; automatically tagging the first data into parts of speech to form first tagged data; identifying filler words and core words from said first tagged data; modeling sentence structure based upon said first tagged data using a first set of rules; identifying synonyms of said core words; and creating the grammar for the first task using said modeled sentence structure, first tagged data and said synonyms.
    Type: Grant
    Filed: December 12, 2006
    Date of Patent: June 7, 2011
    Assignee: Honda Motor Co., Ltd.
    Inventors: Rakesh Gupta, Ken Hennacy
  • Publication number: 20110119059
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.
    Type: Application
    Filed: November 13, 2009
    Publication date: May 19, 2011
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Andrej LJOLJE, Bernard S. RENGER, Steven Neil TISCHER
  • Patent number: 7933777
    Abstract: A hybrid speech recognition system uses a client-side speech recognition engine and a server-side speech recognition engine to produce speech recognition results for the same speech. An arbitration engine produces speech recognition output based on one or both of the client-side and server-side speech recognition results.
    Type: Grant
    Filed: August 30, 2009
    Date of Patent: April 26, 2011
    Assignee: Multimodal Technologies, Inc.
    Inventor: Detlef Koll
  • Patent number: 7933774
    Abstract: A system and method is provided for rapidly generating a new spoken dialog application. In one embodiment, a user experience person labels the transcribed data (e.g., 3000 utterances) using a set of interactive tools. The labeled data is then stored in a processed data database. During the labeling process, the user experience person not only groups utterances in various call type categories, but also flags (e.g., 100-200) specific utterances as positive and negative examples for use in an annotation guide. The labeled data in the processed data database can also be used to generate an initial natural language understanding (NLU) model.
    Type: Grant
    Filed: March 18, 2004
    Date of Patent: April 26, 2011
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Lee Begeja, Mazin G. Rahim, Allen Louis Gorin, Behzad Shahraray, David Crawford Gibbon, Zhu Liu, Bernard S. Renger, Patrick Guy Haffner, Harris Drucker, Steven Hart Lewis
  • Patent number: 7925505
    Abstract: Architecture is disclosed herewith for minimizing an empirical error rate by discriminative adaptation of a statistical language model in a dictation and/or dialog application. The architecture allows assignment of an improved weighting value to each term or phrase to reduce empirical error. Empirical errors are minimized whether a user provides correction results or not based on criteria for discriminatively adapting the user language model (LM)/context-free grammar (CFG) to the target. Moreover, algorithms are provided for the training and adaptation processes of LM/CFG parameters for criteria optimization.
    Type: Grant
    Filed: April 10, 2007
    Date of Patent: April 12, 2011
    Assignee: Microsoft Corporation
    Inventor: Jian Wu
  • Publication number: 20110082697
    Abstract: A method is described for correcting and improving the functioning of certain devices for the diagnosis and treatment of speech that dynamically measure the functioning of the velum in the control of nasality during speech. The correction method uses an estimate of the vowel frequency spectrum to greatly reduce the variation of nasalance with the vowel being spoken, so as to result in a corrected value of nasalance that reflects with greater accuracy the degree of velar opening. Correction is also described for reducing the effect on nasalance values of energy from the oral and nasal channels crossing over into the other channel because of imperfect acoustic separation.
    Type: Application
    Filed: October 6, 2009
    Publication date: April 7, 2011
    Applicant: Rothenberg Enterprises
    Inventor: Martin ROTHENBERG
  • Publication number: 20110077942
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for handling expected repeat speech queries or other inputs. The method causes a computing device to detect a misrecognized speech query from a user, determine a tendency of the user to repeat speech queries based on previous user interactions, and adapt a speech recognition model based on the determined tendency before an expected repeat speech query. The method can further include recognizing the expected repeat speech query from the user based on the adapted speech recognition model. Adapting the speech recognition model can include modifying an acoustic model, a language model, and/or a semantic model. Adapting the speech recognition model can also include preparing a personalized search speech recognition model for the expected repeat query based on usage history and entries in a recognition lattice. The method can include retaining unmodified speech recognition models with adapted speech recognition models.
    Type: Application
    Filed: September 30, 2009
    Publication date: March 31, 2011
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Andrej LJOLJE, Diamantino Antonio Caseiro
  • Patent number: 7912828
    Abstract: A computer-based method for identifying patterns in computer text using structures defining types of patterns which are to be identified, wherein a structure comprises one or more definition items, the method comprising assigning a weighting to each structure and each definition item; searching the computer text for a pattern to be identified on the basis of a particular structure, a pattern being provisionally identified if it matches the definition given by said particular structure; in a provisionally identified pattern, determining those of the definition items making up said particular structure that have been identified in the provisionally identified pattern; combining the weightings of the determined definition items and optionally, the weighting of the particular structure, to a single quantity; assessing whether the single quantity fulfils a given condition; depending on the result of said assessment, rejecting or confirming the provisionally identified pattern.
    Type: Grant
    Filed: February 23, 2007
    Date of Patent: March 22, 2011
    Assignee: Apple Inc.
    Inventors: Olivier Bonnet, Frédéric De Jaeger, Toby Paterson
  • Patent number: 7904298
    Abstract: This disclosure describes a practical system/method for predicting spoken text (a spoken word or a spoken sentence/phrase) given that text's partial spelling (example, initial characters forming the spelling of a word/sentence). The partial spelling may be given using “Speech” or may be inputted using the keyboard/keypad or may be obtained using other input methods. The disclosed system is an alternative method for inputting text into devices; the method is faster (especially for long words or phrases) compared to existing predictive-text-input and/or word-completion methods.
    Type: Grant
    Filed: November 16, 2007
    Date of Patent: March 8, 2011
    Inventor: Ashwin P. Rao
  • Patent number: 7899826
    Abstract: Determining a semantic relationship is disclosed. Source content is received. Cluster analysis is performed at least in part by using at least a portion of the source content. At least a portion of a result of the cluster analysis is used to determine the semantic relationship between two or more content elements comprising the source content.
    Type: Grant
    Filed: August 31, 2009
    Date of Patent: March 1, 2011
    Assignee: Apple Inc.
    Inventors: Philip Andrew Mansfield, Michael Robert Levy, Yuri Khramov, Darryl Will Fuller
  • Patent number: 7890328
    Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.
    Type: Grant
    Filed: September 7, 2006
    Date of Patent: February 15, 2011
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Harry Blanchard, Steven Lewis, Shankarnarayan Sivaprasad, Lan Zhang
  • Patent number: 7885812
    Abstract: Parameters for a feature extractor and acoustic model of a speech recognition module are trained. An objective function is utilized to determine values for the feature extractor parameters and the acoustic model parameters.
    Type: Grant
    Filed: November 15, 2006
    Date of Patent: February 8, 2011
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, James G. Droppo, Milind V. Mahajan
  • Patent number: 7881932
    Abstract: The present invention extends the VoiceXML language model to natively support voice enrolled grammars. Specifically, three VoiceXML tags can be added to the language model to add, modify, and delete acoustically provided phrases to voice enrolled grammars. Once created, the voice enrolled grammars can be used in normal speaker dependent speech recognition operations. That is, the voice enrolled grammars can be referenced and utilized just like text enrolled grammars can be referenced and utilized. For example using the present invention, voice enrolled grammars can be referenced by standard text-based Speech Recognition Grammar Specification (SRGS) grammars to create more complex, usable grammars.
    Type: Grant
    Filed: October 2, 2006
    Date of Patent: February 1, 2011
    Assignee: Nuance Communications, Inc.
    Inventor: Brien H. Muschett
  • Patent number: 7877258
    Abstract: Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.
    Type: Grant
    Filed: March 29, 2007
    Date of Patent: January 25, 2011
    Assignee: Google Inc.
    Inventors: Ciprian Chelba, Thorsten Brants
  • Publication number: 20100318355
    Abstract: Techniques and systems for training an acoustic model are described. In an embodiment, a technique for training an acoustic model includes dividing a corpus of training data that includes transcription errors into N parts, and on each part, decoding an utterance with an incremental acoustic model and an incremental language model to produce a decoded transcription. The technique may further include inserting silence between a pair of words into the decoded transcription and aligning an original transcription corresponding to the utterance with the decoded transcription according to time for each part. The technique may further include selecting a segment from the utterance having at least Q contiguous matching aligned words, and training the incremental acoustic model with the selected segment. The trained incremental acoustic model may then be used on a subsequent part of the training data. Other embodiments are described and claimed.
    Type: Application
    Filed: June 10, 2009
    Publication date: December 16, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Jinyu Li, Yifan Gong, Chaojun Liu, Kaisheng Yao
  • Patent number: 7853448
    Abstract: An electronic instrument includes: a display control unit for displaying a control content corresponding to the command information based on the result of the speech recognition; an instruction unit for instructing that a control for the control content displayed by the display control unit, is cancelled; a control unit for performing the control based on the command information based on the result of the speech recognition after a predetermined standby time elapses since the control content corresponding to the command information based on the result of the speech recognition starts to be displayed by the display control unit when the instruction unit does not instruct that the control for the control content is cancelled within the predetermined standby time, and for canceling the control based on the command information based on the result of the speech recognition when the instruction unit instructs that the control is cancelled within the predetermined standby time.
    Type: Grant
    Filed: April 16, 2007
    Date of Patent: December 14, 2010
    Assignee: Funai Electric Co., Ltd.
    Inventors: Shusuke Narita, Susumu Tokoshima
  • Publication number: 20100312556
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.
    Type: Application
    Filed: June 9, 2009
    Publication date: December 9, 2010
    Applicant: AT & T Intellectual Property I , L.P.
    Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. SYRDAL
  • Patent number: 7844457
    Abstract: Methods are disclosed for automatic accent labeling without manually labeled data. The methods are designed to exploit accent distribution between function and content words.
    Type: Grant
    Filed: February 20, 2007
    Date of Patent: November 30, 2010
    Assignee: Microsoft Corporation
    Inventors: YiNing Chen, Frank Kao-ping Soong, Min Chu
  • Patent number: 7823144
    Abstract: A method, apparatus and computer program product for comparing two computer program codes is disclosed. For each code, a stream of lexemes is generated for the program text of each code. The streams are concatenated in the same order as the program text. The two concatenated streams of lexemes are compared on a language-type by language-type basis to identify lexemes present only in one stream. The comparison derives a set of edit operations including minimal text block moves needed to convert one program code into the other program code.
    Type: Grant
    Filed: December 29, 2005
    Date of Patent: October 26, 2010
    Assignee: International Business Machines Corporation
    Inventors: Donald P Pazel, Pradeep Varma
  • Patent number: 7813926
    Abstract: A training system for a speech recognition application is disclosed. In embodiments described, the training system is used to train a classification model or language model. The classification model is trained using an adaptive language model generated by an iterative training process. In embodiments described, the training data is recognized by the speech recognition component and the recognized text is used to create the adaptive language model which is used for speech recognition in a following training iteration.
    Type: Grant
    Filed: March 16, 2006
    Date of Patent: October 12, 2010
    Assignee: Microsoft Corporation
    Inventors: Ye-Yi Wang, John Sie Yuen Lee, Alex Acero
  • Publication number: 20100256978
    Abstract: A method for performing speech recognition relating to an object for the purpose of affecting automatic processing of the object by a processing system. The object carries information with at least a character string of processing information. The character string spoken by an operator is processed by way of a speech recognition procedure to generate a first result. Based on the need for more information of an element of the first result additional processing data is requested. An operator's response generates a second result. The first result is then modified to achieve consistency with the operator's response.
    Type: Application
    Filed: April 6, 2010
    Publication date: October 7, 2010
    Applicant: SIEMENS AKTIENGESELLSCHAFT
    Inventor: Walter Rosenbaum
  • Patent number: 7797156
    Abstract: Presented herein are systems and methods for generating an adaptive noise codebook for use with electronic speech systems. The noise codebook includes a plurality of entries which may be updated based on environmental noise sounds. The speech system includes a speech codebook and the adaptive noise codebook. The system identifies speech sounds in an audio signal using the speech and noise codebooks.
    Type: Grant
    Filed: February 15, 2006
    Date of Patent: September 14, 2010
    Assignee: Raytheon BBN Technologies Corp.
    Inventors: Robert David Preuss, Darren Ross Fabbri, Daniel Ramsay Cruthirds
  • Publication number: 20100204988
    Abstract: A speech recognition method includes receiving a speech input signal in a first noise environment which includes a sequence of observations, determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model, adapting the model trained in a second noise environment to that of the first environment, wherein adapting the model trained in the second environment to that of the first environment includes using second order or higher order Taylor expansion coefficients derived for a group of probability distributions and the same expansion coefficient is used for the whole group.
    Type: Application
    Filed: April 20, 2010
    Publication date: August 12, 2010
    Inventors: Haitian XU, Kean Kheong Chin
  • Patent number: 7769588
    Abstract: The method of operating a man-machine interface unit includes classifying at least one utterance of a speaker to be of a first type or of a second type. If the utterance is classified to be of the first type, the utterance belongs to a known speaker of a speaker data base, and if the utterance is classified to be of the second type, the utterance belongs to an unknown speaker that is not included in the speaker data base. The method also includes storing a set of utterances of the second type, clustering the set of utterances into clusters, wherein each cluster comprises utterances having similar features, and automatically adding a new speaker to the speaker data base based on utterances of one of the clusters.
    Type: Grant
    Filed: August 20, 2008
    Date of Patent: August 3, 2010
    Assignee: Sony Deutschland GmbH
    Inventors: Ralf Kompe, Thomas Kemp
  • Publication number: 20100191530
    Abstract: A speech understanding apparatus includes a speech recognition unit which performs speech recognition of an utterance using multiple language models, and outputs multiple speech recognition results obtained by the speech recognition, a language understanding unit which uses multiple language understanding models to perform language understanding for each of the multiple speech recognition results output from the speech recognition unit, and outputs multiple speech understanding results obtained from the language understanding, and an integrating unit which calculates, based on values representing features of the speech understanding results, utterance batch confidences that numerically express accuracy of the speech understanding results for each of the multiple speech understanding results output from the language understanding unit, and selects one of the speech understanding results with a highest utterance batch confidence among the calculated utterance batch confidences.
    Type: Application
    Filed: January 22, 2010
    Publication date: July 29, 2010
    Applicant: HONDA MOTOR CO., LTD.
    Inventors: Mikio NAKANO, Masaki KATSUMARU, Kotaro FUNAKOSHI, Hiroshi OKUNO
  • Patent number: 7761302
    Abstract: A method for generating output data identifying a class of a predetermined plurality of classes. The method comprises receiving data representing an acoustic signal; determining an amplitude of at least a first predetermined frequency component of said acoustic signal; and comparing the or each amplitude with a respective primary threshold; and generating output data identifying one of said classes to which said acoustic signal should be allocated, based upon said comparison.
    Type: Grant
    Filed: June 7, 2005
    Date of Patent: July 20, 2010
    Assignee: South Manchester University Hospitals NHS Trust
    Inventors: Ashley Arthur Woodcock, Jaclyn Ann Smith, Kevin McGuinness
  • Publication number: 20100179812
    Abstract: Provided are an apparatus and method for recognizing voice commands, the apparatus including: a voice command recognition unit which recognizes an input voice command; a voice command recognition learning unit which learns a recognition-targeted voice command; and a controller which controls the voice command recognition unit to recognize the recognition-targeted voice command from an input voice command, controls the voice command recognition learning unit to learn the input voice command if the voice command recognition is unsuccessful, and performs a particular operation corresponding to the recognized voice command if the voice command recognition is successful.
    Type: Application
    Filed: September 2, 2009
    Publication date: July 15, 2010
    Applicant: Samsung Electronics Co., Ltd.
    Inventors: Jong-hyuk Jang, Seung-kwon Park, Jong-ho Lea
  • Patent number: 7756708
    Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.
    Type: Grant
    Filed: April 3, 2006
    Date of Patent: July 13, 2010
    Assignee: Google Inc.
    Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
  • Publication number: 20100169094
    Abstract: A speaker adaptation apparatus includes an acquiring unit configured to acquire an acoustic model including HMMs and decision trees for estimating what type of the phoneme or the word is included in a feature value used for speech recognition, the HMMs having a plurality of states on a phoneme-to-phoneme basis or a word-to-word basis, and the decision trees being configured to reply to questions relating to the feature value and output likelihoods in the respective states of the HMMs, and a speaker adaptation unit configured to adapt the decision trees to a speaker, the decision trees being adapted using speaker adaptation data vocalized by the speaker of an input speech.
    Type: Application
    Filed: September 17, 2009
    Publication date: July 1, 2010
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masami Akamine, Jitendra Ajmera, Partha Lal
  • Publication number: 20100161332
    Abstract: A method and apparatus are provided that use narrowband data and wideband data to train a wideband acoustic model.
    Type: Application
    Filed: March 8, 2010
    Publication date: June 24, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Michael L. Seltzer, Alejandro Acero
  • Publication number: 20100161331
    Abstract: In many application environments, it is desirable to provide voice access to tables on Internet pages, where the user asks a subject-related question in a natural language and receives an adequate answer from the table read out to him in a natural language. A method is disclosed for preparing information presented in a tabular form for a speech dialogue system so that the information of the table can be consulted in a user dialogue in a targeted manner.
    Type: Application
    Filed: October 25, 2006
    Publication date: June 24, 2010
    Applicant: Siemens Aktiengesellschaft
    Inventors: Hans-Ulrich Block, Manfred Gehrke, Stefanie Schachchti
  • Publication number: 20100161327
    Abstract: A computer-implemented method for automatically analyzing, predicting, and/or modifying acoustic units of prosodic human speech utterances for use in speech synthesis or speech recognition. Possible steps include: initiating analysis of acoustic wave data representing the human speech utterances, via the phase state of the acoustic wave data; using one or more phase state defined acoustic wave metrics as common elements for analyzing, and optionally modifying, pitch, amplitude, duration, and other measurable acoustic parameters of the acoustic wave data, at predetermined time intervals; analyzing acoustic wave data representing a selected acoustic unit to determine the phase state of the acoustic unit; and analyzing the acoustic wave data representing the selected acoustic unit to determine at least one acoustic parameter of the acoustic unit with reference to the determined phase state of the selected acoustic unit. Also included are systems for implementing the described and related methods.
    Type: Application
    Filed: December 16, 2009
    Publication date: June 24, 2010
    Inventors: Nishant CHANDRA, Reiner Wilhelms-Tricarico, Rattima Nitisaroj, Brian Mottershead, Gary A. Marple, John B. Reichenbach
  • Patent number: 7742919
    Abstract: The present invention provides various elements of a toolkit used for generating a TTS voice for use in a spoken dialog system. The embodiments in each case may be in the form of the system, a computer-readable medium or a method for generating the TTS voice. One embodiment of the invention relates to a method of correcting a database associated with the development of a text-to-speech (TTS) voice. The method comprises generating a pronunciation dictionary for use with a TTS voice, generating a TTS voice to a stage wherein it is prepared to be tested before being deployed, identifying mislabeled phonetic units associated with the TTS voice, for each identified mislabeled phonetic unit, linking to an entry within the pronunciation dictionary to correct the entry and deleting utterances and all associated data for unacceptable utterances.
    Type: Grant
    Filed: September 27, 2005
    Date of Patent: June 22, 2010
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Steven Lawrence Davis, Shane Fetters, David Eugene Schulz, Beverly Gustafson, Louise Loney
  • Patent number: 7734466
    Abstract: A method for reducing a computational complexity of an m-stage adaptive filter is provided by updating recursively forward and backward error prediction square terms for a first portion of a length of the adaptive filter, and keeping the updated forward and backward error prediction square terms constant for a second portion of the length of the adaptive filter.
    Type: Grant
    Filed: April 7, 2006
    Date of Patent: June 8, 2010
    Assignee: Motorola, Inc.
    Inventors: David L. Barron, Kyle K. Iwai, James B. Piket
  • Patent number: 7725316
    Abstract: A speech recognition adaptation method for a vehicle having a telematics unit with an embedded speech recognition system. Speech is received and pre-processed to generate acoustic feature vectors, and an adaptation parameter is applied to the acoustic feature vectors to yield transformed acoustic feature vectors. The transformed acoustic feature vectors are decoded and a hypothesis of the speech is selected, and the adaptation parameter is trained using acoustic feature vectors from the hypothesis. The method also includes one or more of the following steps: the speech is observed for a certain characteristic and the trained adaptation parameter is saved in accordance with the certain characteristic for use in transforming feature vectors of subsequent speech having the certain characteristic; use of the trained adaptation parameter persists from one vehicle ignition cycle to the next; and use of the trained adaptation parameter is ceased upon detection of a system fault.
    Type: Grant
    Filed: July 5, 2006
    Date of Patent: May 25, 2010
    Assignee: General Motors LLC
    Inventors: Rathinavelu Chengalvarayan, John J Correia, Scott M Pennock
  • Patent number: 7720681
    Abstract: Generally described, the present invention is directed toward generating, maintaining, updating, and applying digital voice profiles. Voice profiles may be generated for individuals. The voice profiles include information that is unique to each individual and which may be applied to digital representations of that individual's voice to improve the quality of a transmitted digital representation of that individual's voice. A voice profile may include, but is not limited to, basic information about the individual, and filter definitions relating to the individuals voice patters, such as a frequency range and amplitude range. The voice profile may also include a speech definition that includes digital representations of the individual's unique speech patterns.
    Type: Grant
    Filed: March 23, 2006
    Date of Patent: May 18, 2010
    Assignee: Microsoft Corporation
    Inventors: David Milstein, Kuansan Wang, Linda Criddle
  • Patent number: 7716049
    Abstract: An apparatus for providing adaptive language model scaling includes an adaptive scaling element and an interface element. The adaptive scaling element is configured to receive input speech comprising a sequence of spoken words and to determine a plurality of candidate sequences of text words in which each of the candidate sequences has a corresponding sentence score representing a probability that a candidate sequence matches the sequence of spoken words. Each corresponding sentence score is calculated using an adaptive scaling factor. The interface element is configured to receive a user input selecting one of the candidate sequences. The adaptive scaling element is further configured to estimate an objective function based on the user input and to modify the adaptive scaling factor based on the estimated objective function.
    Type: Grant
    Filed: June 30, 2006
    Date of Patent: May 11, 2010
    Assignee: Nokia Corporation
    Inventor: Jilei Tian
  • Publication number: 20100100380
    Abstract: A system, method and computer-readable medium provide a multitask learning method for intent or call-type classification in a spoken language understanding system. Multitask learning aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.
    Type: Application
    Filed: December 28, 2009
    Publication date: April 22, 2010
    Applicant: AT&T Corp.
    Inventor: Gokhan Tur
  • Patent number: 7702509
    Abstract: Pronunciation for an input word is modeled by generating a set of candidate phoneme strings having pronunciations close to the input word in an orthographic space. Phoneme sub-strings in the set are selected as the pronunciation. In one aspect, a first closeness measure between phoneme strings for words chosen from a dictionary and contexts within the input word is used to determine the candidate phoneme strings. The words are chosen from the dictionary based on a second closeness measure between a representation of the input word in the orthographic space and orthographic anchors corresponding to the words in the dictionary. In another aspect, the phoneme sub-strings are selected by aligning the candidate phoneme strings on common phoneme sub-strings to produce an occurrence count, which is used to choose the phoneme sub-strings for the pronunciation.
    Type: Grant
    Filed: November 21, 2006
    Date of Patent: April 20, 2010
    Assignee: Apple Inc.
    Inventor: Jerome R. Bellegarda
  • Publication number: 20100094629
    Abstract: A weighting factor learning system includes an audio recognition section that recognizes learning audio data and outputting the recognition result; a weighting factor updating section that updates a weighting factor applied to a score obtained from an acoustic model and a language model so that the difference between a correct-answer score calculated with the use of a correct-answer text of the learning audio data and a score of the recognition result becomes large; a convergence determination section that determines, with the use of the score after updating, whether to return to the weighting factor updating section to update the weighting factor again; and a weighting factor convergence determination section that determines, with the use of the score after updating, whether to return to the audio recognition section to perform the process again and update the weighting factor using the weighting factor updating section.
    Type: Application
    Filed: February 19, 2008
    Publication date: April 15, 2010
    Inventors: Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20100076968
    Abstract: Implementations relate to systems and methods for aggregating and presenting data related to geographic locations. Geotag data related to geographic locations and associated features or attributes can be collected to build a regional profile characterizing a set of locations within the region. Geotag data related to the constituent locations, such as user ratings or popularity ranks for restaurants, shops, parks, or other features, sites, or attractions, can be combined to generate a profile of characteristics of locations in the region. The platform can generate recommendations of locations to transmit to the user of a mobile device, based for instance on the location of the device in the region as reported by GPS or other location service and the regional profile. Geotag data can include audio data analyzed using region-specific terms, and user recommendations can be presented via dynamic menus based on regional profiles, user preferences or other criteria.
    Type: Application
    Filed: May 21, 2009
    Publication date: March 25, 2010
    Inventors: Mark R. BOYNS, Chand MEHTA, Jeffrey C. TSAY, Giridhar D. MANDYAM
  • Patent number: 7676366
    Abstract: A speaker adaptation system and method for speech models of symbols displays a multi-word symbol to be spoken as a symbol. The supervised adaptation system and method has unsupervised adaptation for multi-word symbols, limited to the set of words associated with each multi-word symbol.
    Type: Grant
    Filed: January 13, 2003
    Date of Patent: March 9, 2010
    Assignee: Art Advanced Recognition Technologies Inc.
    Inventors: Ran Mochary, Sasi Solomon, Tal El-Hay, Tal Yadid, Itamar Bartur
  • Patent number: 7664640
    Abstract: A signal processing system is disclosed which is implemented using Gaussian Mixture Model (GMM) based Hidden Markov Model (HMM), or a GMM alone, parameters of which are constrained during its optimization procedure. Also disclosed is a constraint system applied to input vectors representing the input signal to the system. The invention is particularly, but not exclusively, related to speech recognition systems. The invention reduces the tendency, common in prior art systems, to get caught in local minima associated with highly anisotropic Gaussian components—which reduces the recognizer performance—by employing the constraint system as above whereby the anisotropy of such components are minimized. The invention also covers a method of processing a signal, and a speech recognizer trained according to the method.
    Type: Grant
    Filed: March 24, 2003
    Date of Patent: February 16, 2010
    Assignee: Qinetiq Limited
    Inventor: Christopher John St. Clair Webber
  • Patent number: 7660715
    Abstract: A system and method to improve the automatic adaptation of one or more speech models in automatic speech recognition systems. After a dialog begins, for example, the dialog asks the customer to provide spoken input and it is recorded. If the speech recognizer determines it may not have correctly transcribed the verbal response, i.e., voice input, the invention uses monitoring and if necessary, intervention to guarantee that the next transcription of the verbal response is correct. The dialog asks the customer to repeat his verbal response, which is recorded and a transcription of the input is sent to a human monitor, i.e., agent or operator. If the transcription of the spoken input is correct, the human does not intervene and the transcription remains unmodified. If the transcription of the verbal response is incorrect, the human intervenes and the transcription of the misrecognized word is corrected. In both cases, the dialog asks the customer to confirm the unmodified and corrected transcription.
    Type: Grant
    Filed: January 12, 2004
    Date of Patent: February 9, 2010
    Assignee: Avaya Inc.
    Inventor: David Preshan Thambiratnam
  • Patent number: 7660717
    Abstract: Speech recognition is performed by matching between a characteristic quantity of an inputted speech and a composite HMM obtained by synthesizing a speech HMM (hidden Markov model) and a noise HMM for each speech frame of the inputted speech by use of the composite HMM.
    Type: Grant
    Filed: January 9, 2008
    Date of Patent: February 9, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Tetsuya Takiguchi, Masafumi Nishimura
  • Patent number: 7657430
    Abstract: An apparatus inputs an utterance and performs speech recognition on the input utterance. The speech processing apparatus determines whether the recognition result contains an unknown word. If it is determined that the recognition result contains an unknown word, it is then determined whether the recognition result is rejected or not. If it is determined that the recognition result is not rejected, a word corresponding to the unknown word contained in the recognition result is required. The apparatus can be used as a speech processing apparatus.
    Type: Grant
    Filed: July 20, 2005
    Date of Patent: February 2, 2010
    Assignee: Sony Corporation
    Inventor: Hiroaki Ogawa
  • Publication number: 20100023329
    Abstract: Speech recognition of even a speaker who uses a speech recognition system is enabled by using an extended recognition dictionary suited to the speaker without requiring any previous learning using an utterance label corresponding to the speech of the speaker.
    Type: Application
    Filed: January 15, 2008
    Publication date: January 28, 2010
    Applicant: NEC CORPORATION
    Inventor: Yoshifumi Onishi
  • Publication number: 20100004931
    Abstract: An apparatus is provided for speech utterance verification. The apparatus is configured to compare a first prosody component from a recorded speech with a second prosody component for a reference speech. The apparatus determines a prosodic verification evaluation for the recorded speech utterance in dependence of the comparison.
    Type: Application
    Filed: September 15, 2006
    Publication date: January 7, 2010
    Inventors: Bin Ma, Haizhou Li, Minghui Dong
  • Patent number: 7643985
    Abstract: Architecture that interacts with a user, or users of different tongues to enhance speech translation. A recognized concept or situation is sensed and/or converged upon, and disambiguated with mixed-initiative user interaction with a device to provide simplified inferences about user communication goals in working with others who speak another language. Reasoning is applied about communication goals based on the concept or situation at the current focus of attention or the probability distribution over the likely focus of attention, and the user or user's conversational partner is provided with appropriately triaged choices and, images, text and/or speech translations for review or perception. The inferences can also process an utterance or other input from a user as part of the evidence in reasoning about a concept, situation, goals, and/or disambiguating the latter.
    Type: Grant
    Filed: June 27, 2005
    Date of Patent: January 5, 2010
    Assignee: Microsoft Corporation
    Inventor: Eric J. Horvitz
  • Publication number: 20090313017
    Abstract: A framework in which a numerical value that represents a statistical appearance tendency of each of words in a language model is set with respect to the words not only as a constant, but also as an update function that changes in time, is included. The numerical value that represents the set statistical appearance tendency of a word is automatically updated in accordance with passage of time.
    Type: Application
    Filed: July 6, 2007
    Publication date: December 17, 2009
    Inventors: Satoshi Nakazawa, Hitoshi Yamamoto, Tasuku Kitade