Update Patterns Patents (Class 704/244)
-
Patent number: 8731924Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.Type: GrantFiled: August 8, 2011Date of Patent: May 20, 2014Assignee: AT&T Intellectual Property II, L.P.Inventor: Gokhan Tur
-
Patent number: 8731922Abstract: A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.Type: GrantFiled: April 30, 2013Date of Patent: May 20, 2014Assignee: AT&T Intellectual Property I, L.P.Inventor: Robert Wesley Bossemeyer, Jr.
-
Patent number: 8732845Abstract: Systems, methods and articles of manufacture for generating a video such that when another person views the video, the other person can view non-private information but not private information of the person who generated the video. A first interview screen is generated by a financial application and displayed to a first person or user of a financial application. The screen includes private data related to the first person. A video of the interview screen is generated and may be transmitted over a network to a second person who may also utilize a financial application. The video is displayed to the second person, but the second person cannot view the private data related to the first person.Type: GrantFiled: May 18, 2012Date of Patent: May 20, 2014Assignee: Intuit Inc.Inventors: Steven C. Barker, Benjamin J. Kanspedos
-
Publication number: 20140136200Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: processing a spoken command with one or more models of one or more model types to achieve model results; evaluating a frequency of the model results; and selectively updating the one or more models of the one or more model types based on the evaluating.Type: ApplicationFiled: October 22, 2013Publication date: May 15, 2014Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: UTE WINTER, RON M. HECHT, TIMOTHY J. GROST, ROBERT D. SIMS, III
-
Publication number: 20140136202Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; processing the speech data for a pattern of a user competence associated with at least one of task requests and interaction behavior; and selectively updating at least one of a system prompt and an interaction sequence based on the user competence.Type: ApplicationFiled: October 22, 2013Publication date: May 15, 2014Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: ROBERT D. SIMS, III, TIMOTHY J. GROST, RON M. HECHT, UTE WINTER
-
Publication number: 20140136201Abstract: Methods and systems are provided for adapting a speech system. In one example a method includes: logging speech data from the speech system; detecting a user characteristic from the speech data; and selectively updating a language model based on the user characteristic.Type: ApplicationFiled: October 22, 2013Publication date: May 15, 2014Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: RON M. HECHT, TIMOTHY J. GROST, ROBERT D. SIMS, III, UTE WINTER
-
Patent number: 8725511Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.Type: GrantFiled: July 2, 2013Date of Patent: May 13, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Harry Blanchard, Steven H. Lewis, Shankarnarayan Sivaprasad, Lan Zhang
-
Patent number: 8725509Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to language models stored for digital language processing. In one aspect, a method includes the actions of generating a language model, including: receiving a collection of n-grams from a corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus, and generating a trie representing the collection of n-grams, the trie being represented using one or more arrays of integers, and compressing an array representation of the trie using block encoding; and using the language model to identify a second probability of a particular string of words occurring.Type: GrantFiled: June 17, 2009Date of Patent: May 13, 2014Assignee: Google Inc.Inventors: Boulos Harb, Ciprian Chelba, Jeffrey A. Dean, Sanjay Ghemawat
-
Patent number: 8719017Abstract: Speech recognition models are dynamically re-configurable based on user information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. The techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.Type: GrantFiled: May 15, 2008Date of Patent: May 6, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Richard C Rose, Bojana Gajic
-
Patent number: 8712757Abstract: A method for communication management includes receiving at least one keyword and receiving a replay time span input. Further, the method includes receiving a plurality of communication inputs including at least a first communication input and a second communication input, monitoring at least the first communication input and second communication input for the at least one keyword, and determining an instantiation of the at least one keyword in at least one of the first communication input and second communication input. Additionally, the method includes associating the determined instantiation with one of the first communication input and second communication input, and providing at least a portion of the communication associated with the determined instantiation based on the replay time span input responsive to the instantiation.Type: GrantFiled: January 10, 2007Date of Patent: April 29, 2014Assignee: Nuance Communications, Inc.Inventors: Rick A. Hamilton, II, Peter G. Finn, Christopher J. Dawson, John S. Langford
-
Publication number: 20140114661Abstract: Methods and systems for speech recognition processing are described. In an example, a computing device may be configured to receive information indicative of a frequency of submission of a search query to a search engine for a search query composed of a sequence of words. Based on the frequency of submission of the search query exceeding a threshold, the computing device may be configured to determine groupings of one or more words of the search query based on an order in which the one or more words occur in the sequence of words of the search query. Further, the computing device may be configured to provide information indicating the groupings to a speech recognition system.Type: ApplicationFiled: September 24, 2013Publication date: April 24, 2014Applicant: Google Inc.Inventors: Pedro J. Moreno Mengibar, Jeffrey Scott Sorensen, Eugene Weinstein
-
Patent number: 8700370Abstract: A method, system and program storage device for history matching and forecasting of subterranean reservoirs is provided. Reservoir parameters and probability models associated with a reservoir model are defined. A likelihood function associated with observed data is also defined. A usable likelihood proxy for the likelihood function is constructed. Reservoir model parameters are sampled utilizing the usable proxy for the likelihood function and utilizing the probability models to determine a set of retained models. Forecasts are estimated for the retained models using a forecast proxy. Finally, computations are made on the parameters and forecasts associated with the retained models to obtain at least one of probability density functions, cumulative density functions and histograms for the reservoir model parameters and forecasts. The system carries out the above method and the program storage device carries instructions for carrying out the method.Type: GrantFiled: December 19, 2007Date of Patent: April 15, 2014Assignee: Chevron U.S.A. Inc.Inventor: Jorge L. Landa
-
Patent number: 8700400Abstract: Subspace speech adaptation may be utilized for facilitating the recognition of speech containing short utterances. Speech training data may be received in a speech model by a computer. A first matrix may be determined for preconditioning speech statistics based on the speech training data. A second matrix may be determined for representing a basis for the speech to be recognized. A set of basis matrices may then be determined from the first matrix and the second matrix. Speech test data including a short utterance may then be received by the computer. The computer may then apply the set of basis matrices to the speech test data to produce a transcription. The transcription may represent speech recognition of the short utterance.Type: GrantFiled: December 30, 2010Date of Patent: April 15, 2014Assignee: Microsoft CorporationInventors: Daniel Povey, Kaisheng Yao, Yifan Gong
-
Patent number: 8700402Abstract: Disclosed herein are systems, methods, and computer-readable storage media for tracking multiple dialog states. A system practicing the method receives an N-best list of speech recognition candidates, a list of current partitions, and a belief for each of the current partitions. A partition is a group of dialog states. In an outer loop, the system iterates over the N-best list of speech recognition candidates. In an inner loop, the system performs a split, update, and recombination process to generate a fixed number of partitions after each speech recognition candidate in the N-best list. The system recognizes speech based on the N-best list and the fixed number of partitions. The split process can perform all possible splits on all partitions. The update process can compute an estimated new belief. The estimated new belief can be a product of ASR reliability, user likelihood to produce this action, and an original belief.Type: GrantFiled: June 4, 2013Date of Patent: April 15, 2014Assignee: AT&T Intellectual Property I, L.P.Inventor: Jason Williams
-
Patent number: 8694313Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.Type: GrantFiled: May 19, 2010Date of Patent: April 8, 2014Assignee: Google Inc.Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
-
Patent number: 8688450Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information are described. A method includes determining, for each of multiple communications that were initiated by a user of a mobile device, a time when the communication was initiated or received; determining, for each of multiple contacts associated with the user, a probability associated with the contact based at least on the times when the communications were initiated or received; weighting a contact disambiguation grammar according to the probabilities; and processing audio data using the contact disambiguation grammar to select a particular contact.Type: GrantFiled: July 10, 2012Date of Patent: April 1, 2014Assignee: Google Inc.Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
-
Patent number: 8688444Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.Type: GrantFiled: October 22, 2012Date of Patent: April 1, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Sarangarajan Parthasarathy, Richard Cameron Rose
-
Patent number: 8676580Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.Type: GrantFiled: August 16, 2011Date of Patent: March 18, 2014Assignee: International Business Machines CorporationInventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
-
Publication number: 20140074470Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for improved pronunciation. One of the methods includes receiving data that represents an audible pronunciation of the name of an individual from a user device. The method includes identifying one or more other users that are members of a social circle that the individual is a member. The method includes identifying one or more devices associated with the other users. The method also includes providing information that identifies the individual and the data representing the audible pronunciation to the one or more identified devices.Type: ApplicationFiled: July 23, 2013Publication date: March 13, 2014Applicant: Google Inc.Inventors: Martin Jansche, Mark Edward Epstein, Ciprian I. Chelba
-
Patent number: 8670983Abstract: A method for determining a similarity between a first audio source and a second audio source includes: for the first audio source, determining a first frequency of occurrence for each of a plurality of phoneme sequences and determining a first weighted frequency for each of the plurality of phoneme sequences based on the first frequency of occurrence for the phoneme sequence; for the second audio source, determining a second frequency of occurrence for each of a plurality of phoneme sequences and determining a second weighted frequency for each of the plurality of phoneme sequences based on the second frequency of occurrence for the phoneme sequence; comparing the first weighted frequency for each phoneme sequence with the second weighted frequency for the corresponding phoneme sequence; and generating a similarity score representative of a similarity between the first audio source and the second audio source based on the results of the comparing.Type: GrantFiled: August 30, 2011Date of Patent: March 11, 2014Assignee: Nexidia Inc.Inventors: Jacob B. Garland, Jon A. Arrowood, Drew Lanham, Marsal Gavalda
-
Publication number: 20140067394Abstract: The system and method for speech decoding in speech recognition systems provides decoding for speech variants common to such languages. These variants include within-word and cross-word variants. For decoding of within-word variants, a data-driven approach is used, in which phonetic variants are identified, and a pronunciation dictionary and language model of a dynamic programming speech recognition system are updated based upon these identifications. Cross-word variants are handled with a knowledge-based approach, applying phonological rules, part-of-speech tagging or tagging of small words to a speech transcription corpus and updating the pronunciation dictionary and language model of the dynamic programming speech recognition system based upon identified cross-word variants.Type: ApplicationFiled: August 28, 2012Publication date: March 6, 2014Applicants: KING ABDULAZIZ CITY FOR SCIENCE AND TECHNOLOGY, KING FAHD UNIVERSITY OF PETROLEUM AND MINERALSInventors: DIA EDDIN M. ABUZEINA, MOUSTAFA ELSHAFEI, HUSNI AL-MUHTASEB, WASFI G. AL-KHATIB
-
Patent number: 8666740Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.Type: GrantFiled: June 22, 2012Date of Patent: March 4, 2014Assignee: Google Inc.Inventors: Matthew I. Lloyd, Trausti T. Kristjansson
-
Patent number: 8660842Abstract: Speech recognition device uses visual information to narrow down the range of likely adaptation parameters even before a speaker makes an utterance. Images of the speaker and/or the environment are collected using an image capturing device, and then processed to extract biometric features and environmental features. The extracted features and environmental features are then used to estimate adaptation parameters. A voice sample may also be collected to refine the adaptation parameters for more accurate speech recognition.Type: GrantFiled: March 9, 2010Date of Patent: February 25, 2014Assignee: Honda Motor Co., Ltd.Inventor: Antoine R. Raux
-
Patent number: 8655664Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.Type: GrantFiled: August 11, 2011Date of Patent: February 18, 2014Assignee: Kabushiki Kaisha ToshibaInventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20140046663Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.Type: ApplicationFiled: October 24, 2013Publication date: February 13, 2014Applicant: AT&T Intellectual Property I, L.P.Inventor: Dan Melamed
-
Patent number: 8645136Abstract: A system and method for efficiently reducing transcription error using hybrid voice transcription is provided. A voice stream is parsed from a call into utterances. An initial transcribed value and corresponding recognition score are assigned to each utterance. A transcribed message is generated for the call and includes the initial transcribed values. A threshold is applied to the recognition scores to identify those utterances with recognition scores below the threshold as questionable utterances. At least one questionable utterance is compared to other questionable utterances from other calls and a group of similar questionable utterances is formed. One or more of the similar questionable utterances is selected from the group. A common manual transcription value is received for the selected similar questionable utterances. The common manual transcription value is assigned to the remaining similar questionable utterances in the group.Type: GrantFiled: July 20, 2010Date of Patent: February 4, 2014Assignee: Intellisist, Inc.Inventor: David Milstein
-
Patent number: 8645138Abstract: Disclosed are apparatus and methods for processing spoken speech. Input speech can be received at a computing system. During a first pass of speech recognition, a plurality of language model outputs can be determined by: providing the input speech to each of a plurality of language models and responsively receiving a language model output from each language model. A language model of the plurality of language models can be selected using a classifier operating on the plurality of language model outputs. During a second pass of speech recognition, a revised language model output can be determined by: providing the input speech and the language model output from the selected language model to the selected language model and responsively receiving the revised language model output from the selected language model. The computing system can generate a result based on the revised language model output.Type: GrantFiled: December 20, 2012Date of Patent: February 4, 2014Assignee: Google Inc.Inventors: Eugene Weinstein, Austin Waters
-
Publication number: 20140032216Abstract: A method for a portable device includes receiving a spoken utterance of a word or phrase, generating a plurality of alternative pronunciations of the spoken utterance, scoring one or more pronunciations of the plurality of alternative pronunciations using the spoken utterance, and updating a lexicon with at least one scored pronunciation.Type: ApplicationFiled: September 30, 2013Publication date: January 30, 2014Applicant: Nuance Communications, Inc.Inventors: Daniel L. Roth, Laurence S. Gillick, Michael L. Shire
-
Publication number: 20140006024Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.Type: ApplicationFiled: September 5, 2013Publication date: January 2, 2014Applicant: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
-
Patent number: 8615393Abstract: A noise suppressor for altering a speech signal is trained based on a speech recognition system. An objective function can be utilized to adjust parameters of the noise suppressor. The noise suppressor can be used to alter speech signals for the speech recognition system.Type: GrantFiled: November 15, 2006Date of Patent: December 24, 2013Assignee: Microsoft CorporationInventors: Ivan J. Tashev, Alejandro Acero, James G. Droppo
-
Patent number: 8612225Abstract: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.Type: GrantFiled: February 26, 2008Date of Patent: December 17, 2013Assignee: NEC CorporationInventors: Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa
-
Patent number: 8612229Abstract: A method (300) and system (100) is provided to add the creation of examples at a developer level in the generation of Natural Language Understanding (NLU) models, tying the examples into a NLU sentence database (130), automatically validating (310) a correct outcome of using the examples, and automatically resolving (316) problems the user has using the examples. The method (300) can convey examples of what a caller can say to a Natural Language Understanding (NLU) application. The method includes entering at least one example associated with an existing routing destination, and ensuring an NLU model correctly interprets the example unambiguously for correctly routing a call to the routing destination. The method can include presenting the example sentence in a help message (126) within an NLU dialogue as an example of what a caller can say for connecting the caller to a desired routing destination.Type: GrantFiled: December 15, 2005Date of Patent: December 17, 2013Assignee: Nuance Communications, Inc.Inventors: Rajesh Balchandran, Linda M. Boyer, James R. Lewis, Brent D. Metz
-
Patent number: 8606580Abstract: To provide a data process unit and data process unit control program that are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and that are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. The data process unit comprises a data classification section, data storing section, pattern model generating section, data control section, mathematical distance calculating section, pattern model converting section, pattern model display section, region dividing section, division changing section, region selecting section, and specific pattern model generating section.Type: GrantFiled: December 30, 2008Date of Patent: December 10, 2013Assignee: Asahi Kasei Kabushiki KaishaInventors: Makoto Shozakai, Goshu Nagino
-
Publication number: 20130325471Abstract: Some aspects include transforming data, at least a portion of which has been processed to determine at least one representative vector associated with each of a plurality of classifications associated with the data to obtain a plurality of representative vectors. Techniques comprise determining a first transformation based, at least in part, on the plurality of representative vectors, applying at least the first transformation to the data to obtain transformed data, and fitting a plurality of clusters to the transformed data to obtain a plurality of established clusters. Some aspects include classifying input data by transforming the input data using at least the first transformation and comparing the transformed input data to the established clusters.Type: ApplicationFiled: August 8, 2012Publication date: December 5, 2013Applicant: Nuance Communications, Inc.Inventors: Leonid Rachevsky, Dimitri Kanevsky, Bhuvana Ramabhadran
-
Patent number: 8600749Abstract: Disclosed herein are systems, methods, and computer-readable storage media for training adaptation-specific acoustic models. A system practicing the method receives speech and generates a full size model and a reduced size model, the reduced size model starting with a single distribution for each speech sound in the received speech. The system finds speech segment boundaries in the speech using the full size model and adapts features of the speech data using the reduced size model based on the speech segment boundaries and an overall centroid for each speech sound. The system then recognizes speech using the adapted features of the speech. The model can be a Hidden Markov Model (HMM). The reduced size model can also be of a reduced complexity, such as having fewer mixture components than a model of full complexity. Adapting features of speech can include moving the features closer to an overall feature distribution center.Type: GrantFiled: December 8, 2009Date of Patent: December 3, 2013Assignee: AT&T Intellectual Property I, L.P.Inventor: Andrej Ljolje
-
Patent number: 8600741Abstract: A system and method for tuning a speech recognition engine to an individual microphone using a database containing acoustical models for a plurality of microphones. Microphone performance characteristics are obtained from a microphone at a speech recognition engine, the database is searched for an acoustical model that matches the characteristics, and the speech recognition engine is then modified based on the matching acoustical model.Type: GrantFiled: August 20, 2008Date of Patent: December 3, 2013Assignee: General Motors LLCInventors: Gaurav Talwar, Rathinavelu Chengalvarayan, Jesse T. Gratke, Subhash B. Gullapalli, Dana B. Fecher
-
Publication number: 20130317822Abstract: A model adaptation device includes a recognition unit which creates a recognition result of recognizing data that complies with a target domain which is an assumed condition of recognition target data, based on at least two models and a candidate of a weighting factor indicating a weight of each model on a recognition process. A weighting factor determination unit determines the weighting factor so as to assign a smaller weight to a model having higher reliability. A model update unit updates at least one model out of the models, using the recognition result as the truth label.Type: ApplicationFiled: January 31, 2012Publication date: November 28, 2013Inventor: Takafumi Koshinaka
-
Patent number: 8595006Abstract: A speech recognition method and system, includes receiving in a first noise environment a speech input having a sequence of observations; determining a likelihood of a sequence of words arising from the sequence of observations using an acoustic model trained to recognize speech in a second noise environment, the model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to an observation; and adapting the model trained in the second environment to that of the first environment.Type: GrantFiled: March 26, 2010Date of Patent: November 26, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Haitian Xu, Mark John Francis Gales
-
Patent number: 8589163Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask.Type: GrantFiled: December 4, 2009Date of Patent: November 19, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Mazin Gilbert
-
Patent number: 8589164Abstract: Methods and systems for speech recognition processing are described. In an example, a computing device may be configured to receive information indicative of a frequency of submission of a search query to a search engine for a search query composed of a sequence of words. Based on the frequency of submission of the search query exceeding a threshold, the computing device may be configured to determine groupings of one or more words of the search query based on an order in which the one or more words occur in the sequence of words of the search query. Further, the computing device may be configured to provide information indicating the groupings to a speech recognition system.Type: GrantFiled: March 15, 2013Date of Patent: November 19, 2013Assignee: Google Inc.Inventors: Pedro J. Moreno Mengibar, Jeffrey Scott Sorensen, Eugene Weinstein
-
Patent number: 8589156Abstract: A system, method, computer-readable medium, and computer-implemented system for optimizing allocation of speech recognition tasks among multiple speech recognizers and combining recognizer results is described. An allocation determination is performed to allocate speech recognition among multiple speech recognizers using at least one of an accuracy-based allocation mechanism, a complexity-based allocation mechanism, and an availability-based allocation mechanism. The speech recognition is allocated among the speech recognizers based on the determined allocation. Recognizer results received from multiple speech recognizers in accordance with the speech recognition task allocation are combined.Type: GrantFiled: July 12, 2004Date of Patent: November 19, 2013Assignee: Hewlett-Packard Development Company, L.P.Inventors: Paul M. Burke, Sherif Yacoub
-
Patent number: 8583434Abstract: Computer-implemented methods and apparatus are provided to facilitate the recognition of the content of a body of speech data. In one embodiment, a method for analyzing verbal communication is provided, comprising acts of producing an electronic recording of a plurality of spoken words; processing the electronic recording to identify a plurality of word alternatives for each of the spoken words, each of the plurality of word alternatives being identified by comparing a portion of the electronic recording with a lexicon, and each of the plurality of word alternatives being assigned a probability of correctly identifying a spoken word; loading the word alternatives and the probabilities to a database for subsequent analysis; and examining the word alternatives and the probabilities to determine at least one characteristic of the plurality of spoken words.Type: GrantFiled: January 29, 2008Date of Patent: November 12, 2013Assignee: CallMiner, Inc.Inventor: Jeffrey A. Gallino
-
Patent number: 8577681Abstract: A method of generating an alternative pronunciation for a word or phrase, given an initial pronunciation and a spoken example of the word or phrase, includes providing the initial pronunciation of the word or phrase, and generating the alternative pronunciation by searching a neighborhood of pronunciations about the initial pronunciation via a constrained hypothesis, wherein the neighborhood includes pronunciations that differ from the initial pronunciation by at most one phoneme. The method further includes selecting a highest scoring pronunciation within the neighborhood of pronunciations.Type: GrantFiled: September 13, 2004Date of Patent: November 5, 2013Assignee: Nuance Communications, Inc.Inventors: Daniel L. Roth, Laurence S. Gillick, Mike Shire
-
Patent number: 8571866Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.Type: GrantFiled: October 23, 2009Date of Patent: October 29, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Dan Melamed, Srinivas Bangalore, Michael Johnston
-
Publication number: 20130262106Abstract: A system and method for adapting a language model to a specific environment by receiving interactions captured the specific environment, generating a collection of documents from documents retrieved from external resources, detecting in the collection of documents terms related to the environment that are not included in an initial language model and adapting the initial language model to include the terms detected.Type: ApplicationFiled: March 29, 2012Publication date: October 3, 2013Inventors: Eyal HURVITZ, Ezra Daya, Oren Pereg, Moshe Wasserblat
-
Patent number: 8548808Abstract: A speech understanding apparatus includes a speech recognition unit which performs speech recognition of an utterance using multiple language models, and outputs multiple speech recognition results obtained by the speech recognition, a language understanding unit which uses multiple language understanding models to perform language understanding for each of the multiple speech recognition results output from the speech recognition unit, and outputs multiple speech understanding results obtained from the language understanding, and an integrating unit which calculates, based on values representing features of the speech understanding results, utterance batch confidences that numerically express accuracy of the speech understanding results for each of the multiple speech understanding results output from the language understanding unit, and selects one of the speech understanding results with a highest utterance batch confidence among the calculated utterance batch confidences.Type: GrantFiled: January 22, 2010Date of Patent: October 1, 2013Assignee: Honda Motor Co., Ltd.Inventors: Mikio Nakano, Masaki Katsumaru, Kotaro Funakoshi, Hiroshi Okuno
-
Publication number: 20130246065Abstract: A method for generating a speech recognition model includes accessing a baseline speech recognition model, obtaining information related to recent language usage from search queries, and modifying the speech recognition model to revise probabilities of a portion of a sound occurrence based on the information. The portion of a sound may include a word. Also, a method for generating a speech recognition model, includes receiving at a search engine from a remote device an audio recording and a transcript that substantially represents at least a portion of the audio recording, synchronizing the transcript with the audio recording, extracting one or more letters from the transcript and extracting the associated pronunciation of the one or more letters from the audio recording, and generating a dictionary entry in a pronunciation dictionary.Type: ApplicationFiled: May 7, 2013Publication date: September 19, 2013Inventors: Michael H. Cohen, Shumeet Baluja, Pedro J. Moreno
-
Publication number: 20130246064Abstract: A system and method for real-time processing a signal of a voice interaction. In an embodiment, a digital representation of a portion of an interaction may be analyzed in real-time and a segment may be selected. The segment may be associated with a source based on a model of the source. The model may updated based on the segment. The updated model is used to associate subsequent segments with the source. Other embodiments are described and claimed.Type: ApplicationFiled: March 13, 2012Publication date: September 19, 2013Inventors: Moshe WASSERBLAT, Tzachi ASHKENAZI, Merav BEN-ASHER, Oren PEREG
-
Publication number: 20130238333Abstract: Disclosed herein are systems, methods, and computer-readable storage media for automatically generating a dialog manager for use in a spoken dialog system. A system practicing the method receives a set of user interactions having features, identifies an initial policy, evaluates all of the features in a linear evaluation step of the algorithm to identify a set of most important features, performs a cubic policy improvement step on the identified set of most important features, repeats the previous two steps one or more times, and generates a dialog manager for use in a spoken dialog system based on the resulting policy and/or set of most important features. Evaluating all of the features can include estimating a weight for each feature which indicates how much each feature contributes to at least one of the identified policies. The system can ignore features not in the set of most important features.Type: ApplicationFiled: April 30, 2013Publication date: September 12, 2013Applicant: AT&T Intellectual Property I, L.P.Inventors: Jason William, Suhrid Balakrishnan, Lihong Li
-
Publication number: 20130238334Abstract: A device and method for pass-phrase modeling for speaker verification and a speaker verification system are provided. The device comprises a front end which receives enrollment speech from a target speaker, and a template generation unit which generates a pass-phrase template with a general speaker model based on the enrollment speech. With the device, method and system of the present disclosure, by taking the rich variations contained in a general speaker model into account, the robust pass-phrase modeling is ensured even the enrollment data is insufficient, even just one pass-phrase is available from a target speaker.Type: ApplicationFiled: December 10, 2010Publication date: September 12, 2013Applicant: Panasonic CorporationInventors: Long Ma, Haifeng Shen, Bingqi Zhang