Update Patterns Patents (Class 704/244)
-
Patent number: 8489399Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.Type: GrantFiled: June 15, 2009Date of Patent: July 16, 2013Assignee: John Nicholas and Kristin Gross TrustInventor: John Nicholas Gross
-
Patent number: 8484024Abstract: Techniques are disclosed for using phonetic features for speech recognition. For example, a method comprises the steps of obtaining a first dictionary and a training data set associated with a speech recognition system, computing one or more support parameters from the training data set, transforming the first dictionary into a second dictionary, wherein the second dictionary is a function of one or more phonetic labels of the first dictionary, and using the one or more support parameters to select one or more samples from the second dictionary to create a set of one or more exemplar-based class identification features for a pattern recognition task.Type: GrantFiled: February 24, 2011Date of Patent: July 9, 2013Assignee: Nuance Communications, Inc.Inventors: Dimitri Kanevsky, David Nahamoo, Bhuvana Ramabhadran, Tara N. Sainath
-
Patent number: 8478593Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.Type: GrantFiled: July 18, 2012Date of Patent: July 2, 2013Assignee: AT&T Intellectual Property II, L.P.Inventors: Harry Blanchard, Steven Lewis, Shankarnarayan Sivaprasad, Lan Zhang
-
Patent number: 8478589Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.Type: GrantFiled: January 5, 2005Date of Patent: July 2, 2013Assignee: AT&T Intellectual Property II, L.P.Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
-
Patent number: 8473293Abstract: This specification describes technologies relating to system, methods, and articles for updating a speech recognition dictionary based on, at least in part, both search query and market data metrics. In general, one innovative aspect of the subject matter described in this specification can be embodied in a method comprising (i) identifying a candidate term for possible inclusion in a speech recognition dictionary, (ii) identifying at least one search query metric associated with the identified candidate term, (iii) identifying at least one market data metric associated with the identified candidate term, and (iv) generating a candidate term score for the identified candidate term based, at least in part, on a weighted combination of the at least one identified search query metric and the at least one identified market data metric.Type: GrantFiled: October 23, 2012Date of Patent: June 25, 2013Assignee: Google Inc.Inventors: Pedro J. Mengibar, Jeffrey S. Sorensen
-
Patent number: 8473300Abstract: Methods and systems for log mining for grammar-based text processing are provided. A method may comprise receiving, from a device, an activity log. The activity log may comprise one or more of an input instruction, a determined function based at least in part on a match of the input instruction to a grammar-based textual pattern including associations of a given function based on one or more grammars, and a response determination based on an acknowledgement of the determined function. The method may also comprise comparing at least a portion of the activity log with stored activity logs in order to determine a correlation between the activity log and the stored activity logs. The method may also comprise modifying the grammar-based textual pattern based on the determined correlation and providing information indicative of the modification to the device so as to update the grammar-based textual pattern.Type: GrantFiled: October 8, 2012Date of Patent: June 25, 2013Assignee: Google Inc.Inventors: Pedro J. Moreno Mengibar, Martin Jansche, Fadi Biadsy
-
Patent number: 8463608Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.Type: GrantFiled: March 12, 2012Date of Patent: June 11, 2013Assignee: Nuance Communications, Inc.Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
-
Patent number: 8457968Abstract: Disclosed herein are systems, methods, and computer-readable storage media for tracking multiple dialog states. A system practicing the method receives an N-best list of speech recognition candidates, a list of current partitions, and a belief for each of the current partitions. A partition is a group of dialog states. In an outer loop, the system iterates over the N-best list of speech recognition candidates. In an inner loop, the system performs a split, update, and recombination process to generate a fixed number of partitions after each speech recognition candidate in the N-best list. The system recognizes speech based on the N-best list and the fixed number of partitions. The split process can perform all possible splits on all partitions. The update process can compute an estimated new belief. The estimated new belief can be a product of ASR reliability, user likelihood to produce this action, and an original belief.Type: GrantFiled: December 8, 2009Date of Patent: June 4, 2013Assignee: AT&T Intellectual Property I, L.P.Inventor: Jason Williams
-
Patent number: 8457962Abstract: This invention provides remote audio surveillance by recording audio data via three microphones and storage on a removable digital mass storage device, operating on battery power. The housing is of a weather resistant design to withstand outdoor conditions. Recording can be done in person or recording times can be defined so that the unit will only ‘listen’ during the desired times of the day, on a day to day basis. The user does not have to be in the vicinity but simply programs the record time(s) and leaves the device in the woods. The device also has play back capabilities for any recorded audio data and can interface with personal computers via the removable digital mass storage device. In addition to the audio collection and playback capabilities, PC software will be provided with the device which will analyze the data and provide direction of sound (based upon relative amplitude of the 3 microphones) and distance of sound (based on absolute and relative recorded amplitudes).Type: GrantFiled: August 4, 2006Date of Patent: June 4, 2013Inventor: Lawrence P. Jones
-
Patent number: 8457965Abstract: A method is described for correcting and improving the functioning of certain devices for the diagnosis and treatment of speech that dynamically measure the functioning of the velum in the control of nasality during speech. The correction method uses an estimate of the vowel frequency spectrum to greatly reduce the variation of nasalance with the vowel being spoken, so as to result in a corrected value of nasalance that reflects with greater accuracy the degree of velar opening. Correction is also described for reducing the effect on nasalance values of energy from the oral and nasal channels crossing over into the other channel because of imperfect acoustic separation.Type: GrantFiled: October 6, 2009Date of Patent: June 4, 2013Assignee: Rothenberg EnterprisesInventor: Martin Rothenberg
-
Publication number: 20130132084Abstract: A system and method for performing dual mode speech recognition, employing a local recognition module on a mobile device and a remote recognition engine on a server device. The system accepts a spoken query from a user, and both the local recognition module and the remote recognition engine perform speech recognition operations on the query, returning a transcription and confidence score, subject to a latency cutoff time. If both sources successfully transcribe the query, then the system accepts the result having the higher confidence score. If only one source succeeds, then that result is accepted. In either case, if the remote recognition engine does succeed in transcribing the query, then a client vocabulary is updated if the remote system result includes information not present in the client vocabulary.Type: ApplicationFiled: June 21, 2012Publication date: May 23, 2013Applicant: SOUNDHOUND, INC.Inventors: Timothy Stonehocker, Keyvan Mohajer, Bernard Mont-Reynaud
-
Patent number: 8447604Abstract: Provided in some embodiments is a method including receiving ordered script words are indicative of dialogue words to be spoken, receiving audio data corresponding to at least a portion of the dialogue words to be spoken and including timecodes associated with dialogue words, generating a matrix of the ordered script words versus the dialogue words, aligning the matrix to determine hard alignment points that include matching consecutive sequences of ordered script words with corresponding sequences of dialogue words, partitioning the matrix of ordered script words into sub-matrices bounded by adjacent hard-alignment points and including corresponding sub-sets the script and dialogue words between the hard-alignment points, and aligning each of the sub-matrices.Type: GrantFiled: May 28, 2010Date of Patent: May 21, 2013Assignee: Adobe Systems IncorporatedInventor: Walter W. Chang
-
Patent number: 8442826Abstract: Architecture for integrating application-dependent information into a constraints component at deployment time or when available. In terms of a general grammar, the constraints component can include or be a general grammar that comprises application-independent information and is structured in such a way that application-dependent information can be integrated into the general grammar without loss of fidelity. The general grammar includes a probability space and reserves a section of the probability space for the integration of application-dependent information. An integration component integrates the application-dependent information into the reserved section of the probability space for recognition processing. The application-dependent information is integrated into the reserved section of the probability space at deployment time or when available. The general grammar is structured to support the integration and improve the overall system.Type: GrantFiled: June 10, 2009Date of Patent: May 14, 2013Assignee: Microsoft CorporationInventors: Jonathan E. Hamaker, Julian James Odell, Michael D. Plumpe, Sandeep Manocha, Keith C. Herold
-
Publication number: 20130117023Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.Type: ApplicationFiled: October 22, 2012Publication date: May 9, 2013Applicant: AT&T INTELLECTUAL PROPERTY II, L.P.Inventor: AT&T INTELLECTUAL PROPERTY II, L.P.
-
Patent number: 8438027Abstract: An object of the invention is to conveniently increase standard patterns registered in a voice recognition device to efficiently extend the amount of words that can be voice-recognized. New standard patterns are generated by modifying a part of an existing standard pattern. A pattern matching unit 16 of a modifying-part specifying unit 14 performs pattern matching process to specify a part to be modified in the existing standard pattern of a usage source. A standard pattern generating unit 18 generates the new standard patterns by cutting or deleting voice data of the modifying part of the usage-source standard pattern, substituting the voice data of the modifying part of the usage-source standard pattern for another voice data, or combining the voice data of the modifying part of the usage-source standard pattern with another voice data. A standard pattern database update unit 20 adds the new standard patterns to a standard pattern database 24.Type: GrantFiled: May 25, 2006Date of Patent: May 7, 2013Assignee: Panasonic CorporationInventors: Toshiyuki Teranishi, Kouji Hatano
-
Patent number: 8438026Abstract: The invention describes a method and a system for generating training data (DT) for an automatic speech recogniser (2) for operating at a particular first sampling frequency (fH), comprising steps of deriving spectral characteristics (SL) from audio data (DL) sampled at a second frequency (fL) lower than the first sampling frequency (fH), extending the bandwidth of the spectral characteristics (SL) by retrieving bandwidth extending informationOBE) from a codebook (6), and processing the bandwidth extended spectral characteristics (SLE) to give the required training data (DT). Moreover a method and a system (5) for generating a codebook (6) for extending the bandwidth of spectral characteristics (SL) for audio data (DL) sampled at a second sampling frequency (fL) to spectral characteristics (SH) for a first sampling frequency (fH) higher than the second sampling frequency (fL) are described.Type: GrantFiled: February 10, 2005Date of Patent: May 7, 2013Assignee: Nuance Communications, Inc.Inventors: Alexander Fischer, Rolf Dieter Bippus
-
Patent number: 8423354Abstract: A device extracts prosodic information including a power value from a speech data and an utterance section including a period with a power value equal to or larger than a threshold, from the speech data, divides the utterance section into each section in which a power value equal to or larger than another threshold, acquires phoneme sequence data for each divided speech data by phoneme recognition, generates clusters which is a set of the classified phoneme sequence data by clustering, calculates an evaluation value for each cluster, selects clusters for which the evaluation value is equal to or larger than a given value as candidate clusters, determines one of the phoneme sequence data from the phoneme sequence data constituting the cluster for each candidate cluster to be a representative phoneme sequence, and selects the divided speech data corresponding to the representative phoneme sequence as listening target speech data.Type: GrantFiled: November 5, 2010Date of Patent: April 16, 2013Assignee: Fujitsu LimitedInventor: Sachiko Onodera
-
Patent number: 8407052Abstract: Methods and systems for correcting transcribed text. One method includes receiving audio data from one or more audio data sources and transcribing the audio data based on a voice model to generate text data. The method also includes making the text data available to a plurality of users over at least one computer network and receiving corrected text data over the at least one computer network from the plurality of users. In addition, the method can include modifying the voice model based on the corrected text data.Type: GrantFiled: April 17, 2007Date of Patent: March 26, 2013Assignee: Vovision, LLCInventor: Paul M. Hager
-
Publication number: 20130073286Abstract: Candidate interpretations resulting from application of speech recognition algorithms to spoken input are presented in a consolidated manner that reduces redundancy. A list of candidate interpretations is generated, and each candidate interpretation is subdivided into time-based portions, forming a grid. Those time-based portions that duplicate portions from other candidate interpretations are removed from the grid. A user interface is provided that presents the user with an opportunity to select among the candidate interpretations; the user interface is configured to present these alternatives without duplicate elements.Type: ApplicationFiled: September 20, 2011Publication date: March 21, 2013Applicant: APPLE INC.Inventors: Marcello Bastea-Forte, David A. Winarsky
-
Patent number: 8401851Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. In a particular embodiment, a method includes determining a frequency of occurrence of a particular type of utterance method and includes determining whether the frequency of occurrence exceeds a threshold. The method further includes tuning a speech recognition system to improve recognition of the particular type of utterance when the frequency of occurrence of the particular type of utterance exceeds the threshold.Type: GrantFiled: July 15, 2009Date of Patent: March 19, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
-
Patent number: 8396710Abstract: A distributed voice user interface system includes a local device which receives speech input issued from a user. Such speech input may specify a command or a request by the user. The local device performs preliminary processing of the speech input and also provides the speech input to a remote system. The local device is able to update its recognition capabilities based on analysis of the speech, input by the remote system.Type: GrantFiled: November 23, 2011Date of Patent: March 12, 2013Assignee: Ben Franklin Patent Holding LLCInventors: George M. White, James J. Buteau, Glen E. Shires, Kevin J. Surace, Steven Markman
-
Patent number: 8392188Abstract: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.Type: GrantFiled: September 21, 2001Date of Patent: March 5, 2013Assignee: AT&T Intellectual Property II, L.P.Inventor: Giuseppe Riccardi
-
Patent number: 8386251Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.Type: GrantFiled: June 8, 2009Date of Patent: February 26, 2013Assignee: Microsoft CorporationInventors: Nikko Strom, Julian Odell, Jon Hamaker
-
Patent number: 8386248Abstract: A method of tuning reusable dialog components within a speech application can include detecting speech recognition events generated from a plurality of recognitions performed for a field of a reusable dialog component. The speech recognition events can be generated over a plurality of interactive voice response sessions. The method also can include automatically computing a suggested value for a tuning parameter corresponding to the field of the reusable dialog component according, at least in part, to the speech recognition events.Type: GrantFiled: September 22, 2006Date of Patent: February 26, 2013Assignee: Nuance Communications, Inc.Inventors: Girish Dhanakshirur, Baiju D. Mandalia, Aimee Silva
-
Patent number: 8386250Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.Type: GrantFiled: September 30, 2011Date of Patent: February 26, 2013Assignee: Google Inc.Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
-
Patent number: 8380502Abstract: A system receives a voice search query from a user, derives recognition hypotheses from the voice search query, and determines scores associated with the recognition hypotheses, the scores being based on a comparison of the recognition hypotheses to previously received search queries. The system discards at least one of the recognition hypotheses that is associated with a first score that is less than a threshold value, and constructs a first query using at least one non-discarded recognition hypothesis, where the at least one first non-discarded recognition hypothesis is associated with a second score that at least meets the threshold value. The system forwards the first query to a search system, receives first results associated with the first query, and provides the first results to the user.Type: GrantFiled: October 14, 2011Date of Patent: February 19, 2013Assignee: Google Inc.Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
-
Patent number: 8380484Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.Type: GrantFiled: August 10, 2004Date of Patent: February 19, 2013Assignee: International Business Machines CorporationInventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
-
Patent number: 8380503Abstract: Challenge items for an audible based electronic challenge system are generated using a variety of techniques to identify optimal candidates. The challenge items are intended for use in a computing system that discriminates between humans and text to speech (TTS) system.Type: GrantFiled: June 15, 2009Date of Patent: February 19, 2013Assignee: John Nicholas and Kristin Gross TrustInventor: John Nicholas Gross
-
Patent number: 8374867Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.Type: GrantFiled: November 13, 2009Date of Patent: February 12, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
-
Patent number: 8374868Abstract: A method for recognizing speech involves reciting, into a speech recognition system, an utterance including a numeric sequence that contains a digit string including a plurality of tokens and detecting a co-articulation problem related to at least two potentially co-articulated tokens in the digit string. The numeric sequence may be identified using i) a dynamically generated possible numeric sequence that potentially corresponds with the numeric sequence, and/or ii) at least one supplemental acoustic model. Also disclosed herein is a system for accomplishing the same.Type: GrantFiled: August 21, 2009Date of Patent: February 12, 2013Assignee: General Motors LLCInventors: Uma Arun, Sherri J Voran-Nowak, Rathinavelu Chengalvarayan, Gaurav Talwar
-
Patent number: 8374864Abstract: In one embodiment, a method includes receiving at a communication device an audio communication and a transcribed text created from the audio communication, and generating a mapping of the transcribed text to the audio communication independent of transcribing the audio. The mapping identifies locations of portions of the text in the audio communication. An apparatus for mapping the text to the audio is also disclosed.Type: GrantFiled: March 17, 2010Date of Patent: February 12, 2013Assignee: Cisco Technology, Inc.Inventor: Jim Kerr
-
Patent number: 8370139Abstract: A noise-environment storing unit stores therein a compensation vector for compensating a feature vector of a speech. A feature-vector extracting unit extracts the feature vector of the speech in each of a plurality of frames. A noise-environment-series estimating unit estimates a noise-environment series based on a feature-vector series and a degree of similarity. A calculating unit obtains a compensation vector corresponding to each noise environment in estimated noise-environment series based on the compensation vector present in the noise-environment storing unit. A compensating unit compensates the extracted feature vector of the speech based on obtained compensation vector.Type: GrantFiled: March 19, 2007Date of Patent: February 5, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Masami Akamine, Takashi Masuko, Daniel Barreda, Remco Teunen
-
Patent number: 8355915Abstract: The disclosure describes an overall system/method for text-input using a multimodal interface with speech recognition. Specifically, pluralities of modes interact with the main speech mode to provide the speech-recognition system with partial knowledge of the text corresponding to the spoken utterance forming the input to the speech recognition system. The knowledge from other modes is used to dynamically change the ASR system's active vocabulary thereby significantly increasing recognition accuracy and significantly reducing processing requirements. Additionally, the speech recognition system is configured using three different system configurations (always listening, partially listening, and push-to-speak) and for each one of those three different user-interfaces are proposed (speak-and-type, type-and-speak, and speak-while-typing).Type: GrantFiled: November 30, 2007Date of Patent: January 15, 2013Inventor: Ashwin P. Rao
-
Patent number: 8352263Abstract: The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.Type: GrantFiled: September 29, 2009Date of Patent: January 8, 2013Inventors: Tze-Fen Li, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
-
Publication number: 20130006632Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.Type: ApplicationFiled: September 12, 2012Publication date: January 3, 2013Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
-
Patent number: 8346551Abstract: A method for adapting a codebook for speech recognition, wherein the codebook is from a set of codebooks comprising a speaker-independent codebook and at least one speaker dependent codebook. A speech input is received and a feature vector based on the received speech input is determined. For each of the Gaussian densities, a first mean vector is estimated using an expectation process and taking into account the determined feature vector. For each of the Gaussian densities, a second mean vector using an Eigenvoice adaptation is determined taking into account the determined feature vector. For each of the Gaussian densities, the mean vector is set to a convex combination of the first and the second mean vector. Thus, this process allows for adaptation during operation and does not require a lengthy training phase.Type: GrantFiled: November 20, 2009Date of Patent: January 1, 2013Assignee: Nuance Communications, Inc.Inventors: Tobias Herbig, Franz Gerl
-
Publication number: 20120284025Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar The database may comprise a directory of names.Type: ApplicationFiled: July 18, 2012Publication date: November 8, 2012Applicant: AT&T Intellectual Property II, L.P.Inventors: Harry Blanchard, Steven LEWIS, Shankarnarayan SIVAPRASAD, Lan ZHANG
-
Patent number: 8306819Abstract: Techniques for enhanced automatic speech recognition are described. An enhanced ASR system may be operative to generate an error correction function. The error correction function may represent a mapping between a supervised set of parameters and an unsupervised training set of parameters generated using a same set of acoustic training data, and apply the error correction function to an unsupervised testing set of parameters to form a corrected set of parameters used to perform speaker adaptation. Other embodiments are described and claimed.Type: GrantFiled: March 9, 2009Date of Patent: November 6, 2012Assignee: Microsoft CorporationInventors: Chaojun Liu, Yifan Gong
-
Patent number: 8296145Abstract: A voice dialing method includes the steps of receiving an utterance from a user, decoding the utterance to identify a recognition result for the utterance, and communicating to the user the recognition result. If an indication is received from the user that the communicated recognition result is incorrect, then it is added to a rejection reference. Then, when the user repeats the misunderstood utterance, the rejection reference can be used to eliminate the incorrect recognition result as a potential subsequent recognition result. The method can be used for single or multiple digits or digit strings.Type: GrantFiled: November 7, 2011Date of Patent: October 23, 2012Assignee: General Motors LLCInventors: Jason W. Clark, Rathinavelu Chengalvarayan, Timothy J. Grost, Dana Fecher, Jeremy Spaulding
-
Publication number: 20120245940Abstract: A method for speech recognition is implemented in the specific form of computer processes that function in a computer processor. That is, one or more computer processes: process a speech input to produce a sequence of representative speech vectors and perform multiple recognition passes to determine a recognition output corresponding to the speech input. At least one generic recognition pass is based on a generic speech recognition arrangement using generic modeling of a broad general class of input speech. And at least one adapted recognition pass is based on a speech adapted arrangement using pre-adapted modeling of a specific sub-class of the general class of input speech.Type: ApplicationFiled: December 8, 2009Publication date: September 27, 2012Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Daniel Willett, Lambert Mathias, Chuang He, Jianxiong Wu
-
Patent number: 8275615Abstract: A translation method and system include a recognition engine having a plurality of models each being employed to decode a same utterance to provide an output. A model combiner is configured to assign probabilities to each model output and configured to assign weights to the outputs of the plurality of models based on the probabilities to provide a best performing model for the context of the utterance.Type: GrantFiled: July 13, 2007Date of Patent: September 25, 2012Assignee: International Business Machines CorporationInventors: Suleyman S. Kozat, Ruhi Sarikaya
-
Patent number: 8271578Abstract: A method of transferring data objects over a network comprises intercepting a network transfer message with a passing object, creating a unique identifier for the object using a predetermined function, the same function having been used to provide identifiers for objects stored at predetermined nodes of said network, removing the object and sending on the network transfer message with the unique identifier in place of the object. Then, at the recipient end it is possible to obtain the unique identifier and use it as a key to search for a corresponding object in the local nodes. The search starts with a node closest to the recipient and steadily spreads outwards. The object when found is reattached for the benefit of the recipient and network bandwidth has been saved by the avoidance of redundant transfer since the object is brought to the recipient from the node which is the closest to him.Type: GrantFiled: December 8, 2005Date of Patent: September 18, 2012Assignee: B-Obvious Ltd.Inventors: Guy Sheffi, Ovadi Somech
-
Patent number: 8270588Abstract: A method and system for automatic incoming call management uses function test results to build call signatures that are stored for later use in incoming call analysis. The function test results are used to compute a suspect score and confidence level associated with each incoming call, and are also used for making incoming call management decisions. A call treatment is selected based on the function test results and/or the computed suspect score and confidence level.Type: GrantFiled: January 5, 2007Date of Patent: September 18, 2012Inventor: Ronald Schwartz
-
Patent number: 8271280Abstract: A voice recognition apparatus can reduce false recognition caused by matching with respect to the phrases composed of a small number of syllables, when it performs a recognition process, by a pronunciation unit, for voice data based on voice produced by a speaker such as a syllable and further performs recognition by a method such as the Word Spotting for matching with respect to the phrases stored in the phrase database. The voice recognition apparatus performs a recognition process for comparing a result of the recognition process by a pronunciation unit with the extended phrases obtained by adding the additional phrase before and/or behind the respective phrases.Type: GrantFiled: October 1, 2008Date of Patent: September 18, 2012Assignee: Fujitsu LimitedInventor: Kenji Abe
-
Patent number: 8249868Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.Type: GrantFiled: September 30, 2011Date of Patent: August 21, 2012Assignee: Google Inc.Inventors: Matthew I. Lloyd, Trausti Kristjansson
-
Patent number: 8249870Abstract: A semi-automatic speech transcription system of the invention leverages the complementary capabilities of human and machine, building a system which combines automatic and manual approaches. With the invention, collected audio data is automatically distilled into speech segments, using signal processing and pattern recognition algorithms. The detected speech segments are presented to a human transcriber using a transcription tool with a streamlined transcription interface, requiring the transcriber to simply “listen and type”. This eliminates the need to manually navigate the audio, coupling the human effort to the amount of speech, rather than the amount of audio. Errors produced by the automatic system can be quickly identified by the human transcriber, which are used to improve the automatic system performance. The automatic system is tuned to maximize the human transcriber efficiency.Type: GrantFiled: November 12, 2008Date of Patent: August 21, 2012Assignee: Massachusetts Institute of TechnologyInventors: Brandon Cain Roy, Deb Kumar Roy
-
Patent number: 8239199Abstract: A method includes identifying a first syllable in a first audio of a first word and a second syllable in a second audio of a second word, the first syllable having a first set of properties and the second syllable having a second set of properties; detecting the first syllable in a first instance of the first word in an audio file, the first syllable in the first instance having a third set of properties; determining one or more transformations for transforming the first set of properties to the third set of properties; applying the one or more transformations to the second set of properties of the second syllable to yield a transformed second syllable; and replacing the first syllable in the first instance of the first word with the transformed second syllable in the audio file.Type: GrantFiled: October 16, 2009Date of Patent: August 7, 2012Assignee: Yahoo! Inc.Inventor: Narayan Lakshmi Bhamidipati
-
Patent number: 8234111Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.Type: GrantFiled: June 14, 2010Date of Patent: July 31, 2012Assignee: Google Inc.Inventors: Matthew I. Lloyd, Trausti Kristjansson
-
Patent number: 8234112Abstract: Provided are an apparatus and method for generating a noise adaptive acoustic model including a noise adaptive discriminative adaptation method. The method includes: generating a baseline model parameter from large-capacity speech training data including various noise environments; and receiving the generated baseline model parameter and applying a discriminative adaptation method to the generated results to generate an migrated acoustic model parameter suitable for an actually applied environment.Type: GrantFiled: April 25, 2008Date of Patent: July 31, 2012Assignee: Electronics and Telecommunications Research InstituteInventors: Byung Ok Kang, Ho Young Jung, Yun Keun Lee
-
Patent number: RE43866Abstract: Method for improving a TFCI transportation performance, including the steps of (1) coding TFCI information bits to be transported through each radio frame, (2) repeating a TFCI code word produced by the coding for an arbitrary times, (3) applying puncturing patterns different from each other to the repeated code words produced as many as the repeated times, and puncturing the repeated code words at locations different from each other, and (4) dividing, inserting, and transporting the punctured fixed length repeated code words in each slot of the radio frame, whereby improving TFCI information transportation performance, and embodying the receiver side decoder to be identical to a case when a 32 bit code word are transported perfectly.Type: GrantFiled: November 13, 2009Date of Patent: December 18, 2012Assignee: LG Electronics Inc.Inventors: Sung Kwon Hong, Sung Lark Kwon, Young Woo Yun, Ki Jun Kim