Update Patterns Patents (Class 704/244)
  • Patent number: 8489399
    Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: July 16, 2013
    Assignee: John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Patent number: 8484024
    Abstract: Techniques are disclosed for using phonetic features for speech recognition. For example, a method comprises the steps of obtaining a first dictionary and a training data set associated with a speech recognition system, computing one or more support parameters from the training data set, transforming the first dictionary into a second dictionary, wherein the second dictionary is a function of one or more phonetic labels of the first dictionary, and using the one or more support parameters to select one or more samples from the second dictionary to create a set of one or more exemplar-based class identification features for a pattern recognition task.
    Type: Grant
    Filed: February 24, 2011
    Date of Patent: July 9, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Dimitri Kanevsky, David Nahamoo, Bhuvana Ramabhadran, Tara N. Sainath
  • Patent number: 8478593
    Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.
    Type: Grant
    Filed: July 18, 2012
    Date of Patent: July 2, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Harry Blanchard, Steven Lewis, Shankarnarayan Sivaprasad, Lan Zhang
  • Patent number: 8478589
    Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.
    Type: Grant
    Filed: January 5, 2005
    Date of Patent: July 2, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
  • Patent number: 8473293
    Abstract: This specification describes technologies relating to system, methods, and articles for updating a speech recognition dictionary based on, at least in part, both search query and market data metrics. In general, one innovative aspect of the subject matter described in this specification can be embodied in a method comprising (i) identifying a candidate term for possible inclusion in a speech recognition dictionary, (ii) identifying at least one search query metric associated with the identified candidate term, (iii) identifying at least one market data metric associated with the identified candidate term, and (iv) generating a candidate term score for the identified candidate term based, at least in part, on a weighted combination of the at least one identified search query metric and the at least one identified market data metric.
    Type: Grant
    Filed: October 23, 2012
    Date of Patent: June 25, 2013
    Assignee: Google Inc.
    Inventors: Pedro J. Mengibar, Jeffrey S. Sorensen
  • Patent number: 8473300
    Abstract: Methods and systems for log mining for grammar-based text processing are provided. A method may comprise receiving, from a device, an activity log. The activity log may comprise one or more of an input instruction, a determined function based at least in part on a match of the input instruction to a grammar-based textual pattern including associations of a given function based on one or more grammars, and a response determination based on an acknowledgement of the determined function. The method may also comprise comparing at least a portion of the activity log with stored activity logs in order to determine a correlation between the activity log and the stored activity logs. The method may also comprise modifying the grammar-based textual pattern based on the determined correlation and providing information indicative of the modification to the device so as to update the grammar-based textual pattern.
    Type: Grant
    Filed: October 8, 2012
    Date of Patent: June 25, 2013
    Assignee: Google Inc.
    Inventors: Pedro J. Moreno Mengibar, Martin Jansche, Fadi Biadsy
  • Patent number: 8463608
    Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.
    Type: Grant
    Filed: March 12, 2012
    Date of Patent: June 11, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
  • Patent number: 8457968
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for tracking multiple dialog states. A system practicing the method receives an N-best list of speech recognition candidates, a list of current partitions, and a belief for each of the current partitions. A partition is a group of dialog states. In an outer loop, the system iterates over the N-best list of speech recognition candidates. In an inner loop, the system performs a split, update, and recombination process to generate a fixed number of partitions after each speech recognition candidate in the N-best list. The system recognizes speech based on the N-best list and the fixed number of partitions. The split process can perform all possible splits on all partitions. The update process can compute an estimated new belief. The estimated new belief can be a product of ASR reliability, user likelihood to produce this action, and an original belief.
    Type: Grant
    Filed: December 8, 2009
    Date of Patent: June 4, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Jason Williams
  • Patent number: 8457962
    Abstract: This invention provides remote audio surveillance by recording audio data via three microphones and storage on a removable digital mass storage device, operating on battery power. The housing is of a weather resistant design to withstand outdoor conditions. Recording can be done in person or recording times can be defined so that the unit will only ‘listen’ during the desired times of the day, on a day to day basis. The user does not have to be in the vicinity but simply programs the record time(s) and leaves the device in the woods. The device also has play back capabilities for any recorded audio data and can interface with personal computers via the removable digital mass storage device. In addition to the audio collection and playback capabilities, PC software will be provided with the device which will analyze the data and provide direction of sound (based upon relative amplitude of the 3 microphones) and distance of sound (based on absolute and relative recorded amplitudes).
    Type: Grant
    Filed: August 4, 2006
    Date of Patent: June 4, 2013
    Inventor: Lawrence P. Jones
  • Patent number: 8457965
    Abstract: A method is described for correcting and improving the functioning of certain devices for the diagnosis and treatment of speech that dynamically measure the functioning of the velum in the control of nasality during speech. The correction method uses an estimate of the vowel frequency spectrum to greatly reduce the variation of nasalance with the vowel being spoken, so as to result in a corrected value of nasalance that reflects with greater accuracy the degree of velar opening. Correction is also described for reducing the effect on nasalance values of energy from the oral and nasal channels crossing over into the other channel because of imperfect acoustic separation.
    Type: Grant
    Filed: October 6, 2009
    Date of Patent: June 4, 2013
    Assignee: Rothenberg Enterprises
    Inventor: Martin Rothenberg
  • Publication number: 20130132084
    Abstract: A system and method for performing dual mode speech recognition, employing a local recognition module on a mobile device and a remote recognition engine on a server device. The system accepts a spoken query from a user, and both the local recognition module and the remote recognition engine perform speech recognition operations on the query, returning a transcription and confidence score, subject to a latency cutoff time. If both sources successfully transcribe the query, then the system accepts the result having the higher confidence score. If only one source succeeds, then that result is accepted. In either case, if the remote recognition engine does succeed in transcribing the query, then a client vocabulary is updated if the remote system result includes information not present in the client vocabulary.
    Type: Application
    Filed: June 21, 2012
    Publication date: May 23, 2013
    Applicant: SOUNDHOUND, INC.
    Inventors: Timothy Stonehocker, Keyvan Mohajer, Bernard Mont-Reynaud
  • Patent number: 8447604
    Abstract: Provided in some embodiments is a method including receiving ordered script words are indicative of dialogue words to be spoken, receiving audio data corresponding to at least a portion of the dialogue words to be spoken and including timecodes associated with dialogue words, generating a matrix of the ordered script words versus the dialogue words, aligning the matrix to determine hard alignment points that include matching consecutive sequences of ordered script words with corresponding sequences of dialogue words, partitioning the matrix of ordered script words into sub-matrices bounded by adjacent hard-alignment points and including corresponding sub-sets the script and dialogue words between the hard-alignment points, and aligning each of the sub-matrices.
    Type: Grant
    Filed: May 28, 2010
    Date of Patent: May 21, 2013
    Assignee: Adobe Systems Incorporated
    Inventor: Walter W. Chang
  • Patent number: 8442826
    Abstract: Architecture for integrating application-dependent information into a constraints component at deployment time or when available. In terms of a general grammar, the constraints component can include or be a general grammar that comprises application-independent information and is structured in such a way that application-dependent information can be integrated into the general grammar without loss of fidelity. The general grammar includes a probability space and reserves a section of the probability space for the integration of application-dependent information. An integration component integrates the application-dependent information into the reserved section of the probability space for recognition processing. The application-dependent information is integrated into the reserved section of the probability space at deployment time or when available. The general grammar is structured to support the integration and improve the overall system.
    Type: Grant
    Filed: June 10, 2009
    Date of Patent: May 14, 2013
    Assignee: Microsoft Corporation
    Inventors: Jonathan E. Hamaker, Julian James Odell, Michael D. Plumpe, Sandeep Manocha, Keith C. Herold
  • Publication number: 20130117023
    Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.
    Type: Application
    Filed: October 22, 2012
    Publication date: May 9, 2013
    Applicant: AT&T INTELLECTUAL PROPERTY II, L.P.
    Inventor: AT&T INTELLECTUAL PROPERTY II, L.P.
  • Patent number: 8438027
    Abstract: An object of the invention is to conveniently increase standard patterns registered in a voice recognition device to efficiently extend the amount of words that can be voice-recognized. New standard patterns are generated by modifying a part of an existing standard pattern. A pattern matching unit 16 of a modifying-part specifying unit 14 performs pattern matching process to specify a part to be modified in the existing standard pattern of a usage source. A standard pattern generating unit 18 generates the new standard patterns by cutting or deleting voice data of the modifying part of the usage-source standard pattern, substituting the voice data of the modifying part of the usage-source standard pattern for another voice data, or combining the voice data of the modifying part of the usage-source standard pattern with another voice data. A standard pattern database update unit 20 adds the new standard patterns to a standard pattern database 24.
    Type: Grant
    Filed: May 25, 2006
    Date of Patent: May 7, 2013
    Assignee: Panasonic Corporation
    Inventors: Toshiyuki Teranishi, Kouji Hatano
  • Patent number: 8438026
    Abstract: The invention describes a method and a system for generating training data (DT) for an automatic speech recogniser (2) for operating at a particular first sampling frequency (fH), comprising steps of deriving spectral characteristics (SL) from audio data (DL) sampled at a second frequency (fL) lower than the first sampling frequency (fH), extending the bandwidth of the spectral characteristics (SL) by retrieving bandwidth extending informationOBE) from a codebook (6), and processing the bandwidth extended spectral characteristics (SLE) to give the required training data (DT). Moreover a method and a system (5) for generating a codebook (6) for extending the bandwidth of spectral characteristics (SL) for audio data (DL) sampled at a second sampling frequency (fL) to spectral characteristics (SH) for a first sampling frequency (fH) higher than the second sampling frequency (fL) are described.
    Type: Grant
    Filed: February 10, 2005
    Date of Patent: May 7, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Alexander Fischer, Rolf Dieter Bippus
  • Patent number: 8423354
    Abstract: A device extracts prosodic information including a power value from a speech data and an utterance section including a period with a power value equal to or larger than a threshold, from the speech data, divides the utterance section into each section in which a power value equal to or larger than another threshold, acquires phoneme sequence data for each divided speech data by phoneme recognition, generates clusters which is a set of the classified phoneme sequence data by clustering, calculates an evaluation value for each cluster, selects clusters for which the evaluation value is equal to or larger than a given value as candidate clusters, determines one of the phoneme sequence data from the phoneme sequence data constituting the cluster for each candidate cluster to be a representative phoneme sequence, and selects the divided speech data corresponding to the representative phoneme sequence as listening target speech data.
    Type: Grant
    Filed: November 5, 2010
    Date of Patent: April 16, 2013
    Assignee: Fujitsu Limited
    Inventor: Sachiko Onodera
  • Patent number: 8407052
    Abstract: Methods and systems for correcting transcribed text. One method includes receiving audio data from one or more audio data sources and transcribing the audio data based on a voice model to generate text data. The method also includes making the text data available to a plurality of users over at least one computer network and receiving corrected text data over the at least one computer network from the plurality of users. In addition, the method can include modifying the voice model based on the corrected text data.
    Type: Grant
    Filed: April 17, 2007
    Date of Patent: March 26, 2013
    Assignee: Vovision, LLC
    Inventor: Paul M. Hager
  • Publication number: 20130073286
    Abstract: Candidate interpretations resulting from application of speech recognition algorithms to spoken input are presented in a consolidated manner that reduces redundancy. A list of candidate interpretations is generated, and each candidate interpretation is subdivided into time-based portions, forming a grid. Those time-based portions that duplicate portions from other candidate interpretations are removed from the grid. A user interface is provided that presents the user with an opportunity to select among the candidate interpretations; the user interface is configured to present these alternatives without duplicate elements.
    Type: Application
    Filed: September 20, 2011
    Publication date: March 21, 2013
    Applicant: APPLE INC.
    Inventors: Marcello Bastea-Forte, David A. Winarsky
  • Patent number: 8401851
    Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. In a particular embodiment, a method includes determining a frequency of occurrence of a particular type of utterance method and includes determining whether the frequency of occurrence exceeds a threshold. The method further includes tuning a speech recognition system to improve recognition of the particular type of utterance when the frequency of occurrence of the particular type of utterance exceeds the threshold.
    Type: Grant
    Filed: July 15, 2009
    Date of Patent: March 19, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
  • Patent number: 8396710
    Abstract: A distributed voice user interface system includes a local device which receives speech input issued from a user. Such speech input may specify a command or a request by the user. The local device performs preliminary processing of the speech input and also provides the speech input to a remote system. The local device is able to update its recognition capabilities based on analysis of the speech, input by the remote system.
    Type: Grant
    Filed: November 23, 2011
    Date of Patent: March 12, 2013
    Assignee: Ben Franklin Patent Holding LLC
    Inventors: George M. White, James J. Buteau, Glen E. Shires, Kevin J. Surace, Steven Markman
  • Patent number: 8392188
    Abstract: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.
    Type: Grant
    Filed: September 21, 2001
    Date of Patent: March 5, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Giuseppe Riccardi
  • Patent number: 8386251
    Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.
    Type: Grant
    Filed: June 8, 2009
    Date of Patent: February 26, 2013
    Assignee: Microsoft Corporation
    Inventors: Nikko Strom, Julian Odell, Jon Hamaker
  • Patent number: 8386248
    Abstract: A method of tuning reusable dialog components within a speech application can include detecting speech recognition events generated from a plurality of recognitions performed for a field of a reusable dialog component. The speech recognition events can be generated over a plurality of interactive voice response sessions. The method also can include automatically computing a suggested value for a tuning parameter corresponding to the field of the reusable dialog component according, at least in part, to the speech recognition events.
    Type: Grant
    Filed: September 22, 2006
    Date of Patent: February 26, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Girish Dhanakshirur, Baiju D. Mandalia, Aimee Silva
  • Patent number: 8386250
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: February 26, 2013
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
  • Patent number: 8380502
    Abstract: A system receives a voice search query from a user, derives recognition hypotheses from the voice search query, and determines scores associated with the recognition hypotheses, the scores being based on a comparison of the recognition hypotheses to previously received search queries. The system discards at least one of the recognition hypotheses that is associated with a first score that is less than a threshold value, and constructs a first query using at least one non-discarded recognition hypothesis, where the at least one first non-discarded recognition hypothesis is associated with a second score that at least meets the threshold value. The system forwards the first query to a search system, receives first results associated with the first query, and provides the first results to the user.
    Type: Grant
    Filed: October 14, 2011
    Date of Patent: February 19, 2013
    Assignee: Google Inc.
    Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
  • Patent number: 8380484
    Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.
    Type: Grant
    Filed: August 10, 2004
    Date of Patent: February 19, 2013
    Assignee: International Business Machines Corporation
    Inventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
  • Patent number: 8380503
    Abstract: Challenge items for an audible based electronic challenge system are generated using a variety of techniques to identify optimal candidates. The challenge items are intended for use in a computing system that discriminates between humans and text to speech (TTS) system.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: February 19, 2013
    Assignee: John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Patent number: 8374867
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.
    Type: Grant
    Filed: November 13, 2009
    Date of Patent: February 12, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
  • Patent number: 8374868
    Abstract: A method for recognizing speech involves reciting, into a speech recognition system, an utterance including a numeric sequence that contains a digit string including a plurality of tokens and detecting a co-articulation problem related to at least two potentially co-articulated tokens in the digit string. The numeric sequence may be identified using i) a dynamically generated possible numeric sequence that potentially corresponds with the numeric sequence, and/or ii) at least one supplemental acoustic model. Also disclosed herein is a system for accomplishing the same.
    Type: Grant
    Filed: August 21, 2009
    Date of Patent: February 12, 2013
    Assignee: General Motors LLC
    Inventors: Uma Arun, Sherri J Voran-Nowak, Rathinavelu Chengalvarayan, Gaurav Talwar
  • Patent number: 8374864
    Abstract: In one embodiment, a method includes receiving at a communication device an audio communication and a transcribed text created from the audio communication, and generating a mapping of the transcribed text to the audio communication independent of transcribing the audio. The mapping identifies locations of portions of the text in the audio communication. An apparatus for mapping the text to the audio is also disclosed.
    Type: Grant
    Filed: March 17, 2010
    Date of Patent: February 12, 2013
    Assignee: Cisco Technology, Inc.
    Inventor: Jim Kerr
  • Patent number: 8370139
    Abstract: A noise-environment storing unit stores therein a compensation vector for compensating a feature vector of a speech. A feature-vector extracting unit extracts the feature vector of the speech in each of a plurality of frames. A noise-environment-series estimating unit estimates a noise-environment series based on a feature-vector series and a degree of similarity. A calculating unit obtains a compensation vector corresponding to each noise environment in estimated noise-environment series based on the compensation vector present in the noise-environment storing unit. A compensating unit compensates the extracted feature vector of the speech based on obtained compensation vector.
    Type: Grant
    Filed: March 19, 2007
    Date of Patent: February 5, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masami Akamine, Takashi Masuko, Daniel Barreda, Remco Teunen
  • Patent number: 8355915
    Abstract: The disclosure describes an overall system/method for text-input using a multimodal interface with speech recognition. Specifically, pluralities of modes interact with the main speech mode to provide the speech-recognition system with partial knowledge of the text corresponding to the spoken utterance forming the input to the speech recognition system. The knowledge from other modes is used to dynamically change the ASR system's active vocabulary thereby significantly increasing recognition accuracy and significantly reducing processing requirements. Additionally, the speech recognition system is configured using three different system configurations (always listening, partially listening, and push-to-speak) and for each one of those three different user-interfaces are proposed (speak-and-type, type-and-speak, and speak-while-typing).
    Type: Grant
    Filed: November 30, 2007
    Date of Patent: January 15, 2013
    Inventor: Ashwin P. Rao
  • Patent number: 8352263
    Abstract: The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.
    Type: Grant
    Filed: September 29, 2009
    Date of Patent: January 8, 2013
    Inventors: Tze-Fen Li, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
  • Publication number: 20130006632
    Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.
    Type: Application
    Filed: September 12, 2012
    Publication date: January 3, 2013
    Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
  • Patent number: 8346551
    Abstract: A method for adapting a codebook for speech recognition, wherein the codebook is from a set of codebooks comprising a speaker-independent codebook and at least one speaker dependent codebook. A speech input is received and a feature vector based on the received speech input is determined. For each of the Gaussian densities, a first mean vector is estimated using an expectation process and taking into account the determined feature vector. For each of the Gaussian densities, a second mean vector using an Eigenvoice adaptation is determined taking into account the determined feature vector. For each of the Gaussian densities, the mean vector is set to a convex combination of the first and the second mean vector. Thus, this process allows for adaptation during operation and does not require a lengthy training phase.
    Type: Grant
    Filed: November 20, 2009
    Date of Patent: January 1, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Tobias Herbig, Franz Gerl
  • Publication number: 20120284025
    Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar The database may comprise a directory of names.
    Type: Application
    Filed: July 18, 2012
    Publication date: November 8, 2012
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Harry Blanchard, Steven LEWIS, Shankarnarayan SIVAPRASAD, Lan ZHANG
  • Patent number: 8306819
    Abstract: Techniques for enhanced automatic speech recognition are described. An enhanced ASR system may be operative to generate an error correction function. The error correction function may represent a mapping between a supervised set of parameters and an unsupervised training set of parameters generated using a same set of acoustic training data, and apply the error correction function to an unsupervised testing set of parameters to form a corrected set of parameters used to perform speaker adaptation. Other embodiments are described and claimed.
    Type: Grant
    Filed: March 9, 2009
    Date of Patent: November 6, 2012
    Assignee: Microsoft Corporation
    Inventors: Chaojun Liu, Yifan Gong
  • Patent number: 8296145
    Abstract: A voice dialing method includes the steps of receiving an utterance from a user, decoding the utterance to identify a recognition result for the utterance, and communicating to the user the recognition result. If an indication is received from the user that the communicated recognition result is incorrect, then it is added to a rejection reference. Then, when the user repeats the misunderstood utterance, the rejection reference can be used to eliminate the incorrect recognition result as a potential subsequent recognition result. The method can be used for single or multiple digits or digit strings.
    Type: Grant
    Filed: November 7, 2011
    Date of Patent: October 23, 2012
    Assignee: General Motors LLC
    Inventors: Jason W. Clark, Rathinavelu Chengalvarayan, Timothy J. Grost, Dana Fecher, Jeremy Spaulding
  • Publication number: 20120245940
    Abstract: A method for speech recognition is implemented in the specific form of computer processes that function in a computer processor. That is, one or more computer processes: process a speech input to produce a sequence of representative speech vectors and perform multiple recognition passes to determine a recognition output corresponding to the speech input. At least one generic recognition pass is based on a generic speech recognition arrangement using generic modeling of a broad general class of input speech. And at least one adapted recognition pass is based on a speech adapted arrangement using pre-adapted modeling of a specific sub-class of the general class of input speech.
    Type: Application
    Filed: December 8, 2009
    Publication date: September 27, 2012
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Daniel Willett, Lambert Mathias, Chuang He, Jianxiong Wu
  • Patent number: 8275615
    Abstract: A translation method and system include a recognition engine having a plurality of models each being employed to decode a same utterance to provide an output. A model combiner is configured to assign probabilities to each model output and configured to assign weights to the outputs of the plurality of models based on the probabilities to provide a best performing model for the context of the utterance.
    Type: Grant
    Filed: July 13, 2007
    Date of Patent: September 25, 2012
    Assignee: International Business Machines Corporation
    Inventors: Suleyman S. Kozat, Ruhi Sarikaya
  • Patent number: 8271578
    Abstract: A method of transferring data objects over a network comprises intercepting a network transfer message with a passing object, creating a unique identifier for the object using a predetermined function, the same function having been used to provide identifiers for objects stored at predetermined nodes of said network, removing the object and sending on the network transfer message with the unique identifier in place of the object. Then, at the recipient end it is possible to obtain the unique identifier and use it as a key to search for a corresponding object in the local nodes. The search starts with a node closest to the recipient and steadily spreads outwards. The object when found is reattached for the benefit of the recipient and network bandwidth has been saved by the avoidance of redundant transfer since the object is brought to the recipient from the node which is the closest to him.
    Type: Grant
    Filed: December 8, 2005
    Date of Patent: September 18, 2012
    Assignee: B-Obvious Ltd.
    Inventors: Guy Sheffi, Ovadi Somech
  • Patent number: 8270588
    Abstract: A method and system for automatic incoming call management uses function test results to build call signatures that are stored for later use in incoming call analysis. The function test results are used to compute a suspect score and confidence level associated with each incoming call, and are also used for making incoming call management decisions. A call treatment is selected based on the function test results and/or the computed suspect score and confidence level.
    Type: Grant
    Filed: January 5, 2007
    Date of Patent: September 18, 2012
    Inventor: Ronald Schwartz
  • Patent number: 8271280
    Abstract: A voice recognition apparatus can reduce false recognition caused by matching with respect to the phrases composed of a small number of syllables, when it performs a recognition process, by a pronunciation unit, for voice data based on voice produced by a speaker such as a syllable and further performs recognition by a method such as the Word Spotting for matching with respect to the phrases stored in the phrase database. The voice recognition apparatus performs a recognition process for comparing a result of the recognition process by a pronunciation unit with the extended phrases obtained by adding the additional phrase before and/or behind the respective phrases.
    Type: Grant
    Filed: October 1, 2008
    Date of Patent: September 18, 2012
    Assignee: Fujitsu Limited
    Inventor: Kenji Abe
  • Patent number: 8249868
    Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: August 21, 2012
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Trausti Kristjansson
  • Patent number: 8249870
    Abstract: A semi-automatic speech transcription system of the invention leverages the complementary capabilities of human and machine, building a system which combines automatic and manual approaches. With the invention, collected audio data is automatically distilled into speech segments, using signal processing and pattern recognition algorithms. The detected speech segments are presented to a human transcriber using a transcription tool with a streamlined transcription interface, requiring the transcriber to simply “listen and type”. This eliminates the need to manually navigate the audio, coupling the human effort to the amount of speech, rather than the amount of audio. Errors produced by the automatic system can be quickly identified by the human transcriber, which are used to improve the automatic system performance. The automatic system is tuned to maximize the human transcriber efficiency.
    Type: Grant
    Filed: November 12, 2008
    Date of Patent: August 21, 2012
    Assignee: Massachusetts Institute of Technology
    Inventors: Brandon Cain Roy, Deb Kumar Roy
  • Patent number: 8239199
    Abstract: A method includes identifying a first syllable in a first audio of a first word and a second syllable in a second audio of a second word, the first syllable having a first set of properties and the second syllable having a second set of properties; detecting the first syllable in a first instance of the first word in an audio file, the first syllable in the first instance having a third set of properties; determining one or more transformations for transforming the first set of properties to the third set of properties; applying the one or more transformations to the second set of properties of the second syllable to yield a transformed second syllable; and replacing the first syllable in the first instance of the first word with the transformed second syllable in the audio file.
    Type: Grant
    Filed: October 16, 2009
    Date of Patent: August 7, 2012
    Assignee: Yahoo! Inc.
    Inventor: Narayan Lakshmi Bhamidipati
  • Patent number: 8234111
    Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.
    Type: Grant
    Filed: June 14, 2010
    Date of Patent: July 31, 2012
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Trausti Kristjansson
  • Patent number: 8234112
    Abstract: Provided are an apparatus and method for generating a noise adaptive acoustic model including a noise adaptive discriminative adaptation method. The method includes: generating a baseline model parameter from large-capacity speech training data including various noise environments; and receiving the generated baseline model parameter and applying a discriminative adaptation method to the generated results to generate an migrated acoustic model parameter suitable for an actually applied environment.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: July 31, 2012
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Byung Ok Kang, Ho Young Jung, Yun Keun Lee
  • Patent number: RE43866
    Abstract: Method for improving a TFCI transportation performance, including the steps of (1) coding TFCI information bits to be transported through each radio frame, (2) repeating a TFCI code word produced by the coding for an arbitrary times, (3) applying puncturing patterns different from each other to the repeated code words produced as many as the repeated times, and puncturing the repeated code words at locations different from each other, and (4) dividing, inserting, and transporting the punctured fixed length repeated code words in each slot of the radio frame, whereby improving TFCI information transportation performance, and embodying the receiver side decoder to be identical to a case when a 32 bit code word are transported perfectly.
    Type: Grant
    Filed: November 13, 2009
    Date of Patent: December 18, 2012
    Assignee: LG Electronics Inc.
    Inventors: Sung Kwon Hong, Sung Lark Kwon, Young Woo Yun, Ki Jun Kim