Update Patterns Patents (Class 704/244)

System and method for verifying origin of input through spoken language analysis

Patent number: 8489399

Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.

Type: Grant

Filed: June 15, 2009

Date of Patent: July 16, 2013

Assignee: John Nicholas and Kristin Gross Trust

Inventor: John Nicholas Gross
Phonetic features for speech recognition

Patent number: 8484024

Abstract: Techniques are disclosed for using phonetic features for speech recognition. For example, a method comprises the steps of obtaining a first dictionary and a training data set associated with a speech recognition system, computing one or more support parameters from the training data set, transforming the first dictionary into a second dictionary, wherein the second dictionary is a function of one or more phonetic labels of the first dictionary, and using the one or more support parameters to select one or more samples from the second dictionary to create a set of one or more exemplar-based class identification features for a pattern recognition task.

Type: Grant

Filed: February 24, 2011

Date of Patent: July 9, 2013

Assignee: Nuance Communications, Inc.

Inventors: Dimitri Kanevsky, David Nahamoo, Bhuvana Ramabhadran, Tara N. Sainath
Enhanced accuracy for speech recognition grammars

Patent number: 8478593

Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.

Type: Grant

Filed: July 18, 2012

Date of Patent: July 2, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Harry Blanchard, Steven Lewis, Shankarnarayan Sivaprasad, Lan Zhang
Library of existing spoken dialog data for use in generating new natural language spoken dialog systems

Patent number: 8478589

Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.

Type: Grant

Filed: January 5, 2005

Date of Patent: July 2, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
Dictionary filtering using market data

Patent number: 8473293

Abstract: This specification describes technologies relating to system, methods, and articles for updating a speech recognition dictionary based on, at least in part, both search query and market data metrics. In general, one innovative aspect of the subject matter described in this specification can be embodied in a method comprising (i) identifying a candidate term for possible inclusion in a speech recognition dictionary, (ii) identifying at least one search query metric associated with the identified candidate term, (iii) identifying at least one market data metric associated with the identified candidate term, and (iv) generating a candidate term score for the identified candidate term based, at least in part, on a weighted combination of the at least one identified search query metric and the at least one identified market data metric.

Type: Grant

Filed: October 23, 2012

Date of Patent: June 25, 2013

Assignee: Google Inc.

Inventors: Pedro J. Mengibar, Jeffrey S. Sorensen
Log mining to modify grammar-based text processing

Patent number: 8473300

Abstract: Methods and systems for log mining for grammar-based text processing are provided. A method may comprise receiving, from a device, an activity log. The activity log may comprise one or more of an input instruction, a determined function based at least in part on a match of the input instruction to a grammar-based textual pattern including associations of a given function based on one or more grammars, and a response determination based on an acknowledgement of the determined function. The method may also comprise comparing at least a portion of the activity log with stored activity logs in order to determine a correlation between the activity log and the stored activity logs. The method may also comprise modifying the grammar-based textual pattern based on the determined correlation and providing information indicative of the modification to the device so as to update the grammar-based textual pattern.

Type: Grant

Filed: October 8, 2012

Date of Patent: June 25, 2013

Assignee: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Martin Jansche, Fadi Biadsy
Interactive speech recognition model

Patent number: 8463608

Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.

Type: Grant

Filed: March 12, 2012

Date of Patent: June 11, 2013

Assignee: Nuance Communications, Inc.

Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
System and method for efficient tracking of multiple dialog states with incremental recombination

Patent number: 8457968

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for tracking multiple dialog states. A system practicing the method receives an N-best list of speech recognition candidates, a list of current partitions, and a belief for each of the current partitions. A partition is a group of dialog states. In an outer loop, the system iterates over the N-best list of speech recognition candidates. In an inner loop, the system performs a split, update, and recombination process to generate a fixed number of partitions after each speech recognition candidate in the N-best list. The system recognizes speech based on the N-best list and the fixed number of partitions. The split process can perform all possible splits on all partitions. The update process can compute an estimated new belief. The estimated new belief can be a product of ASR reliability, user likelihood to produce this action, and an original belief.

Type: Grant

Filed: December 8, 2009

Date of Patent: June 4, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Jason Williams
Remote audio surveillance for detection and analysis of wildlife sounds

Patent number: 8457962

Abstract: This invention provides remote audio surveillance by recording audio data via three microphones and storage on a removable digital mass storage device, operating on battery power. The housing is of a weather resistant design to withstand outdoor conditions. Recording can be done in person or recording times can be defined so that the unit will only ‘listen’ during the desired times of the day, on a day to day basis. The user does not have to be in the vicinity but simply programs the record time(s) and leaves the device in the woods. The device also has play back capabilities for any recorded audio data and can interface with personal computers via the removable digital mass storage device. In addition to the audio collection and playback capabilities, PC software will be provided with the device which will analyze the data and provide direction of sound (based upon relative amplitude of the 3 microphones) and distance of sound (based on absolute and relative recorded amplitudes).

Type: Grant

Filed: August 4, 2006

Date of Patent: June 4, 2013

Inventor: Lawrence P. Jones
Method for the correction of measured values of vowel nasalance

Patent number: 8457965

Abstract: A method is described for correcting and improving the functioning of certain devices for the diagnosis and treatment of speech that dynamically measure the functioning of the velum in the control of nasality during speech. The correction method uses an estimate of the vowel frequency spectrum to greatly reduce the variation of nasalance with the vowel being spoken, so as to result in a corrected value of nasalance that reflects with greater accuracy the degree of velar opening. Correction is also described for reducing the effect on nasalance values of energy from the oral and nasal channels crossing over into the other channel because of imperfect acoustic separation.

Type: Grant

Filed: October 6, 2009

Date of Patent: June 4, 2013

Assignee: Rothenberg Enterprises

Inventor: Martin Rothenberg
SYSTEM AND METHOD FOR PERFORMING DUAL MODE SPEECH RECOGNITION

Publication number: 20130132084

Abstract: A system and method for performing dual mode speech recognition, employing a local recognition module on a mobile device and a remote recognition engine on a server device. The system accepts a spoken query from a user, and both the local recognition module and the remote recognition engine perform speech recognition operations on the query, returning a transcription and confidence score, subject to a latency cutoff time. If both sources successfully transcribe the query, then the system accepts the result having the higher confidence score. If only one source succeeds, then that result is accepted. In either case, if the remote recognition engine does succeed in transcribing the query, then a client vocabulary is updated if the remote system result includes information not present in the client vocabulary.

Type: Application

Filed: June 21, 2012

Publication date: May 23, 2013

Applicant: SOUNDHOUND, INC.

Inventors: Timothy Stonehocker, Keyvan Mohajer, Bernard Mont-Reynaud
Method and apparatus for processing scripts and related data

Patent number: 8447604

Abstract: Provided in some embodiments is a method including receiving ordered script words are indicative of dialogue words to be spoken, receiving audio data corresponding to at least a portion of the dialogue words to be spoken and including timecodes associated with dialogue words, generating a matrix of the ordered script words versus the dialogue words, aligning the matrix to determine hard alignment points that include matching consecutive sequences of ordered script words with corresponding sequences of dialogue words, partitioning the matrix of ordered script words into sub-matrices bounded by adjacent hard-alignment points and including corresponding sub-sets the script and dialogue words between the hard-alignment points, and aligning each of the sub-matrices.

Type: Grant

Filed: May 28, 2010

Date of Patent: May 21, 2013

Assignee: Adobe Systems Incorporated

Inventor: Walter W. Chang
Application-dependent information for recognition processing

Patent number: 8442826

Abstract: Architecture for integrating application-dependent information into a constraints component at deployment time or when available. In terms of a general grammar, the constraints component can include or be a general grammar that comprises application-independent information and is structured in such a way that application-dependent information can be integrated into the general grammar without loss of fidelity. The general grammar includes a probability space and reserves a section of the probability space for the integration of application-dependent information. An integration component integrates the application-dependent information into the reserved section of the probability space for recognition processing. The application-dependent information is integrated into the reserved section of the probability space at deployment time or when available. The general grammar is structured to support the integration and improve the overall system.

Type: Grant

Filed: June 10, 2009

Date of Patent: May 14, 2013

Assignee: Microsoft Corporation

Inventors: Jonathan E. Hamaker, Julian James Odell, Michael D. Plumpe, Sandeep Manocha, Keith C. Herold
System and Method for Mobile Automatic Speech Recognition

Publication number: 20130117023

Abstract: A system and method of updating automatic speech recognition parameters on a mobile device are disclosed. The method comprises storing user account-specific adaptation data associated with ASR on a computing device associated with a wireless network, generating new ASR adaptation parameters based on transmitted information from the mobile device when a communication channel between the computing device and the mobile device becomes available and transmitting the new ASR adaptation data to the mobile device when a communication channel between the computing device and the mobile device becomes available. The new ASR adaptation data on the mobile device more accurately recognizes user utterances.

Type: Application

Filed: October 22, 2012

Publication date: May 9, 2013

Applicant: AT&T INTELLECTUAL PROPERTY II, L.P.

Inventor: AT&T INTELLECTUAL PROPERTY II, L.P.
Updating standard patterns of words in a voice recognition dictionary

Patent number: 8438027

Abstract: An object of the invention is to conveniently increase standard patterns registered in a voice recognition device to efficiently extend the amount of words that can be voice-recognized. New standard patterns are generated by modifying a part of an existing standard pattern. A pattern matching unit 16 of a modifying-part specifying unit 14 performs pattern matching process to specify a part to be modified in the existing standard pattern of a usage source. A standard pattern generating unit 18 generates the new standard patterns by cutting or deleting voice data of the modifying part of the usage-source standard pattern, substituting the voice data of the modifying part of the usage-source standard pattern for another voice data, or combining the voice data of the modifying part of the usage-source standard pattern with another voice data. A standard pattern database update unit 20 adds the new standard patterns to a standard pattern database 24.

Type: Grant

Filed: May 25, 2006

Date of Patent: May 7, 2013

Assignee: Panasonic Corporation

Inventors: Toshiyuki Teranishi, Kouji Hatano
Method and system for generating training data for an automatic speech recognizer

Patent number: 8438026

Abstract: The invention describes a method and a system for generating training data (DT) for an automatic speech recogniser (2) for operating at a particular first sampling frequency (fH), comprising steps of deriving spectral characteristics (SL) from audio data (DL) sampled at a second frequency (fL) lower than the first sampling frequency (fH), extending the bandwidth of the spectral characteristics (SL) by retrieving bandwidth extending informationOBE) from a codebook (6), and processing the bandwidth extended spectral characteristics (SLE) to give the required training data (DT). Moreover a method and a system (5) for generating a codebook (6) for extending the bandwidth of spectral characteristics (SL) for audio data (DL) sampled at a second sampling frequency (fL) to spectral characteristics (SH) for a first sampling frequency (fH) higher than the second sampling frequency (fL) are described.

Type: Grant

Filed: February 10, 2005

Date of Patent: May 7, 2013

Assignee: Nuance Communications, Inc.

Inventors: Alexander Fischer, Rolf Dieter Bippus
Speech recognition dictionary creating support device, computer readable medium storing processing program, and processing method

Patent number: 8423354

Abstract: A device extracts prosodic information including a power value from a speech data and an utterance section including a period with a power value equal to or larger than a threshold, from the speech data, divides the utterance section into each section in which a power value equal to or larger than another threshold, acquires phoneme sequence data for each divided speech data by phoneme recognition, generates clusters which is a set of the classified phoneme sequence data by clustering, calculates an evaluation value for each cluster, selects clusters for which the evaluation value is equal to or larger than a given value as candidate clusters, determines one of the phoneme sequence data from the phoneme sequence data constituting the cluster for each candidate cluster to be a representative phoneme sequence, and selects the divided speech data corresponding to the representative phoneme sequence as listening target speech data.

Type: Grant

Filed: November 5, 2010

Date of Patent: April 16, 2013

Assignee: Fujitsu Limited

Inventor: Sachiko Onodera
Methods and systems for correcting transcribed audio files

Patent number: 8407052

Abstract: Methods and systems for correcting transcribed text. One method includes receiving audio data from one or more audio data sources and transcribing the audio data based on a voice model to generate text data. The method also includes making the text data available to a plurality of users over at least one computer network and receiving corrected text data over the at least one computer network from the plurality of users. In addition, the method can include modifying the voice model based on the corrected text data.

Type: Grant

Filed: April 17, 2007

Date of Patent: March 26, 2013

Assignee: Vovision, LLC

Inventor: Paul M. Hager
Consolidating Speech Recognition Results

Publication number: 20130073286

Abstract: Candidate interpretations resulting from application of speech recognition algorithms to spoken input are presented in a consolidated manner that reduces redundancy. A list of candidate interpretations is generated, and each candidate interpretation is subdivided into time-based portions, forming a grid. Those time-based portions that duplicate portions from other candidate interpretations are removed from the grid. A user interface is provided that presents the user with an opportunity to select among the candidate interpretations; the user interface is configured to present these alternatives without duplicate elements.

Type: Application

Filed: September 20, 2011

Publication date: March 21, 2013

Applicant: APPLE INC.

Inventors: Marcello Bastea-Forte, David A. Winarsky
System and method for targeted tuning of a speech recognition system

Patent number: 8401851

Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. In a particular embodiment, a method includes determining a frequency of occurrence of a particular type of utterance method and includes determining whether the frequency of occurrence exceeds a threshold. The method further includes tuning a speech recognition system to improve recognition of the particular type of utterance when the frequency of occurrence of the particular type of utterance exceeds the threshold.

Type: Grant

Filed: July 15, 2009

Date of Patent: March 19, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
Distributed voice user interface

Patent number: 8396710

Abstract: A distributed voice user interface system includes a local device which receives speech input issued from a user. Such speech input may specify a command or a request by the user. The local device performs preliminary processing of the speech input and also provides the speech input to a remote system. The local device is able to update its recognition capabilities based on analysis of the speech, input by the remote system.

Type: Grant

Filed: November 23, 2011

Date of Patent: March 12, 2013

Assignee: Ben Franklin Patent Holding LLC

Inventors: George M. White, James J. Buteau, Glen E. Shires, Kevin J. Surace, Steven Markman
Method and system for building a phonotactic model for domain independent speech recognition

Patent number: 8392188

Abstract: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.

Type: Grant

Filed: September 21, 2001

Date of Patent: March 5, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Giuseppe Riccardi
Progressive application of knowledge sources in multistage speech recognition

Patent number: 8386251

Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.

Type: Grant

Filed: June 8, 2009

Date of Patent: February 26, 2013

Assignee: Microsoft Corporation

Inventors: Nikko Strom, Julian Odell, Jon Hamaker
Tuning reusable software components in a speech application

Patent number: 8386248

Abstract: A method of tuning reusable dialog components within a speech application can include detecting speech recognition events generated from a plurality of recognitions performed for a field of a reusable dialog component. The speech recognition events can be generated over a plurality of interactive voice response sessions. The method also can include automatically computing a suggested value for a tuning parameter corresponding to the field of the reusable dialog component according, at least in part, to the speech recognition events.

Type: Grant

Filed: September 22, 2006

Date of Patent: February 26, 2013

Assignee: Nuance Communications, Inc.

Inventors: Girish Dhanakshirur, Baiju D. Mandalia, Aimee Silva
Disambiguation of contact information using historical data

Patent number: 8386250

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.

Type: Grant

Filed: September 30, 2011

Date of Patent: February 26, 2013

Assignee: Google Inc.

Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
Voice interface for a search engine

Patent number: 8380502

Abstract: A system receives a voice search query from a user, derives recognition hypotheses from the voice search query, and determines scores associated with the recognition hypotheses, the scores being based on a comparison of the recognition hypotheses to previously received search queries. The system discards at least one of the recognition hypotheses that is associated with a first score that is less than a threshold value, and constructs a first query using at least one non-discarded recognition hypothesis, where the at least one first non-discarded recognition hypothesis is associated with a second score that at least meets the threshold value. The system forwards the first query to a search system, receives first results associated with the first query, and provides the first results to the user.

Type: Grant

Filed: October 14, 2011

Date of Patent: February 19, 2013

Assignee: Google Inc.

Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
Method and system of dynamically changing a sentence structure of a message

Patent number: 8380484

Abstract: A method (50) of dynamically changing a sentence structure of a message can include the step of receiving (51) a user request for information, retrieving (52) data based on the information requested, and altering (53) among an intonation and/or the language conveying the information based on the context of the information to be presented. The intonation can optionally be altered by altering (54) a volume, a speed, and/or a pitch based on the information to be presented. The language can be altered by selecting (55) among a finite set of synonyms based on the information to be presented to the user or by selecting (56) among key verbs, adjectives or adverbs that vary along a continuum.

Type: Grant

Filed: August 10, 2004

Date of Patent: February 19, 2013

Assignee: International Business Machines Corporation

Inventors: Brent L. Davis, Stephen W. Hanley, Vanessa V. Michelini, Melanie D. Polkosky
System and method for generating challenge items for CAPTCHAs

Patent number: 8380503

Abstract: Challenge items for an audible based electronic challenge system are generated using a variety of techniques to identify optimal candidates. The challenge items are intended for use in a computing system that discriminates between humans and text to speech (TTS) system.

Type: Grant

Filed: June 15, 2009

Date of Patent: February 19, 2013

Assignee: John Nicholas and Kristin Gross Trust

Inventor: John Nicholas Gross
System and method for standardized speech recognition infrastructure

Patent number: 8374867

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.

Type: Grant

Filed: November 13, 2009

Date of Patent: February 12, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
Method of recognizing speech

Patent number: 8374868

Abstract: A method for recognizing speech involves reciting, into a speech recognition system, an utterance including a numeric sequence that contains a digit string including a plurality of tokens and detecting a co-articulation problem related to at least two potentially co-articulated tokens in the digit string. The numeric sequence may be identified using i) a dynamically generated possible numeric sequence that potentially corresponds with the numeric sequence, and/or ii) at least one supplemental acoustic model. Also disclosed herein is a system for accomplishing the same.

Type: Grant

Filed: August 21, 2009

Date of Patent: February 12, 2013

Assignee: General Motors LLC

Inventors: Uma Arun, Sherri J Voran-Nowak, Rathinavelu Chengalvarayan, Gaurav Talwar
Correlation of transcribed text with corresponding audio

Patent number: 8374864

Abstract: In one embodiment, a method includes receiving at a communication device an audio communication and a transcribed text created from the audio communication, and generating a mapping of the transcribed text to the audio communication independent of transcribing the audio. The mapping identifies locations of portions of the text in the audio communication. An apparatus for mapping the text to the audio is also disclosed.

Type: Grant

Filed: March 17, 2010

Date of Patent: February 12, 2013

Assignee: Cisco Technology, Inc.

Inventor: Jim Kerr
Feature-vector compensating apparatus, feature-vector compensating method, and computer program product

Patent number: 8370139

Abstract: A noise-environment storing unit stores therein a compensation vector for compensating a feature vector of a speech. A feature-vector extracting unit extracts the feature vector of the speech in each of a plurality of frames. A noise-environment-series estimating unit estimates a noise-environment series based on a feature-vector series and a degree of similarity. A calculating unit obtains a compensation vector corresponding to each noise environment in estimated noise-environment series based on the compensation vector present in the noise-environment storing unit. A compensating unit compensates the extracted feature vector of the speech based on obtained compensation vector.

Type: Grant

Filed: March 19, 2007

Date of Patent: February 5, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masami Akamine, Takashi Masuko, Daniel Barreda, Remco Teunen
Multimodal speech recognition system

Patent number: 8355915

Abstract: The disclosure describes an overall system/method for text-input using a multimodal interface with speech recognition. Specifically, pluralities of modes interact with the main speech mode to provide the speech-recognition system with partial knowledge of the text corresponding to the spoken utterance forming the input to the speech recognition system. The knowledge from other modes is used to dynamically change the ASR system's active vocabulary thereby significantly increasing recognition accuracy and significantly reducing processing requirements. Additionally, the speech recognition system is configured using three different system configurations (always listening, partially listening, and push-to-speak) and for each one of those three different user-interfaces are proposed (speak-and-type, type-and-speak, and speak-while-typing).

Type: Grant

Filed: November 30, 2007

Date of Patent: January 15, 2013

Inventor: Ashwin P. Rao
Method for speech recognition on all languages and for inputing words using speech recognition

Patent number: 8352263

Abstract: The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.

Type: Grant

Filed: September 29, 2009

Date of Patent: January 8, 2013

Inventors: Tze-Fen Li, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
SYSTEM AND METHOD FOR APPLYING DYNAMIC CONTEXTUAL GRAMMARS AND LANGUAGE MODELS TO IMPROVE AUTOMATIC SPEECH RECOGNITION ACCURACY

Publication number: 20130006632

Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.

Type: Application

Filed: September 12, 2012

Publication date: January 3, 2013

Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
Method for adapting a codebook for speech recognition

Patent number: 8346551

Abstract: A method for adapting a codebook for speech recognition, wherein the codebook is from a set of codebooks comprising a speaker-independent codebook and at least one speaker dependent codebook. A speech input is received and a feature vector based on the received speech input is determined. For each of the Gaussian densities, a first mean vector is estimated using an expectation process and taking into account the determined feature vector. For each of the Gaussian densities, a second mean vector using an Eigenvoice adaptation is determined taking into account the determined feature vector. For each of the Gaussian densities, the mean vector is set to a convex combination of the first and the second mean vector. Thus, this process allows for adaptation during operation and does not require a lengthy training phase.

Type: Grant

Filed: November 20, 2009

Date of Patent: January 1, 2013

Assignee: Nuance Communications, Inc.

Inventors: Tobias Herbig, Franz Gerl
ENHANCED ACCURACY FOR SPEECH RECOGNITION GRAMMARS

Publication number: 20120284025

Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar The database may comprise a directory of names.

Type: Application

Filed: July 18, 2012

Publication date: November 8, 2012

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Harry Blanchard, Steven LEWIS, Shankarnarayan SIVAPRASAD, Lan ZHANG
Enhanced automatic speech recognition using mapping between unsupervised and supervised speech model parameters trained on same acoustic training data

Patent number: 8306819

Abstract: Techniques for enhanced automatic speech recognition are described. An enhanced ASR system may be operative to generate an error correction function. The error correction function may represent a mapping between a supervised set of parameters and an unsupervised training set of parameters generated using a same set of acoustic training data, and apply the error correction function to an unsupervised testing set of parameters to form a corrected set of parameters used to perform speaker adaptation. Other embodiments are described and claimed.

Type: Grant

Filed: March 9, 2009

Date of Patent: November 6, 2012

Assignee: Microsoft Corporation

Inventors: Chaojun Liu, Yifan Gong
Voice dialing using a rejection reference

Patent number: 8296145

Abstract: A voice dialing method includes the steps of receiving an utterance from a user, decoding the utterance to identify a recognition result for the utterance, and communicating to the user the recognition result. If an indication is received from the user that the communicated recognition result is incorrect, then it is added to a rejection reference. Then, when the user repeats the misunderstood utterance, the rejection reference can be used to eliminate the incorrect recognition result as a potential subsequent recognition result. The method can be used for single or multiple digits or digit strings.

Type: Grant

Filed: November 7, 2011

Date of Patent: October 23, 2012

Assignee: General Motors LLC

Inventors: Jason W. Clark, Rathinavelu Chengalvarayan, Timothy J. Grost, Dana Fecher, Jeremy Spaulding
Guest Speaker Robust Adapted Speech Recognition

Publication number: 20120245940

Abstract: A method for speech recognition is implemented in the specific form of computer processes that function in a computer processor. That is, one or more computer processes: process a speech input to produce a sequence of representative speech vectors and perform multiple recognition passes to determine a recognition output corresponding to the speech input. At least one generic recognition pass is based on a generic speech recognition arrangement using generic modeling of a broad general class of input speech. And at least one adapted recognition pass is based on a speech adapted arrangement using pre-adapted modeling of a specific sub-class of the general class of input speech.

Type: Application

Filed: December 8, 2009

Publication date: September 27, 2012

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Daniel Willett, Lambert Mathias, Chuang He, Jianxiong Wu
Model weighting, selection and hypotheses combination for automatic speech recognition and machine translation

Patent number: 8275615

Abstract: A translation method and system include a recognition engine having a plurality of models each being employed to decode a same utterance to provide an output. A model combiner is configured to assign probabilities to each model output and configured to assign weights to the outputs of the plurality of models based on the probabilities to provide a best performing model for the context of the utterance.

Type: Grant

Filed: July 13, 2007

Date of Patent: September 25, 2012

Assignee: International Business Machines Corporation

Inventors: Suleyman S. Kozat, Ruhi Sarikaya
Bidirectional data transfer optimization and content control for networks

Patent number: 8271578

Abstract: A method of transferring data objects over a network comprises intercepting a network transfer message with a passing object, creating a unique identifier for the object using a predetermined function, the same function having been used to provide identifiers for objects stored at predetermined nodes of said network, removing the object and sending on the network transfer message with the unique identifier in place of the object. Then, at the recipient end it is possible to obtain the unique identifier and use it as a key to search for a corresponding object in the local nodes. The search starts with a node closest to the recipient and steadily spreads outwards. The object when found is reattached for the benefit of the recipient and network bandwidth has been saved by the avoidance of redundant transfer since the object is brought to the recipient from the node which is the closest to him.

Type: Grant

Filed: December 8, 2005

Date of Patent: September 18, 2012

Assignee: B-Obvious Ltd.

Inventors: Guy Sheffi, Ovadi Somech
Method and system for incoming call management

Patent number: 8270588

Abstract: A method and system for automatic incoming call management uses function test results to build call signatures that are stored for later use in incoming call analysis. The function test results are used to compute a suspect score and confidence level associated with each incoming call, and are also used for making incoming call management decisions. A call treatment is selected based on the function test results and/or the computed suspect score and confidence level.

Type: Grant

Filed: January 5, 2007

Date of Patent: September 18, 2012

Inventor: Ronald Schwartz
Voice recognition apparatus and memory product

Patent number: 8271280

Abstract: A voice recognition apparatus can reduce false recognition caused by matching with respect to the phrases composed of a small number of syllables, when it performs a recognition process, by a pronunciation unit, for voice data based on voice produced by a speaker such as a syllable and further performs recognition by a method such as the Word Spotting for matching with respect to the phrases stored in the phrase database. The voice recognition apparatus performs a recognition process for comparing a result of the recognition process by a pronunciation unit with the extended phrases obtained by adding the additional phrase before and/or behind the respective phrases.

Type: Grant

Filed: October 1, 2008

Date of Patent: September 18, 2012

Assignee: Fujitsu Limited

Inventor: Kenji Abe
Speech and noise models for speech recognition

Patent number: 8249868

Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Type: Grant

Filed: September 30, 2011

Date of Patent: August 21, 2012

Assignee: Google Inc.

Inventors: Matthew I. Lloyd, Trausti Kristjansson
Semi-automatic speech transcription

Patent number: 8249870

Abstract: A semi-automatic speech transcription system of the invention leverages the complementary capabilities of human and machine, building a system which combines automatic and manual approaches. With the invention, collected audio data is automatically distilled into speech segments, using signal processing and pattern recognition algorithms. The detected speech segments are presented to a human transcriber using a transcription tool with a streamlined transcription interface, requiring the transcriber to simply “listen and type”. This eliminates the need to manually navigate the audio, coupling the human effort to the amount of speech, rather than the amount of audio. Errors produced by the automatic system can be quickly identified by the human transcriber, which are used to improve the automatic system performance. The automatic system is tuned to maximize the human transcriber efficiency.

Type: Grant

Filed: November 12, 2008

Date of Patent: August 21, 2012

Assignee: Massachusetts Institute of Technology

Inventors: Brandon Cain Roy, Deb Kumar Roy
Replacing an audio portion

Patent number: 8239199

Abstract: A method includes identifying a first syllable in a first audio of a first word and a second syllable in a second audio of a second word, the first syllable having a first set of properties and the second syllable having a second set of properties; detecting the first syllable in a first instance of the first word in an audio file, the first syllable in the first instance having a third set of properties; determining one or more transformations for transforming the first set of properties to the third set of properties; applying the one or more transformations to the second set of properties of the second syllable to yield a transformed second syllable; and replacing the first syllable in the first instance of the first word with the transformed second syllable in the audio file.

Type: Grant

Filed: October 16, 2009

Date of Patent: August 7, 2012

Assignee: Yahoo! Inc.

Inventor: Narayan Lakshmi Bhamidipati
Speech and noise models for speech recognition

Patent number: 8234111

Abstract: An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Type: Grant

Filed: June 14, 2010

Date of Patent: July 31, 2012

Assignee: Google Inc.

Inventors: Matthew I. Lloyd, Trausti Kristjansson
Apparatus and method for generating noise adaptive acoustic model for environment migration including noise adaptive discriminative adaptation method

Patent number: 8234112

Abstract: Provided are an apparatus and method for generating a noise adaptive acoustic model including a noise adaptive discriminative adaptation method. The method includes: generating a baseline model parameter from large-capacity speech training data including various noise environments; and receiving the generated baseline model parameter and applying a discriminative adaptation method to the generated results to generate an migrated acoustic model parameter suitable for an actually applied environment.

Type: Grant

Filed: April 25, 2008

Date of Patent: July 31, 2012

Assignee: Electronics and Telecommunications Research Institute

Inventors: Byung Ok Kang, Ho Young Jung, Yun Keun Lee
Method for improving TFCI transportation performance

Patent number: RE43866

Abstract: Method for improving a TFCI transportation performance, including the steps of (1) coding TFCI information bits to be transported through each radio frame, (2) repeating a TFCI code word produced by the coding for an arbitrary times, (3) applying puncturing patterns different from each other to the repeated code words produced as many as the repeated times, and puncturing the repeated code words at locations different from each other, and (4) dividing, inserting, and transporting the punctured fixed length repeated code words in each slot of the radio frame, whereby improving TFCI information transportation performance, and embodying the receiver side decoder to be identical to a case when a 32 bit code word are transported perfectly.

Type: Grant

Filed: November 13, 2009

Date of Patent: December 18, 2012

Assignee: LG Electronics Inc.

Inventors: Sung Kwon Hong, Sung Lark Kwon, Young Woo Yun, Ki Jun Kim

prev … 2 3 4 5 6 7 8 9 10 … next