Probability Patents (Class 704/240)

Turbo Processing of Speech Recognition

Publication number: 20130006631

Abstract: Environmental recognition systems may improve recognition accuracy by leveraging local and nonlocal features in a recognition target. A local decoder may be used to analyze local features, and a nonlocal decoder may be used to analyze nonlocal features. Local and nonlocal estimates may then be exchanged to improve the accuracy of the local and nonlocal decoders. Additional iterations of analysis and exchange may be performed until a predetermined threshold is reached. In some embodiments, the system may comprise extrinsic information extractors to prevent positive feedback loops from causing the system to adhere to erroneous previous decisions.

Type: Application

Filed: June 28, 2012

Publication date: January 3, 2013

Applicant: UTAH STATE UNIVERSITY

Inventors: Jacob Gunther, Todd Moon
Automatic grammar tuning using statistical language model generation

Patent number: 8346555

Abstract: The present invention discloses a speech processing solution that utilizes an original speech recognition grammar in a speech recognition system to perform speech recognition operations for multiple recognition instances. Instance data associated with the recognition operations can be stored. A replacement grammar can be automatically generated from the stored instance data, where the replacement grammar is a statistical language model grammar. The original speech recognition grammar, which can be a grammar-based language model grammar or a statistical language model grammar, can be selectively replaced with the replacement grammar. For example when tested performance for the replacement grammar is better than that for the original grammar, the replacement grammar can replace the original grammar.

Type: Grant

Filed: August 22, 2006

Date of Patent: January 1, 2013

Assignee: Nuance Communications, Inc.

Inventor: Brent D. Metz
Information processing apparatus, information processing method, and program

Patent number: 8332208

Abstract: An information processing apparatus includes: morphological analysis means for performing morphological analysis on a text document; managing means for managing a connection pattern indicating a connection relationship of a morpheme of a predetermined part of speech; and extracting means extracting, from a string of morphemes obtained by performing morphological analysis by the morphological analysis means, a phrase including a plurality of morphemes having a same connection relationship as the connection relationship indicated by the connection pattern managed by the managing means.

Type: Grant

Filed: September 3, 2008

Date of Patent: December 11, 2012

Assignee: Sony Corporation

Inventor: Mitsuhiro Miyazaki
Viterbi decoder and speech recognition method using same using non-linear filter for observation probabilities

Patent number: 8332222

Abstract: A Viterbi decoder includes: an observation vector sequence generator for generating an observation vector sequence by converting an input speech to a sequence of observation vectors; a local optimal state calculator for obtaining a partial state sequence having a maximum similarity up to a current observation vector as an optimal state; an observation probability calculator for obtaining, as a current observation probability, a probability for observing the current observation vector in the optimal state; a buffer for storing therein a specific number of previous observation probabilities; a non-linear filter for calculating a filtered probability by using the previous observation probabilities stored in the buffer and the current observation probability; and a maximum likelihood calculator for calculating a partial maximum likelihood by using the filtered probability.

Type: Grant

Filed: July 21, 2009

Date of Patent: December 11, 2012

Assignee: Electronics and Telecommunications Research Institute

Inventors: Hoon Chung, Jeon Gue Park, Yunkeun Lee, Ho-Young Jung, Hyung-Bae Jeon, Jeom Ja Kang, Sung Joo Lee, Euisok Chung, Ji Hyun Wang, Byung Ok Kang, Ki-young Park, Jong Jin Kim
Large language models in machine translation

Patent number: 8332207

Abstract: Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n-1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus.

Type: Grant

Filed: June 22, 2007

Date of Patent: December 11, 2012

Assignee: Google Inc.

Inventors: Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, Jeffrey Dean
System and method of semi-supervised learning for spoken language understanding using semantic role labeling

Patent number: 8321220

Abstract: A system and method are disclosed for providing semi-supervised learning for a spoken language understanding module using semantic role labeling. The method embodiment relates to a method of generating a spoken language understanding module. Steps in the method comprise selecting at least one predicate/argument pair as an intent from a set of the most frequent predicate/argument pairs for a domain, labeling training data using mapping rules associated with the selected at least one predicate/argument pair, training a call-type classification model using the labeled training data, re-labeling the training data using the call-type classification model and iteratively several of the above steps until training set labels converge.

Type: Grant

Filed: November 30, 2005

Date of Patent: November 27, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
Rescoring speech recognition hypothesis using prosodic likelihood

Patent number: 8315870

Abstract: A distance calculation unit (16) obtains the acoustic distance between the feature amount of input speech and each phonetic model. A word search unit (17) performs a word search based on the acoustic distance and a language model including the phoneme and prosodic label of a word, and outputs a word hypothesis and a first score representing the likelihood of the word hypothesis. The word search unit (17) also outputs a vowel interval and its tone label in the input speech, when assuming that the recognition result of the input speech is the word hypothesis. A tone recognition unit (21) outputs a second score representing the likelihood of the tone label output from the word search unit (17) based on a feature amount corresponding to the vowel interval output from the word search unit (17). A rescore unit (22) corrects the first score of the word hypothesis output from the word search unit (17) using the second score output from the tone recognition unit (21).

Type: Grant

Filed: August 22, 2008

Date of Patent: November 20, 2012

Assignee: NEC Corporation

Inventor: Ken Hanazawa
Automatic speech recognition method and apparatus

Patent number: 8311825

Abstract: A system for calculating the look ahead probabilities at the nodes in a language model look ahead tree, wherein the words of the vocabulary of the language are located at the leaves of the tree, said apparatus comprising: means to assign a language model probability to each of the words of the vocabulary using a first low order language model; means to calculate the language look ahead probabilities for all nodes in said tree using said first language model; means to determine if the language model probability of one or more words of said vocabulary can be calculated using a higher order language model and updating said words with the higher order language model; and means to update the look ahead probability at only the nodes which are affected by the words where the language model has been updated.

Type: Grant

Filed: October 3, 2008

Date of Patent: November 13, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventor: Langzhou Chen
Discriminative training of language models for text and speech classification

Patent number: 8306818

Abstract: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

Type: Grant

Filed: April 15, 2008

Date of Patent: November 6, 2012

Assignee: Microsoft Corporation

Inventors: Ciprian Chelba, Alejandro Acero, Milind Mahajan
DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA

Publication number: 20120278076

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information are described. A method includes determining, for each of multiple communications that were initiated by a user of a mobile device, a time when the communication was initiated or received; determining, for each of multiple contacts associated with the user, a probability associated with the contact based at least on the times when the communications were initiated or received; weighting a contact disambiguation grammar according to the probabilities; and processing audio data using the contact disambiguation grammar to select a particular contact.

Type: Application

Filed: July 10, 2012

Publication date: November 1, 2012

Applicant: GOOGLE INC.

Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
System and method for discriminative pronunciation modeling for voice search

Patent number: 8296141

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function.

Type: Grant

Filed: November 19, 2008

Date of Patent: October 23, 2012

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
SPEAKER STATE DETECTING APPARATUS AND SPEAKER STATE DETECTING METHOD

Publication number: 20120253807

Abstract: A speaker state detecting apparatus comprises: an audio input unit for acquiring, at least, a first voice emanated by a first speaker and a second voice emanated by a second speaker; a speech interval detecting unit for detecting an overlap period between a first speech period of the first speaker included in the first voice and a second speech period of the second speaker included in the second voice, which starts before the first speech period, or an interval between the first speech period and the second speech period; a state information extracting unit for extracting state information representing a state of the first speaker from the first speech period; and a state detecting unit for detecting the state of the first speaker in the first speech period based on the overlap period or the interval and the first state information.

Type: Application

Filed: February 3, 2012

Publication date: October 4, 2012

Applicant: FUJITSU LIMITED

Inventor: Akira KAMANO
Model weighting, selection and hypotheses combination for automatic speech recognition and machine translation

Patent number: 8275615

Abstract: A translation method and system include a recognition engine having a plurality of models each being employed to decode a same utterance to provide an output. A model combiner is configured to assign probabilities to each model output and configured to assign weights to the outputs of the plurality of models based on the probabilities to provide a best performing model for the context of the utterance.

Type: Grant

Filed: July 13, 2007

Date of Patent: September 25, 2012

Assignee: International Business Machines Corporation

Inventors: Suleyman S. Kozat, Ruhi Sarikaya
System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands

Patent number: 8275616

Abstract: The present invention relates to a continuous speech recognition system that is very robust in a noisy environment. In order to recognize continuous speech smoothly in a noisy environment, the system selects call commands, configures a minimum recognition network in token, which consists of the call commands and mute intervals including noises, recognizes the inputted speech continuously in real time, analyzes the reliability of speech recognition continuously and recognizes the continuous speech from a speaker. When a speaker delivers a call command, the system for detecting the speech interval and recognizing continuous speech in a noisy environment through the real-time recognition of call commands measures the reliability of the speech after recognizing the call command, and recognizes the speech from the speaker by transferring the speech interval following the call command to a continuous speech-recognition engine at the moment when the system recognizes the call command.

Type: Grant

Filed: April 22, 2009

Date of Patent: September 25, 2012

Assignee: KoreaPowerVoice Co., Ltd.

Inventors: Heui-Suck Jung, Se-Hoon Chin, Tae-Young Roh
AUTOMATIC SPOKEN LANGUAGE IDENTIFICATION BASED ON PHONEME SEQUENCE PATTERNS

Publication number: 20120232901

Abstract: A language identification system that includes a universal phoneme decoder (UPD) is described. The UPD contains a universal phoneme set representing both 1) all phonemes occurring in the set of two or more spoken languages, and 2) captures phoneme correspondences across languages, such that a set of unique phoneme patterns and probabilities are calculated in order to identify a most likely phoneme occurring each time in the audio files in the set of two or more potential languages in which the UPD was trained on. Each statistical language model (SLM) uses the set of unique phoneme patterns created for each language in the set to distinguish between spoken human languages in the set of languages. The run-time language identifier module identifies a particular human language being spoken by utilizing the linguistic probabilities supplied by the SLMs that are based on the set of unique phoneme patterns created for each language.

Type: Application

Filed: May 24, 2012

Publication date: September 13, 2012

Applicant: Autonomy Corporation Ltd.

Inventors: Mahapathy Kadirkamanathan, Christopher John Waple
Signal processing method and processor

Patent number: 8255214

Abstract: A first signal of two signals to be compared for similarity is divided into small areas and one small area is selected for calculating the correlation with a second signal using a correlative method. Then, the quantity of translation, expansion rate and similarity in an area where the similarity, which is the square of the correlation value, reaches its maximum, are found. Values based on the similarity are integrated at a position represented by the quantity of translation and expansion rate. Similar processing is performed with respect to all the small areas, and at a peak where the maximum integral value of the similarity is obtained, its magnitude is compared with a threshold value to evaluate the similarity. The small area voted for that peak can be extracted.

Type: Grant

Filed: October 15, 2002

Date of Patent: August 28, 2012

Assignee: Sony Corporation

Inventors: Mototsugu Abe, Masayuki Nishiguchi
Language understanding device

Patent number: 8244522

Abstract: A language understanding device includes: a language understanding model storing unit configured to store word transition data including pre-transition states, input words, predefined outputs corresponding to the input words, word weight information, and post-transition states, and concept weighting data including concepts obtained from language understanding results for at least one word, and concept weight information corresponding to the concepts; a finite state transducer processing unit configured to output understanding result candidates including the predefined outputs, to accumulate word weights so as to obtain a cumulative word weight, and to sequentially perform state transition operations; a concept weighting processing unit configured to accumulate concept weights so as to obtain a cumulative concept weight; and an understanding result determination unit configured to determine an understanding result from the understanding result candidates by referring to the cumulative word weight and the cumul

Type: Grant

Filed: May 20, 2008

Date of Patent: August 14, 2012

Assignee: Honda Motor Co., Ltd.

Inventors: Mikio Nakano, Hiroshi Okuno, Kazunori Komatani, Yuichiro Fukubayashi, Kotaro Funakoshi
Supplier deduplication engine

Patent number: 8234107

Abstract: Disclosed herein is a method of grouping similar supplier names together in a database. The syntactical errors in the supplier names are corrected. The supplier names are grouped after correcting the syntactical errors. The abbreviations in the supplier names are captured. The ordering, pronunciation and stemming errors in the supplier names are corrected. A matching algorithm that matches and compares two supplier names is applied that comprises the steps of grouping supplier names based on first set of characters in the supplier names and calculating a matching score between the two supplier using Levenshtein distance between the two supplier names, along with the supplier names' sound codes obtained from a modified metaphone algorithm, length of each word, position of matching and mismatching characters, and stem of words in the supplier names. The matching scores are compared with set thresholds in order to further group the supplier names into clusters.

Type: Grant

Filed: February 12, 2008

Date of Patent: July 31, 2012

Assignee: Ketera Technologies, Inc.

Inventor: Ram Dayal Goyal
Creating a mixed-initiative grammar from directed dialog grammars

Patent number: 8229745

Abstract: A method of building a mixed-initiative grammar can include receiving one or more conjoin phrases, wherein each conjoin phrase is associated with a selected one of the plurality of directed dialog grammars, and receiving a user input specifying a selected grammar generation technique. The mixed-initiative grammar can be automatically generated, in accordance with the selected grammar generation technique, such that the mixed-initiative grammar specifies an allowable ordering of sets when interpreting a user spoken utterance and whether duplicative phrases are allowable within the user spoken utterance.

Type: Grant

Filed: October 21, 2005

Date of Patent: July 24, 2012

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, David Jaramillo, Gerald M. McCobb
Spell-check for a keyboard system with automatic correction

Patent number: 8225203

Abstract: User input is received, specifying a continuous traced path across a keyboard presented on a touch sensitive display. An input sequence is resolved, including traced keys and auxiliary keys proximate to the traced keys by prescribed criteria. For each of one or more candidate entries of a prescribed vocabulary, a set-edit-distance metric is computed between said input sequence and the candidate entry. Various rules specify when penalties are imposed, or not, in computing the set-edit-distance metric. Candidate entries are ranked and displayed according to the computed metric.

Type: Grant

Filed: November 4, 2010

Date of Patent: July 17, 2012

Assignee: Nuance Communications, Inc.

Inventor: Erland Unruh
METHOD AND APPARATUS FOR RECOGNIZING SPEECH

Publication number: 20120166194

Abstract: Disclosed herein are an apparatus and method for recognizing speech. The apparatus includes a frame-based speech recognition unit, a segment division unit, a segment feature extraction unit, a segment speech recognition performance unit, and a combination and synchronization unit. The frame-based speech recognition unit extracts frame speech feature vectors from a speech signal, and performs speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model. The segment division unit divides the speech signal into segments. The segment feature extraction unit extracts segment speech feature vectors around a boundary between the segments. The segment speech recognition performance unit performs speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model.

Type: Application

Filed: December 22, 2011

Publication date: June 28, 2012

Applicant: Electronics and Telecommunications Research Institute

Inventors: Ho-Young JUNG, Jeon-Gue PARK, Hoon CHUNG
STATE DETECTION DEVICE AND STATE DETECTING METHOD

Publication number: 20120166195

Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.

Type: Application

Filed: October 5, 2011

Publication date: June 28, 2012

Applicant: FUJITSU LIMITED

Inventors: Shoji HAYAKAWA, Naoshi Matsuo
Pattern identification method, apparatus, and program

Patent number: 8209172

Abstract: Pattern recognition capable of robust identification for the variance of an input pattern is performed with a low processing cost while the possibility of identification errors is decreased. In a pattern recognition apparatus which identifies the pattern of input data from a data input unit (11) by using a hierarchical feature extraction processor (12) which hierarchically extracts features, an extraction result distribution analyzer (13) analyzes a distribution of at least one feature extraction result obtained by a primary feature extraction processor (121). On the basis of the analytical result, a secondary feature extraction processor (122) performs predetermined secondary feature extraction.

Type: Grant

Filed: December 16, 2004

Date of Patent: June 26, 2012

Assignee: Canon Kabushiki Kaisha

Inventors: Yusuke Mitarai, Masakazu Matsuga, Katsuhiko Mori
System and method for building emotional machines

Patent number: 8204749

Abstract: A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics.

Type: Grant

Filed: March 21, 2011

Date of Patent: June 19, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Jackson J. Liscombe, Guiseppe Riccardi
Open class noun classification

Patent number: 8195458

Abstract: A method of semantically classifying a data set of open class nouns and a system for executing the method. The method includes loading, by a processing device, a data set comprising one or more open class nouns from a computer readable medium operably connected to the processing device; extracting, by the processing device, the one or more open class nouns from the data set; for each open class noun, querying, by the processing device, one or more application programming interfaces (APIs) to produce one or more results; deriving, by the processing device, a confidence score for the data set based upon the one or more results; and determining, by the processing device, a classification for the data set based upon the derived confidence score.

Type: Grant

Filed: August 17, 2010

Date of Patent: June 5, 2012

Assignee: Xerox Corporation

Inventors: Michael David Shepherd, Kirk J. Ocke, Barry Glynn Gombert, Dale Ellen Gaucas
Multi-infrastructure modeling system

Patent number: 8195436

Abstract: A system for simulating interdependencies between multiple critical physical infrastructure models, including a first infrastructure data model that models a first critical physical infrastructure, a second infrastructure data model that models a second critical physical infrastructure, wherein the second critical physical infrastructure is a different critical physical infrastructure from the first critical physical infrastructure, a simulation engine including a visualization application and adapted to automatically produce a change in the second infrastructure data model in response to a change in the first infrastructure data model, and a user interface permitting a user to interact with the simulation engine.

Type: Grant

Filed: January 11, 2010

Date of Patent: June 5, 2012

Assignee: Intepoint, LLC

Inventors: William J Tolone, Bei-tseng Chu
Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments

Patent number: 8195459

Abstract: Outputs of an automatic probabilistic event detection system, such as a fact extraction system, a speech-to-text engine or an automatic character recognition system, are matched with comparable results produced manually or by a different system. This comparison allows statistical modeling of the run-time behavior of the event detection system. This model can subsequently be used to give supplemental or replacement data for an output sequence of the system. In particular, the model can effectively calibrate the system for use with data of a particular statistical nature.

Type: Grant

Filed: September 6, 2010

Date of Patent: June 5, 2012

Assignee: Verint Americas, Inc.

Inventor: Michael Brand
Apparatus and method for sound recognition in portable device

Patent number: 8195455

Abstract: Provided are an apparatus and a method capable of recognizing a sound through a reduced burden of computations and a noise-tolerant technique. The sound recognition apparatus in a portable device includes a memory unit that stores at least one base sound and a sound input unit that receives a sound input. The sound recognition apparatus also includes a control unit that receives the sound input from the sound input unit, extracts peak values of the sound input, calculates statistical data by using the peak values, and determines whether the sound input is equal to a base sound by using the statistical data.

Type: Grant

Filed: February 18, 2009

Date of Patent: June 5, 2012

Assignee: Samsung Electronics Co., Ltd

Inventor: Hyun Soo Kim
Method and system for using input signal quality in speech recognition

Patent number: 8190430

Abstract: A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.

Type: Grant

Filed: August 9, 2011

Date of Patent: May 29, 2012

Assignee: Nuance Communications, Inc.

Inventors: John Doyle, John Brian Pickering
Personalized voice activity detection

Patent number: 8175874

Abstract: A method of transferring a real-time audio signal transmission, including: registering voice patterns (or other characteristics) of on more users to be used to identify the voices of the users, accepting an audio signal as it is created as a sequence of segments, analyzing each segment of the accepted audio signal to determine if it contains voice activity (314), determining a probability level that the voice activity of the segment is of a registered user (320 & 322); and selectively transferring the contents, of a segment responsive to the determined probability level (324).

Type: Grant

Filed: July 18, 2006

Date of Patent: May 8, 2012

Inventor: Shaul Shimhi
Representing n-gram language models for compact storage and fast retrieval

Patent number: 8175878

Abstract: Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.

Type: Grant

Filed: December 14, 2010

Date of Patent: May 8, 2012

Assignee: Google Inc.

Inventors: Ciprian Chelba, Thorsten Brants
DATA RETRIEVAL AND INDEXING METHOD AND APPARATUS

Publication number: 20120109651

Abstract: A method of searching a plurality of data files, wherein each data file includes a plurality of features. The method: determines a plurality of feature groups, wherein each feature group includes n features and n is an integer of 2 or more; expresses each data file as a file vector, wherein each component of the vector indicates the frequency of a feature group within the data file, wherein the n features which constitute a feature group do not have to be located adjacent to one another; expresses a search query using the feature groups as a vector; and searches the plurality of data files by comparing the search query expressed as a vector with the file vectors.

Type: Application

Filed: April 16, 2009

Publication date: May 3, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Langzhou Chen
Comparing events in word spotting

Patent number: 8170873

Abstract: An approach to comparing events in word spotting, such as comparing putative and reference instances of a keyword, makes use of a set of models of subword units. For each of two acoustic events and for each of a series of times in each of the events, a probability associated with each of the models of the set of subword units is computed. Then, a quantity characterizing a comparison of the two acoustic events, one occurring in each of the two acoustic signals, is computed using the computed probabilities associated with each of the models.

Type: Grant

Filed: July 22, 2004

Date of Patent: May 1, 2012

Assignee: Nexidia Inc.

Inventor: Robert W. Morris
MULTI-STATE BARGE-IN MODELS FOR SPOKEN DIALOG SYSTEMS

Publication number: 20120101820

Abstract: A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.

Type: Application

Filed: October 24, 2011

Publication date: April 26, 2012

Applicant: AT&T Intellectual Property I, L.P.

Inventor: Andrej Ljolje
Confidence measure generation for speech related searching

Patent number: 8165877

Abstract: A voice search system has a speech recognizer, a search component, and a dialog manager. A confidence measure generator receives speech recognition features from the speech recognizer, search features from the search component, and dialog features from the dialog manager, and calculates an overall confidence measure for voice search results based upon the features received. The invention can be extended to include the generation of additional features, based on those received from the individual components of the voice search system.

Type: Grant

Filed: August 3, 2007

Date of Patent: April 24, 2012

Assignee: Microsoft Corporation

Inventors: Ye-Yi Wang, Yun-Cheng Ju, Dong Yu
FRONT-END PROCESSOR FOR SPEECH RECOGNITION, AND SPEECH RECOGNIZING APPARATUS AND METHOD USING THE SAME

Publication number: 20120095762

Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.

Type: Application

Filed: October 19, 2011

Publication date: April 19, 2012

Applicants: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ki-wan EOM, Chang-woo HAN, Tae-gyoon KANG, Nam-soo KIM, Doo-hwa HONG, Jae-won LEE, Hyung-joon LIM
Grammar weighting voice recognition information

Patent number: 8145485

Abstract: A device receives a voice recognition statistic from a voice recognition application and applies a grammar improvement rule based on the voice recognition statistic. The device also automatically adjusts a weight of the voice recognition statistic based on the grammar improvement rule, and outputs the weight adjusted voice recognition statistic for use in the voice recognition application.

Type: Grant

Filed: April 29, 2011

Date of Patent: March 27, 2012

Assignee: Verizon Patent and Licensing Inc.

Inventor: Kevin W. Brown
Speech processing with predictive language modeling

Patent number: 8145484

Abstract: The described implementations relate to speech spelling by a user. One method identifies one or more symbols that may match a user utterance and displays an individual symbol for confirmation by the user.

Type: Grant

Filed: November 11, 2008

Date of Patent: March 27, 2012

Assignee: Microsoft Corporation

Inventor: Geoffrey Zweig
FULL-SEQUENCE TRAINING OF DEEP STRUCTURES FOR SPEECH RECOGNITION

Publication number: 20120072215

Abstract: A method is disclosed herein that include an act of causing a processor to access a deep-structured model retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto, transition probabilities between states, and language model scores. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.

Type: Application

Filed: September 21, 2010

Publication date: March 22, 2012

Applicant: Microsoft Corporation

Inventors: Dong Yu, Li Deng, Abdel-rahman Samir Abdel-rahman Mohamed
AGE DETERMINATION USING SPEECH

Publication number: 20120072216

Abstract: A method and device are configured to receive voice data from a user and perform speech recognition on the received voice data. A confidence score is calculated that represents the likelihood that received voice data has been accurately recognized. A likely age range is determined associated with the user based on the confidence score.

Type: Application

Filed: November 30, 2011

Publication date: March 22, 2012

Applicant: VERIZON PATENT AND LICENSING INC.

Inventor: Kevin R. Witzman
Hidden markov model (“HMM”)-based user authentication using keystroke dynamics

Patent number: 8136154

Abstract: Hidden Markov Models (“HMMs”) are used to analyze keystroke dynamics measurements collected as a user types a predetermined string on a keyboard. A user enrolls by typing the predetermined string several times; the enrollment samples are used to train a HMM to identify the user. A candidate who claims to be the user provides a typing sample, and the HMM produces a probability to estimate the likelihood that the candidate is the user he claims to be. A computationally-efficient method for preparing HMMs to analyze certain types of processes is also described.

Type: Grant

Filed: May 6, 2008

Date of Patent: March 13, 2012

Assignees: The Penn State Foundation, Louisiana Tech Unversity Research Foundation

Inventors: Vir V. Phoha, Shashi Phoha, Asok Ray, Shrijit Sudhakar Joshi, Sampath Kumar Vuyyuru
Summarization systems and methods

Patent number: 8135699

Abstract: A server-side summarization system includes a function for acquiring material to be summarized, along with source information about the material, a converter for converting the acquired material to machine-readable form, if not in that form when acquired, a summarizer for creating a summary from the acquired material, and a storage function for storing a copy of the acquired material and the summary created as separate files, associated and cross-referenced using the source information.

Type: Grant

Filed: June 21, 2006

Date of Patent: March 13, 2012

Inventors: Puneet K. Gupta, Mark A. Boys
Method and apparatus for estimating noise by using harmonics of voice signal

Patent number: 8135586

Abstract: Disclosed is a method and an apparatus for estimating noise included in a sound signal during sound signal processing. The method includes estimating harmonics components in a frame of an input sound signal; using the estimated harmonics components, computing a Voice Presence Probability (VPP) on the frame of the input sound signal; determining a weight of an equation necessary to estimate a noise spectrum, depending on the computed VPP; and using the determined weight and the equation necessary to estimate a noise spectrum, estimating the noise spectrum, and updating the noise spectrum.

Type: Grant

Filed: March 21, 2008

Date of Patent: March 13, 2012

Assignees: Samsung Electronics Co., Ltd, Korea University Industrial & Academic Collaboration Foundation

Inventors: Hyun-Soo Kim, Hanseok Ko, Sung-Joo Ahn, Jounghoon Beh, Hyun-Jin Yoon
Speech detection

Patent number: 8131543

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal, determining an energy-independent component of a portion of the audio signal associated with a spectral shape of the portion, and determining an energy-dependent component of the portion associated with a gain level of the portion. The method also comprises comparing the energy-independent and energy-dependent components to a speech model, comparing the energy-independent and energy-dependent components to a noise model, and outputting an indication whether the portion of the audio signal more closely corresponds to the speech model or to the noise model based on the comparisons.

Type: Grant

Filed: April 14, 2008

Date of Patent: March 6, 2012

Assignee: Google Inc.

Inventors: Ron J. Weiss, Trausti Kristjansson
System and method for user-specific speech recognition

Patent number: 8112275

Abstract: The systems and methods described herein may recognize natural language utterances that include queries and/or commands and execute the queries and/or commands based on user-specific profiles. The systems and methods described herein may include a complete speech-based information query, retrieval, presentation and command environment that makes significant use of context, prior information, domain knowledge, and the user-specific profiles to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created and tailored to specific users. For example, the systems and methods described herein may create, store, and use extensive personal profile information for different users, thereby improving the reliability of determining the context and presenting the results that the specific users may expect for a particular question or command.

Type: Grant

Filed: April 22, 2010

Date of Patent: February 7, 2012

Assignee: VoiceBox Technologies, Inc.

Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, Sr., Michael R. Kennewick, Jr., Richard Kennewick, Tom Freeman
Method of determining model-specific factors for pattern recognition, in particular for speech patterns

Patent number: 8112274

Abstract: A method for recognizing a pattern that comprises a set of physical stimuli, said method comprising the steps of: providing a set of training observations and through applying a plurality of association models ascertaining various measuring values pj(k|x), j=1 . . . M, that each pertain to assigning a particular training observation to one or more associated pattern classes; setting up a log/linear association distribution by combining all association models of the plurality according to respective weight factors, and joining thereto a normalization quantity to produce a compound association distribution; optimizing said weight factors for thereby minimizing a detected error rate of the actual assigning to said compound distribution; recognizing target observations representing a target pattern with the help of said compound distribution.

Type: Grant

Filed: April 30, 2002

Date of Patent: February 7, 2012

Assignee: Nuance Communications, Inc.

Inventor: Peter Beyerlein
Systems And Methods for Manipulating Electronic Content Based On Speech Recognition

Publication number: 20120010884

Abstract: Systems and methods are disclosed for displaying electronic multimedia content to a user. One computer-implemented method for manipulating electronic multimedia content includes generating, using a processor, a speech model and at least one speaker model of an individual speaker. The method further includes receiving electronic media content over a network; extracting an audio track from the electronic media content; and detecting speech segments within the electronic media content based on the speech model. The method further includes detecting a speaker segment within the electronic media content and calculating a probability of the detected speaker segment involving the individual speaker based on the at least one speaker model.

Type: Application

Filed: June 9, 2011

Publication date: January 12, 2012

Inventors: Peter F. Kocks, Guoning Hu, Ping-Hao Wu
Method and system for predicting understanding errors in a task classification system

Patent number: 8095363

Abstract: A method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications in a task classification system. If the user's input communication cannot be understood and a task classification decision cannot be made, then further dialog may be conducted with the user if a probability of understanding the user's input communication exceeds a first threshold. Otherwise, the user may be directed to a human for assistance. In another possible embodiment, the method operates as above except that if the probability exceeds a second threshold, then further dialog may be conducted with the user using the current dialog strategy. However, if the probability falls between a first threshold and a second threshold, the dialog strategy may be adapted in order to improve the chances of conducting a successful dialog with the user.

Type: Grant

Filed: January 6, 2009

Date of Patent: January 10, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
METHOD AND SYSTEM FOR USING INPUT SIGNAL QUALITY IN SPEECH RECOGNITION

Publication number: 20120004912

Abstract: A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.

Type: Application

Filed: August 9, 2011

Publication date: January 5, 2012

Applicant: Nuance Communications, Inc.

Inventors: John Doyle, John Brian Pickering
Apparatus for creating speaker model, and computer program product

Patent number: 8078462

Abstract: A transformation-parameter calculating unit calculates a first model parameter indicating a parameter of a speaker model for causing a first likelihood for a clean feature to maximum, and calculates a transformation parameter for causing the first likelihood to maximum. The transformation parameter transforms, for each of the speakers, a distribution of the clean feature corresponding to the identification information of the speaker to a distribution represented by the speaker model of the first model parameter. A model-parameter calculating unit transforms a noisy feature corresponding to identification information for each of speakers by using the transformation parameter, and calculates a second model parameter indicating a parameter of the speaker model for causing a second likelihood for the transformed noisy feature to maximum.

Type: Grant

Filed: October 2, 2008

Date of Patent: December 13, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventors: Yusuke Shinohara, Masami Akamine

prev … 3 4 5 6 7 8 9 10 11 … next