Probability Patents (Class 704/240)
  • Publication number: 20130006631
    Abstract: Environmental recognition systems may improve recognition accuracy by leveraging local and nonlocal features in a recognition target. A local decoder may be used to analyze local features, and a nonlocal decoder may be used to analyze nonlocal features. Local and nonlocal estimates may then be exchanged to improve the accuracy of the local and nonlocal decoders. Additional iterations of analysis and exchange may be performed until a predetermined threshold is reached. In some embodiments, the system may comprise extrinsic information extractors to prevent positive feedback loops from causing the system to adhere to erroneous previous decisions.
    Type: Application
    Filed: June 28, 2012
    Publication date: January 3, 2013
    Applicant: UTAH STATE UNIVERSITY
    Inventors: Jacob Gunther, Todd Moon
  • Patent number: 8346555
    Abstract: The present invention discloses a speech processing solution that utilizes an original speech recognition grammar in a speech recognition system to perform speech recognition operations for multiple recognition instances. Instance data associated with the recognition operations can be stored. A replacement grammar can be automatically generated from the stored instance data, where the replacement grammar is a statistical language model grammar. The original speech recognition grammar, which can be a grammar-based language model grammar or a statistical language model grammar, can be selectively replaced with the replacement grammar. For example when tested performance for the replacement grammar is better than that for the original grammar, the replacement grammar can replace the original grammar.
    Type: Grant
    Filed: August 22, 2006
    Date of Patent: January 1, 2013
    Assignee: Nuance Communications, Inc.
    Inventor: Brent D. Metz
  • Patent number: 8332208
    Abstract: An information processing apparatus includes: morphological analysis means for performing morphological analysis on a text document; managing means for managing a connection pattern indicating a connection relationship of a morpheme of a predetermined part of speech; and extracting means extracting, from a string of morphemes obtained by performing morphological analysis by the morphological analysis means, a phrase including a plurality of morphemes having a same connection relationship as the connection relationship indicated by the connection pattern managed by the managing means.
    Type: Grant
    Filed: September 3, 2008
    Date of Patent: December 11, 2012
    Assignee: Sony Corporation
    Inventor: Mitsuhiro Miyazaki
  • Patent number: 8332222
    Abstract: A Viterbi decoder includes: an observation vector sequence generator for generating an observation vector sequence by converting an input speech to a sequence of observation vectors; a local optimal state calculator for obtaining a partial state sequence having a maximum similarity up to a current observation vector as an optimal state; an observation probability calculator for obtaining, as a current observation probability, a probability for observing the current observation vector in the optimal state; a buffer for storing therein a specific number of previous observation probabilities; a non-linear filter for calculating a filtered probability by using the previous observation probabilities stored in the buffer and the current observation probability; and a maximum likelihood calculator for calculating a partial maximum likelihood by using the filtered probability.
    Type: Grant
    Filed: July 21, 2009
    Date of Patent: December 11, 2012
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hoon Chung, Jeon Gue Park, Yunkeun Lee, Ho-Young Jung, Hyung-Bae Jeon, Jeom Ja Kang, Sung Joo Lee, Euisok Chung, Ji Hyun Wang, Byung Ok Kang, Ki-young Park, Jong Jin Kim
  • Patent number: 8332207
    Abstract: Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n-1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus.
    Type: Grant
    Filed: June 22, 2007
    Date of Patent: December 11, 2012
    Assignee: Google Inc.
    Inventors: Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, Jeffrey Dean
  • Patent number: 8321220
    Abstract: A system and method are disclosed for providing semi-supervised learning for a spoken language understanding module using semantic role labeling. The method embodiment relates to a method of generating a spoken language understanding module. Steps in the method comprise selecting at least one predicate/argument pair as an intent from a set of the most frequent predicate/argument pairs for a domain, labeling training data using mapping rules associated with the selected at least one predicate/argument pair, training a call-type classification model using the labeled training data, re-labeling the training data using the call-type classification model and iteratively several of the above steps until training set labels converge.
    Type: Grant
    Filed: November 30, 2005
    Date of Patent: November 27, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
  • Patent number: 8315870
    Abstract: A distance calculation unit (16) obtains the acoustic distance between the feature amount of input speech and each phonetic model. A word search unit (17) performs a word search based on the acoustic distance and a language model including the phoneme and prosodic label of a word, and outputs a word hypothesis and a first score representing the likelihood of the word hypothesis. The word search unit (17) also outputs a vowel interval and its tone label in the input speech, when assuming that the recognition result of the input speech is the word hypothesis. A tone recognition unit (21) outputs a second score representing the likelihood of the tone label output from the word search unit (17) based on a feature amount corresponding to the vowel interval output from the word search unit (17). A rescore unit (22) corrects the first score of the word hypothesis output from the word search unit (17) using the second score output from the tone recognition unit (21).
    Type: Grant
    Filed: August 22, 2008
    Date of Patent: November 20, 2012
    Assignee: NEC Corporation
    Inventor: Ken Hanazawa
  • Patent number: 8311825
    Abstract: A system for calculating the look ahead probabilities at the nodes in a language model look ahead tree, wherein the words of the vocabulary of the language are located at the leaves of the tree, said apparatus comprising: means to assign a language model probability to each of the words of the vocabulary using a first low order language model; means to calculate the language look ahead probabilities for all nodes in said tree using said first language model; means to determine if the language model probability of one or more words of said vocabulary can be calculated using a higher order language model and updating said words with the higher order language model; and means to update the look ahead probability at only the nodes which are affected by the words where the language model has been updated.
    Type: Grant
    Filed: October 3, 2008
    Date of Patent: November 13, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Langzhou Chen
  • Patent number: 8306818
    Abstract: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.
    Type: Grant
    Filed: April 15, 2008
    Date of Patent: November 6, 2012
    Assignee: Microsoft Corporation
    Inventors: Ciprian Chelba, Alejandro Acero, Milind Mahajan
  • Publication number: 20120278076
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information are described. A method includes determining, for each of multiple communications that were initiated by a user of a mobile device, a time when the communication was initiated or received; determining, for each of multiple contacts associated with the user, a probability associated with the contact based at least on the times when the communications were initiated or received; weighting a contact disambiguation grammar according to the probabilities; and processing audio data using the contact disambiguation grammar to select a particular contact.
    Type: Application
    Filed: July 10, 2012
    Publication date: November 1, 2012
    Applicant: GOOGLE INC.
    Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
  • Patent number: 8296141
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function.
    Type: Grant
    Filed: November 19, 2008
    Date of Patent: October 23, 2012
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
  • Publication number: 20120253807
    Abstract: A speaker state detecting apparatus comprises: an audio input unit for acquiring, at least, a first voice emanated by a first speaker and a second voice emanated by a second speaker; a speech interval detecting unit for detecting an overlap period between a first speech period of the first speaker included in the first voice and a second speech period of the second speaker included in the second voice, which starts before the first speech period, or an interval between the first speech period and the second speech period; a state information extracting unit for extracting state information representing a state of the first speaker from the first speech period; and a state detecting unit for detecting the state of the first speaker in the first speech period based on the overlap period or the interval and the first state information.
    Type: Application
    Filed: February 3, 2012
    Publication date: October 4, 2012
    Applicant: FUJITSU LIMITED
    Inventor: Akira KAMANO
  • Patent number: 8275615
    Abstract: A translation method and system include a recognition engine having a plurality of models each being employed to decode a same utterance to provide an output. A model combiner is configured to assign probabilities to each model output and configured to assign weights to the outputs of the plurality of models based on the probabilities to provide a best performing model for the context of the utterance.
    Type: Grant
    Filed: July 13, 2007
    Date of Patent: September 25, 2012
    Assignee: International Business Machines Corporation
    Inventors: Suleyman S. Kozat, Ruhi Sarikaya
  • Patent number: 8275616
    Abstract: The present invention relates to a continuous speech recognition system that is very robust in a noisy environment. In order to recognize continuous speech smoothly in a noisy environment, the system selects call commands, configures a minimum recognition network in token, which consists of the call commands and mute intervals including noises, recognizes the inputted speech continuously in real time, analyzes the reliability of speech recognition continuously and recognizes the continuous speech from a speaker. When a speaker delivers a call command, the system for detecting the speech interval and recognizing continuous speech in a noisy environment through the real-time recognition of call commands measures the reliability of the speech after recognizing the call command, and recognizes the speech from the speaker by transferring the speech interval following the call command to a continuous speech-recognition engine at the moment when the system recognizes the call command.
    Type: Grant
    Filed: April 22, 2009
    Date of Patent: September 25, 2012
    Assignee: KoreaPowerVoice Co., Ltd.
    Inventors: Heui-Suck Jung, Se-Hoon Chin, Tae-Young Roh
  • Publication number: 20120232901
    Abstract: A language identification system that includes a universal phoneme decoder (UPD) is described. The UPD contains a universal phoneme set representing both 1) all phonemes occurring in the set of two or more spoken languages, and 2) captures phoneme correspondences across languages, such that a set of unique phoneme patterns and probabilities are calculated in order to identify a most likely phoneme occurring each time in the audio files in the set of two or more potential languages in which the UPD was trained on. Each statistical language model (SLM) uses the set of unique phoneme patterns created for each language in the set to distinguish between spoken human languages in the set of languages. The run-time language identifier module identifies a particular human language being spoken by utilizing the linguistic probabilities supplied by the SLMs that are based on the set of unique phoneme patterns created for each language.
    Type: Application
    Filed: May 24, 2012
    Publication date: September 13, 2012
    Applicant: Autonomy Corporation Ltd.
    Inventors: Mahapathy Kadirkamanathan, Christopher John Waple
  • Patent number: 8255214
    Abstract: A first signal of two signals to be compared for similarity is divided into small areas and one small area is selected for calculating the correlation with a second signal using a correlative method. Then, the quantity of translation, expansion rate and similarity in an area where the similarity, which is the square of the correlation value, reaches its maximum, are found. Values based on the similarity are integrated at a position represented by the quantity of translation and expansion rate. Similar processing is performed with respect to all the small areas, and at a peak where the maximum integral value of the similarity is obtained, its magnitude is compared with a threshold value to evaluate the similarity. The small area voted for that peak can be extracted.
    Type: Grant
    Filed: October 15, 2002
    Date of Patent: August 28, 2012
    Assignee: Sony Corporation
    Inventors: Mototsugu Abe, Masayuki Nishiguchi
  • Patent number: 8244522
    Abstract: A language understanding device includes: a language understanding model storing unit configured to store word transition data including pre-transition states, input words, predefined outputs corresponding to the input words, word weight information, and post-transition states, and concept weighting data including concepts obtained from language understanding results for at least one word, and concept weight information corresponding to the concepts; a finite state transducer processing unit configured to output understanding result candidates including the predefined outputs, to accumulate word weights so as to obtain a cumulative word weight, and to sequentially perform state transition operations; a concept weighting processing unit configured to accumulate concept weights so as to obtain a cumulative concept weight; and an understanding result determination unit configured to determine an understanding result from the understanding result candidates by referring to the cumulative word weight and the cumul
    Type: Grant
    Filed: May 20, 2008
    Date of Patent: August 14, 2012
    Assignee: Honda Motor Co., Ltd.
    Inventors: Mikio Nakano, Hiroshi Okuno, Kazunori Komatani, Yuichiro Fukubayashi, Kotaro Funakoshi
  • Patent number: 8234107
    Abstract: Disclosed herein is a method of grouping similar supplier names together in a database. The syntactical errors in the supplier names are corrected. The supplier names are grouped after correcting the syntactical errors. The abbreviations in the supplier names are captured. The ordering, pronunciation and stemming errors in the supplier names are corrected. A matching algorithm that matches and compares two supplier names is applied that comprises the steps of grouping supplier names based on first set of characters in the supplier names and calculating a matching score between the two supplier using Levenshtein distance between the two supplier names, along with the supplier names' sound codes obtained from a modified metaphone algorithm, length of each word, position of matching and mismatching characters, and stem of words in the supplier names. The matching scores are compared with set thresholds in order to further group the supplier names into clusters.
    Type: Grant
    Filed: February 12, 2008
    Date of Patent: July 31, 2012
    Assignee: Ketera Technologies, Inc.
    Inventor: Ram Dayal Goyal
  • Patent number: 8229745
    Abstract: A method of building a mixed-initiative grammar can include receiving one or more conjoin phrases, wherein each conjoin phrase is associated with a selected one of the plurality of directed dialog grammars, and receiving a user input specifying a selected grammar generation technique. The mixed-initiative grammar can be automatically generated, in accordance with the selected grammar generation technique, such that the mixed-initiative grammar specifies an allowable ordering of sets when interpreting a user spoken utterance and whether duplicative phrases are allowable within the user spoken utterance.
    Type: Grant
    Filed: October 21, 2005
    Date of Patent: July 24, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, David Jaramillo, Gerald M. McCobb
  • Patent number: 8225203
    Abstract: User input is received, specifying a continuous traced path across a keyboard presented on a touch sensitive display. An input sequence is resolved, including traced keys and auxiliary keys proximate to the traced keys by prescribed criteria. For each of one or more candidate entries of a prescribed vocabulary, a set-edit-distance metric is computed between said input sequence and the candidate entry. Various rules specify when penalties are imposed, or not, in computing the set-edit-distance metric. Candidate entries are ranked and displayed according to the computed metric.
    Type: Grant
    Filed: November 4, 2010
    Date of Patent: July 17, 2012
    Assignee: Nuance Communications, Inc.
    Inventor: Erland Unruh
  • Publication number: 20120166194
    Abstract: Disclosed herein are an apparatus and method for recognizing speech. The apparatus includes a frame-based speech recognition unit, a segment division unit, a segment feature extraction unit, a segment speech recognition performance unit, and a combination and synchronization unit. The frame-based speech recognition unit extracts frame speech feature vectors from a speech signal, and performs speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model. The segment division unit divides the speech signal into segments. The segment feature extraction unit extracts segment speech feature vectors around a boundary between the segments. The segment speech recognition performance unit performs speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model.
    Type: Application
    Filed: December 22, 2011
    Publication date: June 28, 2012
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Ho-Young JUNG, Jeon-Gue PARK, Hoon CHUNG
  • Publication number: 20120166195
    Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.
    Type: Application
    Filed: October 5, 2011
    Publication date: June 28, 2012
    Applicant: FUJITSU LIMITED
    Inventors: Shoji HAYAKAWA, Naoshi Matsuo
  • Patent number: 8209172
    Abstract: Pattern recognition capable of robust identification for the variance of an input pattern is performed with a low processing cost while the possibility of identification errors is decreased. In a pattern recognition apparatus which identifies the pattern of input data from a data input unit (11) by using a hierarchical feature extraction processor (12) which hierarchically extracts features, an extraction result distribution analyzer (13) analyzes a distribution of at least one feature extraction result obtained by a primary feature extraction processor (121). On the basis of the analytical result, a secondary feature extraction processor (122) performs predetermined secondary feature extraction.
    Type: Grant
    Filed: December 16, 2004
    Date of Patent: June 26, 2012
    Assignee: Canon Kabushiki Kaisha
    Inventors: Yusuke Mitarai, Masakazu Matsuga, Katsuhiko Mori
  • Patent number: 8204749
    Abstract: A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics.
    Type: Grant
    Filed: March 21, 2011
    Date of Patent: June 19, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Dilek Z. Hakkani-Tur, Jackson J. Liscombe, Guiseppe Riccardi
  • Patent number: 8195458
    Abstract: A method of semantically classifying a data set of open class nouns and a system for executing the method. The method includes loading, by a processing device, a data set comprising one or more open class nouns from a computer readable medium operably connected to the processing device; extracting, by the processing device, the one or more open class nouns from the data set; for each open class noun, querying, by the processing device, one or more application programming interfaces (APIs) to produce one or more results; deriving, by the processing device, a confidence score for the data set based upon the one or more results; and determining, by the processing device, a classification for the data set based upon the derived confidence score.
    Type: Grant
    Filed: August 17, 2010
    Date of Patent: June 5, 2012
    Assignee: Xerox Corporation
    Inventors: Michael David Shepherd, Kirk J. Ocke, Barry Glynn Gombert, Dale Ellen Gaucas
  • Patent number: 8195436
    Abstract: A system for simulating interdependencies between multiple critical physical infrastructure models, including a first infrastructure data model that models a first critical physical infrastructure, a second infrastructure data model that models a second critical physical infrastructure, wherein the second critical physical infrastructure is a different critical physical infrastructure from the first critical physical infrastructure, a simulation engine including a visualization application and adapted to automatically produce a change in the second infrastructure data model in response to a change in the first infrastructure data model, and a user interface permitting a user to interact with the simulation engine.
    Type: Grant
    Filed: January 11, 2010
    Date of Patent: June 5, 2012
    Assignee: Intepoint, LLC
    Inventors: William J Tolone, Bei-tseng Chu
  • Patent number: 8195459
    Abstract: Outputs of an automatic probabilistic event detection system, such as a fact extraction system, a speech-to-text engine or an automatic character recognition system, are matched with comparable results produced manually or by a different system. This comparison allows statistical modeling of the run-time behavior of the event detection system. This model can subsequently be used to give supplemental or replacement data for an output sequence of the system. In particular, the model can effectively calibrate the system for use with data of a particular statistical nature.
    Type: Grant
    Filed: September 6, 2010
    Date of Patent: June 5, 2012
    Assignee: Verint Americas, Inc.
    Inventor: Michael Brand
  • Patent number: 8195455
    Abstract: Provided are an apparatus and a method capable of recognizing a sound through a reduced burden of computations and a noise-tolerant technique. The sound recognition apparatus in a portable device includes a memory unit that stores at least one base sound and a sound input unit that receives a sound input. The sound recognition apparatus also includes a control unit that receives the sound input from the sound input unit, extracts peak values of the sound input, calculates statistical data by using the peak values, and determines whether the sound input is equal to a base sound by using the statistical data.
    Type: Grant
    Filed: February 18, 2009
    Date of Patent: June 5, 2012
    Assignee: Samsung Electronics Co., Ltd
    Inventor: Hyun Soo Kim
  • Patent number: 8190430
    Abstract: A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.
    Type: Grant
    Filed: August 9, 2011
    Date of Patent: May 29, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: John Doyle, John Brian Pickering
  • Patent number: 8175874
    Abstract: A method of transferring a real-time audio signal transmission, including: registering voice patterns (or other characteristics) of on more users to be used to identify the voices of the users, accepting an audio signal as it is created as a sequence of segments, analyzing each segment of the accepted audio signal to determine if it contains voice activity (314), determining a probability level that the voice activity of the segment is of a registered user (320 & 322); and selectively transferring the contents, of a segment responsive to the determined probability level (324).
    Type: Grant
    Filed: July 18, 2006
    Date of Patent: May 8, 2012
    Inventor: Shaul Shimhi
  • Patent number: 8175878
    Abstract: Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.
    Type: Grant
    Filed: December 14, 2010
    Date of Patent: May 8, 2012
    Assignee: Google Inc.
    Inventors: Ciprian Chelba, Thorsten Brants
  • Publication number: 20120109651
    Abstract: A method of searching a plurality of data files, wherein each data file includes a plurality of features. The method: determines a plurality of feature groups, wherein each feature group includes n features and n is an integer of 2 or more; expresses each data file as a file vector, wherein each component of the vector indicates the frequency of a feature group within the data file, wherein the n features which constitute a feature group do not have to be located adjacent to one another; expresses a search query using the feature groups as a vector; and searches the plurality of data files by comparing the search query expressed as a vector with the file vectors.
    Type: Application
    Filed: April 16, 2009
    Publication date: May 3, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Langzhou Chen
  • Patent number: 8170873
    Abstract: An approach to comparing events in word spotting, such as comparing putative and reference instances of a keyword, makes use of a set of models of subword units. For each of two acoustic events and for each of a series of times in each of the events, a probability associated with each of the models of the set of subword units is computed. Then, a quantity characterizing a comparison of the two acoustic events, one occurring in each of the two acoustic signals, is computed using the computed probabilities associated with each of the models.
    Type: Grant
    Filed: July 22, 2004
    Date of Patent: May 1, 2012
    Assignee: Nexidia Inc.
    Inventor: Robert W. Morris
  • Publication number: 20120101820
    Abstract: A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.
    Type: Application
    Filed: October 24, 2011
    Publication date: April 26, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventor: Andrej Ljolje
  • Patent number: 8165877
    Abstract: A voice search system has a speech recognizer, a search component, and a dialog manager. A confidence measure generator receives speech recognition features from the speech recognizer, search features from the search component, and dialog features from the dialog manager, and calculates an overall confidence measure for voice search results based upon the features received. The invention can be extended to include the generation of additional features, based on those received from the individual components of the voice search system.
    Type: Grant
    Filed: August 3, 2007
    Date of Patent: April 24, 2012
    Assignee: Microsoft Corporation
    Inventors: Ye-Yi Wang, Yun-Cheng Ju, Dong Yu
  • Publication number: 20120095762
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Application
    Filed: October 19, 2011
    Publication date: April 19, 2012
    Applicants: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki-wan EOM, Chang-woo HAN, Tae-gyoon KANG, Nam-soo KIM, Doo-hwa HONG, Jae-won LEE, Hyung-joon LIM
  • Patent number: 8145485
    Abstract: A device receives a voice recognition statistic from a voice recognition application and applies a grammar improvement rule based on the voice recognition statistic. The device also automatically adjusts a weight of the voice recognition statistic based on the grammar improvement rule, and outputs the weight adjusted voice recognition statistic for use in the voice recognition application.
    Type: Grant
    Filed: April 29, 2011
    Date of Patent: March 27, 2012
    Assignee: Verizon Patent and Licensing Inc.
    Inventor: Kevin W. Brown
  • Patent number: 8145484
    Abstract: The described implementations relate to speech spelling by a user. One method identifies one or more symbols that may match a user utterance and displays an individual symbol for confirmation by the user.
    Type: Grant
    Filed: November 11, 2008
    Date of Patent: March 27, 2012
    Assignee: Microsoft Corporation
    Inventor: Geoffrey Zweig
  • Publication number: 20120072215
    Abstract: A method is disclosed herein that include an act of causing a processor to access a deep-structured model retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto, transition probabilities between states, and language model scores. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.
    Type: Application
    Filed: September 21, 2010
    Publication date: March 22, 2012
    Applicant: Microsoft Corporation
    Inventors: Dong Yu, Li Deng, Abdel-rahman Samir Abdel-rahman Mohamed
  • Publication number: 20120072216
    Abstract: A method and device are configured to receive voice data from a user and perform speech recognition on the received voice data. A confidence score is calculated that represents the likelihood that received voice data has been accurately recognized. A likely age range is determined associated with the user based on the confidence score.
    Type: Application
    Filed: November 30, 2011
    Publication date: March 22, 2012
    Applicant: VERIZON PATENT AND LICENSING INC.
    Inventor: Kevin R. Witzman
  • Patent number: 8136154
    Abstract: Hidden Markov Models (“HMMs”) are used to analyze keystroke dynamics measurements collected as a user types a predetermined string on a keyboard. A user enrolls by typing the predetermined string several times; the enrollment samples are used to train a HMM to identify the user. A candidate who claims to be the user provides a typing sample, and the HMM produces a probability to estimate the likelihood that the candidate is the user he claims to be. A computationally-efficient method for preparing HMMs to analyze certain types of processes is also described.
    Type: Grant
    Filed: May 6, 2008
    Date of Patent: March 13, 2012
    Assignees: The Penn State Foundation, Louisiana Tech Unversity Research Foundation
    Inventors: Vir V. Phoha, Shashi Phoha, Asok Ray, Shrijit Sudhakar Joshi, Sampath Kumar Vuyyuru
  • Patent number: 8135699
    Abstract: A server-side summarization system includes a function for acquiring material to be summarized, along with source information about the material, a converter for converting the acquired material to machine-readable form, if not in that form when acquired, a summarizer for creating a summary from the acquired material, and a storage function for storing a copy of the acquired material and the summary created as separate files, associated and cross-referenced using the source information.
    Type: Grant
    Filed: June 21, 2006
    Date of Patent: March 13, 2012
    Inventors: Puneet K. Gupta, Mark A. Boys
  • Patent number: 8135586
    Abstract: Disclosed is a method and an apparatus for estimating noise included in a sound signal during sound signal processing. The method includes estimating harmonics components in a frame of an input sound signal; using the estimated harmonics components, computing a Voice Presence Probability (VPP) on the frame of the input sound signal; determining a weight of an equation necessary to estimate a noise spectrum, depending on the computed VPP; and using the determined weight and the equation necessary to estimate a noise spectrum, estimating the noise spectrum, and updating the noise spectrum.
    Type: Grant
    Filed: March 21, 2008
    Date of Patent: March 13, 2012
    Assignees: Samsung Electronics Co., Ltd, Korea University Industrial & Academic Collaboration Foundation
    Inventors: Hyun-Soo Kim, Hanseok Ko, Sung-Joo Ahn, Jounghoon Beh, Hyun-Jin Yoon
  • Patent number: 8131543
    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal, determining an energy-independent component of a portion of the audio signal associated with a spectral shape of the portion, and determining an energy-dependent component of the portion associated with a gain level of the portion. The method also comprises comparing the energy-independent and energy-dependent components to a speech model, comparing the energy-independent and energy-dependent components to a noise model, and outputting an indication whether the portion of the audio signal more closely corresponds to the speech model or to the noise model based on the comparisons.
    Type: Grant
    Filed: April 14, 2008
    Date of Patent: March 6, 2012
    Assignee: Google Inc.
    Inventors: Ron J. Weiss, Trausti Kristjansson
  • Patent number: 8112275
    Abstract: The systems and methods described herein may recognize natural language utterances that include queries and/or commands and execute the queries and/or commands based on user-specific profiles. The systems and methods described herein may include a complete speech-based information query, retrieval, presentation and command environment that makes significant use of context, prior information, domain knowledge, and the user-specific profiles to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created and tailored to specific users. For example, the systems and methods described herein may create, store, and use extensive personal profile information for different users, thereby improving the reliability of determining the context and presenting the results that the specific users may expect for a particular question or command.
    Type: Grant
    Filed: April 22, 2010
    Date of Patent: February 7, 2012
    Assignee: VoiceBox Technologies, Inc.
    Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, Sr., Michael R. Kennewick, Jr., Richard Kennewick, Tom Freeman
  • Patent number: 8112274
    Abstract: A method for recognizing a pattern that comprises a set of physical stimuli, said method comprising the steps of: providing a set of training observations and through applying a plurality of association models ascertaining various measuring values pj(k|x), j=1 . . . M, that each pertain to assigning a particular training observation to one or more associated pattern classes; setting up a log/linear association distribution by combining all association models of the plurality according to respective weight factors, and joining thereto a normalization quantity to produce a compound association distribution; optimizing said weight factors for thereby minimizing a detected error rate of the actual assigning to said compound distribution; recognizing target observations representing a target pattern with the help of said compound distribution.
    Type: Grant
    Filed: April 30, 2002
    Date of Patent: February 7, 2012
    Assignee: Nuance Communications, Inc.
    Inventor: Peter Beyerlein
  • Publication number: 20120010884
    Abstract: Systems and methods are disclosed for displaying electronic multimedia content to a user. One computer-implemented method for manipulating electronic multimedia content includes generating, using a processor, a speech model and at least one speaker model of an individual speaker. The method further includes receiving electronic media content over a network; extracting an audio track from the electronic media content; and detecting speech segments within the electronic media content based on the speech model. The method further includes detecting a speaker segment within the electronic media content and calculating a probability of the detected speaker segment involving the individual speaker based on the at least one speaker model.
    Type: Application
    Filed: June 9, 2011
    Publication date: January 12, 2012
    Inventors: Peter F. Kocks, Guoning Hu, Ping-Hao Wu
  • Patent number: 8095363
    Abstract: A method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications in a task classification system. If the user's input communication cannot be understood and a task classification decision cannot be made, then further dialog may be conducted with the user if a probability of understanding the user's input communication exceeds a first threshold. Otherwise, the user may be directed to a human for assistance. In another possible embodiment, the method operates as above except that if the probability exceeds a second threshold, then further dialog may be conducted with the user using the current dialog strategy. However, if the probability falls between a first threshold and a second threshold, the dialog strategy may be adapted in order to improve the chances of conducting a successful dialog with the user.
    Type: Grant
    Filed: January 6, 2009
    Date of Patent: January 10, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
  • Publication number: 20120004912
    Abstract: A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.
    Type: Application
    Filed: August 9, 2011
    Publication date: January 5, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: John Doyle, John Brian Pickering
  • Patent number: 8078462
    Abstract: A transformation-parameter calculating unit calculates a first model parameter indicating a parameter of a speaker model for causing a first likelihood for a clean feature to maximum, and calculates a transformation parameter for causing the first likelihood to maximum. The transformation parameter transforms, for each of the speakers, a distribution of the clean feature corresponding to the identification information of the speaker to a distribution represented by the speaker model of the first model parameter. A model-parameter calculating unit transforms a noisy feature corresponding to identification information for each of speakers by using the transformation parameter, and calculates a second model parameter indicating a parameter of the speaker model for causing a second likelihood for the transformed noisy feature to maximum.
    Type: Grant
    Filed: October 2, 2008
    Date of Patent: December 13, 2011
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Yusuke Shinohara, Masami Akamine