Probability Patents (Class 704/240)
-
Publication number: 20130006631Abstract: Environmental recognition systems may improve recognition accuracy by leveraging local and nonlocal features in a recognition target. A local decoder may be used to analyze local features, and a nonlocal decoder may be used to analyze nonlocal features. Local and nonlocal estimates may then be exchanged to improve the accuracy of the local and nonlocal decoders. Additional iterations of analysis and exchange may be performed until a predetermined threshold is reached. In some embodiments, the system may comprise extrinsic information extractors to prevent positive feedback loops from causing the system to adhere to erroneous previous decisions.Type: ApplicationFiled: June 28, 2012Publication date: January 3, 2013Applicant: UTAH STATE UNIVERSITYInventors: Jacob Gunther, Todd Moon
-
Patent number: 8346555Abstract: The present invention discloses a speech processing solution that utilizes an original speech recognition grammar in a speech recognition system to perform speech recognition operations for multiple recognition instances. Instance data associated with the recognition operations can be stored. A replacement grammar can be automatically generated from the stored instance data, where the replacement grammar is a statistical language model grammar. The original speech recognition grammar, which can be a grammar-based language model grammar or a statistical language model grammar, can be selectively replaced with the replacement grammar. For example when tested performance for the replacement grammar is better than that for the original grammar, the replacement grammar can replace the original grammar.Type: GrantFiled: August 22, 2006Date of Patent: January 1, 2013Assignee: Nuance Communications, Inc.Inventor: Brent D. Metz
-
Patent number: 8332208Abstract: An information processing apparatus includes: morphological analysis means for performing morphological analysis on a text document; managing means for managing a connection pattern indicating a connection relationship of a morpheme of a predetermined part of speech; and extracting means extracting, from a string of morphemes obtained by performing morphological analysis by the morphological analysis means, a phrase including a plurality of morphemes having a same connection relationship as the connection relationship indicated by the connection pattern managed by the managing means.Type: GrantFiled: September 3, 2008Date of Patent: December 11, 2012Assignee: Sony CorporationInventor: Mitsuhiro Miyazaki
-
Patent number: 8332222Abstract: A Viterbi decoder includes: an observation vector sequence generator for generating an observation vector sequence by converting an input speech to a sequence of observation vectors; a local optimal state calculator for obtaining a partial state sequence having a maximum similarity up to a current observation vector as an optimal state; an observation probability calculator for obtaining, as a current observation probability, a probability for observing the current observation vector in the optimal state; a buffer for storing therein a specific number of previous observation probabilities; a non-linear filter for calculating a filtered probability by using the previous observation probabilities stored in the buffer and the current observation probability; and a maximum likelihood calculator for calculating a partial maximum likelihood by using the filtered probability.Type: GrantFiled: July 21, 2009Date of Patent: December 11, 2012Assignee: Electronics and Telecommunications Research InstituteInventors: Hoon Chung, Jeon Gue Park, Yunkeun Lee, Ho-Young Jung, Hyung-Bae Jeon, Jeom Ja Kang, Sung Joo Lee, Euisok Chung, Ji Hyun Wang, Byung Ok Kang, Ki-young Park, Jong Jin Kim
-
Patent number: 8332207Abstract: Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n-1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus.Type: GrantFiled: June 22, 2007Date of Patent: December 11, 2012Assignee: Google Inc.Inventors: Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, Jeffrey Dean
-
Patent number: 8321220Abstract: A system and method are disclosed for providing semi-supervised learning for a spoken language understanding module using semantic role labeling. The method embodiment relates to a method of generating a spoken language understanding module. Steps in the method comprise selecting at least one predicate/argument pair as an intent from a set of the most frequent predicate/argument pairs for a domain, labeling training data using mapping rules associated with the selected at least one predicate/argument pair, training a call-type classification model using the labeled training data, re-labeling the training data using the call-type classification model and iteratively several of the above steps until training set labels converge.Type: GrantFiled: November 30, 2005Date of Patent: November 27, 2012Assignee: AT&T Intellectual Property II, L.P.Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
-
Patent number: 8315870Abstract: A distance calculation unit (16) obtains the acoustic distance between the feature amount of input speech and each phonetic model. A word search unit (17) performs a word search based on the acoustic distance and a language model including the phoneme and prosodic label of a word, and outputs a word hypothesis and a first score representing the likelihood of the word hypothesis. The word search unit (17) also outputs a vowel interval and its tone label in the input speech, when assuming that the recognition result of the input speech is the word hypothesis. A tone recognition unit (21) outputs a second score representing the likelihood of the tone label output from the word search unit (17) based on a feature amount corresponding to the vowel interval output from the word search unit (17). A rescore unit (22) corrects the first score of the word hypothesis output from the word search unit (17) using the second score output from the tone recognition unit (21).Type: GrantFiled: August 22, 2008Date of Patent: November 20, 2012Assignee: NEC CorporationInventor: Ken Hanazawa
-
Patent number: 8311825Abstract: A system for calculating the look ahead probabilities at the nodes in a language model look ahead tree, wherein the words of the vocabulary of the language are located at the leaves of the tree, said apparatus comprising: means to assign a language model probability to each of the words of the vocabulary using a first low order language model; means to calculate the language look ahead probabilities for all nodes in said tree using said first language model; means to determine if the language model probability of one or more words of said vocabulary can be calculated using a higher order language model and updating said words with the higher order language model; and means to update the look ahead probability at only the nodes which are affected by the words where the language model has been updated.Type: GrantFiled: October 3, 2008Date of Patent: November 13, 2012Assignee: Kabushiki Kaisha ToshibaInventor: Langzhou Chen
-
Patent number: 8306818Abstract: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.Type: GrantFiled: April 15, 2008Date of Patent: November 6, 2012Assignee: Microsoft CorporationInventors: Ciprian Chelba, Alejandro Acero, Milind Mahajan
-
Publication number: 20120278076Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information are described. A method includes determining, for each of multiple communications that were initiated by a user of a mobile device, a time when the communication was initiated or received; determining, for each of multiple contacts associated with the user, a probability associated with the contact based at least on the times when the communications were initiated or received; weighting a contact disambiguation grammar according to the probabilities; and processing audio data using the contact disambiguation grammar to select a particular contact.Type: ApplicationFiled: July 10, 2012Publication date: November 1, 2012Applicant: GOOGLE INC.Inventors: Matthew I. Lloyd, Willard Van Tuyl Rusch, II
-
Patent number: 8296141Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function.Type: GrantFiled: November 19, 2008Date of Patent: October 23, 2012Assignee: AT&T Intellectual Property I, L.P.Inventors: Mazin Gilbert, Alistair D. Conkie, Andrej Ljolje
-
Publication number: 20120253807Abstract: A speaker state detecting apparatus comprises: an audio input unit for acquiring, at least, a first voice emanated by a first speaker and a second voice emanated by a second speaker; a speech interval detecting unit for detecting an overlap period between a first speech period of the first speaker included in the first voice and a second speech period of the second speaker included in the second voice, which starts before the first speech period, or an interval between the first speech period and the second speech period; a state information extracting unit for extracting state information representing a state of the first speaker from the first speech period; and a state detecting unit for detecting the state of the first speaker in the first speech period based on the overlap period or the interval and the first state information.Type: ApplicationFiled: February 3, 2012Publication date: October 4, 2012Applicant: FUJITSU LIMITEDInventor: Akira KAMANO
-
Patent number: 8275615Abstract: A translation method and system include a recognition engine having a plurality of models each being employed to decode a same utterance to provide an output. A model combiner is configured to assign probabilities to each model output and configured to assign weights to the outputs of the plurality of models based on the probabilities to provide a best performing model for the context of the utterance.Type: GrantFiled: July 13, 2007Date of Patent: September 25, 2012Assignee: International Business Machines CorporationInventors: Suleyman S. Kozat, Ruhi Sarikaya
-
Patent number: 8275616Abstract: The present invention relates to a continuous speech recognition system that is very robust in a noisy environment. In order to recognize continuous speech smoothly in a noisy environment, the system selects call commands, configures a minimum recognition network in token, which consists of the call commands and mute intervals including noises, recognizes the inputted speech continuously in real time, analyzes the reliability of speech recognition continuously and recognizes the continuous speech from a speaker. When a speaker delivers a call command, the system for detecting the speech interval and recognizing continuous speech in a noisy environment through the real-time recognition of call commands measures the reliability of the speech after recognizing the call command, and recognizes the speech from the speaker by transferring the speech interval following the call command to a continuous speech-recognition engine at the moment when the system recognizes the call command.Type: GrantFiled: April 22, 2009Date of Patent: September 25, 2012Assignee: KoreaPowerVoice Co., Ltd.Inventors: Heui-Suck Jung, Se-Hoon Chin, Tae-Young Roh
-
Publication number: 20120232901Abstract: A language identification system that includes a universal phoneme decoder (UPD) is described. The UPD contains a universal phoneme set representing both 1) all phonemes occurring in the set of two or more spoken languages, and 2) captures phoneme correspondences across languages, such that a set of unique phoneme patterns and probabilities are calculated in order to identify a most likely phoneme occurring each time in the audio files in the set of two or more potential languages in which the UPD was trained on. Each statistical language model (SLM) uses the set of unique phoneme patterns created for each language in the set to distinguish between spoken human languages in the set of languages. The run-time language identifier module identifies a particular human language being spoken by utilizing the linguistic probabilities supplied by the SLMs that are based on the set of unique phoneme patterns created for each language.Type: ApplicationFiled: May 24, 2012Publication date: September 13, 2012Applicant: Autonomy Corporation Ltd.Inventors: Mahapathy Kadirkamanathan, Christopher John Waple
-
Patent number: 8255214Abstract: A first signal of two signals to be compared for similarity is divided into small areas and one small area is selected for calculating the correlation with a second signal using a correlative method. Then, the quantity of translation, expansion rate and similarity in an area where the similarity, which is the square of the correlation value, reaches its maximum, are found. Values based on the similarity are integrated at a position represented by the quantity of translation and expansion rate. Similar processing is performed with respect to all the small areas, and at a peak where the maximum integral value of the similarity is obtained, its magnitude is compared with a threshold value to evaluate the similarity. The small area voted for that peak can be extracted.Type: GrantFiled: October 15, 2002Date of Patent: August 28, 2012Assignee: Sony CorporationInventors: Mototsugu Abe, Masayuki Nishiguchi
-
Patent number: 8244522Abstract: A language understanding device includes: a language understanding model storing unit configured to store word transition data including pre-transition states, input words, predefined outputs corresponding to the input words, word weight information, and post-transition states, and concept weighting data including concepts obtained from language understanding results for at least one word, and concept weight information corresponding to the concepts; a finite state transducer processing unit configured to output understanding result candidates including the predefined outputs, to accumulate word weights so as to obtain a cumulative word weight, and to sequentially perform state transition operations; a concept weighting processing unit configured to accumulate concept weights so as to obtain a cumulative concept weight; and an understanding result determination unit configured to determine an understanding result from the understanding result candidates by referring to the cumulative word weight and the cumulType: GrantFiled: May 20, 2008Date of Patent: August 14, 2012Assignee: Honda Motor Co., Ltd.Inventors: Mikio Nakano, Hiroshi Okuno, Kazunori Komatani, Yuichiro Fukubayashi, Kotaro Funakoshi
-
Patent number: 8234107Abstract: Disclosed herein is a method of grouping similar supplier names together in a database. The syntactical errors in the supplier names are corrected. The supplier names are grouped after correcting the syntactical errors. The abbreviations in the supplier names are captured. The ordering, pronunciation and stemming errors in the supplier names are corrected. A matching algorithm that matches and compares two supplier names is applied that comprises the steps of grouping supplier names based on first set of characters in the supplier names and calculating a matching score between the two supplier using Levenshtein distance between the two supplier names, along with the supplier names' sound codes obtained from a modified metaphone algorithm, length of each word, position of matching and mismatching characters, and stem of words in the supplier names. The matching scores are compared with set thresholds in order to further group the supplier names into clusters.Type: GrantFiled: February 12, 2008Date of Patent: July 31, 2012Assignee: Ketera Technologies, Inc.Inventor: Ram Dayal Goyal
-
Patent number: 8229745Abstract: A method of building a mixed-initiative grammar can include receiving one or more conjoin phrases, wherein each conjoin phrase is associated with a selected one of the plurality of directed dialog grammars, and receiving a user input specifying a selected grammar generation technique. The mixed-initiative grammar can be automatically generated, in accordance with the selected grammar generation technique, such that the mixed-initiative grammar specifies an allowable ordering of sets when interpreting a user spoken utterance and whether duplicative phrases are allowable within the user spoken utterance.Type: GrantFiled: October 21, 2005Date of Patent: July 24, 2012Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, David Jaramillo, Gerald M. McCobb
-
Patent number: 8225203Abstract: User input is received, specifying a continuous traced path across a keyboard presented on a touch sensitive display. An input sequence is resolved, including traced keys and auxiliary keys proximate to the traced keys by prescribed criteria. For each of one or more candidate entries of a prescribed vocabulary, a set-edit-distance metric is computed between said input sequence and the candidate entry. Various rules specify when penalties are imposed, or not, in computing the set-edit-distance metric. Candidate entries are ranked and displayed according to the computed metric.Type: GrantFiled: November 4, 2010Date of Patent: July 17, 2012Assignee: Nuance Communications, Inc.Inventor: Erland Unruh
-
Publication number: 20120166194Abstract: Disclosed herein are an apparatus and method for recognizing speech. The apparatus includes a frame-based speech recognition unit, a segment division unit, a segment feature extraction unit, a segment speech recognition performance unit, and a combination and synchronization unit. The frame-based speech recognition unit extracts frame speech feature vectors from a speech signal, and performs speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model. The segment division unit divides the speech signal into segments. The segment feature extraction unit extracts segment speech feature vectors around a boundary between the segments. The segment speech recognition performance unit performs speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model.Type: ApplicationFiled: December 22, 2011Publication date: June 28, 2012Applicant: Electronics and Telecommunications Research InstituteInventors: Ho-Young JUNG, Jeon-Gue PARK, Hoon CHUNG
-
Publication number: 20120166195Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.Type: ApplicationFiled: October 5, 2011Publication date: June 28, 2012Applicant: FUJITSU LIMITEDInventors: Shoji HAYAKAWA, Naoshi Matsuo
-
Patent number: 8209172Abstract: Pattern recognition capable of robust identification for the variance of an input pattern is performed with a low processing cost while the possibility of identification errors is decreased. In a pattern recognition apparatus which identifies the pattern of input data from a data input unit (11) by using a hierarchical feature extraction processor (12) which hierarchically extracts features, an extraction result distribution analyzer (13) analyzes a distribution of at least one feature extraction result obtained by a primary feature extraction processor (121). On the basis of the analytical result, a secondary feature extraction processor (122) performs predetermined secondary feature extraction.Type: GrantFiled: December 16, 2004Date of Patent: June 26, 2012Assignee: Canon Kabushiki KaishaInventors: Yusuke Mitarai, Masakazu Matsuga, Katsuhiko Mori
-
Patent number: 8204749Abstract: A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics.Type: GrantFiled: March 21, 2011Date of Patent: June 19, 2012Assignee: AT&T Intellectual Property II, L.P.Inventors: Dilek Z. Hakkani-Tur, Jackson J. Liscombe, Guiseppe Riccardi
-
Patent number: 8195458Abstract: A method of semantically classifying a data set of open class nouns and a system for executing the method. The method includes loading, by a processing device, a data set comprising one or more open class nouns from a computer readable medium operably connected to the processing device; extracting, by the processing device, the one or more open class nouns from the data set; for each open class noun, querying, by the processing device, one or more application programming interfaces (APIs) to produce one or more results; deriving, by the processing device, a confidence score for the data set based upon the one or more results; and determining, by the processing device, a classification for the data set based upon the derived confidence score.Type: GrantFiled: August 17, 2010Date of Patent: June 5, 2012Assignee: Xerox CorporationInventors: Michael David Shepherd, Kirk J. Ocke, Barry Glynn Gombert, Dale Ellen Gaucas
-
Patent number: 8195436Abstract: A system for simulating interdependencies between multiple critical physical infrastructure models, including a first infrastructure data model that models a first critical physical infrastructure, a second infrastructure data model that models a second critical physical infrastructure, wherein the second critical physical infrastructure is a different critical physical infrastructure from the first critical physical infrastructure, a simulation engine including a visualization application and adapted to automatically produce a change in the second infrastructure data model in response to a change in the first infrastructure data model, and a user interface permitting a user to interact with the simulation engine.Type: GrantFiled: January 11, 2010Date of Patent: June 5, 2012Assignee: Intepoint, LLCInventors: William J Tolone, Bei-tseng Chu
-
Patent number: 8195459Abstract: Outputs of an automatic probabilistic event detection system, such as a fact extraction system, a speech-to-text engine or an automatic character recognition system, are matched with comparable results produced manually or by a different system. This comparison allows statistical modeling of the run-time behavior of the event detection system. This model can subsequently be used to give supplemental or replacement data for an output sequence of the system. In particular, the model can effectively calibrate the system for use with data of a particular statistical nature.Type: GrantFiled: September 6, 2010Date of Patent: June 5, 2012Assignee: Verint Americas, Inc.Inventor: Michael Brand
-
Patent number: 8195455Abstract: Provided are an apparatus and a method capable of recognizing a sound through a reduced burden of computations and a noise-tolerant technique. The sound recognition apparatus in a portable device includes a memory unit that stores at least one base sound and a sound input unit that receives a sound input. The sound recognition apparatus also includes a control unit that receives the sound input from the sound input unit, extracts peak values of the sound input, calculates statistical data by using the peak values, and determines whether the sound input is equal to a base sound by using the statistical data.Type: GrantFiled: February 18, 2009Date of Patent: June 5, 2012Assignee: Samsung Electronics Co., LtdInventor: Hyun Soo Kim
-
Patent number: 8190430Abstract: A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.Type: GrantFiled: August 9, 2011Date of Patent: May 29, 2012Assignee: Nuance Communications, Inc.Inventors: John Doyle, John Brian Pickering
-
Patent number: 8175874Abstract: A method of transferring a real-time audio signal transmission, including: registering voice patterns (or other characteristics) of on more users to be used to identify the voices of the users, accepting an audio signal as it is created as a sequence of segments, analyzing each segment of the accepted audio signal to determine if it contains voice activity (314), determining a probability level that the voice activity of the segment is of a registered user (320 & 322); and selectively transferring the contents, of a segment responsive to the determined probability level (324).Type: GrantFiled: July 18, 2006Date of Patent: May 8, 2012Inventor: Shaul Shimhi
-
Patent number: 8175878Abstract: Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.Type: GrantFiled: December 14, 2010Date of Patent: May 8, 2012Assignee: Google Inc.Inventors: Ciprian Chelba, Thorsten Brants
-
Publication number: 20120109651Abstract: A method of searching a plurality of data files, wherein each data file includes a plurality of features. The method: determines a plurality of feature groups, wherein each feature group includes n features and n is an integer of 2 or more; expresses each data file as a file vector, wherein each component of the vector indicates the frequency of a feature group within the data file, wherein the n features which constitute a feature group do not have to be located adjacent to one another; expresses a search query using the feature groups as a vector; and searches the plurality of data files by comparing the search query expressed as a vector with the file vectors.Type: ApplicationFiled: April 16, 2009Publication date: May 3, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventor: Langzhou Chen
-
Patent number: 8170873Abstract: An approach to comparing events in word spotting, such as comparing putative and reference instances of a keyword, makes use of a set of models of subword units. For each of two acoustic events and for each of a series of times in each of the events, a probability associated with each of the models of the set of subword units is computed. Then, a quantity characterizing a comparison of the two acoustic events, one occurring in each of the two acoustic signals, is computed using the computed probabilities associated with each of the models.Type: GrantFiled: July 22, 2004Date of Patent: May 1, 2012Assignee: Nexidia Inc.Inventor: Robert W. Morris
-
Publication number: 20120101820Abstract: A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.Type: ApplicationFiled: October 24, 2011Publication date: April 26, 2012Applicant: AT&T Intellectual Property I, L.P.Inventor: Andrej Ljolje
-
Patent number: 8165877Abstract: A voice search system has a speech recognizer, a search component, and a dialog manager. A confidence measure generator receives speech recognition features from the speech recognizer, search features from the search component, and dialog features from the dialog manager, and calculates an overall confidence measure for voice search results based upon the features received. The invention can be extended to include the generation of additional features, based on those received from the individual components of the voice search system.Type: GrantFiled: August 3, 2007Date of Patent: April 24, 2012Assignee: Microsoft CorporationInventors: Ye-Yi Wang, Yun-Cheng Ju, Dong Yu
-
Publication number: 20120095762Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.Type: ApplicationFiled: October 19, 2011Publication date: April 19, 2012Applicants: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SAMSUNG ELECTRONICS CO., LTD.Inventors: Ki-wan EOM, Chang-woo HAN, Tae-gyoon KANG, Nam-soo KIM, Doo-hwa HONG, Jae-won LEE, Hyung-joon LIM
-
Patent number: 8145485Abstract: A device receives a voice recognition statistic from a voice recognition application and applies a grammar improvement rule based on the voice recognition statistic. The device also automatically adjusts a weight of the voice recognition statistic based on the grammar improvement rule, and outputs the weight adjusted voice recognition statistic for use in the voice recognition application.Type: GrantFiled: April 29, 2011Date of Patent: March 27, 2012Assignee: Verizon Patent and Licensing Inc.Inventor: Kevin W. Brown
-
Patent number: 8145484Abstract: The described implementations relate to speech spelling by a user. One method identifies one or more symbols that may match a user utterance and displays an individual symbol for confirmation by the user.Type: GrantFiled: November 11, 2008Date of Patent: March 27, 2012Assignee: Microsoft CorporationInventor: Geoffrey Zweig
-
Publication number: 20120072215Abstract: A method is disclosed herein that include an act of causing a processor to access a deep-structured model retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto, transition probabilities between states, and language model scores. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.Type: ApplicationFiled: September 21, 2010Publication date: March 22, 2012Applicant: Microsoft CorporationInventors: Dong Yu, Li Deng, Abdel-rahman Samir Abdel-rahman Mohamed
-
Publication number: 20120072216Abstract: A method and device are configured to receive voice data from a user and perform speech recognition on the received voice data. A confidence score is calculated that represents the likelihood that received voice data has been accurately recognized. A likely age range is determined associated with the user based on the confidence score.Type: ApplicationFiled: November 30, 2011Publication date: March 22, 2012Applicant: VERIZON PATENT AND LICENSING INC.Inventor: Kevin R. Witzman
-
Patent number: 8136154Abstract: Hidden Markov Models (“HMMs”) are used to analyze keystroke dynamics measurements collected as a user types a predetermined string on a keyboard. A user enrolls by typing the predetermined string several times; the enrollment samples are used to train a HMM to identify the user. A candidate who claims to be the user provides a typing sample, and the HMM produces a probability to estimate the likelihood that the candidate is the user he claims to be. A computationally-efficient method for preparing HMMs to analyze certain types of processes is also described.Type: GrantFiled: May 6, 2008Date of Patent: March 13, 2012Assignees: The Penn State Foundation, Louisiana Tech Unversity Research FoundationInventors: Vir V. Phoha, Shashi Phoha, Asok Ray, Shrijit Sudhakar Joshi, Sampath Kumar Vuyyuru
-
Patent number: 8135699Abstract: A server-side summarization system includes a function for acquiring material to be summarized, along with source information about the material, a converter for converting the acquired material to machine-readable form, if not in that form when acquired, a summarizer for creating a summary from the acquired material, and a storage function for storing a copy of the acquired material and the summary created as separate files, associated and cross-referenced using the source information.Type: GrantFiled: June 21, 2006Date of Patent: March 13, 2012Inventors: Puneet K. Gupta, Mark A. Boys
-
Patent number: 8135586Abstract: Disclosed is a method and an apparatus for estimating noise included in a sound signal during sound signal processing. The method includes estimating harmonics components in a frame of an input sound signal; using the estimated harmonics components, computing a Voice Presence Probability (VPP) on the frame of the input sound signal; determining a weight of an equation necessary to estimate a noise spectrum, depending on the computed VPP; and using the determined weight and the equation necessary to estimate a noise spectrum, estimating the noise spectrum, and updating the noise spectrum.Type: GrantFiled: March 21, 2008Date of Patent: March 13, 2012Assignees: Samsung Electronics Co., Ltd, Korea University Industrial & Academic Collaboration FoundationInventors: Hyun-Soo Kim, Hanseok Ko, Sung-Joo Ahn, Jounghoon Beh, Hyun-Jin Yoon
-
Patent number: 8131543Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving an audio signal, determining an energy-independent component of a portion of the audio signal associated with a spectral shape of the portion, and determining an energy-dependent component of the portion associated with a gain level of the portion. The method also comprises comparing the energy-independent and energy-dependent components to a speech model, comparing the energy-independent and energy-dependent components to a noise model, and outputting an indication whether the portion of the audio signal more closely corresponds to the speech model or to the noise model based on the comparisons.Type: GrantFiled: April 14, 2008Date of Patent: March 6, 2012Assignee: Google Inc.Inventors: Ron J. Weiss, Trausti Kristjansson
-
Patent number: 8112275Abstract: The systems and methods described herein may recognize natural language utterances that include queries and/or commands and execute the queries and/or commands based on user-specific profiles. The systems and methods described herein may include a complete speech-based information query, retrieval, presentation and command environment that makes significant use of context, prior information, domain knowledge, and the user-specific profiles to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created and tailored to specific users. For example, the systems and methods described herein may create, store, and use extensive personal profile information for different users, thereby improving the reliability of determining the context and presenting the results that the specific users may expect for a particular question or command.Type: GrantFiled: April 22, 2010Date of Patent: February 7, 2012Assignee: VoiceBox Technologies, Inc.Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, Sr., Michael R. Kennewick, Jr., Richard Kennewick, Tom Freeman
-
Patent number: 8112274Abstract: A method for recognizing a pattern that comprises a set of physical stimuli, said method comprising the steps of: providing a set of training observations and through applying a plurality of association models ascertaining various measuring values pj(k|x), j=1 . . . M, that each pertain to assigning a particular training observation to one or more associated pattern classes; setting up a log/linear association distribution by combining all association models of the plurality according to respective weight factors, and joining thereto a normalization quantity to produce a compound association distribution; optimizing said weight factors for thereby minimizing a detected error rate of the actual assigning to said compound distribution; recognizing target observations representing a target pattern with the help of said compound distribution.Type: GrantFiled: April 30, 2002Date of Patent: February 7, 2012Assignee: Nuance Communications, Inc.Inventor: Peter Beyerlein
-
Publication number: 20120010884Abstract: Systems and methods are disclosed for displaying electronic multimedia content to a user. One computer-implemented method for manipulating electronic multimedia content includes generating, using a processor, a speech model and at least one speaker model of an individual speaker. The method further includes receiving electronic media content over a network; extracting an audio track from the electronic media content; and detecting speech segments within the electronic media content based on the speech model. The method further includes detecting a speaker segment within the electronic media content and calculating a probability of the detected speaker segment involving the individual speaker based on the at least one speaker model.Type: ApplicationFiled: June 9, 2011Publication date: January 12, 2012Inventors: Peter F. Kocks, Guoning Hu, Ping-Hao Wu
-
Patent number: 8095363Abstract: A method and system for monitoring an automated dialog system for the automatic recognition of language understanding errors based on a user's input communications in a task classification system. If the user's input communication cannot be understood and a task classification decision cannot be made, then further dialog may be conducted with the user if a probability of understanding the user's input communication exceeds a first threshold. Otherwise, the user may be directed to a human for assistance. In another possible embodiment, the method operates as above except that if the probability exceeds a second threshold, then further dialog may be conducted with the user using the current dialog strategy. However, if the probability falls between a first threshold and a second threshold, the dialog strategy may be adapted in order to improve the chances of conducting a successful dialog with the user.Type: GrantFiled: January 6, 2009Date of Patent: January 10, 2012Assignee: AT&T Intellectual Property II, L.P.Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
-
Publication number: 20120004912Abstract: A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.Type: ApplicationFiled: August 9, 2011Publication date: January 5, 2012Applicant: Nuance Communications, Inc.Inventors: John Doyle, John Brian Pickering
-
Patent number: 8078462Abstract: A transformation-parameter calculating unit calculates a first model parameter indicating a parameter of a speaker model for causing a first likelihood for a clean feature to maximum, and calculates a transformation parameter for causing the first likelihood to maximum. The transformation parameter transforms, for each of the speakers, a distribution of the clean feature corresponding to the identification information of the speaker to a distribution represented by the speaker model of the first model parameter. A model-parameter calculating unit transforms a noisy feature corresponding to identification information for each of speakers by using the transformation parameter, and calculates a second model parameter indicating a parameter of the speaker model for causing a second likelihood for the transformed noisy feature to maximum.Type: GrantFiled: October 2, 2008Date of Patent: December 13, 2011Assignee: Kabushiki Kaisha ToshibaInventors: Yusuke Shinohara, Masami Akamine