Probability Patents (Class 704/240)

Single-Sided Speech Quality Measurement

Publication number: 20110288865

Abstract: A non-intrusive speech quality estimation technique is based on statistical or probability models such as Gaussian Mixture Models (“GMMs”). Perceptual features are extracted from the received speech signal and assessed by an artificial reference model formed using statistical models. The models characterize the statistical behavior of speech features. Consistency measures between the input speech features and the models are calculated to form indicators of speech quality. The consistency values are mapped to a speech quality score using a mapping optimized using machine learning algorithms, such as Multivariate Adaptive Regression Splines (“MARS”). The technique provides competitive or better quality estimates relative to known techniques while having lower computational complexity.

Type: Application

Filed: August 1, 2011

Publication date: November 24, 2011

Inventors: Wai-Yip Chan, Tiago H. Falk, Qingfeng Xu
Methods and apparatus for audio data analysis and data mining using speech recognition

Patent number: 8055503

Abstract: A system and method provide an audio analysis intelligence tool with ad-hoc search capabilities using spoken words as an organized data form. An SQL-like interface is used to process and search audio data and combine it with other traditional data forms to enhance searching of audio segments to identify those audio segments satisfying minimum confidence levels for a match.

Type: Grant

Filed: November 1, 2006

Date of Patent: November 8, 2011

Assignee: Siemens Enterprise Communications, Inc.

Inventors: Robert Scarano, Lawrence Mark
Method and system of optimal selection strategy for statistical classifications in dialog systems

Patent number: 8050929

Abstract: An optimal selection or decision strategy is described through an example that includes use in dialog systems. The selection strategy or method includes receiving multiple predictions and multiple probabilities. The received predictions predict the content of a received input and each of the probabilities corresponds to one of the predictions. In an example dialog system, the received input includes an utterance. The selection method includes dynamically selecting a set of predictions from the received predictions by generating ranked predictions. The ranked predictions are generated by ordering the plurality of predictions according to descending probability.

Type: Grant

Filed: August 24, 2007

Date of Patent: November 1, 2011

Assignee: Robert Bosch GmbH

Inventors: Junling Hu, Fabrizio Morbini, Fuliang Weng, Xue Liu
Apparatus for estimating phase error and phase error correction system using the same

Patent number: 8045646

Abstract: Provided are an apparatus for estimating a phase error and a phase error correcting system using the phase error estimating apparatus. The apparatus includes: a probability value estimating unit for estimating a negative log probability value for each transmission symbol by transforming a soft output information transferred from the outside to a log A posterior probability ratio (LAPPR) value; an APP value calculating unit for calculating a posterior probability (APP) value by applying a negative exponential function to the transmission symbol; an average value deciding unit for deciding an average value for each transmission symbol using the probability information entirely, partially, or selectively according to a probability information type; and a symbol phase estimating unit for estimating a phase of a symbol based on the decided average value.

Type: Grant

Filed: September 11, 2006

Date of Patent: October 25, 2011

Assignee: Electronics and Telecommunications Research Institute

Inventors: Pan-Soo Kim, Byoung-Hak Kim, Yun-Jeong Song, Deock-Gil Oh, Ho-Jin Lee, Jun Heo, Joong-Gon Ryoo
Method and system of optimal selection strategy for statistical classifications

Patent number: 8024188

Abstract: An optimal selection or decision strategy is described through an example that includes use in dialog systems. The selection strategy or method includes receiving multiple predictions and multiple probabilities. The received predictions predict the content of a received input and each of the probabilities corresponds to one of the predictions. In an example dialog system, the received input includes an utterance. The selection method includes dynamically selecting a set of predictions from the received predictions by generating ranked predictions. The ranked predictions are generated by ordering the plurality of predictions according to descending probability.

Type: Grant

Filed: August 24, 2007

Date of Patent: September 20, 2011

Assignee: Robert Bosch GmbH

Inventors: Junling Hu, Fabrizio Morbini, Fuliang Weng, Xue Liu
N-Gram Model Smoothing with Independently Controllable Parameters

Publication number: 20110224983

Abstract: Described is a technology by which a probability is estimated for a token in a sequence of tokens based upon a number of zero or more times (actual counts) that the sequence was observed in training data. The token may be a word in a word sequence, and the estimated probability may be used in a statistical language model. A discount parameter is set independently of interpolation parameters. If the sequence was observed at least once in the training data, a discount probability and an interpolation probability are computed and summed to provide the estimated probability. If the sequence was not observed, the probability is estimated by computing a backoff probability. Also described are various ways to obtain the discount parameter and interpolation parameters.

Type: Application

Filed: March 11, 2010

Publication date: September 15, 2011

Applicant: Microsoft Corporation

Inventor: Robert Carter Moore
METHOD AND SYSTEM FOR ASSESSING INTELLIGIBILITY OF SPEECH REPRESENTED BY A SPEECH SIGNAL

Publication number: 20110218803

Abstract: A method for assessing intelligibility of speech represented by a speech signal includes providing a speech signal and performing a feature extraction on at least one frame of the speech signal so as to obtain a feature vector for each of the at least one frame of the speech signal. The feature vector is input to a statistical machine learning model so as to obtain an estimated posterior probability of phonemes in the at least one frame as an output including a vector of phoneme posterior probabilities of different phonemes for each of the at least one frame of the speech signal. An entropy estimation is performed on the vector of phoneme posterior probabilities of the at least one frame of the speech signal so as to evaluate intelligibility of the at least one frame of the speech signal. An intelligibility measure is output for the at least one frame of the speech signal.

Type: Application

Filed: March 4, 2011

Publication date: September 8, 2011

Applicant: DEUTSCHE TELEKOM AG

Inventors: Hamed Ketabdar, Juan-Pablo Ramirez
Frequency compensation for perceptual speech analysis

Patent number: 8014999

Abstract: The invention provides a softscaled frequency compensation function that allows the evaluation of a first quality measure indicating a global impact of all distortions in an audio transmission system, including linear frequency response distortions and second quality measure that only lakes into account the impact of linear frequency response distortions. The softscaled frequency compensation function is derived from a softscaled ratio between a time integrated output and a time integrated input power density functions. The first quality measure is derived from the difference loudness density function as function of time and frequency, using the frequency compensated input loudness density function and the gain compensated output loudness density function both as a function of time and frequency, in the same manner as carried out in ITU standard P.862.

Type: Grant

Filed: September 20, 2005

Date of Patent: September 6, 2011

Assignee: Nederlandse Organisatie voor toegepast - natuurwetenschappelijk Onderzoek TNO

Inventor: John Gerard Beerends
Audio source separation based on flexible pre-trained probabilistic source models

Patent number: 8014536

Abstract: Improved audio source separation is provided by providing an audio dictionary for each source to be separated. Thus the invention can be regarded as providing “partially blind” source separation as opposed to the more commonly considered “blind” source separation problem, where no prior information about the sources is given. The audio dictionaries are probabilistic source models, and can be derived from training data from the sources to be separated, or from similar sources. Thus a library of audio dictionaries can be developed to aid in source separation. An unmixing and deconvolutive transformation can be inferred by maximum likelihood (ML) given the received signals and the selected audio dictionaries as input to the ML calculation. Optionally, frequency-domain filtering of the separated signal estimates can be performed prior to reconstructing the time-domain separated signal estimates. Such filtering can be regarded as providing an “audio skin” for a recovered signal.

Type: Grant

Filed: December 1, 2006

Date of Patent: September 6, 2011

Assignee: Golden Metallic, Inc.

Inventor: Hagai Thomas Attias
Speech recognition apparatus and method thereof

Patent number: 8015007

Abstract: A speech recognition apparatus includes a first grammar storage unit configured to store one or more grammar segments, a second grammar storage unit configured to store one or more grammar segments, a first decoder configured to carry out a decoding process by referring to the grammar segment stored in the second grammar storage unit, a grammar transfer unit configured to transfer a trailing grammar segment from the first grammar storage unit to the second grammar storage unit, a second decoder configured to operate in parallel to the grammar transfer unit and carry out the decoding process by referring to the grammar segment stored in the second grammar storage unit, and a recognition control unit configured to monitor the state of transfer of the trailing grammar segment carried out by the grammar transfer unit and activate the both decoders by switching the operation thereof according to the state of transfer of the grammar segment.

Type: Grant

Filed: March 13, 2008

Date of Patent: September 6, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventor: Masaru Sakai
Method and apparatus for generating a language independent document abstract

Patent number: 8005665

Abstract: A method of extracting significant phrases from one or more documents stored in a computer-readable medium. A sequence of words is read from the one or more documents and a score is determined for each word in the sequence based on the length of the word. The score for each word in the sequence is compared against a threshold score. The sequence of words is indicated to be a significant phrase if the number of words in the sequences that have a score greater than the threshold score equals or exceeds a predetermined number. A sentence containing the sequence of words is retrieved from the document, if the sequence of words is a significant phrase. An abstract of the document is searched to determine if the sentence has been previously included in the abstract. If not, the sentence is added to the abstract.

Type: Grant

Filed: July 23, 2010

Date of Patent: August 23, 2011

Assignee: Schukhaus Group GmbH, LLC

Inventors: Garnet R. Chaney, Robert F. Richardson, Seymour I. Rubinstein
Method and system for using input signal quality in speech recognition

Patent number: 8000962

Abstract: A method and system for using input signal quality in an automatic speech recognition system. The method includes measuring the quality of an input signal into a speech recognition system and varying a rejection threshold of the speech recognition system at runtime in dependence on the measurement of the input signal quality. If the measurement of the input signal quality is low, the rejection threshold is reduced and, if the measurement of the input signal quality is high, the rejection threshold is increased. The measurement of the input signal quality may be based on one or more of the measurements of signal-to-noise ratio, loudness, including clipping, and speech signal duration.

Type: Grant

Filed: May 19, 2006

Date of Patent: August 16, 2011

Assignee: Nuance Communications, Inc.

Inventors: John Doyle, John Brian Pickering
System and method of exploiting prosodic features for dialog act tagging in a discriminative modeling framework

Patent number: 7996214

Abstract: Disclosed are a system and method for exploiting information in an utterance for dialog act tagging. An exemplary method includes receiving a user utterance, computing at periodic intervals at least one parameter in the user utterance, quantizing the at least one parameter at each periodic interval, approximating conditional probabilities using an n-gram over a sliding window over the periodic intervals and tagging the utterance as a dialog act based on the approximated conditional probabilities.

Type: Grant

Filed: November 1, 2007

Date of Patent: August 9, 2011

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Srinivas Bangalore, Vivek Kumar Rangarajan Sridhar
SPEECH RECOGNITION ANALYSIS VIA IDENTIFICATION INFORMATION

Publication number: 20110184735

Abstract: Embodiments are disclosed that relate to the use of identity information to help avoid the occurrence of false positive speech recognition events in a speech recognition system. One embodiment provides a method comprising receiving speech recognition data comprising a recognized speech segment, acoustic locational data related to a location of origin of the recognized speech segment as determined via signals from the microphone array, and confidence data comprising a recognition confidence value, and also receiving image data comprising visual locational information related to a location of each person in an image. The acoustic locational data is compared to the visual locational data to determine whether the recognized speech segment originated from a person in the field of view of the image sensor, and the confidence data is adjusted depending on this determination.

Type: Application

Filed: January 22, 2010

Publication date: July 28, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Jason Flaks, Dax Hawkins, Christian Klein, Mitchell Stephen Dernis, Tommer Leyvand, Ali M. Vassigh, Duncan McKay
WORD CATEGORY ESTIMATION APPARATUS, WORD CATEGORY ESTIMATION METHOD, SPEECH RECOGNITION APPARATUS, SPEECH RECOGNITION METHOD, PROGRAM, AND RECORDING MEDIUM

Publication number: 20110173000

Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.

Type: Application

Filed: December 19, 2008

Publication date: July 14, 2011

Inventors: Hitoshi Yamamoto, Miki Kiyokazu
Method and system for Gaussian probability data bit reduction and computation

Patent number: 7970613

Abstract: Use of runtime memory may be reduced in a data processing algorithm that uses one or more probability distribution functions. Each probability distribution function may be characterized by one or more uncompressed mean values and one or more variance values. The uncompressed mean and variance values may be represented by ?-bit floating point numbers, where ? is an integer greater than 1. The probability distribution functions are converted to compressed probability functions having compressed mean and/or variance values represented as ?-bit integers, where ? is less than ?, whereby the compressed mean and/or variance values occupy less memory space than the uncompressed mean and/or variance values. Portions of the data processing algorithm can be performed with the compressed mean and variance values.

Type: Grant

Filed: November 12, 2005

Date of Patent: June 28, 2011

Assignee: Sony Computer Entertainment Inc.

Inventor: Ruxin Chen
Continuous adaptation in detection systems via self-tuning from target population subsets

Patent number: 7970614

Abstract: The present invention provides a system and method for treating distortion propagated though a detection system. The system includes a compensation module that compensates for untreated distortions propagating through the detection compensation system, a user model pool that comprises of a plurality of model sets, and a model selector that selects at least one model set from plurality of model sets in the user model pool. The compensation is accomplished by continually producing scores distributed according to a prescribed distribution for the at least one model set and mitigating the adverse effects of the scores being distorted and lying off a pre-set operating point. The method for treating distortion propagated though a detection system includes receiving a signal from a remote device, and compensating the signal for untreated distortions.

Type: Grant

Filed: May 8, 2007

Date of Patent: June 28, 2011

Assignee: Nuance Communications, Inc.

Inventors: Janice J. Kim, Jiri Navratil, Jason W. Pelecanos, Ganesh N. Ramaswamy
SYSTEM AND METHOD FOR COMPUTING AND TRANSMITTING PARAMETERS IN A DISTRIBUTED VOICE RECOGNITION SYSTEM

Publication number: 20110153326

Abstract: A system and method for extracting acoustic features and speech activity on a device and transmitting them in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit and a server VR engine on a server. The local VR engine comprises a feature extraction (FE) module that extracts features from a speech signal, and a voice activity detection module (VAD) that detects voice activity within a speech signal. The system includes filters, framing and windowing modules, power spectrum analyzers, a neural network, a nonlinear element, and other components to selectively provide an advanced front end vector including predetermined portions of the voice activity detection indication and extracted features from the subscriber unit to the server. The system also includes a module to generate additional feature vectors on the server from the received features using a feed-forward multilayer perceptron (MLP) and providing the same to the speech server.

Type: Application

Filed: February 9, 2011

Publication date: June 23, 2011

Applicant: QUALCOMM INCORPORATED

Inventors: HARINATH GARUDADRI, HYNEK HERMANSKY, LUKAS BURGET, PRATIBHA JAIN, SACHIN KAJAREKAR, SUNIL SIVADAS, STEPHANE N. DUPONT, MARIA CARMEN BENITEZ ORTUZAR, NELSON H. MORGAN
CONFIDENCE CALIBRATION IN AUTOMATIC SPEECH RECOGNITION SYSTEMS

Publication number: 20110144986

Abstract: Described is a calibration model for use in a speech recognition system. The calibration model adjusts the confidence scores output by a speech recognition engine to thereby provide an improved calibrated confidence score for use by an application. The calibration model is one that has been trained for a specific usage scenario, e.g., for that application, based upon a calibration training set obtained from a previous similar/corresponding usage scenario or scenarios. Different calibration models may be used with different usage scenarios, e.g., during different conditions. The calibration model may comprise a maximum entropy classifier with distribution constraints, trained with continuous raw confidence scores and multi-valued word tokens, and/or other distributions and extracted features.

Type: Application

Filed: December 10, 2009

Publication date: June 16, 2011

Applicant: Microsoft Corporation

Inventors: Dong Yu, Li Deng, Jinyu Li
System and Method for Processing Speech Recognition

Publication number: 20110137651

Abstract: An automatic speech recognition (ASR) system and method is provided for controlling the recognition of speech utterances generated by an end user operating a communications device. The ASR system and method can be used with a mobile device that is used in a communications network. The ASR system can be used for ASR of speech utterances input into a mobile device, to perform compensating techniques using at least one characteristic and for updating an ASR speech recognizer associated with the ASR system by determined and using a background noise value and a distortion value that is based on the features of the mobile device. The ASR system can be used to augment a limited data input capability of a mobile device, for example, caused by limited input devices physically located on the mobile device.

Type: Application

Filed: February 14, 2011

Publication date: June 9, 2011

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Richard C. ROSE, Sarangarajan PATHASARATHY, Aaron Edward ROSENBERG, Shrikanth Sambasivan NARAYANAN
DIALOGUE SPEECH RECOGNITION SYSTEM, DIALOGUE SPEECH RECOGNITION METHOD, AND RECORDING MEDIUM FOR STORING DIALOGUE SPEECH RECOGNITION PROGRAM

Publication number: 20110131042

Abstract: Disclosed is a dialogue speech recognition system that can expand the scope of applications by employing a universal dialogue structure as the condition for speech recognition of dialogue speech between persons. An acoustic likelihood computation means (701) provides a likelihood that a speech signal input from a given phoneme sequence will occur. A linguistic likelihood computation means (702) provides a likelihood that a given word sequence will occur. A maximum likelihood candidate search means (703) uses the likelihoods provided by the acoustic likelihood computation means and the linguistic likelihood computation means to provide a word sequence with the maximum likelihood of occurring from a speech signal. Further, the linguistic likelihood computation means (702) provides different linguistic likelihoods when the speaker who generated the acoustic signal input to the speech recognition means does and does not have the turn to speak.

Type: Application

Filed: May 12, 2009

Publication date: June 2, 2011

Inventor: Kentaro Nagatomo
Grammar weighting voice recognition information

Patent number: 7953598

Abstract: A device receives a voice recognition statistic from a voice recognition application and applies a grammar improvement rule based on the voice recognition statistic. The device also automatically adjusts a weight of the voice recognition statistic based on the grammar improvement rule, and outputs the weight adjusted voice recognition statistic for use in the voice recognition application.

Type: Grant

Filed: December 17, 2007

Date of Patent: May 31, 2011

Assignee: Verizon Patent and Licensing Inc.

Inventor: Kevin W. Brown
System of effectively searching text for keyword, and method thereof

Patent number: 7945552

Abstract: A system of the present invention stores: a first index which designates lists of keywords contained in texts from identifications of the respective texts; a second index which designates lists of texts containing keywords from identifications of the respective keywords; and the number of texts containing the respective keywords. Then, upon receiving an input of a text search condition, the system calculates an estimation of search time by the first index and an estimation of search time by the second index, and determines which one of the first and second indexes makes a search faster. Then, by using the index which has been determined to make the search faster, the system searches for keywords which appear in texts satisfying the text search condition with higher frequency.

Type: Grant

Filed: March 26, 2008

Date of Patent: May 17, 2011

Assignee: International Business Machines Corporation

Inventors: Daisuke Takuma, Issei Yoshida, Yuta Tsuboi
Low latency real-time speech transcription

Patent number: 7941317

Abstract: Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.

Type: Grant

Filed: June 5, 2007

Date of Patent: May 10, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Vincent Goffin, Michael Dennis Riley, Murat Saraclar
SYSTEM AND METHOD FOR ESTIMATING THE RELIABILITY OF ALTERNATE SPEECH RECOGNITION HYPOTHESES IN REAL TIME

Publication number: 20110099012

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for estimating reliability of alternate speech recognition hypotheses. A system configured to practice the method receives an N-best list of speech recognition hypotheses and features describing the N-best list, determines a first probability of correctness for each hypothesis in the N-best list based on the received features, determines a second probability that the N-best list does not contain a correct hypothesis, and uses the first probability and the second probability in a spoken dialog. The features can describe properties of at least one of a lattice, a word confusion network, and a garbage model. In one aspect, the N-best lists are not reordered according to reranking scores. The determination of the first probability of correctness can include a first stage of training a probabilistic model and a second stage of distributing mass over items in a tail of the N-best list.

Type: Application

Filed: October 23, 2009

Publication date: April 28, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Jason WILLIAMS, Suhrid BALAKRISHNAN
Natural language understanding monitoring system for identifying a task

Patent number: 7933773

Abstract: A natural language understanding monitoring system adapted to conduct an automated dialog with a user. If the system is unable to identify from the automated dialog, to at least a predetermined level of confidence, any one of a plurality of predetermined tasks as being a particular task that the user wants to have performed, the system makes a determination of the value of a probability that further automated dialog will enable the system to identify the particular task, and determines whether or not to conduct further automated dialog with the user, in an attempt to identify the particular task, based on the relative values of the determined probability and a predetermined threshold value. The probability value determination is based on inputs from the user during the automated dialog.

Type: Grant

Filed: March 25, 2009

Date of Patent: April 26, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Allen Louis Gorin, Irene Langkilde Geary, Marilyn Ann Walker, Jeremy H. Wright
Low latency real-time speech transcription

Patent number: 7930181

Abstract: Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.

Type: Grant

Filed: November 21, 2002

Date of Patent: April 19, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Vincent Goffin, Michael Dennis Riley, Murat Saraclar
SPEECH RECOGNITION SYSTEM, METHOD FOR RECOGNIZING SPEECH AND ELECTRONIC APPARATUS

Publication number: 20110087492

Abstract: A speech characteristic-amount calculation circuit 31 calculates an amount of speech characteristics of each phrase in input speech. An estimation process likelihood calculation circuit 33 compares the calculated speech characteristic amount of a phrase with speech pattern sequence information of a plurality of phrases stored in a storage unit 34 to select a plurality of candidates having from a higher likelihood value to a lower likelihood value for the phrases. A recognition filtering device 4 determines whether to reject or not reject the extracted candidates based on the likelihood difference ratio between the difference in likelihood values between the first candidate and the second candidate and the difference in likelihood values between the second candidate and the third candidate.

Type: Application

Filed: May 11, 2009

Publication date: April 14, 2011

Applicant: RayTron, Inc.

Inventors: Mitsuji Yoshida, Kazutaka Hyodo
Apparatus and method for speech recognition using probability and mixed distributions

Patent number: 7921012

Abstract: A speech recognition apparatus includes a first storing unit configured to store a first acoustic model invariable regardless of speaker and environment, a second storing unit configured to store a classification model that has shared parameters and non-shared parameters with the first acoustic model to classify second acoustic models, a recognizing unit configured to calculate a first likelihood with regard to the input speech by applying the first acoustic model to the input speech and obtain calculation result on the shared parameter and a plurality of candidate words that have relatively large values as the first likelihood, and a calculating unit configured to calculate a second likelihood for each of the groups with regard to the input speech by use of the calculation result on the shared parameters and the non-shared parameters of the classification model.

Type: Grant

Filed: September 18, 2007

Date of Patent: April 5, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventors: Hiroshi Fujimura, Takashi Masuko
Method for uncovering hidden Markov models

Patent number: 7912717

Abstract: The invention uses the ModelGrower program to generate possible candidates from an original or aggregated model. An isomorphic reduction program operates on the candidates to identify and exclude isomorphic models. A Markov model evaluation and optimization program operates on the remaining non-isomorphic candidates. The candidates are optimized and the ones that most closely conform to the data are kept. The best optimized candidate of one stage becomes the starting candidate for the next stage where ModelGrower and the other programs operate on the optimized candidate to generate a new optimized candidate. The invention repeats the steps of growing, excluding isomorphs, evaluating and optimizing until such repetitions yield no significantly better results.

Type: Grant

Filed: November 18, 2005

Date of Patent: March 22, 2011

Inventor: Albert Galick
System and method for building emotional machines

Patent number: 7912720

Abstract: A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog between a human and a computing device, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics.

Type: Grant

Filed: July 20, 2005

Date of Patent: March 22, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Jackson J. Liscombe, Guiseppe Riccardi
METHOD OF RECOGNIZING SPEECH

Publication number: 20110046953

Abstract: A method for recognizing speech involves reciting, into a speech recognition system, an utterance including a numeric sequence that contains a digit string including a plurality of tokens and detecting a co-articulation problem related to at least two potentially co-articulated tokens in the digit string. The numeric sequence may be identified using i) a dynamically generated possible numeric sequence that potentially corresponds with the numeric sequence, and/or ii) at least one supplemental acoustic model. Also disclosed herein is a system for accomplishing the same.

Type: Application

Filed: August 21, 2009

Publication date: February 24, 2011

Applicant: GENERAL MOTORS COMPANY

Inventors: Uma Arun, Sherri J. Voran-Nowak, Rathinavelu Chengalvarayan, Gaurav Talwar
Device and method of modeling acoustic characteristics with HMM and collating the same with a voice characteristic vector sequence

Patent number: 7895040

Abstract: According to an embodiment, voice recognition apparatus includes units of: acoustic processing, voice interval detecting, dictionary, collating, search target selecting, storing and determining, and voice recognition method includes processes of: selecting a search range on basis of a beam search, setting and storing a standard frame, storing an output probability of a certain transition path, determining whether or not the output probability of a certain path is stored. Number of times of calculation of the output probability is reduced by selecting the search range on basis of the beam search, calculating the output probability of the certain transition path only once in an interval from when the standard frame is set to when the standard frame is renewed, and storing and using thus calculated value as an approximate value of the output probability in subsequent frames.

Type: Grant

Filed: March 30, 2007

Date of Patent: February 22, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masaru Sakai, Shinichi Tanaka
Intersession variability compensation for automatic extraction of information from voice

Publication number: 20110040561

Abstract: A method for compensating inter-session variability for automatic extraction of information from an input voice signal representing an utterance of a speaker, includes: processing the input voice signal to provide feature vectors each formed by acoustic features extracted from the input voice signal at a time frame; computing an intersession variability compensation feature vector; and computing compensated feature vectors based on the extracted feature vectors and the intersession variability compensation feature vector.

Type: Application

Filed: May 16, 2006

Publication date: February 17, 2011

Inventors: Claudio Vair, Daniele Colibro, Pietro Laface
Subword unit posterior probability for measuring confidence

Patent number: 7890325

Abstract: Speech recognition such as command and control speech recognition generally use a context free grammar to constrain the decoding process. Word or subword background model are constructed to repopulate dynamic hypothesis space, especially when word spareness is at issue. The background models can be later used in speech recognition. During speech recognition, background and conventional context free grammar decoding are used to measure confidence. The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.

Type: Grant

Filed: March 16, 2006

Date of Patent: February 15, 2011

Assignee: Microsoft Corporation

Inventors: Peng Liu, Ye Tian, Jian-Lai Zhou, Frank Kao-Ping K. Soong
SPEECH RECOGNITION METHOD FOR ALL LANGUAGES WITHOUT USING SAMPLES

Publication number: 20110035216

Abstract: The invention can recognize any several languages at the same time without using samples. The important skill is that features of known words in any language are extracted from unknown words or continuous voices. These unknown words represented by matrices are spread in the 144-dimensional space. The feature of a known word of any language represented by a matrix is simulated by the surrounding unknown words. The invention includes 12 elastic frames of equal length without filter and without overlap to normalize the signal waveform of variable length for a word, which has one to several syllables, into a 12×12 matrix as a feature of the word. The invention can improve the feature such that the speech recognition of an unknown sentence is correct. The invention can correctly recognize any languages without samples, such as English, Chinese, German, French, Japanese, Korean, Russian, Cantonese, Taiwanese, etc.

Type: Application

Filed: August 5, 2009

Publication date: February 10, 2011

Inventors: Tze Fen LI, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
Representing n-gram language models for compact storage and fast retrieval

Patent number: 7877258

Abstract: Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.

Type: Grant

Filed: March 29, 2007

Date of Patent: January 25, 2011

Assignee: Google Inc.

Inventors: Ciprian Chelba, Thorsten Brants
SPEECH RECOGNITION SYSTEM AND METHOD

Publication number: 20110015925

Abstract: A speech recognition method, comprising: receiving a speech input in a first noise environment which comprises a sequence of observations; determining the likelihood of a sequence of words arising from the sequence of observations using an acoustic model, comprising: providing an acoustic model for performing speech recognition on a input signal which comprises a sequence of observations, wherein said model has been trained to recognise speech in a second noise environment, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to an observation; adapting the model trained in the second environment to that of the first environment; the speech recognition method further comprising determining the likelihood of a sequence of observations occurring in a given language using a language model; combining the likelihoods determined by the acoustic model and the language model and outputting a sequence of words identified from said spee

Type: Application

Filed: March 26, 2010

Publication date: January 20, 2011

Applicant: Kabushiki Kaisha Toshiba

Inventors: Haitian Xu, Mark John Francis Gales
Partially filling mixed-initiative forms from utterances having sub-threshold confidence scores based upon word-level confidence data

Patent number: 7870000

Abstract: The present disclosure relates to prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played.

Type: Grant

Filed: March 28, 2007

Date of Patent: January 11, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Gerald M. McCobb, Paritosh D. Patel, Marc White
NOISE ADAPTIVE TRAINING FOR SPEECH RECOGNITION

Publication number: 20100318354

Abstract: Technologies are described herein for noise adaptive training to achieve robust automatic speech recognition. Through the use of these technologies, a noise adaptive training (NAT) approach may use both clean and corrupted speech for training. The NAT approach may normalize the environmental distortion as part of the model training. A set of underlying “pseudo-clean” model parameters may be estimated directly. This may be done without point estimation of clean speech features as an intermediate step. The pseudo-clean model parameters learned from the NAT technique may be used with a Vector Taylor Series (VTS) adaptation. Such adaptation may support decoding noisy utterances during the operating phase of a automatic voice recognition system.

Type: Application

Filed: June 12, 2009

Publication date: December 16, 2010

Applicant: Microsoft Corporation

Inventors: Michael Lewis Seltzer, James Garnet Droppo, Ozlem Kalinli, Alejandro Acero
System, method, and program for correcting misrecognized spoken words by selecting appropriate correction word from one or more competitive words

Patent number: 7848926

Abstract: A speech recognition system is provided where a user may more efficiently and easily correct a recognition error resulting from speech recognition. The system compares multiple inputted words with multiple stored words and determines a most-competitive word candidate. The system selects one or more competitive words that have competitive probabilities close to the competitive probability of the most-competitive word candidate and displays the one or more competitive words adjacent to the most-competitive word candidate. The system selects an appropriate correction word from the one or more competitive words and replaces one of the most competitive word candidate with the correction word.

Type: Grant

Filed: November 18, 2005

Date of Patent: December 7, 2010

Assignee: National Institute of Advanced Industrial Science and Technology

Inventors: Masataka Goto, Jun Ogata
Method and apparatus for normalizing voice feature vector by backward cumulative histogram

Patent number: 7835909

Abstract: A method and apparatus for normalizing a histogram utilizing a backward cumulative histogram which can cumulate a probability distribution function in an order from a greatest to smallest value so as to estimate a noise robust histogram. A method of normalizing a speech feature vector includes: extracting the speech feature vector from a speech signal; calculating a probability distribution function using the extracted speech feature vector; calculating a backward cumulative distribution function by cumulating the probability distribution function in an order from a largest to smallest value; and normalizing a histogram using the backward cumulative distribution function.

Type: Grant

Filed: December 12, 2006

Date of Patent: November 16, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventors: So-Young Jeong, Gil Jin Jang, Kwang Cheol Oh
Device control, speech recognition device, agent device, control method

Patent number: 7822614

Abstract: A language analyzer performs speech recognition on a speech input by a speech input unit, specifies a possible word which is represented by the speech, and the score thereof, and supplies word data representing them to an agent processing unit. The agent processing unit stores process item data which defines a data acquisition process to acquire word data or the like, a discrimination process, and an input/output process, and wires or data defining transition from one process to another and giving a weighting factor to the transition, and executes a flow represented generally by the process item data and the wires to thereby control devices belonging to an input/output target device group. To which process in the flow the transition takes place is determined by the weighting factor of each wire, which is determined by the connection relationship between a point where the process has proceeded and the wire, and the score of word data.

Type: Grant

Filed: December 6, 2004

Date of Patent: October 26, 2010

Assignee: Kabushikikaisha Kenwood

Inventor: Rika Koyama
State output probability calculating method and apparatus for mixture distribution HMM

Patent number: 7813925

Abstract: When adjacent times or the small change of an observation signal is determined, a distribution which maximizes the output probability of a mixture distribution does not change at a high possibility. By using this fact, when obtaining the output probability of the mixture distribution HMM, a distribution serving as a maximum output probability is stored. When adjacent times or the small change of the observation signal is determined, the output probability of the stored distribution serves as the output probability of the mixture distribution. This can reduce the output probability calculation of other distributions when calculating the output probability of the mixture distribution, thereby reducing the calculation amount required for output probabilities.

Type: Grant

Filed: April 6, 2006

Date of Patent: October 12, 2010

Assignee: Canon Kabushiki Kaisha

Inventors: Hiroki Yamamoto, Masayuki Yamada
MAXIMUM ENTROPY MODEL WITH CONTINUOUS FEATURES

Publication number: 20100256977

Abstract: Described is a technology by which a maximum entropy (MaxEnt) model, such as used as a classifier or in a conditional random field or hidden conditional random field that embed the maximum entropy model, uses continuous features with continuous weights that are continuous functions of the feature values (instead of single-valued weights). The continuous weights may be approximated by a spline-based solution. In general, this converts the optimization problem into a standard log-linear optimization problem without continuous weights at a higher-dimensional space.

Type: Application

Filed: April 1, 2009

Publication date: October 7, 2010

Applicant: Microsoft Corporation

Inventors: Dong Yu, Li Deng, Alejandro Acero
Systems and methods for responding to natural language speech utterance

Patent number: 7809570

Abstract: Systems and methods for receiving natural language queries and/or commands and execute the queries and/or commands. The systems and methods overcomes the deficiencies of prior art speech query and response systems through the application of a complete speech-based information query, retrieval, presentation and command environment. This environment makes significant use of context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command.

Type: Grant

Filed: July 7, 2008

Date of Patent: October 5, 2010

Assignee: VoiceBox Technologies, Inc.

Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, Sr., Michael R. Kennewick, Jr., Richard Kennewick, Tom Freeman
One-step repair of misrecognized recognition strings

Patent number: 7809566

Abstract: A method for use in automatic speech recognition corrects erroneous recognition elements within a recognition hypothesis. A user input is recognized as a correction hypothesis which contains various recognition elements. A non-deterministic alignment is performed to align at least a portion of the correction hypothesis with an earlier recognition hypothesis which also contains various recognition elements such that the recognition elements in the aligned portion of the correction hypothesis are determined to most likely, correspond to a range of recognition elements in the earlier recognition hypotheses. The recognition elements in the range of recognition elements in the earlier recognition hypothesis are replaced with the recognition elements in the aligned portion of the correction hypothesis.

Type: Grant

Filed: October 13, 2006

Date of Patent: October 5, 2010

Assignee: Nuance Communications, Inc.

Inventor: Ralf Meermeier
Augmentation and calibration of output from non-deterministic text generators by modeling its characteristics in specific environments

Patent number: 7792671

Abstract: Outputs of an automatic probabilistic event detection system, such as a fact extraction system, a speech-to-text engine or an automatic character recognition system, are matched with comparable results produced manually or by a different system. This comparison allows statistical modeling of the run-time behavior of the event detection system. This model can subsequently be used to give supplemental or replacement data for an output sequence of the system. In particular, the model can effectively calibrate the system for use with data of a particular statistical nature.

Type: Grant

Filed: February 5, 2004

Date of Patent: September 7, 2010

Assignee: Verint Americas Inc.

Inventor: Michael Brand
Method and apparatus for generating a language independent document abstract

Patent number: 7792667

Abstract: A method of extracting significant phrases from one or more documents stored in a computer-readable medium. A sequence of words is read from the one or more documents and a score is determined for each word in the sequence based on the length of the word. The score for each word in the sequence is compared against a threshold score. The sequence of words is indicated to be a significant phrase if the number of words in the sequences that have a score greater than the threshold score equals or exceeds a predetermined number. A sentence containing the sequence of words is retrieved from the document, if the sequence of words is a significant phrase. An abstract of the document is searched to determine if the sentence has been previously included in the abstract. If not, the sentence is added to the abstract.

Type: Grant

Filed: September 26, 2008

Date of Patent: September 7, 2010

Inventors: Garnet R. Chaney, Robert F. Richardson, Seymour I. Rubinstein
Apparatus, method and system for maximum entropy modeling for uncertain observations

Patent number: 7788094

Abstract: A method for performing conditional maximum entropy modeling includes constructing a conditional maximum entropy model, and incorporating an observation confidence score into the model to reduce an effect due to an uncertain observation.

Type: Grant

Filed: January 29, 2007

Date of Patent: August 31, 2010

Assignee: Robert Bosch GmbH

Inventors: Farhad Farahani, Fuliang Weng, Qi Zhang

prev … 4 5 6 7 8 9 10 11 12 … next