Specialized Equations Or Comparisons Patents (Class 704/236)

Correlation (Class 704/237)

Distance (Class 704/238)

Similarity (Class 704/239)

Probability (Class 704/240)

Dynamic time warping (Class 704/241)

Viterbi trellis (Class 704/242)

Adaptation of language models and context free grammar in speech recognition

Patent number: 7925505

Abstract: Architecture is disclosed herewith for minimizing an empirical error rate by discriminative adaptation of a statistical language model in a dictation and/or dialog application. The architecture allows assignment of an improved weighting value to each term or phrase to reduce empirical error. Empirical errors are minimized whether a user provides correction results or not based on criteria for discriminatively adapting the user language model (LM)/context-free grammar (CFG) to the target. Moreover, algorithms are provided for the training and adaptation processes of LM/CFG parameters for criteria optimization.

Type: Grant

Filed: April 10, 2007

Date of Patent: April 12, 2011

Assignee: Microsoft Corporation

Inventor: Jian Wu
METHODS, ELECTRONIC DEVICES, AND COMPUTER PROGRAM PRODUCTS FOR GENERATING AN INDICIUM THAT REPRESENTS A PREVAILING MOOD ASSOCIATED WITH A PHONE CALL

Publication number: 20110082695

Abstract: An electronic device includes a call analysis module that is configured to analyze characteristics of a phone call and to generate an indicium that represents a prevailing mood associated with the phone call based on the analyzed characteristics.

Type: Application

Filed: October 2, 2009

Publication date: April 7, 2011

Inventor: Emil Morgan Billing Bengt
Spoken language identification system and methods for training and operating same

Patent number: 7917361

Abstract: A method for training a spoken language identification system to identify an unknown language as one of a plurality of known candidate languages includes the process of creating a sound inventory comprising a plurality of sound tokens, the collective plurality of sound tokens provided from a subset of the known candidate languages. The method further includes providing a plurality of training samples, each training sample composed within one of the known candidate languages. Further included is the process of generating one or more training vectors from each training database, wherein each training vector is defined as a function of said plurality of sound tokens provided from said subset of the known candidate languages. The method further includes associating each training vector with the candidate language of the corresponding training sample.

Type: Grant

Filed: September 19, 2005

Date of Patent: March 29, 2011

Assignee: Agency for Science, Technology and Research

Inventors: Haizhou Li, Bin Ma, George M. White
Automatic speech recognition system and method using weighted confidence measure

Patent number: 7912713

Abstract: An automatic speech recognition method for identifying words from an input speech signal includes providing at least one hypothesis recognition based on the input speech signal, the hypothesis recognition being an individual hypothesis word or a sequence of individual hypothesis words, and computing a confidence measure for the hypothesis recognition, based on the input speech signal, wherein computing a confidence measure includes computing differential contributions to the confidence measure, each as a difference between a constrained acoustic score and an unconstrained acoustic score, weighting each differential contribution by applying thereto a cumulative distribution function of the differential contribution, so as to make the distributions of the confidence measures homogeneous in terms of rejection capability, as the language, vocabulary and grammar vary, and computing the confidence measure by averaging the weighted differential contributions.

Type: Grant

Filed: December 28, 2004

Date of Patent: March 22, 2011

Assignee: Loquendo S.p.A.

Inventors: Claudio Vair, Daniele Colibro
RECOGNITION VIA HIGH-DIMENSIONAL DATA CLASSIFICATION

Publication number: 20110064302

Abstract: A method is disclosed for recognition of high-dimensional data in the presence of occlusion, including: receiving a target data that includes an occlusion and is of an unknown class, wherein the target data includes a known object; sampling a plurality of training data files comprising a plurality of distinct classes of the same object as that of the target data; and identifying the class of the target data through linear superposition of the sampled training data files using l1 minimization, wherein a linear superposition with a sparsest number of coefficients is used to identify the class of the target data.

Type: Application

Filed: January 29, 2009

Publication date: March 17, 2011

Inventors: Yi Ma, Allen Yang Yang, John Norbert Wright, Andrew William Wagner
SYSTEM AND METHOD FOR PERSONALIZATION OF ACOUSTIC MODELS FOR AUTOMATIC SPEECH RECOGNITION

Publication number: 20110066433

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Application

Filed: September 16, 2009

Publication date: March 17, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Andrej LJOLJE, Diamantino Antonio CASEIRO, Alistair D. CONKIE
System and method for processing speech recognition

Patent number: 7904294

Abstract: An automatic speech recognition (ASR) system and method is provided for controlling the recognition of speech utterances generated by an end user operating a communications device. The ASR system and method can be used with a mobile device that is used in a communications network. The ASR system can be used for ASR of speech utterances input into a mobile device, to perform compensating techniques using at least one characteristic and for updating an ASR speech recognizer associated with the ASR system by determined and using a background noise value and a distortion value that is based on the features of the mobile device. The ASR system can be used to augment a limited data input capability of a mobile device, for example, caused by limited input devices physically located on the mobile device.

Type: Grant

Filed: April 9, 2007

Date of Patent: March 8, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Richard C. Rose, Sarangarajan Pathasarathy, Aaron Edward Rosenberg, Shrikanth Sambasivan Narayanan
SYSTEM AND METHOD FOR BUILDING OPTIMAL STATE-DEPENDENT STATISTICAL UTTERANCE CLASSIFIERS IN SPOKEN DIALOG SYSTEMS

Publication number: 20110046951

Abstract: A system and a method to generate statistical utterance classifiers optimized for the individual states of a spoken dialog system is disclosed. The system and method make use of large databases of transcribed and annotated utterances from calls collected in a dialog system in production and log data reporting the association between the state of the system at the moment when the utterances were recorded and the utterance. From the system state, being a vector of multiple system variables, subsets of these variables, certain variable ranges, quantized variable values, etc. can be extracted to produce a multitude of distinct utterance subsets matching every possible system state. For each of these subset and variable combinations, statistical classifiers can be trained, tuned, and tested, and the classifiers can be stored together with the performance results and the state subset and variable combination.

Type: Application

Filed: August 21, 2009

Publication date: February 24, 2011

Inventors: David Suendermann, Jackson Liscombe, Krishna Dayanidhi, Roberto Pieraccini
Mobile personal services platform for providing feedback

Patent number: 7894849

Abstract: Methods, systems, and apparatus, including computer program products, for generating feedback. In one aspect, a method includes receiving sensor data from a plurality of sensors, wherein at least one of the plurality of sensors is associated with a mobile device of a user; aggregating the received sensor data to generate aggregated sensor data; processing the aggregated sensor data to determine an aggregated metric; comparing the aggregated metric to a target associated with the user to determine a measure of performance; and generating feedback based on the determined measure of performance. Further, the mobile device can comprise a mobile personal services device that includes one or more of an audio sensor, a video sensor, an environmental sensor, a biometric sensor, a location sensor, an activity detector, and a health monitor. The feedback can be displayed on the mobile personal services device. The feedback also can be displayed in near real-time.

Type: Grant

Filed: July 9, 2007

Date of Patent: February 22, 2011

Assignee: Accenture Global Services Limited

Inventors: Alex M. Kass, Lucian P. Hughes, Owen E. Richter, Dana Le, Daniel Farina
Avoiding repeated misunderstandings in spoken dialog system

Patent number: 7865364

Abstract: A method for improving speech recognition accuracy includes utilizing skiplists or lists of values that cannot occur because of improbability or impossibility. A table or list is stored in a dialog manager module. The table includes a plurality of information items and a corresponding list of improbable values for each of the plurality of information items. A plurality of recognized ordered interpretations is received from an automatic speech recognition (ASR) engine. Each of the plurality of recognized ordered interpretations each includes a number of information items. A value of one or more of the received information items for a first recognized ordered interpretation is compared to a table to determine if the value of the one of the received information items matches any of the list of improbable values for the corresponding information item.

Type: Grant

Filed: May 5, 2006

Date of Patent: January 4, 2011

Assignee: Nuance Communications, Inc.

Inventor: Marc Helbing
METHOD AND APPARATUS FOR IMPROVING MEMORY LOCALITY FOR REAL-TIME SPEECH RECOGNITION

Publication number: 20100332228

Abstract: According to some embodiments, a method and apparatus are provided to buffer N audio frames of a plurality of audio frames associated with an audio signal, pre-compute scores for a subset of context dependent models (CDMs), and perform a graphical model search associated with the N audio frames where a score of a context independent model (CIM) associated with a CDM is used in lieu of a score for the CDM when a score for the CDM is needed and has not been pre-computed.

Type: Application

Filed: June 25, 2009

Publication date: December 30, 2010

Inventors: Michael Eugene Deisher, Tao Ma
AUTOMATIC DISCLOSURE DETECTION

Publication number: 20100332227

Abstract: A method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.

Type: Application

Filed: June 24, 2009

Publication date: December 30, 2010

Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: I. Dan MELAMED, Yeon-Jun KIM, Andrej LJOLJE, Bernard S. RENGER, David J. SMITH
Reducing time for annotating speech data to develop a dialog application

Patent number: 7860713

Abstract: Systems and methods for annotating speech data. The present invention reduces the time required to annotate speech data by selecting utterances for annotation that will be of greatest benefit. A selection module uses speech models, including speech recognition models and spoken language understanding models, to identify utterances that should be annotated based on criteria such as confidence scores generated by the models. These utterances are placed in an annotation list along with a type of annotation to be performed for the utterances and an order in which the annotation should proceed. The utterances in the annotation list can be annotated for speech recognition purposes, spoken language understanding purposes, labeling purposes, etc. The selection module can also select utterances for annotation based on previously annotated speech data and deficiencies in the various models.

Type: Grant

Filed: July 1, 2008

Date of Patent: December 28, 2010

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Tirso M. Alonso, Ilana Bromberg, Dilek Z. Hakkani-Tur, Barbara B. Hollister, Mazin G. Rahim, Giuseppe Riccardi, Lawrence Lyon Rose, Daniel Leon Stern, Gokhan Tur, James M. Wilson
Electronic instrument for speech recognition with standby time shortening and acoustic model deletion

Patent number: 7853448

Abstract: An electronic instrument includes: a display control unit for displaying a control content corresponding to the command information based on the result of the speech recognition; an instruction unit for instructing that a control for the control content displayed by the display control unit, is cancelled; a control unit for performing the control based on the command information based on the result of the speech recognition after a predetermined standby time elapses since the control content corresponding to the command information based on the result of the speech recognition starts to be displayed by the display control unit when the instruction unit does not instruct that the control for the control content is cancelled within the predetermined standby time, and for canceling the control based on the command information based on the result of the speech recognition when the instruction unit instructs that the control is cancelled within the predetermined standby time.

Type: Grant

Filed: April 16, 2007

Date of Patent: December 14, 2010

Assignee: Funai Electric Co., Ltd.

Inventors: Shusuke Narita, Susumu Tokoshima
Client-server speech recognition for altering processing time based on a value communicated between client and server

Patent number: 7831422

Abstract: Methods and systems for handling speech recognition processing in effectively real-time, via the Internet, in order that users do not experience noticeable delays from the start of an exercise until they receive responsive feedback. A user uses a client to access the Internet and a server supporting speech recognition processing, e.g., for language learning activities. The user inputs speech to the client, which transmits the user speech to the server in approximate real-time. The server evaluates the user speech in context of the current speech recognition exercise being executed, and provides responsive feedback to the client, again, in approximate real-time, with minimum latency delays. The client upon receiving responsive feedback from the server, displays, or otherwise provides, the feedback to the user.

Type: Grant

Filed: October 26, 2007

Date of Patent: November 9, 2010

Assignee: GlobalEnglish Corporation

Inventor: Christopher S. Jochumson
NOISE ROBUST SPEECH CLASSIFIER ENSEMBLE

Publication number: 20100280827

Abstract: Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.

Type: Application

Filed: April 30, 2009

Publication date: November 4, 2010

Applicant: Microsoft Corporation

Inventors: Kunal Mukerjee, Kazuhito Koishida, Shankar Regunathan
PRONUNCIATION VARIATION RULE EXTRACTION APPARATUS, PRONUNCIATION VARIATION RULE EXTRACTION METHOD, AND PRONUNCIATION VARIATION RULE EXTRACTION PROGRAM

Publication number: 20100268535

Abstract: A problem to be solved is to robustly detect a pronunciation variation example and acquire a pronunciation variation rule having a high generalization property, with less effort. The problem can be solved by a pronunciation variation rule extraction apparatus including a speech data storage unit, a base form pronunciation storage unit, a sub word language model generation unit, a speech recognition unit, and a difference extraction unit. The speech data storage unit stores speech data. The base form pronunciation storage unit stores base form pronunciation data representing base form pronunciation of the speech data. The sub word language model generation unit generates a sub word language model from the base form pronunciation data. The speech recognition unit recognizes the speech data by using the sub word language model.

Type: Application

Filed: November 27, 2008

Publication date: October 21, 2010

Inventor: Takafumi Koshinaka
Voice recognition method and system based on the contexual modeling of voice units

Patent number: 7818172

Abstract: The method of recognizing speech in an acoustic signal comprises developing acoustic stochastic models of voice units in the form of a set of states of an acoustic signal and using the acoustic models for recognition by a comparison of the signal with predetermined acoustic models obtained via a prior learning process. While developing the acoustic models, the voice units are modeled by means of a first portion of the states independent of adjacent voice units and by means of a second portion of the states dependent on adjacent voice units. The second portion of states dependent on adjacent voice units shares common parameters with a plurality of units sharing same phonemes.

Type: Grant

Filed: April 20, 2004

Date of Patent: October 19, 2010

Assignee: France Telecom

Inventors: Ronaldo Messina, Denis Jouvet
Speech recognition device, speech recognition method, and program

Patent number: 7813928

Abstract: A speech recognition device presenting whether a user's utterance is an unregistered word and whether the utterance should be repeated. The device includes a vocabulary storage unit (102) defining a vocabulary for speech recognition, and a speech recognition unit (101) checking the uttered speech against registered words. The device also includes a similarity calculation unit (103) calculating a similarity between the uttered speech and acoustic units, a judgment unit (104) judging, based on the check by the speech recognition unit (101) and the calculation performed by the similarity calculation unit (103), whether the uttered speech is a registered or unregistered word, an unregistered word unit (106) storing unregistered words, an unregistered word candidate search unit (105) searching the unregistered word unit (106) for unregistered word candidates, the, when the judgment unit (104) judges that the uttered speech is an unregistered word, and a display unit (107) displaying the result.

Type: Grant

Filed: June 2, 2005

Date of Patent: October 12, 2010

Assignee: Panasonic Corporation

Inventors: Yoshiyuki Okimoto, Tsuyoshi Inoue, Takashi Tsuzuki
State output probability calculating method and apparatus for mixture distribution HMM

Patent number: 7813925

Abstract: When adjacent times or the small change of an observation signal is determined, a distribution which maximizes the output probability of a mixture distribution does not change at a high possibility. By using this fact, when obtaining the output probability of the mixture distribution HMM, a distribution serving as a maximum output probability is stored. When adjacent times or the small change of the observation signal is determined, the output probability of the stored distribution serves as the output probability of the mixture distribution. This can reduce the output probability calculation of other distributions when calculating the output probability of the mixture distribution, thereby reducing the calculation amount required for output probabilities.

Type: Grant

Filed: April 6, 2006

Date of Patent: October 12, 2010

Assignee: Canon Kabushiki Kaisha

Inventors: Hiroki Yamamoto, Masayuki Yamada
Methods and systems for measuring the perceptual quality of communications

Patent number: 7801280

Abstract: Described are methods, systems, and devices that include obtaining a first measured perceptual quality by measuring, at a first location associated with a communications network, a perceptual quality of a first communication transmitted from the first location to a second location associated with the communications network, obtaining a second measured perceptual quality by measuring perceptual quality of the first communication at the second location; and, based on the first measured perceptual quality and the second measured perceptual quality, generating a first value representative of degradation in the quality of the first communication.

Type: Grant

Filed: December 15, 2004

Date of Patent: September 21, 2010

Assignee: Verizon Laboratories Inc.

Inventor: Adrian Evans Conway
Automatic speech recognition channel normalization based on measured statistics from initial portions of speech utterances

Patent number: 7797157

Abstract: Channel normalization for automatic speech recognition is provided. Statistics are measured from an initial portion of a speech utterance. Feature normalization parameters are estimated based on the measured statistics and a statistically derived mapping relating measured statistics and feature normalization parameters. In some examples, the measured statistics comprise measures of an energy from the initial portion of the speech utterance. In some examples, measures of the energy comprise extreme values of the energy.

Type: Grant

Filed: January 10, 2005

Date of Patent: September 14, 2010

Assignee: Voice Signal Technologies, Inc.

Inventors: Igor Zlokarnik, Laurence S. Gillick, Jordan Cohen
Dialog Prediction Using Lexical and Semantic Features

Publication number: 20100217592

Abstract: The present invention provides a method for identifying a turn, such as a sentence or phrase, for addition to a platform dialog comprising a plurality of turns. Lexical features of each of a set of candidate turns relative to one or more turns in the platform dialog are determined. Semantic features associated with each candidate turn and associated with the platform dialog are determined to identify one or more topics associated with each candidate turn and with the platform dialog. Lexical features of each candidate turn are compared to lexical features of the platform dialog and semantic features associated with each candidate turn are compared to semantic features of the platform dialog to rank the candidate turns based on similarity of lexical features and semantic features of each candidate turn to lexical features and semantic features of the platform dialog.

Type: Application

Filed: October 14, 2009

Publication date: August 26, 2010

Applicant: HONDA MOTOR CO., LTD.

Inventors: Rakesh Gupta, Lev-Arie Ratinov
Remote tracing and debugging of automatic speech recognition servers by speech reconstruction from cepstra and pitch information

Patent number: 7783488

Abstract: Methods and systems are provided for remote tuning and debugging of an automatic speech recognition system. Trace files are generated on-site from input speech by efficient, lossless compression of MFCC data, which is merged with compressed pitch and voicing information and stored as trace files. The trace files are transferred to a remote site where human-intelligible speech is reconstructed and analyzed. Based on the analysis, parameters of the automatic speech recognition system are remotely adjusted.

Type: Grant

Filed: December 19, 2005

Date of Patent: August 24, 2010

Assignee: Nuance Communications, Inc.

Inventors: Shay Ben-David, Baiju Dhirajlal Mandalia, Zohar Sivan, Alexander Sorin
FREQUENCY AXIS WARPING FACTOR ESTIMATION APPARATUS, SYSTEM, METHOD AND PROGRAM

Publication number: 20100204985

Abstract: A warping factor estimation system comprises label information generation unit that outputs voice/non-voice label information, warp model storage unit in which a probability model representing voice and non-voice occurrence probabilities is stored, and warp estimation unit that calculates a warping factor in the frequency axis direction using the probability model representing voice and non-voice occurrence probabilities, voice and non-voice labels, and a cepstrum.

Type: Application

Filed: September 22, 2008

Publication date: August 12, 2010

Inventor: Tadashi Emori
Speech activated control system and related methods

Patent number: 7774202

Abstract: A speech activated control system for controlling aerial vehicle components, program product, and associated methods are provided. The system can include a host processor adapted to develop speech recognition models and to provide speech command recognition. The host processor can be positioned in communication with a database for storing and retrieving speech recognition models. The system can include an avionic computer in communication with the host processor and adapted to provide command function management, a display and control processor in communication with the avionic computer adapted to provide a user interface between a user and the avionic computer, and a data interface positioned in communication with the avionic computer and the host processor provided to divorce speech command recognition functionality from vehicle or aircraft-related speech-command functionality.

Type: Grant

Filed: June 12, 2006

Date of Patent: August 10, 2010

Assignee: Lockheed Martin Corporation

Inventors: Richard P. Spengler, Jon C. Russo, Gregory W. Barnett, Kermit L. Armbruster
DYNAMIC PRUNING FOR AUTOMATIC SPEECH RECOGNITION

Publication number: 20100198597

Abstract: Methods, speech recognition systems, and computer readable media are provided that recognize speech using dynamic pruning techniques. A search network is expanded based on a frame from a speech signal, a best hypothesis is determined in the search network, a default beam threshold is modified, and the search network is pruned using the modified beam threshold. The search network may be further pruned based on the search depth of the best hypothesis and/or the average number of frames per state for a search path.

Type: Application

Filed: January 30, 2009

Publication date: August 5, 2010

Inventor: Qifeng ZHU
Spoken mobile engine

Patent number: 7761293

Abstract: Systems and methods are disclosed to operate a mobile device. The system includes a message center; an engine coupled to the message center; and a mobile device wirelessly coupled to the message center, wherein the engine specifies one or more meeting locations and wherein at least one meeting location comprises a location designated by an advertiser.

Type: Grant

Filed: March 6, 2006

Date of Patent: July 20, 2010

Inventor: Bao Q. Tran
Timing of speech recognition over lossy transmission systems

Patent number: 7752036

Abstract: Recognizing a stream of speech received as speech vectors over a lossy communications link includes constructing for a speech recognizer a series of speech vectors from packets received over a lossy packetized transmission link, wherein some of the packets associated with each speech vector are lost or corrupted during transmission. Each constructed speech vector is multi-dimensional and includes associated features. After waiting for a predetermined time, speech vectors are generated and potentially corrupted features within the speech vector are indicated to the speech recognizer when present. Speech recognition is attempted at the speech recognizer on the speech vectors when corrupted features are present. This recognition may be based only on certain or valid features within each speech vector. Retransmission of a missing or corrupted packet is requested when corrupted values are indicated by the indicating step and when the attempted recognition step fails.

Type: Grant

Filed: December 29, 2008

Date of Patent: July 6, 2010

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Richard Vandervoort Cox, Stephen Michael Marcus, Mazin G. Rahim, Nambirajan Seshadri, Robert Douglas Sharp
Systems and methods for comparing speech elements

Patent number: 7752045

Abstract: A method for comparing a first audio data source with a plurality of audio data sources, wherein the first audio data source has an utterance spoken by a first person and the plurality of audio data sources have the same utterance spoken by a second person. The method includes performing a speech recognition function on the first audio data source to isolate at least one element of the first audio data source. The method also includes comparing the isolated element with a corresponding element in the plurality of audio data sources and determining whether the utterance spoken by the first person contained an error based on the comparison.

Type: Grant

Filed: October 7, 2002

Date of Patent: July 6, 2010

Assignees: Carnegie Mellon University, Carnegie Speech Company, Inc.

Inventor: Maxine Eskenazi
Utterance Processing For Network-Based Speech Recognition Utilizing A Client-Side Cache

Publication number: 20100161328

Abstract: Embodiments are provided for utilizing a client-side cache for utterance processing to facilitate network based speech recognition. An utterance comprising a query is received in a client computing device. The query is sent from the client to a network server for results processing. The utterance is processed to determine a speech profile. A cache lookup is performed based on the speech profile to determine whether results data for the query is stored in the cache. If the results data is stored in the cache, then a query is sent to cancel the results processing on the network server and the cached results data is displayed on the client computing device.

Type: Application

Filed: December 18, 2008

Publication date: June 24, 2010

Applicant: Microsoft Corporation

Inventors: Andrew K. Krumel, Shuangyu Chang, Robert L. Chambers
Correlating call data and speech recognition information in a telephony application

Patent number: 7738635

Abstract: A method for improving the recognition confidence of alphanumeric spoken input, suitable for use in a speech recognition telephony application such as a voice response system. An alphanumeric candidate is determined from the spoken input, which may be the best available representation of the spoken input. Recognition confidence is compared with a preestablished threshold. If the recognition confidence exceeds the threshold, the alphanumeric candidate is selected to represent the spoken input. Otherwise, present call data associated with the spoken input is determined. Call data may include automatic number identification (ANI) information, caller-ID information, and/or dialed number information service (DNIS) information. Information associated with the alphanumeric candidate and information associated with the present call data are correlated in order to select alphanumeric information that best represents the spoken input.

Type: Grant

Filed: January 6, 2005

Date of Patent: June 15, 2010

Assignee: International Business Machines Corporation Nuance Communications, Inc.

Inventors: Christopher Ryan Groves, Kevin James Muterspaugh
METHOD AND APPARATUS FOR SPEECH RECOGNITION USING DOMAIN ONTOLOGY

Publication number: 20100145680

Abstract: A speech recognition method using a domain ontology includes: constructing domain ontology DB; forming a speech recognition grammar using the formed domain ontology DB; extracting a feature vector from a speech signal; modeling the speech signal using an acoustic model. The method performs speech recognition by using the acoustic model, the speech recognition dictionary and the speech recognition grammar on the basis of the feature vector.

Type: Application

Filed: September 1, 2009

Publication date: June 10, 2010

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Seung YUN, Soo Jong Lee, Jeong Se Kim, Il Bin Lee, Jun Park, Sang Kyu Park
Speech recognition enhancer

Patent number: 7734472

Abstract: The invention concerns a speech recognition enhancer (51) and a speech recognition system comprising such speech recognition enhancer (51), an audio input unit (41) and a speech recognizer (61, 3). The speech recognition enhancer (51) is arranged between the audio input unit (41) and the speech recognizer (61, 3). The speech recognition enhancer (51) has a parametrizable pre-filtering unit (511), a parametrizable dynamic voice level control unit (512), a parametrizable noise reduction unit (513) and a parametrizable voice level control unit (514). The parameters of these parametrizable units (511, 512, 513, 514) are adjusted to the characteristics of the specific audio input unit (41) and/or the characteristics of the specific speech recognizer (61, 3) for adapting the audio input unit (41) to the speech recognizer (61, 3).

Type: Grant

Filed: September 29, 2004

Date of Patent: June 8, 2010

Assignee: Alcatel

Inventor: Michael Walker
Zero-search, zero-memory vector quantization

Patent number: 7729910

Abstract: The invention comprises a method for lossy data compression, akin to vector quantization, in which there is no explicit codebook and no search, i.e. the codebook memory and associated search computation are eliminated. Some memory and computation are still required, but these are dramatically reduced, compared to systems that do not exploit this method. For this reason, both the memory and computation requirements of the method are exponentially smaller than comparable methods that do not exploit the invention. Because there is no explicit codebook to be stored or searched, no such codebook need be generated either. This makes the method well suited to adaptive coding schemes, where the compression system adapts to the statistics of the data presented for processing: both the complexity of the algorithm executed for adaptation, and the amount of data transmitted to synchronize the sender and receiver, are exponentially smaller than comparable existing methods.

Type: Grant

Filed: June 25, 2004

Date of Patent: June 1, 2010

Assignee: Agiletv Corporation

Inventor: Harry Printz
System and method for latency reduction for automatic speech recognition using partial multi-pass results

Patent number: 7729912

Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.

Type: Grant

Filed: December 23, 2003

Date of Patent: June 1, 2010

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
Speech Recognition Based on a Multilingual Acoustic Model

Publication number: 20100131262

Abstract: Embodiments of the invention relate to methods for generating a multilingual acoustic model. A main acoustic model comprising a main acoustic model having probability distribution functions and a probabilistic state sequence model including first states is provided to a processor. At least one second acoustic model including probability distribution functions and a probabilistic state sequence model including states is also provided to the processor. The processor replaces each of the probability distribution functions of the at least one second acoustic model by one of the probability distribution functions and/or each of the states of the probabilistic state sequence model of the at least one second acoustic model with the state of the probabilistic state sequence model of the main acoustic model based on a criteria set to obtain at least one modified second acoustic model. The criteria set may be a distance measurement.

Type: Application

Filed: November 25, 2009

Publication date: May 27, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Rainer Gruhn, Martin Raab, Raymond Brueckner
Interactive control system and method

Patent number: 7725317

Abstract: An interactive control system is disclosed with which the recognition rate and the responsiveness when operating a plurality of interactive services in parallel can be improved. A recognition lexicon and consolidated and reorganized information are generated for each individual interaction. Thus, excessive growth of the recognition lexicon can be avoided and a lowering of the recognition rate can be prevented. Moreover, based on the consolidated and reorganized information, it is possible to specify interactive services that may respond to the same input information, so that responses that are unexpected for the user can be prevented.

Type: Grant

Filed: July 14, 2004

Date of Patent: May 25, 2010

Assignee: Fujitsu Limited

Inventors: Eiji Kitagawa, Toshiyuki Fukuoka, Ryosuke Miyata
Method and apparatus for constructing a speech filter using estimates of clean speech and noise

Patent number: 7725314

Abstract: A method and apparatus identify a clean speech signal from a noisy speech signal. To do this, a clean speech value and a noise value are estimated from the noisy speech signal. The clean speech value and the noise value are then used to define a gain on a filter. The noisy speech signal is applied to the filter to produce the clean speech signal. Under some embodiments, the noise value and the clean speech value are used in both the numerator and the denominator of the filter gain, with the numerator being guaranteed to be positive.

Type: Grant

Filed: February 16, 2004

Date of Patent: May 25, 2010

Assignee: Microsoft Corporation

Inventors: Jian Wu, James G. Droppo, Li Deng, Alejandro Acero
System and method for an automatic set-up of speech recognition engines

Patent number: 7716047

Abstract: A system and method for an automatic set-up of speech recognition engines may include a speech recognizer configured to perform speech recognition procedures to identify input speech data according to one or more operating parameters. A merit manager may be utilized to automatically calculate merit values corresponding to the foregoing recognition procedures. These merit values may incorporate recognition accuracy information, recognition speed information, and a user-specified weighting factor that shifts the relative effect of the recognition accuracy information and the recognition speed information on the merit values. The merit manager may then automatically perform a merit value optimization procedure to select operating parameters that correspond to an optimal one of the merit values.

Type: Grant

Filed: March 31, 2003

Date of Patent: May 11, 2010

Assignees: Sony Corporation, Sony Electronics Inc.

Inventors: Gustavo Hernandez-Abrego, Xavier Menendez-Pidal, Thomas Kemp, Katsuki Minamino, Helmut Lucke
Method and system for matching speech data

Patent number: 7707032

Abstract: A method and system used to determine the similarity between an input speech data and a sample speech data is provided. First, the input speech data is segmented into a plurality of input speech frames and the sample speech data is segmented into a plurality of sample speech frames. Then, the input speech frames and the sample speech frames are used to build a matching matrix, wherein the matching matrix comprises the distance values between each of the input speech frames and each of the sample speech frames. Next, the distance values are used to calculate a matching score. Finally, the similarity between the input speech data and the sample speech data is determined according to this matching score.

Type: Grant

Filed: October 20, 2005

Date of Patent: April 27, 2010

Assignee: National Cheng Kung University

Inventors: Jhing-Fa Wang, Po-Chuan Lin, Li-Chang Wen
Segmental tonal modeling for tonal languages

Patent number: 7684987

Abstract: A phone set for use in speech processing such as speech recognition or text-to-speech conversion is used to model or form syllables of a tonal language having a plurality of different tones. Each syllable includes an initial part that can be glide dependent and a final part. The final part includes a plurality of phones. Each phones carries partial tonal information such that the phones taken together implicitly and jointly represent the different tones.

Type: Grant

Filed: January 21, 2004

Date of Patent: March 23, 2010

Assignee: Microsoft Corporation

Inventors: Min Chu, Chao Huang
Discriminative training for language modeling

Patent number: 7680659

Abstract: A method of training language model parameters trains discriminative model parameters in the language model based on a performance measure having discrete values.

Type: Grant

Filed: June 1, 2005

Date of Patent: March 16, 2010

Assignee: Microsoft Corporation

Inventors: Jianfeng Gao, Hisami Suzuki
Method and apparatus for speech recognition

Patent number: 7680658

Abstract: A method and apparatus for enhancing the performance of speech recognition by adaptively changing a process of determining the final, recognized word depending on a user's selection in a list of alternative words represented by a result of speech recognition. A speech recognition method comprising: inputting speech uttered by a user; recognizing the input speech and creating a predetermined number of alternative words to be recognized in the order of similarity; and displaying a list of alternative words arranged in a predetermined order and determining an alternative word that a cursor currently indicates as the final, recognized word if a user's selection from the list of alternative words has not been changed within a predetermined standby time.

Type: Grant

Filed: December 31, 2003

Date of Patent: March 16, 2010

Assignee: Samsung Electronics Co., Ltd.

Inventors: Seung-nyung Chung, Myung-hyun Yoo, Jay-woo Kim, Joon-ah Park
Method and System for Parsing of a Speech Signal

Publication number: 20100063816

Abstract: A method for processing an analog speech signal for speech recognition. The analog speech signal is sampled to produced a sampled speech signal. The sampled speech signal is framed into multiple frames of the sampled speech signal. The absolute value of the sampled speech signal is integrated within the frames and respective integrated-absolute values of the frames are determined. Based on the integrated-absolute values, the sampled speech signal is cut into segments of non-uniform duration. The segments are not as yet identified as parts of speech prior to and during the cutting.

Type: Application

Filed: September 7, 2008

Publication date: March 11, 2010

Inventors: Ronen Faifkov, Rabin Cohen-Tov
Incorporation of external knowledge in multimodal dialog systems

Patent number: 7668716

Abstract: A device improves speech recognition accuracy by utilizing an external knowledge source. The device receives a speech recognition result from an automatic speech recognition (ASR) engine, the speech recognition result including a plurality of ordered interpretations, wherein each of the ordered interpretations includes a plurality of information items. The device analyzes and filters the plurality of interpretations using an external knowledge source to create a filtered plurality of ordered interpretations. The device stores the filtered plurality of ordered interpretations to a memory. The device transmits the filtered plurality of ordered interpretations to a dialog manager module to create a textual output. Alternatively, the dialog manager module retrieves the filtered plurality of ordered interpretations from a memory.

Type: Grant

Filed: May 5, 2006

Date of Patent: February 23, 2010

Assignee: Dictaphone Corporation

Inventors: Marc Helbing, Klaus Reifenrath
Speech recognition accuracy with multi-confidence thresholds

Patent number: 7657433

Abstract: A speech recognition system uses multiple confidence thresholds to improve the quality of speech recognition results. The choice of which confidence threshold to use for a particular utterance may be based on one or more features relating to the utterance. In one particular implementation, the speech recognition system includes a speech recognition engine that provides speech recognition results and a confidence score for an input utterance. The system also includes a threshold selection component that determines, based on the received input utterance, a threshold value corresponding to the input utterance. The system further includes a threshold component that accepts the recognition results based on a comparison of the confidence score to the threshold value.

Type: Grant

Filed: September 8, 2006

Date of Patent: February 2, 2010

Assignee: TellMe Networks, Inc.

Inventor: Shuangyu Chang
METHOD AND DEVICE FOR ASCERTAINING FEATURE VECTORS FROM A SIGNAL

Publication number: 20100017207

Abstract: A signal is used to form intermediate feature vectors which are subjected to high-pass filtering. The high-pass-filtered intermediate feature vectors have a respective prescribed addition feature vector added to them.

Type: Application

Filed: September 24, 2009

Publication date: January 21, 2010

Applicant: Infineon Technologies AG

Inventors: Werner Hemmert, Marcus Holmberg
Apparatus, method, and computer program product for speech recognition

Patent number: 7647224

Abstract: A speech recognition apparatus includes a sound information acquiring unit that acquires sound information, a unit segment dividing unit that divides the sound information into plural unit segments, a segment information acquiring unit that acquires segment information that indicates a feature of each unit segment, a segment relation value calculating unit that calculates a segment relation value that indicates a relative feature of a target segment which is a unit segment to be processed with respect to an adjacent segment which is a unit segment adjacent to the target segment, based on segment information of the target segment and segment information of the adjacent segment among the segment information, a recognition candidate storing unit that stores recognition candidates that are targets of speech recognition, and a recognition result selecting unit that selects a recognition result from the recognition candidates stored in the recognition candidate storing unit utilizing the segment relation value.

Type: Grant

Filed: November 23, 2005

Date of Patent: January 12, 2010

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masahide Ariu, Shinichi Tanaka, Takashi Masuko
METHOD AND APPARATUS FOR MEASURING THE INTELLIGIBILITY OF AN AUDIO ANNOUNCEMENT DEVICE

Publication number: 20090319268

Abstract: A method and an apparatus for measuring the intelligibility level of an audio announcement device (40), employ at least one speech recognition module (418; 518) for analyzing the reconstructed verbal content of the audio message announced by the audio announcement device (40), optionally by comparison with the verbal content of an original audio message.

Type: Application

Filed: June 19, 2009

Publication date: December 24, 2009

Applicant: ARCHEAN TECHNOLOGIES

Inventors: Xavier AUMONT, Antoine WILHELM-JAUREGUIBERRY

prev … 6 7 8 9 10 11 12 13 14 … next