Creation Of Reference Templates; Training Of Speech Recognition Systems, E.g., Adaption To The Characteristics Of The Speaker's Voice, Etc. (epo) Patents (Class 704/E15.007)

E Subclasses

Training (epo) (Class 704/E15.008)

Adaptation (epo) (Class 704/E15.009)

SYSTEM AND METHOD FOR SPEECH PERSONALIZATION BY NEED

Publication number: 20100312556

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.

Type: Application

Filed: June 9, 2009

Publication date: December 9, 2010

Applicant: AT & T Intellectual Property I , L.P.

Inventors: Andrej LJOLJE, Alistair D. CONKIE, Ann K. SYRDAL
TEXT ANALYSIS USING PHRASE DEFINITIONS AND CONTAINERS

Publication number: 20100250235

Abstract: In one example, a phrase analyzer may analyze a text input stream to identify phrases contained in the text input stream. The phrase analyzer may receive a specification, which includes dictionaries of phrases and synonyms, and a specification of the phrases, or sequences of phrases to be matched. The phrase analyzer may compare the input stream to the specification and may produce, as output, an identification of which phrases appear in the input stream, and where in the input stream those phrases occur.

Type: Application

Filed: March 24, 2009

Publication date: September 30, 2010

Applicant: MICROSOFT CORPORATION

Inventor: Umesh Madan
ADAPTATION FOR STATISTICAL LANGUAGE MODEL

Publication number: 20100250251

Abstract: Architecture that suppresses the unexpected appearance of words by applying appropriate restrictions to long-term and short-term memory. The quickness of adaptation is also realized by leveraging the restriction. The architecture includes a history component for processing user input history for conversion of a phonetic string by a conversion process that output conversion results, and an adaptation component for adapting the conversion process to the user input history based on restriction(s) applied to short-term memory that impacts word appearances during the conversion process. The architecture performs probability boosting based on context-dependent probability differences (short-term memory), and dynamic linear-interpolation between long-term memory and baseline language model based on frequency of preceding context of word (long-term memory).

Type: Application

Filed: March 30, 2009

Publication date: September 30, 2010

Applicant: Microsoft Corporation

Inventors: Katsutoshi Ohtsuki, Takashi Umeoka
SYSTEM AND METHOD FOR USING META-DATA DEPENDENT LANGUAGE MODELING FOR AUTOMATIC SPEECH RECOGNITION

Publication number: 20100241430

Abstract: Disclosed are systems and methods for providing a spoken dialog system using meta-data to build language models to improve speech processing. Meta-data is generally defined as data outside received speech; for example, meta-data may be a customer profile having a name, address and purchase history of a caller to a spoken dialog system. The method comprises building tree clusters from meta-data and estimating a language model using the built tree clusters. The language model may be used by various modules in the spoken dialog system, such as the automatic speech recognition module and/or the dialog management module. Building the tree clusters from the meta-data may involve generating projections from the meta-data and further may comprise computing counts as a result of unigram tree clustering and then building both unigram trees and higher-order trees from the meta-data as well as computing node distances within the built trees that are used for estimating the language model.

Type: Application

Filed: June 3, 2010

Publication date: September 23, 2010

Applicant: AT&T Intellectual Property II, L.P., via transfer from AT&T Corp.

Inventors: Michiel A. U. Bacchiani, Brian E. Roark
SPEECH ANALYZER AND SPEECH ANALYSYS METHOD

Publication number: 20100204990

Abstract: A speech analyzer includes a vocal tract and sound source separating unit which separates a vocal tract feature and a sound source feature from an input speech, based on a speech generation model, a fundamental frequency stability calculating unit which calculates a temporal stability of a fundamental frequency of the input speech in the sound source feature, from the separated sound source feature, a stable analyzed period extracting unit which extracts time information of a stable period, based on the temporal stability, and a vocal tract feature interpolation unit which interpolates a vocal tract feature which is not included in the stable period, using a vocal tract feature included in the extracted stable period.

Type: Application

Filed: May 3, 2010

Publication date: August 12, 2010

Inventors: Yoshifumi Hirose, Takahiro Kamai
INFORMATION PROCESSING APPARATUS, METHOD AND RECORDING MEDIUM FOR GENERATING ACOUSTIC MODEL

Publication number: 20100169093

Abstract: An information processing apparatus for speech recognition includes a first speech dataset storing speech data uttered by low recognition rate speakers; a second speech dataset storing speech data uttered by a plurality of speakers; a third speech dataset storing speech data to be mixed with the speech data of the second speech dataset; a similarity calculating part obtaining, for each piece of the speech data in the second speech dataset, a degree of similarity to a given average voice in the first speech dataset; a speech data selecting part recording the speech data, the degree of similarity of which is within a given selection range, as selected speech data in the third speech dataset; and an acoustic model generating part generating a first acoustic model using the speech data recorded in the second speech dataset and the third speech dataset.

Type: Application

Filed: December 22, 2009

Publication date: July 1, 2010

Applicant: FUJITSU LIMITED

Inventor: Nobuyuki Washio
SPEAKER ADAPTATION APPARATUS AND PROGRAM THEREOF

Publication number: 20100169094

Abstract: A speaker adaptation apparatus includes an acquiring unit configured to acquire an acoustic model including HMMs and decision trees for estimating what type of the phoneme or the word is included in a feature value used for speech recognition, the HMMs having a plurality of states on a phoneme-to-phoneme basis or a word-to-word basis, and the decision trees being configured to reply to questions relating to the feature value and output likelihoods in the respective states of the HMMs, and a speaker adaptation unit configured to adapt the decision trees to a speaker, the decision trees being adapted using speaker adaptation data vocalized by the speaker of an input speech.

Type: Application

Filed: September 17, 2009

Publication date: July 1, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masami Akamine, Jitendra Ajmera, Partha Lal
SYSTEM-EFFECTED METHODS FOR ANALYZING, PREDICTING, AND/OR MODIFYING ACOUSTIC UNITS OF HUMAN UTTERANCES FOR USE IN SPEECH SYNTHESIS AND RECOGNITION

Publication number: 20100161327

Abstract: A computer-implemented method for automatically analyzing, predicting, and/or modifying acoustic units of prosodic human speech utterances for use in speech synthesis or speech recognition. Possible steps include: initiating analysis of acoustic wave data representing the human speech utterances, via the phase state of the acoustic wave data; using one or more phase state defined acoustic wave metrics as common elements for analyzing, and optionally modifying, pitch, amplitude, duration, and other measurable acoustic parameters of the acoustic wave data, at predetermined time intervals; analyzing acoustic wave data representing a selected acoustic unit to determine the phase state of the acoustic unit; and analyzing the acoustic wave data representing the selected acoustic unit to determine at least one acoustic parameter of the acoustic unit with reference to the determined phase state of the selected acoustic unit. Also included are systems for implementing the described and related methods.

Type: Application

Filed: December 16, 2009

Publication date: June 24, 2010

Inventors: Nishant CHANDRA, Reiner Wilhelms-Tricarico, Rattima Nitisaroj, Brian Mottershead, Gary A. Marple, John B. Reichenbach
METHOD FOR DYNAMIC LEARNING OF INDIVIDUAL VOICE PATTERNS

Publication number: 20100153108

Abstract: The present invention is a system and method for generating a personal voice font including, monitoring voice segments automatically from phone conversations of a user by a voice learning processor to generate a personalized voice font and delivering the personalized voice font (PVF) to the a server.

Type: Application

Filed: February 10, 2009

Publication date: June 17, 2010

Inventors: Zsolt Szalai, Philippe Bazot, Bernard Pucci, Joel Viale
Method for Adapting a Codebook for Speech Recognition

Publication number: 20100138222

Abstract: A method for adapting a codebook for speech recognition, wherein the codebook is from a set of codebooks comprising a speaker-independent codebook and at least one speaker-dependent codebook is disclosed. A speech input is received and a feature vector based on the received speech input is determined. For each of the Gaussian densities, a first mean vector is estimated using an expectation process and taking into account the determined feature vector. For each of the Gaussian densities, a second mean vector using an Eigenvoice adaptation is determined taking into account the determined feature vector. For each of the Gaussian densities, the mean vector is set to a convex combination of the first and the second mean vector. Thus, this process allows for adaptation during operation and does not require a lengthy training phase.

Type: Application

Filed: November 20, 2009

Publication date: June 3, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Tobias Herbig, Franz Gerl
APPARATUS AND METHOD FOR GENERATING AND VERIFYING A VOICE SIGNATURE OF A MESSAGE AND COMPUTER READABLE MEDIUM THEREOF

Publication number: 20100131272

Abstract: Apparatuses and methods for generating and verifying a voice signature of a message and computer readable medium thereof are provided. The generation and verification ends both use the same set of pronounceable symbols. The set of pronounceable symbols comprises a plurality of pronounceable units, and each of the pronounceable units comprises an index and a pronounceable symbol. The generation end converts the message into a message digest by a hash function and generates a plurality of designated pronounceable symbols according to the message digest. A user utters the designated pronounceable symbols to generate the voice signature. After receiving the message and the voice signature, the verification end performs voice authentication to determine a user identity of the voice signature, performs speech recognition to determine the relation between the message and the voice signature, and determines whether the user generates the voice signature for the message.

Type: Application

Filed: January 6, 2009

Publication date: May 27, 2010

Applicant: INSTITUTE FOR INFORMATION INDUSTRY

Inventor: Jui-Ming WU
METHOD AND SYSTEM FOR MODELING A COMMON-LANGUAGE SPEECH RECOGNITION, BY A COMPUTER, UNDER THE INFLUENCE OF A PLURALITY OF DIALECTS

Publication number: 20100121640

Abstract: The present invention relates to a method for modeling a common-language speech recognition, by a computer, under the influence of multiple dialects and concerns a technical field of speech recognition by a computer. In this method, a triphone standard common-language model is first generated based on training data of standard common language, and first and second monophone dialectal-accented common-language models are based on development data of dialectal-accented common languages of first kind and second kind, respectively. Then a temporary merged model is obtained in a manner that the first dialectal-accented common-language model is merged into the standard common-language model according to a first confusion matrix obtained by recognizing the development data of first dialectal-accented common language using the standard common-language model.

Type: Application

Filed: October 29, 2009

Publication date: May 13, 2010

Applicants: SONY COMPUTER ENTERTAINMENT INC., TSINGHUA UNIVERSITY

Inventors: Fang Zheng, Xi Xiao, Linquan Liu, Zhan You, Wenxiao Cao, Makoto Akabane, Ruxin Chen, Yoshikazu Takahashi
SYSTEM AND METHOD FOR AUTOMATIC SPEECH TO TEXT CONVERSION

Publication number: 20100121638

Abstract: Speech recognition is performed in near-real-time and improved by exploiting events and event sequences, employing machine learning techniques including boosted classifiers, ensembles, detectors and cascades and using perceptual clusters. Speech recognition is also improved using tandem processing. An automatic punctuator injects punctuation into recognized text streams.

Type: Application

Filed: November 11, 2009

Publication date: May 13, 2010

Inventors: Mark PINSON, David Pinson, SR., Mary Flanagan, Shahrokh Makanvand
Updating a Voice Template

Publication number: 20100106501

Abstract: Updating a voice template for recognizing a speaker on the basis of a voice uttered by the speaker is disclosed. Stored voice templates indicate distinctive characteristics of utterances from speakers. Distinctive characteristics are extracted for a specific speaker based on a voice message utterance received from that speaker. The distinctive characteristics are compared to the characteristics indicated by the stored voice templates to selected a template that matches within a predetermined threshold. The selected template is updated on the basis of the extracted characteristics.

Type: Application

Filed: October 27, 2009

Publication date: April 29, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yukari Miki, Masami Noguchi
Multitask Learning for Spoken Language Understanding

Publication number: 20100100380

Abstract: A system, method and computer-readable medium provide a multitask learning method for intent or call-type classification in a spoken language understanding system. Multitask learning aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.

Type: Application

Filed: December 28, 2009

Publication date: April 22, 2010

Applicant: AT&T Corp.

Inventor: Gokhan Tur
STRUCTURED MODELS OF REPITITION FOR SPEECH RECOGNITION

Publication number: 20100076765

Abstract: Described is a technology by which a structured model of repetition is used to determine the words spoken by a user, and/or a corresponding database entry, based in part on a prior utterance. For a repeated utterance, a joint probability analysis is performed on (at least some of) the corresponding word sequences as recognized by one or more recognizers) and associated acoustic data. For example, a generative probabilistic model, or a maximum entropy model may be used in the analysis. The second utterance may be a repetition of the first utterance using the exact words, or another structural transformation thereof relative to the first utterance, such as an extension that adds one or more words, a truncation that removes one or more words, or a whole or partial spelling of one or more words.

Type: Application

Filed: September 19, 2008

Publication date: March 25, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Geoffrey G. Zweig, Xiao Li, Dan Bohus, Alejandro Acero, Eric J. Horvitz
METHOD AND APPARATUS FOR AGGREGATING AND PRESENTING DATA ASSOCIATED WITH GEOGRAPHIC LOCATIONS

Publication number: 20100076968

Abstract: Implementations relate to systems and methods for aggregating and presenting data related to geographic locations. Geotag data related to geographic locations and associated features or attributes can be collected to build a regional profile characterizing a set of locations within the region. Geotag data related to the constituent locations, such as user ratings or popularity ranks for restaurants, shops, parks, or other features, sites, or attractions, can be combined to generate a profile of characteristics of locations in the region. The platform can generate recommendations of locations to transmit to the user of a mobile device, based for instance on the location of the device in the region as reported by GPS or other location service and the regional profile. Geotag data can include audio data analyzed using region-specific terms, and user recommendations can be presented via dynamic menus based on regional profiles, user preferences or other criteria.

Type: Application

Filed: May 21, 2009

Publication date: March 25, 2010

Inventors: Mark R. BOYNS, Chand MEHTA, Jeffrey C. TSAY, Giridhar D. MANDYAM
ACOUSTIC MODEL REGISTRATION APPARATUS, TALKER RECOGNITION APPARATUS, ACOUSTIC MODEL REGISTRATION METHOD AND ACOUSTIC MODEL REGISTRATION PROCESSING PROGRAM

Publication number: 20100063817

Abstract: An acoustic model registration apparatus, an talker recognition apparatus, an acoustic model registration method and an acoustic model registration processing program, each of which prevents certainly an acoustic model having a low recognition capability for talker from being registered certainly, are provided.

Type: Application

Filed: March 14, 2007

Publication date: March 11, 2010

Applicant: Pioneer Corporation

Inventors: Soichi Toyama, Ikuo Fujita, Yukio Kamoshida
METHOD AND SYSTEM FOR CREATING OR UPDATING ENTRIES IN A SPEECH RECOGNITION LEXICON

Publication number: 20100057461

Abstract: In a method and a system (20) for creating or updating entries in a speech recognition (SR) lexicon (7) of a speech recognition system, said entries mapping speech recognition (SR) phoneme sequences to words, said method comprising entering a respective word, and in the case that the word is a new word to be added to the SR lexicon, also entering at least one associated SR phoneme sequence through input means (26), it is provided that the SR phoneme sequence associated with the respective word is converted into speech by phoneme to speech conversion means (4.4), and the speech is played back by playback means (28), to control the match of the phoneme sequence and the word.

Type: Application

Filed: February 4, 2008

Publication date: March 4, 2010

Inventors: Andreas Neubacher, Gerhard Grobauer
Recognizing the Numeric Language in Natural Spoken Dialogue

Publication number: 20100049519

Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.

Type: Application

Filed: November 5, 2009

Publication date: February 25, 2010

Applicant: AT&T Corp.

Inventors: Mazin G. Rahim, Giuseppe Riccardi, Jeremy Huntley Wright, Bruce Melvin Buntschuh, Allen Louis Gorin
Methods and Systems for Providing Grammar Services

Publication number: 20100036661

Abstract: A computing system, comprising: an I/O platform for interfacing with a user; and a processing entity configured to implement a dialog with the user via the I/O platform. The processing entity is further configured for: identifying a grammar template and an instantiation context associated with a current point in the dialog; causing creation of an instantiated grammar model from the grammar template and the instantiation context; storing the instantiated grammar model in a memory; and interpreting user input received via the I/O platform in accordance with the instantiated grammar model. Also, a grammar authoring environment supporting a variety of grammar development tools is disclosed.

Type: Application

Filed: July 15, 2009

Publication date: February 11, 2010

Applicant: NU ECHO INC.

Inventors: Dominique Boucher, Yves Normandin
SPEECH PROCESSING METHOD, SPEECH PROCESSING PROGRAM, AND SPEECH PROCESSING DEVICE

Publication number: 20090326952

Abstract: [Problems] To convert a signal of non-audible murmur obtained through an in-vivo conduction microphone into a signal of a speech that is recognizable for (hardly misrecognized by) a receiving person with maximum accuracy.

Type: Application

Filed: February 7, 2007

Publication date: December 31, 2009

Applicant: NATIONAL UNIVERSITY CORPORATION NARA INSTITUTE OF SCIENCE AND TECHNOLOGY

Inventors: Tomoki Toda, Mikihiro Nakagiri, Hideki Kashioka, Kiyohiro Shikano
BOUNDARY ESTIMATION APPARATUS AND METHOD

Publication number: 20090265166

Abstract: A boundary estimation apparatus includes an boundary estimation unit which estimates a first boundary separating a speech into first meaning units, a boundary estimation unit configured to estimate a second boundary separating a speech, related to the speech, into second meaning units related to the first meaning units, a pattern generating unit configured to generate a representative pattern showing representative characteristic in the analysis interval, a similarity calculation unit configured to calculate a similarity between the representative pattern and a characteristic pattern showing feature in a calculation interval for calculating the similarity in the speech, and the boundary estimation unit estimate as the second boundary based on the calculation interval, in which the similarity is higher than a threshold value or relatively high.

Type: Application

Filed: June 30, 2009

Publication date: October 22, 2009

Inventor: Kazuhiko Abe
Voice Recognition Apparatus

Publication number: 20090259467

Abstract: A voice recognition apparatus 10 carries out voice recognition of an inputted voice with reference to a voice recognition dictionary, and outputs a voice recognition result. In this voice recognition apparatus, a plurality of voice recognition dictionaries 23-1 to 23-N are provided according to predetermined classification items.

Type: Application

Filed: August 16, 2006

Publication date: October 15, 2009

Inventors: Yuki Sumiyoshi, Reiko Okada
SPEECH RECOGNITION ON LARGE LISTS USING FRAGMENTS

Publication number: 20090210230

Abstract: A system and method is provided for recognizing a speech input and selecting an entry from a list of entries. The method includes recognizing a speech input. A fragment list of fragmented entries is provided and compared to the recognized speech input to generate a candidate list of best matching entries based on the comparison result. The system includes a speech recognition module, and a data base for storing the list of entries and the fragmented list. The speech recognition module may obtain the fragmented list from the data base and store a candidate list of best matching entries in memory. A display may also be provided to allow the user to select from a list of best matching entries.

Type: Application

Filed: January 16, 2009

Publication date: August 20, 2009

Applicant: Harman Becker Automotive Systems GmbH

Inventor: Markus Schwarz
Measurement of Spoken Language Training, Learning & Testing

Publication number: 20090204398

Abstract: The fluency of a spoken utterance or passage is measure and presented to the speaker and to others. In one embodiment, a method is described that includes recording a spoken utterance, evaluating the spoken utterance for accuracy, evaluating the spoken utterance for duration, and assigning a score to the spoken utterance based on the accuracy and the duration.

Type: Application

Filed: June 24, 2005

Publication date: August 13, 2009

Inventors: Robert Du, Lingfei Song, Nan N. Li, Minerva Yeung
REDUCING A SIZE OF A COMPILED SPEECH RECOGNITION GRAMMAR

Publication number: 20090171663

Abstract: The present invention discloses creating and using speech recognition grammars of reduced size. The reduced speech recognition grammars can include a set of entries, each entry having a unique identifier and a phonetic representation that is used when matching speech input against the entries. Each entry can lack a textual spelling corresponding to the phonetic representation. The reduced speech recognition grammar can be digitally encoded and stored in a computer readable media, such as a hard drive or flash memory of a portable speech enabled device.

Type: Application

Filed: January 2, 2008

Publication date: July 2, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: DANIEL E. BADT, VLADIMIR BERGL, JOHN W. ECKHART, RADEK HAMPL, JONATHAN PALGON, HARVEY M. RUBACK
Semantic Decoding of User Queries

Publication number: 20090157401

Abstract: An intelligent query system for processing voiced-based queries is disclosed, which uses semantic based processing to identify the question posed by the user by understanding the meaning of the user's utterance. Based on identifying the meaning of the utterance, the system selects a single answer that best matches the user's query. The answer that is paired to this single question is then retrieved and presented to the user. The system, as implemented, accepts environmental variables selected by the user and is scalable to provide answers to a variety and quantity of user-initiated queries.

Type: Application

Filed: June 23, 2008

Publication date: June 18, 2009

Inventor: Ian M. Bennett
GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA

Publication number: 20090150153

Abstract: Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.

Type: Application

Filed: December 7, 2007

Publication date: June 11, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Xiao Li, Asela J. R. Gunawardana, Alejandro Acero
VOICE RECOGNITION APPARATUS AND MEMORY PRODUCT

Publication number: 20090150148

Abstract: A voice recognition apparatus can reduce false recognition caused by matching with respect to the phrases composed of a small number of syllables, when it performs a recognition process, by a pronunciation unit, for voice data based on voice produced by a speaker such as a syllable and further performs recognition by a method such as the Word Spotting for matching with respect to the phrases stored in the phrase database. The voice recognition apparatus performs a recognition process for comparing a result of the recognition process by a pronunciation unit with the extended phrases obtained by adding the additional phrase before and/or behind the respective phrases.

Type: Application

Filed: October 1, 2008

Publication date: June 11, 2009

Applicant: FUJITSU LIMITED

Inventor: Kenji ABE
Data Process unit and data process unit control program

Publication number: 20090138263

Abstract: To provide a data process unit and data process unit control program which are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and which are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. A data process unit 1 comprises a data classification section 1a, data storing section 1b, pattern model generating section 1c, data control section 1d, mathematical distance calculating section 1e, pattern model converting section 1f, pattern model display section 1g, region dividing section 1h, division changing section 1i, region selecting section 1j, and specific pattern model generating section 1k.

Type: Application

Filed: December 30, 2008

Publication date: May 28, 2009

Inventors: Makoto Shozakai, Goshu Nagino
Switching Functionality To Control Real-Time Switching Of Modules Of A Dialog System

Publication number: 20090119104

Abstract: Systems and methods are described that automatically control modules of dialog systems. The systems and methods include a dialog module that receives and processes utterances from a speaker and outputs data used to generate synthetic speech outputs as responses to the utterances. A controller is coupled to the dialog module, and the controller detects an abnormal output of the dialog module when the dialog module is processing in an automatic mode. The controller comprises a mode control for an agent to control the dialog module by correcting the abnormal output and transferring a corrected output to a downstream dialog module that follows, in a processing path, the dialog module. The corrected output is used in further processing the utterances.

Type: Application

Filed: November 7, 2007

Publication date: May 7, 2009

Applicant: Robert Bosch GmbH

Inventors: Fuliang Weng, Baoshi Yan, Zhe Feng
SPEAKER RECOGNITION SYSTEM

Publication number: 20090119103

Abstract: A method automatically recognizes speech received through an input. The method accesses one or more speaker-independent speaker models. The method detects whether the received speech input matches a speaker model according to an adaptable predetermined criterion. The method creates a speaker model assigned to a speaker model set when no match occurs based on the input.

Type: Application

Filed: October 10, 2008

Publication date: May 7, 2009

Inventors: Franz Gerl, Tobias Herbig
METHOD, SYSTEM, AND APPARATUS FOR NATURAL LANGUAGE MIXED-INITIATIVE DIALOGUE PROCESSING

Publication number: 20080300865

Abstract: In a natural language, mixed-initiative system, a method of processing user dialogue can include receiving a user input and determining whether the user input specifies an action to be performed or a token of an action. The user input can be selectively routed to an action interpreter or a token interpreter according to the determining step.

Type: Application

Filed: April 30, 2008

Publication date: December 4, 2008

Applicant: INTERNATIIONAL BUSINESS MACHINES CORPORATION

Inventors: Rajesh Balchandran, Linda Boyer
Input system for mobile search and method therefor

Publication number: 20080281582

Abstract: An input system for mobile search and a method therefor are provided. The input system includes an input module receiving a code input for a specific term and a voice input corresponding thereto, a database including a glossary and an acoustic model, wherein the glossary includes a plurality of terms and a sequence list, and each of the terms has a search weight based on an order of the sequence list, a process module selecting a first number of candidate terms from the glossary according to the code input by using an input algorithm and obtaining a second number of candidate terms by using a speech recognition algorithm to compare the voice input with the first number of candidate terms via the acoustic model, wherein the second number of candidate terms are listed in a particular order based on their respective search weights, and an output module showing the second number of candidate terms in the particular order for selecting the specific term therefrom.

Type: Application

Filed: October 1, 2007

Publication date: November 13, 2008

Inventors: Tien-ming Hsu, Ming-hong Wang, Yuan-chia Lu, Jia-lin Shen
Hidden trajectory modeling with differential cepstra for speech recognition

Publication number: 20080177546

Abstract: A novel system for speech recognition uses differential cepstra over time frames as acoustic features, together with the traditional static cepstral features, for hidden trajectory modeling, and provides greater accuracy and performance in automatic speech recognition. According to one illustrative embodiment, an automatic speech recognition method includes receiving a speech input, generating an interpretation of the speech, and providing an output based at least in part on the interpretation of the speech input. The interpretation of the speech uses hidden trajectory modeling with observation vectors that are based on cepstra and on differential cepstra derived from the cepstra. A method is developed that can automatically train the hidden trajectory model's parameters that are corresponding to the components of the differential cepstra in the full acoustic feature vectors.

Type: Application

Filed: January 19, 2007

Publication date: July 24, 2008

Applicant: Microsoft Corporation

Inventors: Li Deng, Dong Yu
VOICE TRIGGERED EMERGENCY ALERT

Publication number: 20080172232

Abstract: A voice recognition emergency system and method. The system includes a microphone, a speaker, and a voice recognition emergency device. The device includes a processor and a transmitter. The processor analyzes sound received from the microphone, detects when an emergency phrase has been spoken, and conveys an alert condition to a gateway via the transmitter in response to detecting the emergency phrase has been spoken. The processor recognizes pre-defined, spoken emergency phrases and triggers an alert condition in response to detecting that one of the emergency phrases was spoken. The device continuously listens through the microphone for any one of the pre-defined phrases, in response to which an alert condition is conveyed to the gateway device. The alert condition is conveyed to the gateway device through Power Line Communications (PLC), radio communications, Wi-Fi communications, or Ethernet communications. The processor may recognize an emergency phrase spoken by a particular person.

Type: Application

Filed: January 14, 2008

Publication date: July 17, 2008

Inventor: Scott A. Gurley
Analyzing network traffic

Publication number: 20080162135

Abstract: A method, article of manufacture, and apparatus for monitoring data traffic on a network is disclosed. In an embodiment, this includes obtaining intrinsic data from at least a portion of the traffic, obtaining extrinsic data from at least a portion of the traffic, associating the intrinsic data with the extrinsic data, and logging the intrinsic data and extrinsic data. The portion of the traffic from which the intrinsic data and extrinsic data are derived may not be stored, or may be stored in encrypted form.

Type: Application

Filed: December 30, 2006

Publication date: July 3, 2008

Inventors: Christopher Hercules Claudatos, William Dale Andruss, Scott R. Bevan
VOICE COMMAND PROCESSING SYSTEM IN A VEHICLE ENVIRONMENT

Publication number: 20080154613

Abstract: A voice processing system for a vehicle environment is provided for detecting a sound signal in the vehicle environment and identifying a voice command that originates from a vehicle user outside the vehicle. The voice processing system, in the detected the sound signal, takes into account position information relating to the position of the vehicle user in the vehicle environment. Information on the position of the vehicle user may be obtained from a keyless-go-system or another monitoring device of the motor vehicle, for example an optical imaging device of a parking-assistance system or a driver-assistance system.

Type: Application

Filed: August 6, 2007

Publication date: June 26, 2008

Applicant: Harman Becker Automotive Systems GmbH

Inventors: Tim Haulick, Markus Buck, Hans-Joerg Koepf
Method for the Distributed Construction of a Voice Recognition Model, and Device, Server and Computer Programs Used to Implement Same

Publication number: 20080103771

Abstract: A method for the distributed construction of a voice recognition model that is intended to be used by a device comprising a model base and a reference base in which the modeling elements are stored. The method includes the steps of obtaining the entity to be modeled, transmitting data representative of the entity over a communication link to a server, determining a set of modeling parameters indicating the modeling elements, transmitting the modeling parameters to the device, determining the voice recognition model of the entity to be modeled as a function of at least the modeling parameters received and at least one modeling element that is stored in the reference base and indicated in the transmitted parameters, and subsequently saving the voice recognition model in the model base.

Type: Application

Filed: October 27, 2005

Publication date: May 1, 2008

Applicant: France Telecom

Inventors: Denis Jouvet, Jean Monne
Grammar Generation for Password Recognition

Publication number: 20080077405

Abstract: A password grammar for speech recognition is described. A password is normalized into a list of strings of a plurality of character types such as letters and numerals. For each string of letters, one or more corresponding letter permutations are determined which represent pronounceable combinations of that string. Then, for each letter permutation, a corresponding recognition grammar entry is created for a speech recognition grammar.

Type: Application

Filed: September 21, 2007

Publication date: March 27, 2008

Applicant: NUANCE COMMUNICATIONS, INC.

Inventor: Richard Breuer
METHODS AND APPARATUS TO PERFORM SPEECH REFERENCE ENROLLMENT

Publication number: 20080015858

Abstract: A speech reference enrollment method involves the following steps: (a) requesting a user speak a vocabulary word; (b) detecting a first utterance (354); (c) requesting the user speak the vocabulary word; (d) detecting a second utterance (358); (e) determining a first similarity between the first utterance and the second utterance (362); (f) when the first similarity is less than a predetermined similarity, requesting the user speak the vocabulary word; (g) detecting a third utterance (366); (h) determining a second similarity between the first utterance and the third utterance (370); and (i) when the second similarity is greater than or equal to the predetermined similarity, creating a reference (364).

Type: Application

Filed: July 9, 2007

Publication date: January 17, 2008

Inventor: Robert Bossemeyer

prev 1 2