Specialized Models Patents (Class 704/255)

Markov (Class 704/256)

Natural language (Class 704/257)

Method and system for automatically detecting morphemes in a task classification system using lattices

Patent number: 8612212

Abstract: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.

Type: Grant

Filed: March 4, 2013

Date of Patent: December 17, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Giuseppe Riccardi
Text mining device, method thereof, and program

Patent number: 8612207

Abstract: Language analysis means 21 analyzes texts read from a text DB 11, and generates a sentence structure as the analysis result. Similar-structure generation adjustment means 25 generates, from an input of an input device, a determination item for determining whether or not the structures are identical every type of differences between the sentence structures. Similar-structure determination adjustment means 26 generates, from an input of the input device 6, a determination item for determining whether or not the difference between attribute values is ignored every type of attribute values. Similar-structure generating means 22 generates a similar structure of a partial structure forming the sentence structure obtained by language analysis means 21 in accordance with the determination item from the similar-structure generation adjustment means 25, and sets the generated similar structure as an equivalent class of the partial structure on the generation source.

Type: Grant

Filed: March 17, 2005

Date of Patent: December 17, 2013

Assignee: NEC Corporation

Inventors: Yousuke Sakao, Kenji Satoh, Susumu Akamine
Speech data process unit and speech data process unit control program for speech recognition

Patent number: 8606580

Abstract: To provide a data process unit and data process unit control program that are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and that are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. The data process unit comprises a data classification section, data storing section, pattern model generating section, data control section, mathematical distance calculating section, pattern model converting section, pattern model display section, region dividing section, division changing section, region selecting section, and specific pattern model generating section.

Type: Grant

Filed: December 30, 2008

Date of Patent: December 10, 2013

Assignee: Asahi Kasei Kabushiki Kaisha

Inventors: Makoto Shozakai, Goshu Nagino
Automated assistant for customer service representatives

Patent number: 8605885

Abstract: Systems and methods for handling information communicated by voice. The method may comprise: (i) receiving a call from a caller, the call comprising utterances from the caller; (ii) verbally communicating information to the caller through a customer service representative, the agent interacting with a display; (iii) processing the utterances with a computing device; (iv) determining content of the utterances; and (v) displaying information on the display based on the content.

Type: Grant

Filed: October 23, 2009

Date of Patent: December 10, 2013

Assignee: Next IT Corporation

Inventor: Charles C. Wooters
Noise profile determination for voice-related feature

Patent number: 8600743

Abstract: Systems, methods, and devices for noise profile determination for a voice-related feature of an electronic device are provided. In one example, an electronic device capable of such noise profile determination may include a microphone and data processing circuitry. When a voice-related feature of the electronic device is not in use, the microphone may obtain ambient sounds. The data processing circuitry may determine a noise profile based at least in part on the obtained ambient sounds. The noise profile may enable the data processing circuitry to at least partially filter other ambient sounds obtained when the voice-related feature of the electronic device is in use.

Type: Grant

Filed: January 6, 2010

Date of Patent: December 3, 2013

Assignee: Apple Inc.

Inventors: Aram Lindahl, Joseph M. Williams, Gints Valdis Klimanis
Program for creating hidden Markov model, information storage medium, system for creating hidden Markov model, speech recognition system, and method of speech recognition

Patent number: 8595010

Abstract: A program for generating Hidden Markov Models to be used for speech recognition with a given speech recognition system, the information storage medium storing a program, that renders a computer to function as a scheduled-to-be-used model group storage section that stores a scheduled-to-be-used model group including a plurality of Hidden Markov Models scheduled to be used by the given speech recognition system, and a filler model generation section that generates Hidden Markov Models to be used as filler models by the given speech recognition system based on all or at least a part of the Hidden Markov Model group in the scheduled-to-be-used model group.

Type: Grant

Filed: February 5, 2010

Date of Patent: November 26, 2013

Assignee: Seiko Epson Corporation

Inventors: Paul W. Shields, Matthew E. Dunnachie, Yasutoshi Takizawa
Robust information fusion methods for decision making for multisource data

Patent number: 8589334

Abstract: Methods and systems are provided for developing decision information relating to a single system based on data received from a plurality of sensors. The method includes receiving first data from a first sensor that defines first information of a first type that is related to a system, receiving second data from a second sensor that defines second information of a second type that is related to said system, wherein the first type is different from the second type, generating a first decision model, a second decision model, and a third decision model, determining whether data is available from only the first sensor, only the second sensor, or both the first and second sensors, and selecting based on the determination of availability an additional model to apply the available data, wherein the additional model is selected from a plurality of additional decision models including the third decision model.

Type: Grant

Filed: January 18, 2011

Date of Patent: November 19, 2013

Assignee: Telcordia Technologies, Inc.

Inventor: Akshay Vashist
Adapting language models with a bit mask for a subset of related words

Patent number: 8589163

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask.

Type: Grant

Filed: December 4, 2009

Date of Patent: November 19, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Mazin Gilbert
Word category estimation apparatus, word category estimation method, speech recognition apparatus, speech recognition method, program, and recording medium

Patent number: 8583436

Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.

Type: Grant

Filed: December 19, 2008

Date of Patent: November 12, 2013

Assignee: NEC Corporation

Inventors: Hitoshi Yamamoto, Kiyokazu Miki
ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION

Publication number: 20130297313

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Type: Application

Filed: April 12, 2013

Publication date: November 7, 2013

Inventors: Matthew I. Lloyd, Trausti T. Kristjansson
Adaptive construction of a statistical language model

Patent number: 8577670

Abstract: A statistical language model (SLM) may be iteratively refined by considering N-gram counts in new data, and blending the information contained in the new data with the existing SLM. A first group of documents is evaluated to determine the probabilities associated with the different N-grams observed in the documents. An SLM is constructed based on these probabilities. A second group of documents is then evaluated to determine the probabilities associated with each N-gram in that second group. The existing SLM is then evaluated to determine how well it explains the probabilities in the second group of documents, and a weighting parameter is calculated from that evaluation. Using the weighting parameter, a new SLM is then constructed as a weighted average of the existing SLM and the new probabilities.

Type: Grant

Filed: January 8, 2010

Date of Patent: November 5, 2013

Assignee: Microsoft Corporation

Inventors: Kuansan Wang, Xiaolong Li, Jiangbo Miao, Frederic H. Behr, Jr.
Speech recognition system and speech recognizing method

Patent number: 8577678

Abstract: A speech recognition system according to the present invention includes a sound source separating section which separates mixed speeches from multiple sound sources from one another; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each frequency spectral component of a separated speech signal using distributions of speech signal and noise against separation reliability of the separated speech signal; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.

Type: Grant

Filed: March 10, 2011

Date of Patent: November 5, 2013

Assignee: Honda Motor Co., Ltd.

Inventors: Kazuhiro Nakadai, Toru Takahashi, Hiroshi Okuno
System and method for improving speech recognition accuracy using textual context

Patent number: 8571866

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.

Type: Grant

Filed: October 23, 2009

Date of Patent: October 29, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Dan Melamed, Srinivas Bangalore, Michael Johnston
Segmenting words using scaled probabilities

Patent number: 8566095

Abstract: Systems, methods, and apparatuses including computer program products for segmenting words using scaled probabilities. In one implementation, a method is provided. The method includes receiving a probability of a n-gram identifying a word, determining a number of atomic units in the corresponding n-gram, identifying a scaling weight depending on the number of atomic units in the n-gram, and applying the scaling weight to the probability of the n-gram identifying a word to determine a scaled probability of the n-gram identifying a word.

Type: Grant

Filed: October 11, 2011

Date of Patent: October 22, 2013

Assignee: Google Inc.

Inventor: Mark Davis
Method and apparatus for segmenting a multimedia program based upon audio events

Patent number: 8560319

Abstract: The present invention provides for a method and apparatus for segmenting a multi-media program based upon audio events. In an embodiment a method of classifying an audio stream is provided. This method includes receiving an audio stream. Sampling the audio stream at a predetermined rate and then combining a predetermined number of samples into a clip. A plurality of features are then determined for the clip and are analyzed using a linear approximation algorithm. The clip is then characterized based upon the results of the analysis conducted with the linear approximation algorithm.

Type: Grant

Filed: January 15, 2008

Date of Patent: October 15, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Qian Huang, Zhu Liu
Mobile terminal and menu control method thereof

Patent number: 8560324

Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal, a memory configured to store information related to operations performed on the mobile terminal, and a controller configured to activate the voice recognition function upon receiving the input to activate the voice recognition function, to determine a meaning of an input voice instruction based on at least one prior operation performed on the mobile terminal and a language included in the voice instruction, and to provide operations related to the determined meaning of the input voice instruction based on the at least one prior operation performed on the mobile terminal and the language included in the voice instruction and based on a probability that the determined meaning of the input voice instruction matches the information related to the operations of the mobile terminal.

Type: Grant

Filed: January 31, 2012

Date of Patent: October 15, 2013

Assignee: LG Electronics Inc.

Inventors: Jong-Ho Shin, Jae-Do Kwak, Jong-Keun Youn
SPOKEN DIALOG SYSTEM USING PROMINENCE

Publication number: 20130262117

Abstract: The invention presents a method for analyzing speech in a spoken dialog system, comprising the steps of: accepting an utterance by at least one means for accepting acoustical signals, in particular a microphone, analyzing the utterance and obtaining prosodic cues from the utterance using at least one processing engine, wherein the utterance is evaluated based on the prosodic cues to determine a prominence of parts of the utterance, and wherein the utterance is analyzed to detect at least one marker feature, e.g. a negative statement, indicative of the utterance containing at least one part to replace at least one part in a previous utterance, the part to be replaced in the previous utterance being determined based on the prominence determined for the parts of the previous utterance and the replacement parts being determined based on the prominence of the parts in the utterance, and wherein the previous utterance is evaluated with the replacement part(s).

Type: Application

Filed: March 18, 2013

Publication date: October 3, 2013

Applicant: HONDA RESEARCH INSTITUTE EUROPE GMBH

Inventor: Martin HECKMANN
System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring

Patent number: 8548807

Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

Type: Grant

Filed: June 9, 2009

Date of Patent: October 1, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Voice processing methods and systems

Patent number: 8543400

Abstract: Voice processing methods and systems are provided. An utterance is received. The utterance is compared with teaching materials according to at least one matching algorithm to obtain a plurality of matching values corresponding to a plurality of voice units of the utterance. Respective voice units are scored in at least one first scoring item according to the matching values and a personified voice scoring algorithm. The personified voice scoring algorithm is generated according to training utterances corresponding to at least one training sentence in a phonetic-balanced sentence set of a plurality of learners and at least one real teacher, and scores corresponding to the respective voice units of the training utterances of the learners in the first scoring item provided by the real teacher.

Type: Grant

Filed: June 6, 2008

Date of Patent: September 24, 2013

Assignee: National Taiwan University

Inventors: Lin-Shan Lee, Che-Kuang Lin, Chia-Lin Chang, Yi-Jing Lin, Yow-Bang Wang, Yun-Huan Lee, Li-Wei Cheng
System and method of a list commands utility for a speech recognition command system

Patent number: 8538757

Abstract: In embodiments of the present invention, a system and computer-implemented method for enabling a user to interact with a computer platform using a voice command may include the steps of defining a structured grammar for generating a global voice command, defining a global voice command of the structured grammar, wherein the global voice command building a custom list of objects, and mapping at least one function of a listed object from the custom list of objects to the global voice command, wherein upon receiving voice input from the user the platform object recognizes at least one global voice command in the voice input and executes the function on the listed object in accordance with the recognized global voice command.

Type: Grant

Filed: December 21, 2009

Date of Patent: September 17, 2013

Assignee: Redstart Systems, Inc.

Inventor: Kimberly Patch
Method and apparatus for predicting word accuracy in automatic speech recognition systems

Patent number: 8538752

Abstract: The invention comprises a method and apparatus for predicting word accuracy. Specifically, the method comprises obtaining an utterance in speech data where the utterance comprises an actual word string, processing the utterance for generating an interpretation of the actual word string, processing the utterance to identify at least one utterance frame, and predicting a word accuracy associated with the interpretation according to at least one stationary signal-to-noise ratio and at least one non-stationary signal to noise ratio, wherein the at least one stationary signal-to-noise ratio and the at least one non-stationary signal to noise ratio are determined according to a frame energy associated with each of the at least one utterance frame.

Type: Grant

Filed: May 7, 2012

Date of Patent: September 17, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin Gilbert, Hong Kook Kim
RECOGNIZING SPEECH IN MULTIPLE LANGUAGES

Publication number: 20130238336

Abstract: Speech recognition systems may perform the following operations: receiving audio; recognizing the audio using language models for different languages to produce recognition candidates for the audio, where the recognition candidates are associated with corresponding recognition scores; identifying a candidate language for the audio; selecting a recognition candidate based on the recognition scores and the candidate language; and outputting data corresponding to the selected recognition candidate as a recognized version of the audio.

Type: Application

Filed: December 26, 2012

Publication date: September 12, 2013

Inventors: Yun-hsuan Sung, Francoise Beaufays, Brian Strope, Hui Lin, Jui-Ting Huang
System and method for isolating and processing common dialog cues

Patent number: 8532995

Abstract: A method, system and machine-readable medium are provided. Speech input is received at a speech recognition component and recognized output is produced. A common dialog cue from the received speech input or input from a second source is recognized. An action is performed corresponding to the recognized common dialog cue. The performed action includes sending a communication from the speech recognition component to the speech generation component while bypassing a dialog component.

Type: Grant

Filed: May 21, 2012

Date of Patent: September 10, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Vincent J. Goffin, Sarangarajan Parthasarathy
Speech models generated using competitive training, asymmetric training, and data boosting

Patent number: 8532991

Abstract: Speech models are trained using one or more of three different training systems. They include competitive training which reduces a distance between a recognized result and a true result, data boosting which divides and weights training data, and asymmetric training which trains different model components differently.

Type: Grant

Filed: March 10, 2010

Date of Patent: September 10, 2013

Assignee: Microsoft Corporation

Inventors: Xiaodong He, Jian Wu
System and method for standardized speech recognition infrastructure

Patent number: 8532992

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.

Type: Grant

Filed: February 8, 2013

Date of Patent: September 10, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Andrej Ljolje, Bernard S. Renger, Steven Neil Tischer
Speech recognition based on pronunciation modeling

Patent number: 8532993

Abstract: A system and method for performing speech recognition is disclosed. The method comprises receiving an utterance, applying the utterance to a recognizer with a language model having pronunciation probabilities associated with unique word identifiers for words given their pronunciations and presenting a recognition result for the utterance. Recognition improvement is found by moving a pronunciation model from a dictionary to the language model.

Type: Grant

Filed: July 2, 2012

Date of Patent: September 10, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Andrej Ljolje
Systems and methods for determining the N-best strings

Patent number: 8527273

Abstract: Systems and methods for identifying the N-best strings of a weighted automaton. A potential for each state of an input automaton to a set of destination states of the input automaton is first determined. Then, the N-best paths are found in the result of an on-the-fly determinization of the input automaton. Only the portion of the input automaton needed to identify the N-best paths is determinized. As the input automaton is determinized, a potential for each new state of the partially determinized automaton is determined and is used in identifying the N-best paths of the determinized automaton, which correspond exactly to the N-best strings of the input automaton.

Type: Grant

Filed: July 30, 2012

Date of Patent: September 3, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mehryar Mohri, Michael Dennis Riley
Voice recognition grammar selection based on context

Patent number: 8527279

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving geographical information derived from a non-verbal user action associated with a first computing device. The non-verbal user action implies an interest of a user in a geographic location. The method also includes identifying a grammar associated with the geographic location using the derived geographical information and outputting a grammar indicator for use in selecting the identified grammar for voice recognition processing of vocal input from the user.

Type: Grant

Filed: August 23, 2012

Date of Patent: September 3, 2013

Assignee: Google Inc.

Inventors: David P. Singleton, Debajit Ghosh
Computer-implemented system and method for processing audio in a voice response environment

Patent number: 8521527

Abstract: A computer-implemented system and method for processing audio in a voice response environment is provided. A database of host scripts each comprising signature files of audio phrases and actions to take when one of the audio phrases is recognized is maintained. The host scripts are loaded and a call to a voice mail server is initiated. Incoming audio buffers are received during the call from voice messages stored on the voice mail server. The incoming audio buffers are processed. A signature data structure is created for each audio buffer. The signature data structure is compared with signatures of expected phrases in the host scripts. The actions stored in the host scripts are executed when the signature data structure matches the signature of the expected phrase.

Type: Grant

Filed: September 10, 2012

Date of Patent: August 27, 2013

Assignee: Intellisist, Inc.

Inventor: Martin R. M. Dunsmuir
Acoustic model adaptation methods based on pronunciation variability analysis for enhancing the recognition of voice of non-native speaker and apparatus thereof

Patent number: 8515753

Abstract: The example embodiment of the present invention provides an acoustic model adaptation method for enhancing recognition performance for a non-native speaker's speech. In order to adapt acoustic models, first, pronunciation variations are examined by analyzing a non-native speaker's speech. Thereafter, based on variation pronunciation of a non-native speaker's speech, acoustic models are adapted in a state-tying step during a training process of acoustic models. When the present invention for adapting acoustic models and a conventional acoustic model adaptation scheme are combined, more-enhanced recognition performance can be obtained. The example embodiment of the present invention enhances recognition performance for a non-native speaker's speech while reducing the degradation of recognition performance for a native speaker's speech.

Type: Grant

Filed: March 30, 2007

Date of Patent: August 20, 2013

Assignee: Gwangju Institute of Science and Technology

Inventors: Hong Kook Kim, Yoo Rhee Oh, Jae Sam Yoon
Integrated language model, related systems and methods

Patent number: 8515734

Abstract: An integrated language model includes an upper-level language model component and a lower-level language model component, with the upper-level language model component including a non-terminal and the lower-level language model component being applied to the non-terminal. The upper-level and lower-level language model components can be of the same or different language model formats, including finite state grammar (FSG) and statistical language model (SLM) formats. Systems and methods for making integrated language models allow designation of language model formats for the upper-level and lower-level components and identification of non-terminals. Automatic non-terminal replacement and retention criteria can be used to facilitate the generation of one or both language model components, which can include the modification of existing language models.

Type: Grant

Filed: February 8, 2010

Date of Patent: August 20, 2013

Assignee: Adacel Systems, Inc.

Inventors: Chang-Qing Shu, Han Shu, John M. Mervin
Speech recognition apparatus and method and program therefor

Patent number: 8510111

Abstract: A speech recognition apparatus includes a generating unit generating a speech-feature vector expressing a feature for each of frames obtained by dividing an input speech, a storage unit storing a first acoustic model obtained by modeling a feature of each word by using a state transition model, a storage unit configured to store at least one second acoustic model, a calculation unit calculating, for each state, a first probability of transition to an at-end-frame state to obtain first probabilities, and select a maximum probability of the first probabilities, a selection unit selecting a maximum-probability-transition path, a conversion unit converting the maximum-probability-transition path into a corresponding-transition-path corresponding to the second acoustic model, a calculation unit calculating a second probability of transition to the at-end-frame state on the corresponding-transition-path, and a finding unit finding to which word the input speech corresponds based on the maximum probability and the s

Type: Grant

Filed: February 8, 2008

Date of Patent: August 13, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masaru Sakai, Hiroshi Fujimura, Shinichi Tanaka
Differential dynamic content delivery with text display in dependence upon simultaneous speech

Patent number: 8504364

Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.

Type: Grant

Filed: September 14, 2012

Date of Patent: August 6, 2013

Assignee: Nuance Communications, Inc.

Inventors: William K. Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
Joint factor analysis scoring for speech processing systems

Patent number: 8504366

Abstract: Method, system, and computer program product are provided for Joint Factor Analysis (JFA) scoring in speech processing systems. The method includes: carrying out an enrollment session offline to enroll a speaker model in a speech processing system using JFA, including: extracting speaker factors from the enrollment session; estimating first components of channel factors from the enrollment session. The method further includes: carrying out a test session including: calculating second components of channel factors strongly dependent on the test session; and generating a score based on speaker factors, channel factors, and test session Gaussian mixture model sufficient statistics to provide a log-likelihood ratio for a test session.

Type: Grant

Filed: November 16, 2011

Date of Patent: August 6, 2013

Assignee: Nuance Communications, Inc.

Inventors: Aronowitz Hagai, Barkan Oren
System and method for speech-enabled call routing

Patent number: 8503662

Abstract: A method includes receiving speech of a call from a caller at a processor of a call routing system. The method includes using the processor to determine a first call destination for the call based on the speech. The method includes using the processor to determine whether the caller is in compliance with at least one business rule related to an account of the caller. The method includes routing the call to the first call destination when the caller is in compliance with the at least one business rule and routing the call to a second call destination when the caller is not in compliance with the at least one business rule.

Type: Grant

Filed: May 26, 2010

Date of Patent: August 6, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Robert R. Bushey, Benjamin Anthony Knott, Sarah Korth
System and method for unsupervised and active learning for automatic speech recognition

Patent number: 8504363

Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.

Type: Grant

Filed: April 9, 2012

Date of Patent: August 6, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Zeynep Hakkani-Tur, Giuseppe Riccardi
Method and apparatus for speech recognition using domain ontology

Patent number: 8504359

Abstract: A speech recognition method using a domain ontology includes: constructing domain ontology DB; forming a speech recognition grammar using the formed domain ontology DB; extracting a feature vector from a speech signal; modeling the speech signal using an acoustic model. The method performs speech recognition by using the acoustic model, the speech recognition dictionary and the speech recognition grammar on the basis of the feature vector.

Type: Grant

Filed: September 1, 2009

Date of Patent: August 6, 2013

Assignee: Electronics and Telecommunications Research Institute

Inventors: Seung Yun, Soo Jong Lee, Jeong Se Kim, Il Bin Lee, Jun Park, Sang Kyu Park
Mapping an audio utterance to an action using a classifier

Patent number: 8484025

Abstract: Disclosed embodiments relate to mapping an utterance to an action using a classifier. One illustrative computing device includes a user interface having an input component. The computing device further includes a processor and a computer-readable storage medium, having stored thereon program instructions that, upon execution by the processor, cause the computing device to perform a set of operations including: receiving an audio utterance via the input component; determining a text string based on the utterance; determining a string-feature vector based on the text string; selecting a target classifier from a set of classifiers, wherein the target classifier is selected based on a determination that a string-feature criteria of the target classifier corresponds to at least one string-feature of the string-feature vector; and initiating a target action that corresponds to the target classifier.

Type: Grant

Filed: October 4, 2012

Date of Patent: July 9, 2013

Assignee: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Martin Jansche, Fadi Biadsy
Method, system and computer readable recording medium for correcting OCR result

Patent number: 8468013

Abstract: Disclosed is a method, system and computer readable recording medium for correcting an OCR result. According to an exemplary embodiment of the present invention, there is provided a method for correcting an OCR result, the method including performing character recognition on content including character information using an OCR technique, removing extra carriage return information from the content, outputting the character recognition result, and correcting word spacing on the outputted result.

Type: Grant

Filed: December 30, 2009

Date of Patent: June 18, 2013

Assignee: NHN Corporation

Inventors: Byoung Seok Yang, Hee Cheol Seo, Do Gil Lee, Ki Joon Sung
Conditional model for natural language understanding

Patent number: 8442828

Abstract: A conditional model is used in spoken language understanding. One such model is a conditional random field model.

Type: Grant

Filed: March 17, 2006

Date of Patent: May 14, 2013

Assignee: Microsoft Corporation

Inventors: Ye-Yi Wang, Alejandro Acero, John Sie Yuen Lee, Milind V. Mahajan
Sound envelope deconstruction to identify words in continuous speech

Patent number: 8442831

Abstract: A speech recognition capability in which words of spoken text are identified based on the contour of sound waves representing the spoken text. Variations in the contour of the sound waves are identified, features are assigned to those variations, and then the features are mapped to sound constructs to provide the words.

Type: Grant

Filed: October 31, 2008

Date of Patent: May 14, 2013

Assignee: International Business Machines Corporation

Inventor: Mukundan Sundararajan
Automated distortion classification

Patent number: 8438030

Abstract: A method of and system for automated distortion classification. The method includes steps of (a) receiving audio including a user speech signal and at least some distortion associated with the signal; (b) pre-processing the received audio to generate acoustic feature vectors; (c) decoding the generated acoustic feature vectors to produce a plurality of hypotheses for the distortion; and (d) post-processing the plurality of hypotheses to identify at least one distortion hypothesis of the plurality of hypotheses as the received distortion. The system can include one or more distortion models including distortion-related acoustic features representative of various types of distortion and used by a decoder to compare the acoustic feature vectors with the distortion-related acoustic features to produce the plurality of hypotheses for the distortion.

Type: Grant

Filed: November 25, 2009

Date of Patent: May 7, 2013

Assignee: General Motors LLC

Inventors: Gaurav Talwar, Rathinavelu Chengalvarayan
System and method for relating syntax and semantics for a conversational speech application

Patent number: 8438031

Abstract: A conversation manager processes spoken utterances from a user of a computer. The conversation manager includes a semantics analysis module and a syntax manager. A domain model that is used in processing the spoken utterances includes an ontology (i.e., world view for the relevant domain of the spoken utterances), lexicon, and syntax definitions. The syntax manager combines the ontology, lexicon, and syntax definitions to generate a grammatic specification. The semantics module uses the grammatic specification and the domain model to develop a set of frames (i.e., internal representation of the spoken utterance). The semantics module then develops a set of propositions from the set of frames. The conversation manager then uses the set of propositions in further processing to provide a reply to the spoken utterance.

Type: Grant

Filed: June 7, 2007

Date of Patent: May 7, 2013

Assignee: Nuance Communications, Inc.

Inventors: Steven I. Ross, Robert C. Armes, Julie F. Alweis, Elizabeth A. Brownholtz, Jeffrey G. MacAllister
Methods and systems for natural language understanding using human knowledge and collected data

Patent number: 8433558

Abstract: Disclosed herein are systems and methods to incorporate human knowledge when developing and using statistical models for natural language understanding. The disclosed systems and methods embrace a data-driven approach to natural language understanding which progresses seamlessly along the continuum of availability of annotated collected data, from when there is no available annotated collected data to when there is any amount of annotated collected data.

Type: Grant

Filed: July 25, 2005

Date of Patent: April 30, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, Mazin Gilbert, Narendra K. Gupta
Method and apparatus for multiple value confirmation and correction in spoken dialog system

Patent number: 8433572

Abstract: A method for multiple value confirmation and correction in spoken dialog systems. A user is allowed to correct errors in values captured by the spoken dialog system, such that the interaction necessary for error correction between the system and the user is reduced. When the spoken dialog system collects a set of values from a user, the system provides a spoken confirmation of the set of values to the user. The spoken confirmation comprises the set of values and possibly pause associated with each value. Upon hearing an incorrect value, the user may react and barge-in the spoken confirmation and provide a corrected value. Responsive to detecting the user interruption during the pause or after the system speaking of a value, the system halts the spoken confirmation and collects the corrected value. The system then provides a new spoken confirmation to the user, wherein the new spoken confirmation includes the corrected value.

Type: Grant

Filed: April 2, 2008

Date of Patent: April 30, 2013

Assignee: Nuance Communications, Inc.

Inventors: Sasha Porto Caskey, Juan Manuel Huerta, Roberto Pieraccini
Prosody modification device, prosody modification method, and recording medium storing prosody modification program

Patent number: 8433573

Abstract: A prosody modification device includes: a real voice prosody input part that receives real voice prosody information extracted from an utterance of a human; a regular prosody generating part that generates regular prosody information having a regular phoneme boundary that determines a boundary between phonemes and a regular phoneme length of a phoneme by using data representing a regular or statistical phoneme length in an utterance of a human with respect to a section including at least a phoneme or a phoneme string to be modified in the real voice prosody information; and a real voice prosody modification part that resets a real voice phoneme boundary by using the generated regular prosody information so that the real voice phoneme boundary and a real voice phoneme length of the phoneme or the phoneme string to be modified in the real voice prosody information are approximate to an actual phoneme boundary and an actual phoneme length of the utterance of the human, thereby modifying the real voice prosody in

Type: Grant

Filed: February 11, 2008

Date of Patent: April 30, 2013

Assignee: Fujitsu Limited

Inventors: Kentaro Murase, Nobuyuki Katae
System and method for performing compensated speech recognition

Patent number: 8428944

Abstract: A speech recognition system prompts a user to provide a first utterance, which is recorded. Speech recognition is performed on the first user utterance to yield a recognition result. The user is prompted to provide a second user utterance, which is recorded, processed and compared to the first utterance to detect a plurality of acoustic differences for each acoustic parameter. The acoustic model used by the speech recognition engine is modified as a function of the acoustic difference.

Type: Grant

Filed: May 7, 2007

Date of Patent: April 23, 2013

Assignee: Nuance Communications, Inc.

Inventors: Timothy David Poultney, Matthew Whitbourne, Kamourudeen Larry Yusuf
Speech recognition apparatus, navigation apparatus including a speech recognition apparatus, and a control screen aided speech recognition method

Patent number: 8428951

Abstract: A speech recognition apparatus includes a speech recognition dictionary and a speech recognition unit. The speech recognition dictionary includes comparison data used to recognize a voice input. The speech recognition unit is adapted to calculate the score for each comparison data by comparing voice input data generated based on the voice input with each comparison data, recognize the voice input based on the score, and produce the recognition result of the voice input. The speech recognition apparatus further includes data indicating score weights associated with particular comparison data, used to weight the scores calculated for the particular comparison data. After the score is calculated for each comparison data, the score weights are added to the scores of the particular comparison data, and the voice input is recognized based on total scores including the added score weights.

Type: Grant

Filed: July 6, 2006

Date of Patent: April 23, 2013

Assignee: Alpine Electronics, Inc.

Inventor: Toshiyuki Hyakumoto
Pattern generation

Patent number: 8423348

Abstract: A method and system is disclosed herein for generating a plurality of equivalent sentence patterns from a declared sentence pattern for a specific language. The declared pattern is fed into a pattern selector. The pattern selector reads a predetermined library of equivalent pattern sets and selects an equivalent pattern set for the declared pattern. The selected equivalent pattern set corresponds to the declared pattern and represents a set of equivalent declared patterns. The set of equivalent declared patterns and the declared pattern are fed to a rules generator. The rules generator outputs executable semantic pattern recognition rules. The reader module, using the generated executable semantic pattern recognition rules, reads the given information source to determine the information of interest.

Type: Grant

Filed: June 10, 2006

Date of Patent: April 16, 2013

Assignee: Trigent Software Ltd.

Inventors: Charles Rehberg, Krishnamurthy Satyanarayana, Rengarajan Seshadri, Vasudevan Comandur, Abhishek Mehta, Amit Goel
Identifying keyword occurrences in audio data

Patent number: 8423363

Abstract: Occurrences of one or more keywords in audio data are identified using a speech recognizer employing a language model to derive a transcript of the keywords. The transcript is converted into a phoneme sequence. The phonemes of the phoneme sequence are mapped to the audio data to derive a time-aligned phoneme sequence that is searched for occurrences of keyword phoneme sequences corresponding to the phonemes of the keywords. Searching includes computing a confusion matrix. The language model used by the speech recognizer is adapted to keywords by increasing the likelihoods of the keywords in the language model. For each potential occurrences keywords detected, a corresponding subset of the audio data may be played back to an operator to confirm whether the potential occurrences correspond to actual occurrences of the keywords.

Type: Grant

Filed: January 13, 2010

Date of Patent: April 16, 2013

Assignee: CRIM (Centre de Recherche Informatique de Montréal)

Inventors: Vishwa Nath Gupta, Gilles Boulianne

prev … 2 3 4 5 6 7 8 9 10 … next