Creating Patterns For Matching Patents (Class 704/243)

Update patterns (Class 704/244)

Clustering (Class 704/245)

Method for tone/intonation recognition using auditory attention cues

Patent number: 8676574

Abstract: In a spoken language processing method for tone/intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.

Type: Grant

Filed: November 10, 2010

Date of Patent: March 18, 2014

Assignee: Sony Computer Entertainment Inc.

Inventor: Ozlem Kalinli
Tone determination device and method

Patent number: 8670980

Abstract: A tone determination device, which determines the tonality of an input signal, is capable of reducing calculation complexity. Therein a frequency conversion unit (101) converts the frequency of an input signal; a downsampling unit (102) carries out shortening processing which shortens the vector series length of the frequency-converted signal; a constancy determination unit (107) determines the constancy of the input signal; depending on the constancy of the input signal, a vector selection unit (104) selects either the vector series of the post-frequency conversion signal or the vector series after the shortening of the vector series length; a correlation analysis unit (105) uses the vector series selected by the vector selection unit (104) to obtain correlations; and a tone determination unit (106) uses the correlations to determine the tonality of the input signal.

Type: Grant

Filed: October 26, 2010

Date of Patent: March 11, 2014

Assignee: Panasonic Corporation

Inventor: Kaoru Satoh
Speech signal similarity

Patent number: 8670983

Abstract: A method for determining a similarity between a first audio source and a second audio source includes: for the first audio source, determining a first frequency of occurrence for each of a plurality of phoneme sequences and determining a first weighted frequency for each of the plurality of phoneme sequences based on the first frequency of occurrence for the phoneme sequence; for the second audio source, determining a second frequency of occurrence for each of a plurality of phoneme sequences and determining a second weighted frequency for each of the plurality of phoneme sequences based on the second frequency of occurrence for the phoneme sequence; comparing the first weighted frequency for each phoneme sequence with the second weighted frequency for the corresponding phoneme sequence; and generating a similarity score representative of a similarity between the first audio source and the second audio source based on the results of the comparing.

Type: Grant

Filed: August 30, 2011

Date of Patent: March 11, 2014

Assignee: Nexidia Inc.

Inventors: Jacob B. Garland, Jon A. Arrowood, Drew Lanham, Marsal Gavalda
MODEL LEARNING DEVICE, MODEL GENERATION METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20140067393

Abstract: According to an embodiment, a model learning device learns a model having a full covariance matrix shared among a plurality of Gaussian distributions. The device includes a first calculator to calculate, from training data, frequencies of occurrence and sufficient statistics of the Gaussian distributions contained in the model; and a second calculator to select, on the basis of the frequencies of occurrence and the sufficient statistics, a sharing structure in which a covariance matrix is shared among Gaussian distributions, and calculate the full covariance matrix shared in the selected sharing structure.

Type: Application

Filed: August 15, 2013

Publication date: March 6, 2014

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Takashi Masuko
Recognition and recall memory

Patent number: 8667230

Abstract: A digital memory architecture for recognition and recall in support of a host comprises a plurality of pattern processors, each of which has its own random access memory (RAM) and controller, an external data bus and external data bus controller, a results bus and results bus controller, an internal data bus and internal data bus controller, and an external control bus and external control bus and controller. Each of the pattern processors may be a general purpose set theoretic processor (GPSTP) operating in interrupt and block modes.

Type: Grant

Filed: October 19, 2010

Date of Patent: March 4, 2014

Inventor: Curtis L. Harris
Noise power estimation system, noise power estimating method, speech recognition system and speech recognizing method

Patent number: 8666737

Abstract: A noise power estimation system for estimating noise power of each frequency spectral component includes a cumulative histogram generating section for generating a cumulative histogram for each frequency spectral component of a time series signal, in which the horizontal axis indicates index of power level and the vertical axis indicates cumulative frequency and which is weighted by exponential moving average; and a noise power estimation section for determining an estimated value of noise power for each frequency spectral component of the time series signal based on the cumulative histogram.

Type: Grant

Filed: September 14, 2011

Date of Patent: March 4, 2014

Assignee: Honda Motor Co., Ltd.

Inventors: Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa
Method, a computer program and apparatus for processing a computer message

Patent number: 8666731

Abstract: Embodiments of the invention provide a method, computer program and apparatus for processing a computer message, the method comprising: upon receipt of a computer message at a computer, classifying the computer message and assigning it a message cluster identification in dependence thereon; and, utilizing a message template to trans-denotate the message, wherein the message template is selected in dependence on the message cluster identification.

Type: Grant

Filed: September 22, 2010

Date of Patent: March 4, 2014

Assignee: Oracle International Corporation

Inventors: Stephen Anthony Moyle, Graham Kenneth Thwaites
Method for estimating language model weight and system for the same

Patent number: 8666739

Abstract: Method of the present invention may include receiving speech feature vector converted from speech signal, performing first search by applying first language model to the received speech feature vector, and outputting word lattice and first acoustic score of the word lattice as continuous speech recognition result, outputting second acoustic score as phoneme recognition result by applying an acoustic model to the speech feature vector, comparing the first acoustic score of the continuous speech recognition result with the second acoustic score of the phoneme recognition result, outputting first language model weight when the first coustic score of the continuous speech recognition result is better than the second acoustic score of the phoneme recognition result and performing a second search by applying a second language model weight, which is the same as the output first language model, to the word lattice.

Type: Grant

Filed: December 13, 2011

Date of Patent: March 4, 2014

Assignee: Electronics and Telecommunications Research Institute

Inventors: Hyung Bae Jeon, Yun Keun Lee, Eui Sok Chung, Jong Jin Kim, Hoon Chung, Jeon Gue Park, Ho Young Jung, Byung Ok Kang, Ki Young Park, Sung Joo Lee, Jeom Ja Kang, Hwa Jeon Song
Method and System for Selectively Biased Linear Discriminant Analysis in Automatic Speech Recognition Systems

Publication number: 20140058731

Abstract: A system and method are presented for selectively biased linear discriminant analysis in automatic speech recognition systems. Linear Discriminant Analysis (LDA) may be used to improve the discrimination between the hidden Markov model (HMM) tied-states in the acoustic feature space. The between-class and within-class covariance matrices may be biased based on the observed recognition errors of the tied-states, such as shared HMM states of the context dependent tri-phone acoustic model. The recognition errors may be obtained from a trained maximum-likelihood acoustic model utilizing the tied-states which may then be used as classes in the analysis.

Type: Application

Filed: August 23, 2013

Publication date: February 27, 2014

Applicant: INTERACTIVE INTELLIGENCE, INC.

Inventors: Vivek Tyagi, Aravind Ganapathiraju, Felix Immanuel Wyss
Enhancing speech recognition using visual information

Patent number: 8660842

Abstract: Speech recognition device uses visual information to narrow down the range of likely adaptation parameters even before a speaker makes an utterance. Images of the speaker and/or the environment are collected using an image capturing device, and then processed to extract biometric features and environmental features. The extracted features and environmental features are then used to estimate adaptation parameters. A voice sample may also be collected to refine the adaptation parameters for more accurate speech recognition.

Type: Grant

Filed: March 9, 2010

Date of Patent: February 25, 2014

Assignee: Honda Motor Co., Ltd.

Inventor: Antoine R. Raux
SYSTEM AND METHODS FOR MATCHING AN UTTERANCE TO A TEMPLATE HIERARCHY

Publication number: 20140052444

Abstract: A system and methods for matching at least one word of an utterance against a set of template hierarchies to select the best matching template or set of templates corresponding to the utterance. Certain embodiments of the system and methods determines at least one exact, inexact, and partial match between the at least one word of the utterance and at least one term within the template hierarchy to select and populate a template or set of templates corresponding to the utterance. The populated template or set of templates may then be used to generate a narrative template or a report template.

Type: Application

Filed: October 28, 2013

Publication date: February 20, 2014

Inventor: James Roberge
Method for dynamic learning of individual voice patterns

Patent number: 8655660

Abstract: The present invention is a system and method for generating a personal voice font including, monitoring voice segments automatically from phone conversations of a user by a voice learning processor to generate a personalized voice font and delivering the personalized voice font (PVF) to the a server.

Type: Grant

Filed: February 10, 2009

Date of Patent: February 18, 2014

Assignee: International Business Machines Corporation

Inventors: Zsolt Szalai, Philippe Bazot, Bernard Pucci, Joel Vitale
System and method for answering a communication notification

Patent number: 8655662

Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.

Type: Grant

Filed: November 29, 2012

Date of Patent: February 18, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst Schroeter
N-gram selection for practical-sized language models

Patent number: 8655647

Abstract: Described is a technology by which a statistical N-gram (e.g., language) model is trained using an N-gram selection technique that helps reduce the size of the final N-gram model. During training, a higher-order probability estimate for an N-gram is only added to the model when the training data justifies adding the estimate. To this end, if a backoff probability estimate is within a maximum likelihood set determined by that N-gram and the N-gram's associated context, or is between the higher-order estimate and the maximum likelihood set, then the higher-order estimate is not included in the model. The backoff probability estimate may be determined via an iterative process such that the backoff probability estimate is based on the final model rather than any lower-order model. Also described is additional pruning referred to as modified weighted difference pruning.

Type: Grant

Filed: March 11, 2010

Date of Patent: February 18, 2014

Assignee: Microsoft Corporation

Inventor: Robert Carter Moore
METHOD AND SYSTEM FOR ACOUSTIC DATA SELECTION FOR TRAINING THE PARAMETERS OF AN ACOUSTIC MODEL

Publication number: 20140046662

Abstract: A system and method are presented for acoustic data selection of a particular quality for training the parameters of an acoustic model, such as a Hidden Markov Model and Gaussian Mixture Model, for example, in automatic speech recognition systems in the speech analytics field. A raw acoustic model may be trained using a given speech corpus and maximum likelihood criteria. A series of operations are performed, such as a forced Viterbi-alignment, calculations of likelihood scores, and phoneme recognition, for example, to form a subset corpus of training data. During the process, audio files of a quality that does not meet a criterion, such as poor quality audio files, may be automatically rejected from the corpus. The subset may then be used to train a new acoustic model.

Type: Application

Filed: August 5, 2013

Publication date: February 13, 2014

Applicant: Interactive Intelligence, Inc.

Inventors: Vivek Tyagi, Aravind Ganapathiraju, Felix Immanuel Wyss
Accuracy improvement of spoken queries transcription using co-occurrence information

Patent number: 8650031

Abstract: Techniques disclosed herein include systems and methods for voice-enabled searching. Techniques include a co-occurrence based approach to improve accuracy of the 1-best hypothesis for non-phrase voice queries, as well as for phrased voice queries. A co-occurrence model is used in addition to a statistical natural language model and acoustic model to recognize spoken queries, such as spoken queries for searching a search engine. Given an utterance and an associated list of automated speech recognition n-best hypotheses, the system rescores the different hypotheses using co-occurrence information. For each hypothesis, the system estimates a frequency of co-occurrence within web documents. Combined scores from a speech recognizer and a co-occurrence engine can be combined to select a best hypothesis with a lower word error rate.

Type: Grant

Filed: July 31, 2011

Date of Patent: February 11, 2014

Assignee: Nuance Communications, Inc.

Inventors: Jonathan Mamou, Abhinav Sethy, Bhuvana Ramabhadran, Ron Hoory, Paul Joseph Vozila, Nathan Bodenstab
Method of active learning for automatic speech recognition

Patent number: 8650033

Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.

Type: Grant

Filed: October 13, 2006

Date of Patent: February 11, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Allen Louis Gorin, Dilek Z. Hakkani-Tur, Giuseppe Riccardi
Location based responses to telephone requests

Patent number: 8650030

Abstract: A method for receiving processed information at a remote device is described. The method includes transmitting from the remote device a verbal request to a first information provider and receiving a digital message from the first information provider in response to the transmitted verbal request. The digital message includes a symbolic representation indicator associated with a symbolic representation of the verbal request and data used to control an application. The method also includes transmitting, using the application, the symbolic representation indicator to a second information provider for generating results to be displayed on the remote device.

Type: Grant

Filed: April 2, 2007

Date of Patent: February 11, 2014

Assignee: Google Inc.

Inventors: Gudmundur Hafsteinsson, Michael J. LeBeau, Natalia Marmasse, Sumit Agarwal, Dipchand Nishar
Processing unit, speech recognition apparatus, speech recognition system, speech recognition method, storage medium storing speech recognition program

Patent number: 8645130

Abstract: A processing unit is provided which executes speech recognition on speech signals captured by a microphone for capturing sounds uttered in an environment. The processing unit has: an initial reflection component extraction portion that extracts initial reflection components by removing diffuse reverberation components from a reverberation pattern of an impulse response generated in the environment; and an acoustic model learning portion that learns an acoustic model for the speech recognition by reflecting the initial reflection components to speech data for learning.

Type: Grant

Filed: November 20, 2008

Date of Patent: February 4, 2014

Assignees: Toyota Jidosha Kabushiki Kaisha, National University Corporation Nara Institute of Science and Technology

Inventors: Narimasa Watanabe, Kiyohiro Shikano, Randy Gomez
Method for creating a speech model

Patent number: 8645135

Abstract: A transformation can be derived which would represent that processing required to convert a male speech model to a female speech model. That transformation is subjected to a predetermined modification, and the modified transformation is applied to a female speech model to produce a synthetic children's speech model. The male and female models can be expressed in terms of a vector representing key values defining each speech model and the derived transformation can be in the form of a matrix that would transform the vector of the male model to the vector of the female model. The modification to the derived matrix comprises applying an exponential p which has a value greater than zero and less than 1.

Type: Grant

Filed: September 12, 2008

Date of Patent: February 4, 2014

Assignee: Rosetta Stone, Ltd.

Inventors: Andreas Hagen, Bryan Peltom, Kadri Hacioglu
User-specific confidence thresholds for speech recognition

Patent number: 8639508

Abstract: A method of automatic speech recognition includes receiving an utterance from a user via a microphone that converts the utterance into a speech signal, pre-processing the speech signal using a processor to extract acoustic data from the received speech signal, and identifying at least one user-specific characteristic in response to the extracted acoustic data. The method also includes determining a user-specific confidence threshold responsive to the at least one user-specific characteristic, and using the user-specific confidence threshold to recognize the utterance received from the user and/or to assess confusability of the utterance with stored vocabulary.

Type: Grant

Filed: February 14, 2011

Date of Patent: January 28, 2014

Assignee: General Motors LLC

Inventors: Xufang Zhao, Gaurav Talwar
Automated communication integrator

Patent number: 8639513

Abstract: An apparatus includes a plurality of applications and an integrator having a voice recognition module configured to identify at least one voice command from a user. The integrator is configured to integrate information from a remote source into at least one of the plurality of applications based on the identified voice command. A method includes analyzing speech from a first user of a first mobile device having a plurality of applications, identifying a voice command based on the analyzed speech using a voice recognition module, and incorporating information from the remote source into at least one of a plurality of applications based on the identified voice command.

Type: Grant

Filed: August 5, 2009

Date of Patent: January 28, 2014

Assignee: Verizon Patent and Licensing Inc.

Inventor: Robert Edward Opaluch
Gesture detection using an array of short-range communication devices

Patent number: 8638190

Abstract: In general, techniques and systems for defining a gesture with a computing device using short-range communication are described. In one example, a method includes obtaining position information from an array of position devices using near-field communication (NFC) during a movement of the computing device with respect to the array, wherein the position information identifies unique positions within the array for each position device from which position information was obtained. The method may also include determining sequence information associated with the position information, wherein the sequence information is representative of an order in which the position information was obtained from each position device, and performing, by the computing device, an action based at least in part on the position information and the sequence information, wherein the position information and the sequence information are representative of a gesture input associated with the movement of the computing device.

Type: Grant

Filed: September 12, 2012

Date of Patent: January 28, 2014

Assignee: Google Inc.

Inventors: Roy Want, Yang Li, William Noah Schilit
Increasing the relevancy of media content

Patent number: 8635058

Abstract: The present invention relates to increasing the relevance of media content communicated to consumers who are consuming the media content. In this regard, at least one of a personal device can be synced with a media device, each of the personal device is associated with at least one of a consumer who is proximate the media device. At least one of a preferred human language associated with at least one of the personal device can be determined. The media device or media content can be configured and or caused to be communicated in at least one of the preferred human language to increase relevance of the media content communicated to the consumer. Other embodiments can include communicating at least a portion of the media content on the personal device and selecting relevant media content based in part on language, cultural, ethnic, time, day, occasion, or geography.

Type: Grant

Filed: March 2, 2010

Date of Patent: January 21, 2014

Inventor: Nilang Patel
Apparatus and method for automatic extraction of important events in audio signals

Patent number: 8635065

Abstract: The present invention discloses an apparatus for automatic extraction of important events in audio signals comprising: signal input means for supplying audio signals; audio signal fragmenting means for partitioning audio signals supplied by the signal input means into audio fragments of a predetermined length and for allocating a sequence of one or more audio fragments to a respective audio window; feature extracting means for analyzing acoustic characteristics of the audio signals comprised in the audio fragments and for analyzing acoustic characteristics of the audio signals comprised in the audio windows; and important event extraction means for extracting important events in audio signals supplied by the audio signal fragmenting means based on predetermined important event classifying rules depending on acoustic characteristics of the audio signals comprised in the audio fragments and on acoustic characteristics of the audio signals comprised in the audio windows, wherein each important event extracted

Type: Grant

Filed: November 10, 2004

Date of Patent: January 21, 2014

Assignee: Sony Deutschland GmbH

Inventors: Silke Goronzy-Thomae, Thomas Kemp, Ralf Kompe, Yin Hay Lam, Krzysztof Marasek, Raquel Tato
Image processing apparatus, speech recognition processing apparatus, control method for speech recognition processing apparatus, and computer-readable storage medium for computer program

Patent number: 8630852

Abstract: An image processing apparatus includes a speech input portion that receives an input of speech from a user, a dictionary storage portion that stores a dictionary configured by phrase information pieces for recognizing the speech, a compound phrase generation portion that generates a plurality of compound phrases formed by all combinations of a plurality of predetermined phrases in different orders, a compound phrase registration portion that registers the plurality of compound phrases that have been generated in the dictionary as the phrase information pieces, a speech recognition portion that, in a case where speech including a speech phrase formed by the plurality of predetermined phrases said in an arbitrary order has been input, performs speech recognition on the speech by searching the dictionary for a compound phrase that matches the speech phrase.

Type: Grant

Filed: September 16, 2010

Date of Patent: January 14, 2014

Assignee: Konica Minolta Business Technologies, Inc.

Inventor: Ayumi Itoh
Method and system for dynamic nametag scoring

Patent number: 8626506

Abstract: A method for dynamic nametag scoring includes receiving at least one confusion table including at least one circumstantial condition wherein the confusion table is based on a plurality of phonetically balanced utterances, determining a plurality of templates for the nametag based on the received confusion tables, and determining a global nametag score for the nametag based on the determined templates. A computer usable medium with suitable computer program code is employed for dynamic nametag scoring.

Type: Grant

Filed: January 20, 2006

Date of Patent: January 7, 2014

Assignee: General Motors LLC

Inventors: Rathinavelu Chengalvarayan, John J. Correia
Systems and methods for extracting meaning from multimodal inputs using finite-state devices

Patent number: 8626507

Abstract: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

Type: Grant

Filed: November 30, 2012

Date of Patent: January 7, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Srinivas Bangalore, Michael J. Johnston
Speech search device and speech search method

Patent number: 8626508

Abstract: Provided are a speech search device, the search speed of which is very fast, the search performance of which is also excellent, and which performs fuzzy search, and a speech search method. Not only the fuzzy search is performed, but also the distance between phoneme discrimination features included in speech data is calculated to determine the similarity with respect to the speech using both a suffix array and dynamic programming, and an object to be searched for is narrowed by means of search keyword division based on a phoneme and search thresholds relative to a plurality of the divided search keywords, the object to be searched for is repeatedly searched for while increasing the search thresholds in order, and whether or not there is the keyword division is determined according to the length of the search keywords, thereby implementing speech search, the search speed of which is very fast and the search performance of which is also excellent.

Type: Grant

Filed: February 10, 2010

Date of Patent: January 7, 2014

Assignee: National University Corporation TOYOHASHI UNIVERSITY OF TECHNOLOGY

Inventors: Koichi Katsurada, Tsuneo Nitta, Shigeki Teshima
Speech processing system and method

Patent number: 8620655

Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic

Type: Grant

Filed: August 10, 2011

Date of Patent: December 31, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
Identifying audio content using distorted target patterns

Patent number: 8615397

Abstract: Embodiments of a system for identifying audio content are described. During operation, the system receives a data stream from an electronic device via a communication network. Then, the system distorts a set of target patterns which are used to identify the audio content based on characteristics of the electronic device and/or the communication network. Next, the system identifies the audio content in the data stream based on the set of distorted target patterns.

Type: Grant

Filed: April 4, 2008

Date of Patent: December 24, 2013

Assignee: Intuit Inc.

Inventor: Matt E. Hart
Speech processing system and method

Patent number: 8612224

Abstract: A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are upd

Type: Grant

Filed: August 23, 2011

Date of Patent: December 17, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Catherine Breslin, Mark John Francis Gales, Kean Kheong Chin, Katherine Mary Knill
Method and system for automatically detecting morphemes in a task classification system using lattices

Patent number: 8612212

Abstract: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.

Type: Grant

Filed: March 4, 2013

Date of Patent: December 17, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Giuseppe Riccardi
Voice processing device and method, and program

Patent number: 8612223

Abstract: There is provided a voice processing device. The device includes: score calculation unit configured to calculate a score indicating compatibility of a voice signal input on the basis of an utterance of a user with each of plural pieces of intention information indicating each of a plurality of intentions; intention selection unit configured to select the intention information indicating the intention of the utterance of the user among the plural pieces of intention information on the basis of the score calculated by the score calculation unit; and intention reliability calculation unit configured to calculate the reliability with respect to the intention information selected by the intention selection unit on the basis of the score calculated by the score calculation unit.

Type: Grant

Filed: June 17, 2010

Date of Patent: December 17, 2013

Assignee: Sony Corporation

Inventors: Katsuki Minamino, Hitoshi Honda, Yoshinori Maeda, Hiroaki Ogawa
Speech recognition and summarization

Patent number: 8612211

Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving two or more data sets each representing speech of a corresponding individual attending an internet-based social networking video conference session, decoding the received data sets to produce corresponding text for each individual attending the internet-based social networking video conference, and detecting characteristics of the session from a coalesced transcript produced from the decoded text of the attending individuals for providing context to the internet-based social networking video conference session.

Type: Grant

Filed: January 17, 2013

Date of Patent: December 17, 2013

Assignee: Google Inc.

Inventors: Glen Shires, Sterling Swigart, Jonathan Zolla, Jason J. Gauci
Method and system for conveying an example in a natural language understanding application

Patent number: 8612229

Abstract: A method (300) and system (100) is provided to add the creation of examples at a developer level in the generation of Natural Language Understanding (NLU) models, tying the examples into a NLU sentence database (130), automatically validating (310) a correct outcome of using the examples, and automatically resolving (316) problems the user has using the examples. The method (300) can convey examples of what a caller can say to a Natural Language Understanding (NLU) application. The method includes entering at least one example associated with an existing routing destination, and ensuring an NLU model correctly interprets the example unambiguously for correctly routing a call to the routing destination. The method can include presenting the example sentence in a help message (126) within an NLU dialogue as an example of what a caller can say for connecting the caller to a desired routing destination.

Type: Grant

Filed: December 15, 2005

Date of Patent: December 17, 2013

Assignee: Nuance Communications, Inc.

Inventors: Rajesh Balchandran, Linda M. Boyer, James R. Lewis, Brent D. Metz
NAME RECOGNITION SYSTEM

Publication number: 20130332164

Abstract: A speech recognition system uses, in one embodiment, an extended phonetic dictionary that is obtained by processing words in a user's set of databases, such as a user's contacts database, with a set of pronunciation guessers. The speech recognition system can use a conventional phonetic dictionary and the extended phonetic dictionary to recognize speech inputs that are user requests to use the contacts database, for example, to make a phone call, etc. The extended phonetic dictionary can be updated in response to changes in the contacts database, and the set of pronunciation guessers can include pronunciation guessers for a plurality of locales, each locale having its own pronunciation guesser.

Type: Application

Filed: June 8, 2012

Publication date: December 12, 2013

Inventor: Devang K. Nalk
Speech data process unit and speech data process unit control program for speech recognition

Patent number: 8606580

Abstract: To provide a data process unit and data process unit control program that are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and that are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. The data process unit comprises a data classification section, data storing section, pattern model generating section, data control section, mathematical distance calculating section, pattern model converting section, pattern model display section, region dividing section, division changing section, region selecting section, and specific pattern model generating section.

Type: Grant

Filed: December 30, 2008

Date of Patent: December 10, 2013

Assignee: Asahi Kasei Kabushiki Kaisha

Inventors: Makoto Shozakai, Goshu Nagino
Method and apparatus for improving memory locality for real-time speech recognition

Patent number: 8606578

Abstract: According to some embodiments, a method and apparatus are provided to buffer N audio frames of a plurality of audio frames associated with an audio signal, pre-compute scores for a subset of context dependent models (CDMs), and perform a graphical model search associated with the N audio frames where a score of a context independent model (CIM) associated with a CDM is used in lieu of a score for the CDM when a score for the CDM is needed and has not been pre-computed.

Type: Grant

Filed: June 25, 2009

Date of Patent: December 10, 2013

Assignee: Intel Corporation

Inventors: Michael Eugene Deisher, Tao Ma
Method and apparatus of providing semi-automated classifier adaptation for natural language processing

Patent number: 8606575

Abstract: Example embodiments of the present invention may include a method that provides transcribing spoken utterances occurring during a call and automatically assigning each of the spoken utterances with a corresponding set of first classifications. The method may also include determining a confidence rating associated with each of the spoken utterances and the assigned set of first classifications, and performing at least one of reclassifying the spoken utterances with new classifications based on at least one additional classification operation, and adding the assigned first classifications and the corresponding plurality of spoken utterances to a training data set.

Type: Grant

Filed: September 6, 2011

Date of Patent: December 10, 2013

Assignee: West Corporation

Inventor: Silke Witt-ehsani
Method of using microphone characteristics to optimize speech recognition performance

Patent number: 8600741

Abstract: A system and method for tuning a speech recognition engine to an individual microphone using a database containing acoustical models for a plurality of microphones. Microphone performance characteristics are obtained from a microphone at a speech recognition engine, the database is searched for an acoustical model that matches the characteristics, and the speech recognition engine is then modified based on the matching acoustical model.

Type: Grant

Filed: August 20, 2008

Date of Patent: December 3, 2013

Assignee: General Motors LLC

Inventors: Gaurav Talwar, Rathinavelu Chengalvarayan, Jesse T. Gratke, Subhash B. Gullapalli, Dana B. Fecher
System and methods for matching an utterance to a template hierarchy

Patent number: 8600748

Abstract: A system and methods for matching at least one word of an utterance against a set of template hierarchies to select the best matching template or set of templates corresponding to the utterance. The system and methods determines at least one exact, inexact, and partial match between the at least one word of the utterance and at least one term within the template hierarchy to select and populate a template or set of templates corresponding to the utterance. The populated template or set of templates may then be used to generate a narrative template or a report template.

Type: Grant

Filed: March 30, 2012

Date of Patent: December 3, 2013

Assignee: Cyberpulse L.L.C.

Inventors: James Roberge, Jeffrey Soble
Audio quality and double talk preservation in echo control for voice communications

Patent number: 8600037

Abstract: Echo cancellation is handled using a pattern classification technique. During a training phase, the communications device is trained to learn patterns of signal inputs that correspond to certain communication modes. After numerous different patterns have been classified by mode, the classified patterns are used during real-time use of the communications device in order to determine, dynamically, the mode in which the communications device is currently operating. The communications device may then apply, to the microphone-produced signal, a suppression action.

Type: Grant

Filed: June 1, 2012

Date of Patent: December 3, 2013

Assignee: Apple Inc.

Inventor: Arvindh Krishnaswany
Dialect-specific acoustic language modeling and speech recognition

Patent number: 8583432

Abstract: Methods and systems for automatic speech recognition and methods and systems for training acoustic language models are disclosed. One system for automatic speech recognition includes a dialect recognition unit and a controller. The dialect recognition unit is configured to analyze acoustic input data to identify portions of the acoustic input data that conform to a general language and to identify portions of the acoustic input data that conform to at least one dialect of the general language. In addition, the controller is configured to apply a general language model and at least one dialect language model to the input data to perform speech recognition by dynamically selecting between the models in accordance with each of the identified portions.

Type: Grant

Filed: July 25, 2012

Date of Patent: November 12, 2013

Assignee: International Business Machines Corporation

Inventors: Fadi Biadsy, Lidia Mangu, Hagen Soltau
Word category estimation apparatus, word category estimation method, speech recognition apparatus, speech recognition method, program, and recording medium

Patent number: 8583436

Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.

Type: Grant

Filed: December 19, 2008

Date of Patent: November 12, 2013

Assignee: NEC Corporation

Inventors: Hitoshi Yamamoto, Kiyokazu Miki
GENERATING ACOUSTIC MODELS

Publication number: 20130297310

Abstract: This document describes methods, systems, techniques, and computer program products for generating and/or modifying acoustic models. Acoustic models and/or transformations for a target language/dialect can be generated and/or modified using acoustic models and/or transformations from a source language/dialect.

Type: Application

Filed: November 8, 2011

Publication date: November 7, 2013

Inventors: Eugene Weinstein, Pedro J. Moreno Mangibar
Monitoring and logging voice traffic on data network

Patent number: 8577680

Abstract: A method, article of manufacture, and apparatus for monitoring data traffic on a network is disclosed. In an embodiment, this includes obtaining intrinsic data from at least a portion of the traffic, obtaining extrinsic data from at least a portion of the traffic, associating the intrinsic data with the extrinsic data, and logging the intrinsic data and extrinsic data. The portion of the traffic from which the intrinsic data and extrinsic data are derived may not be stored, or may be stored in encrypted form.

Type: Grant

Filed: December 30, 2006

Date of Patent: November 5, 2013

Assignee: EMC Corporation

Inventors: Christopher Hercules Claudatos, William Dale Andruss, Scott R. Bevan
Pronunciation discovery for spoken words

Patent number: 8577681

Abstract: A method of generating an alternative pronunciation for a word or phrase, given an initial pronunciation and a spoken example of the word or phrase, includes providing the initial pronunciation of the word or phrase, and generating the alternative pronunciation by searching a neighborhood of pronunciations about the initial pronunciation via a constrained hypothesis, wherein the neighborhood includes pronunciations that differ from the initial pronunciation by at most one phoneme. The method further includes selecting a highest scoring pronunciation within the neighborhood of pronunciations.

Type: Grant

Filed: September 13, 2004

Date of Patent: November 5, 2013

Assignee: Nuance Communications, Inc.

Inventors: Daniel L. Roth, Laurence S. Gillick, Mike Shire
Speech recognition system and speech recognizing method

Patent number: 8577678

Abstract: A speech recognition system according to the present invention includes a sound source separating section which separates mixed speeches from multiple sound sources from one another; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each frequency spectral component of a separated speech signal using distributions of speech signal and noise against separation reliability of the separated speech signal; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.

Type: Grant

Filed: March 10, 2011

Date of Patent: November 5, 2013

Assignee: Honda Motor Co., Ltd.

Inventors: Kazuhiro Nakadai, Toru Takahashi, Hiroshi Okuno
System and Dialog Manager Developed Using Modular Spoken-Dialog Components

Publication number: 20130289990

Abstract: A dialog manager and spoken dialog service having a dialog manager generated according to a method comprising selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The dialog manager capable of handling context shifts in a spoken dialog with a user. Application dependencies are established in the top level flow controller thus enabling the subdialogs to be reusable and to be capable of managing context shifts and mixed initiative dialogs.

Type: Application

Filed: June 25, 2013

Publication date: October 31, 2013

Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis

prev … 6 7 8 9 10 11 12 13 14 … next