Creating Patterns For Matching Patents (Class 704/243)
  • Patent number: 8676574
    Abstract: In a spoken language processing method for tone/intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.
    Type: Grant
    Filed: November 10, 2010
    Date of Patent: March 18, 2014
    Assignee: Sony Computer Entertainment Inc.
    Inventor: Ozlem Kalinli
  • Patent number: 8670980
    Abstract: A tone determination device, which determines the tonality of an input signal, is capable of reducing calculation complexity. Therein a frequency conversion unit (101) converts the frequency of an input signal; a downsampling unit (102) carries out shortening processing which shortens the vector series length of the frequency-converted signal; a constancy determination unit (107) determines the constancy of the input signal; depending on the constancy of the input signal, a vector selection unit (104) selects either the vector series of the post-frequency conversion signal or the vector series after the shortening of the vector series length; a correlation analysis unit (105) uses the vector series selected by the vector selection unit (104) to obtain correlations; and a tone determination unit (106) uses the correlations to determine the tonality of the input signal.
    Type: Grant
    Filed: October 26, 2010
    Date of Patent: March 11, 2014
    Assignee: Panasonic Corporation
    Inventor: Kaoru Satoh
  • Patent number: 8670983
    Abstract: A method for determining a similarity between a first audio source and a second audio source includes: for the first audio source, determining a first frequency of occurrence for each of a plurality of phoneme sequences and determining a first weighted frequency for each of the plurality of phoneme sequences based on the first frequency of occurrence for the phoneme sequence; for the second audio source, determining a second frequency of occurrence for each of a plurality of phoneme sequences and determining a second weighted frequency for each of the plurality of phoneme sequences based on the second frequency of occurrence for the phoneme sequence; comparing the first weighted frequency for each phoneme sequence with the second weighted frequency for the corresponding phoneme sequence; and generating a similarity score representative of a similarity between the first audio source and the second audio source based on the results of the comparing.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: March 11, 2014
    Assignee: Nexidia Inc.
    Inventors: Jacob B. Garland, Jon A. Arrowood, Drew Lanham, Marsal Gavalda
  • Publication number: 20140067393
    Abstract: According to an embodiment, a model learning device learns a model having a full covariance matrix shared among a plurality of Gaussian distributions. The device includes a first calculator to calculate, from training data, frequencies of occurrence and sufficient statistics of the Gaussian distributions contained in the model; and a second calculator to select, on the basis of the frequencies of occurrence and the sufficient statistics, a sharing structure in which a covariance matrix is shared among Gaussian distributions, and calculate the full covariance matrix shared in the selected sharing structure.
    Type: Application
    Filed: August 15, 2013
    Publication date: March 6, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Takashi Masuko
  • Patent number: 8667230
    Abstract: A digital memory architecture for recognition and recall in support of a host comprises a plurality of pattern processors, each of which has its own random access memory (RAM) and controller, an external data bus and external data bus controller, a results bus and results bus controller, an internal data bus and internal data bus controller, and an external control bus and external control bus and controller. Each of the pattern processors may be a general purpose set theoretic processor (GPSTP) operating in interrupt and block modes.
    Type: Grant
    Filed: October 19, 2010
    Date of Patent: March 4, 2014
    Inventor: Curtis L. Harris
  • Patent number: 8666737
    Abstract: A noise power estimation system for estimating noise power of each frequency spectral component includes a cumulative histogram generating section for generating a cumulative histogram for each frequency spectral component of a time series signal, in which the horizontal axis indicates index of power level and the vertical axis indicates cumulative frequency and which is weighted by exponential moving average; and a noise power estimation section for determining an estimated value of noise power for each frequency spectral component of the time series signal based on the cumulative histogram.
    Type: Grant
    Filed: September 14, 2011
    Date of Patent: March 4, 2014
    Assignee: Honda Motor Co., Ltd.
    Inventors: Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa
  • Patent number: 8666731
    Abstract: Embodiments of the invention provide a method, computer program and apparatus for processing a computer message, the method comprising: upon receipt of a computer message at a computer, classifying the computer message and assigning it a message cluster identification in dependence thereon; and, utilizing a message template to trans-denotate the message, wherein the message template is selected in dependence on the message cluster identification.
    Type: Grant
    Filed: September 22, 2010
    Date of Patent: March 4, 2014
    Assignee: Oracle International Corporation
    Inventors: Stephen Anthony Moyle, Graham Kenneth Thwaites
  • Patent number: 8666739
    Abstract: Method of the present invention may include receiving speech feature vector converted from speech signal, performing first search by applying first language model to the received speech feature vector, and outputting word lattice and first acoustic score of the word lattice as continuous speech recognition result, outputting second acoustic score as phoneme recognition result by applying an acoustic model to the speech feature vector, comparing the first acoustic score of the continuous speech recognition result with the second acoustic score of the phoneme recognition result, outputting first language model weight when the first coustic score of the continuous speech recognition result is better than the second acoustic score of the phoneme recognition result and performing a second search by applying a second language model weight, which is the same as the output first language model, to the word lattice.
    Type: Grant
    Filed: December 13, 2011
    Date of Patent: March 4, 2014
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Hyung Bae Jeon, Yun Keun Lee, Eui Sok Chung, Jong Jin Kim, Hoon Chung, Jeon Gue Park, Ho Young Jung, Byung Ok Kang, Ki Young Park, Sung Joo Lee, Jeom Ja Kang, Hwa Jeon Song
  • Publication number: 20140058731
    Abstract: A system and method are presented for selectively biased linear discriminant analysis in automatic speech recognition systems. Linear Discriminant Analysis (LDA) may be used to improve the discrimination between the hidden Markov model (HMM) tied-states in the acoustic feature space. The between-class and within-class covariance matrices may be biased based on the observed recognition errors of the tied-states, such as shared HMM states of the context dependent tri-phone acoustic model. The recognition errors may be obtained from a trained maximum-likelihood acoustic model utilizing the tied-states which may then be used as classes in the analysis.
    Type: Application
    Filed: August 23, 2013
    Publication date: February 27, 2014
    Applicant: INTERACTIVE INTELLIGENCE, INC.
    Inventors: Vivek Tyagi, Aravind Ganapathiraju, Felix Immanuel Wyss
  • Patent number: 8660842
    Abstract: Speech recognition device uses visual information to narrow down the range of likely adaptation parameters even before a speaker makes an utterance. Images of the speaker and/or the environment are collected using an image capturing device, and then processed to extract biometric features and environmental features. The extracted features and environmental features are then used to estimate adaptation parameters. A voice sample may also be collected to refine the adaptation parameters for more accurate speech recognition.
    Type: Grant
    Filed: March 9, 2010
    Date of Patent: February 25, 2014
    Assignee: Honda Motor Co., Ltd.
    Inventor: Antoine R. Raux
  • Publication number: 20140052444
    Abstract: A system and methods for matching at least one word of an utterance against a set of template hierarchies to select the best matching template or set of templates corresponding to the utterance. Certain embodiments of the system and methods determines at least one exact, inexact, and partial match between the at least one word of the utterance and at least one term within the template hierarchy to select and populate a template or set of templates corresponding to the utterance. The populated template or set of templates may then be used to generate a narrative template or a report template.
    Type: Application
    Filed: October 28, 2013
    Publication date: February 20, 2014
    Inventor: James Roberge
  • Patent number: 8655660
    Abstract: The present invention is a system and method for generating a personal voice font including, monitoring voice segments automatically from phone conversations of a user by a voice learning processor to generate a personalized voice font and delivering the personalized voice font (PVF) to the a server.
    Type: Grant
    Filed: February 10, 2009
    Date of Patent: February 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Zsolt Szalai, Philippe Bazot, Bernard Pucci, Joel Vitale
  • Patent number: 8655662
    Abstract: Disclosed herein are systems, methods, and computer readable-media for answering a communication notification. The method for answering a communication notification comprises receiving a notification of communication from a user, converting information related to the notification to speech, outputting the information as speech to the user, and receiving from the user an instruction to accept or ignore the incoming communication associated with the notification. In one embodiment, information related to the notification comprises one or more of a telephone number, an area code, a geographic origin of the request, caller id, a voice message, address book information, a text message, an email, a subject line, an importance level, a photograph, a video clip, metadata, an IP address, or a domain name. Another embodiment involves notification assigned an importance level and repeat attempts at notification if it is of high importance.
    Type: Grant
    Filed: November 29, 2012
    Date of Patent: February 18, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst Schroeter
  • Patent number: 8655647
    Abstract: Described is a technology by which a statistical N-gram (e.g., language) model is trained using an N-gram selection technique that helps reduce the size of the final N-gram model. During training, a higher-order probability estimate for an N-gram is only added to the model when the training data justifies adding the estimate. To this end, if a backoff probability estimate is within a maximum likelihood set determined by that N-gram and the N-gram's associated context, or is between the higher-order estimate and the maximum likelihood set, then the higher-order estimate is not included in the model. The backoff probability estimate may be determined via an iterative process such that the backoff probability estimate is based on the final model rather than any lower-order model. Also described is additional pruning referred to as modified weighted difference pruning.
    Type: Grant
    Filed: March 11, 2010
    Date of Patent: February 18, 2014
    Assignee: Microsoft Corporation
    Inventor: Robert Carter Moore
  • Publication number: 20140046662
    Abstract: A system and method are presented for acoustic data selection of a particular quality for training the parameters of an acoustic model, such as a Hidden Markov Model and Gaussian Mixture Model, for example, in automatic speech recognition systems in the speech analytics field. A raw acoustic model may be trained using a given speech corpus and maximum likelihood criteria. A series of operations are performed, such as a forced Viterbi-alignment, calculations of likelihood scores, and phoneme recognition, for example, to form a subset corpus of training data. During the process, audio files of a quality that does not meet a criterion, such as poor quality audio files, may be automatically rejected from the corpus. The subset may then be used to train a new acoustic model.
    Type: Application
    Filed: August 5, 2013
    Publication date: February 13, 2014
    Applicant: Interactive Intelligence, Inc.
    Inventors: Vivek Tyagi, Aravind Ganapathiraju, Felix Immanuel Wyss
  • Patent number: 8650031
    Abstract: Techniques disclosed herein include systems and methods for voice-enabled searching. Techniques include a co-occurrence based approach to improve accuracy of the 1-best hypothesis for non-phrase voice queries, as well as for phrased voice queries. A co-occurrence model is used in addition to a statistical natural language model and acoustic model to recognize spoken queries, such as spoken queries for searching a search engine. Given an utterance and an associated list of automated speech recognition n-best hypotheses, the system rescores the different hypotheses using co-occurrence information. For each hypothesis, the system estimates a frequency of co-occurrence within web documents. Combined scores from a speech recognizer and a co-occurrence engine can be combined to select a best hypothesis with a lower word error rate.
    Type: Grant
    Filed: July 31, 2011
    Date of Patent: February 11, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Jonathan Mamou, Abhinav Sethy, Bhuvana Ramabhadran, Ron Hoory, Paul Joseph Vozila, Nathan Bodenstab
  • Patent number: 8650033
    Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.
    Type: Grant
    Filed: October 13, 2006
    Date of Patent: February 11, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Allen Louis Gorin, Dilek Z. Hakkani-Tur, Giuseppe Riccardi
  • Patent number: 8650030
    Abstract: A method for receiving processed information at a remote device is described. The method includes transmitting from the remote device a verbal request to a first information provider and receiving a digital message from the first information provider in response to the transmitted verbal request. The digital message includes a symbolic representation indicator associated with a symbolic representation of the verbal request and data used to control an application. The method also includes transmitting, using the application, the symbolic representation indicator to a second information provider for generating results to be displayed on the remote device.
    Type: Grant
    Filed: April 2, 2007
    Date of Patent: February 11, 2014
    Assignee: Google Inc.
    Inventors: Gudmundur Hafsteinsson, Michael J. LeBeau, Natalia Marmasse, Sumit Agarwal, Dipchand Nishar
  • Patent number: 8645130
    Abstract: A processing unit is provided which executes speech recognition on speech signals captured by a microphone for capturing sounds uttered in an environment. The processing unit has: an initial reflection component extraction portion that extracts initial reflection components by removing diffuse reverberation components from a reverberation pattern of an impulse response generated in the environment; and an acoustic model learning portion that learns an acoustic model for the speech recognition by reflecting the initial reflection components to speech data for learning.
    Type: Grant
    Filed: November 20, 2008
    Date of Patent: February 4, 2014
    Assignees: Toyota Jidosha Kabushiki Kaisha, National University Corporation Nara Institute of Science and Technology
    Inventors: Narimasa Watanabe, Kiyohiro Shikano, Randy Gomez
  • Patent number: 8645135
    Abstract: A transformation can be derived which would represent that processing required to convert a male speech model to a female speech model. That transformation is subjected to a predetermined modification, and the modified transformation is applied to a female speech model to produce a synthetic children's speech model. The male and female models can be expressed in terms of a vector representing key values defining each speech model and the derived transformation can be in the form of a matrix that would transform the vector of the male model to the vector of the female model. The modification to the derived matrix comprises applying an exponential p which has a value greater than zero and less than 1.
    Type: Grant
    Filed: September 12, 2008
    Date of Patent: February 4, 2014
    Assignee: Rosetta Stone, Ltd.
    Inventors: Andreas Hagen, Bryan Peltom, Kadri Hacioglu
  • Patent number: 8639508
    Abstract: A method of automatic speech recognition includes receiving an utterance from a user via a microphone that converts the utterance into a speech signal, pre-processing the speech signal using a processor to extract acoustic data from the received speech signal, and identifying at least one user-specific characteristic in response to the extracted acoustic data. The method also includes determining a user-specific confidence threshold responsive to the at least one user-specific characteristic, and using the user-specific confidence threshold to recognize the utterance received from the user and/or to assess confusability of the utterance with stored vocabulary.
    Type: Grant
    Filed: February 14, 2011
    Date of Patent: January 28, 2014
    Assignee: General Motors LLC
    Inventors: Xufang Zhao, Gaurav Talwar
  • Patent number: 8639513
    Abstract: An apparatus includes a plurality of applications and an integrator having a voice recognition module configured to identify at least one voice command from a user. The integrator is configured to integrate information from a remote source into at least one of the plurality of applications based on the identified voice command. A method includes analyzing speech from a first user of a first mobile device having a plurality of applications, identifying a voice command based on the analyzed speech using a voice recognition module, and incorporating information from the remote source into at least one of a plurality of applications based on the identified voice command.
    Type: Grant
    Filed: August 5, 2009
    Date of Patent: January 28, 2014
    Assignee: Verizon Patent and Licensing Inc.
    Inventor: Robert Edward Opaluch
  • Patent number: 8638190
    Abstract: In general, techniques and systems for defining a gesture with a computing device using short-range communication are described. In one example, a method includes obtaining position information from an array of position devices using near-field communication (NFC) during a movement of the computing device with respect to the array, wherein the position information identifies unique positions within the array for each position device from which position information was obtained. The method may also include determining sequence information associated with the position information, wherein the sequence information is representative of an order in which the position information was obtained from each position device, and performing, by the computing device, an action based at least in part on the position information and the sequence information, wherein the position information and the sequence information are representative of a gesture input associated with the movement of the computing device.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: January 28, 2014
    Assignee: Google Inc.
    Inventors: Roy Want, Yang Li, William Noah Schilit
  • Patent number: 8635058
    Abstract: The present invention relates to increasing the relevance of media content communicated to consumers who are consuming the media content. In this regard, at least one of a personal device can be synced with a media device, each of the personal device is associated with at least one of a consumer who is proximate the media device. At least one of a preferred human language associated with at least one of the personal device can be determined. The media device or media content can be configured and or caused to be communicated in at least one of the preferred human language to increase relevance of the media content communicated to the consumer. Other embodiments can include communicating at least a portion of the media content on the personal device and selecting relevant media content based in part on language, cultural, ethnic, time, day, occasion, or geography.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: January 21, 2014
    Inventor: Nilang Patel
  • Patent number: 8635065
    Abstract: The present invention discloses an apparatus for automatic extraction of important events in audio signals comprising: signal input means for supplying audio signals; audio signal fragmenting means for partitioning audio signals supplied by the signal input means into audio fragments of a predetermined length and for allocating a sequence of one or more audio fragments to a respective audio window; feature extracting means for analyzing acoustic characteristics of the audio signals comprised in the audio fragments and for analyzing acoustic characteristics of the audio signals comprised in the audio windows; and important event extraction means for extracting important events in audio signals supplied by the audio signal fragmenting means based on predetermined important event classifying rules depending on acoustic characteristics of the audio signals comprised in the audio fragments and on acoustic characteristics of the audio signals comprised in the audio windows, wherein each important event extracted
    Type: Grant
    Filed: November 10, 2004
    Date of Patent: January 21, 2014
    Assignee: Sony Deutschland GmbH
    Inventors: Silke Goronzy-Thomae, Thomas Kemp, Ralf Kompe, Yin Hay Lam, Krzysztof Marasek, Raquel Tato
  • Patent number: 8630852
    Abstract: An image processing apparatus includes a speech input portion that receives an input of speech from a user, a dictionary storage portion that stores a dictionary configured by phrase information pieces for recognizing the speech, a compound phrase generation portion that generates a plurality of compound phrases formed by all combinations of a plurality of predetermined phrases in different orders, a compound phrase registration portion that registers the plurality of compound phrases that have been generated in the dictionary as the phrase information pieces, a speech recognition portion that, in a case where speech including a speech phrase formed by the plurality of predetermined phrases said in an arbitrary order has been input, performs speech recognition on the speech by searching the dictionary for a compound phrase that matches the speech phrase.
    Type: Grant
    Filed: September 16, 2010
    Date of Patent: January 14, 2014
    Assignee: Konica Minolta Business Technologies, Inc.
    Inventor: Ayumi Itoh
  • Patent number: 8626506
    Abstract: A method for dynamic nametag scoring includes receiving at least one confusion table including at least one circumstantial condition wherein the confusion table is based on a plurality of phonetically balanced utterances, determining a plurality of templates for the nametag based on the received confusion tables, and determining a global nametag score for the nametag based on the determined templates. A computer usable medium with suitable computer program code is employed for dynamic nametag scoring.
    Type: Grant
    Filed: January 20, 2006
    Date of Patent: January 7, 2014
    Assignee: General Motors LLC
    Inventors: Rathinavelu Chengalvarayan, John J. Correia
  • Patent number: 8626507
    Abstract: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: January 7, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Srinivas Bangalore, Michael J. Johnston
  • Patent number: 8626508
    Abstract: Provided are a speech search device, the search speed of which is very fast, the search performance of which is also excellent, and which performs fuzzy search, and a speech search method. Not only the fuzzy search is performed, but also the distance between phoneme discrimination features included in speech data is calculated to determine the similarity with respect to the speech using both a suffix array and dynamic programming, and an object to be searched for is narrowed by means of search keyword division based on a phoneme and search thresholds relative to a plurality of the divided search keywords, the object to be searched for is repeatedly searched for while increasing the search thresholds in order, and whether or not there is the keyword division is determined according to the length of the search keywords, thereby implementing speech search, the search speed of which is very fast and the search performance of which is also excellent.
    Type: Grant
    Filed: February 10, 2010
    Date of Patent: January 7, 2014
    Assignee: National University Corporation TOYOHASHI UNIVERSITY OF TECHNOLOGY
    Inventors: Koichi Katsurada, Tsuneo Nitta, Shigeki Teshima
  • Patent number: 8620655
    Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic
    Type: Grant
    Filed: August 10, 2011
    Date of Patent: December 31, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
  • Patent number: 8615397
    Abstract: Embodiments of a system for identifying audio content are described. During operation, the system receives a data stream from an electronic device via a communication network. Then, the system distorts a set of target patterns which are used to identify the audio content based on characteristics of the electronic device and/or the communication network. Next, the system identifies the audio content in the data stream based on the set of distorted target patterns.
    Type: Grant
    Filed: April 4, 2008
    Date of Patent: December 24, 2013
    Assignee: Intuit Inc.
    Inventor: Matt E. Hart
  • Patent number: 8612224
    Abstract: A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers; the method comprising: receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising: performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are upd
    Type: Grant
    Filed: August 23, 2011
    Date of Patent: December 17, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Catherine Breslin, Mark John Francis Gales, Kean Kheong Chin, Katherine Mary Knill
  • Patent number: 8612212
    Abstract: The invention concerns a method and corresponding system for building a phonotactic model for domain independent speech recognition. The method may include recognizing phones from a user's input communication using a current phonotactic model, detecting morphemes (acoustic and/or non-acoustic) from the recognized phones, and outputting the detected morphemes for processing. The method also updates the phonotactic model with the detected morphemes and stores the new model in a database for use by the system during the next user interaction. The method may also include making task-type classification decisions based on the detected morphemes from the user's input communication.
    Type: Grant
    Filed: March 4, 2013
    Date of Patent: December 17, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventor: Giuseppe Riccardi
  • Patent number: 8612223
    Abstract: There is provided a voice processing device. The device includes: score calculation unit configured to calculate a score indicating compatibility of a voice signal input on the basis of an utterance of a user with each of plural pieces of intention information indicating each of a plurality of intentions; intention selection unit configured to select the intention information indicating the intention of the utterance of the user among the plural pieces of intention information on the basis of the score calculated by the score calculation unit; and intention reliability calculation unit configured to calculate the reliability with respect to the intention information selected by the intention selection unit on the basis of the score calculated by the score calculation unit.
    Type: Grant
    Filed: June 17, 2010
    Date of Patent: December 17, 2013
    Assignee: Sony Corporation
    Inventors: Katsuki Minamino, Hitoshi Honda, Yoshinori Maeda, Hiroaki Ogawa
  • Patent number: 8612211
    Abstract: The subject matter of this specification can be embodied in, among other things, a method that includes receiving two or more data sets each representing speech of a corresponding individual attending an internet-based social networking video conference session, decoding the received data sets to produce corresponding text for each individual attending the internet-based social networking video conference, and detecting characteristics of the session from a coalesced transcript produced from the decoded text of the attending individuals for providing context to the internet-based social networking video conference session.
    Type: Grant
    Filed: January 17, 2013
    Date of Patent: December 17, 2013
    Assignee: Google Inc.
    Inventors: Glen Shires, Sterling Swigart, Jonathan Zolla, Jason J. Gauci
  • Patent number: 8612229
    Abstract: A method (300) and system (100) is provided to add the creation of examples at a developer level in the generation of Natural Language Understanding (NLU) models, tying the examples into a NLU sentence database (130), automatically validating (310) a correct outcome of using the examples, and automatically resolving (316) problems the user has using the examples. The method (300) can convey examples of what a caller can say to a Natural Language Understanding (NLU) application. The method includes entering at least one example associated with an existing routing destination, and ensuring an NLU model correctly interprets the example unambiguously for correctly routing a call to the routing destination. The method can include presenting the example sentence in a help message (126) within an NLU dialogue as an example of what a caller can say for connecting the caller to a desired routing destination.
    Type: Grant
    Filed: December 15, 2005
    Date of Patent: December 17, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Rajesh Balchandran, Linda M. Boyer, James R. Lewis, Brent D. Metz
  • Publication number: 20130332164
    Abstract: A speech recognition system uses, in one embodiment, an extended phonetic dictionary that is obtained by processing words in a user's set of databases, such as a user's contacts database, with a set of pronunciation guessers. The speech recognition system can use a conventional phonetic dictionary and the extended phonetic dictionary to recognize speech inputs that are user requests to use the contacts database, for example, to make a phone call, etc. The extended phonetic dictionary can be updated in response to changes in the contacts database, and the set of pronunciation guessers can include pronunciation guessers for a plurality of locales, each locale having its own pronunciation guesser.
    Type: Application
    Filed: June 8, 2012
    Publication date: December 12, 2013
    Inventor: Devang K. Nalk
  • Patent number: 8606580
    Abstract: To provide a data process unit and data process unit control program that are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and that are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. The data process unit comprises a data classification section, data storing section, pattern model generating section, data control section, mathematical distance calculating section, pattern model converting section, pattern model display section, region dividing section, division changing section, region selecting section, and specific pattern model generating section.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: December 10, 2013
    Assignee: Asahi Kasei Kabushiki Kaisha
    Inventors: Makoto Shozakai, Goshu Nagino
  • Patent number: 8606578
    Abstract: According to some embodiments, a method and apparatus are provided to buffer N audio frames of a plurality of audio frames associated with an audio signal, pre-compute scores for a subset of context dependent models (CDMs), and perform a graphical model search associated with the N audio frames where a score of a context independent model (CIM) associated with a CDM is used in lieu of a score for the CDM when a score for the CDM is needed and has not been pre-computed.
    Type: Grant
    Filed: June 25, 2009
    Date of Patent: December 10, 2013
    Assignee: Intel Corporation
    Inventors: Michael Eugene Deisher, Tao Ma
  • Patent number: 8606575
    Abstract: Example embodiments of the present invention may include a method that provides transcribing spoken utterances occurring during a call and automatically assigning each of the spoken utterances with a corresponding set of first classifications. The method may also include determining a confidence rating associated with each of the spoken utterances and the assigned set of first classifications, and performing at least one of reclassifying the spoken utterances with new classifications based on at least one additional classification operation, and adding the assigned first classifications and the corresponding plurality of spoken utterances to a training data set.
    Type: Grant
    Filed: September 6, 2011
    Date of Patent: December 10, 2013
    Assignee: West Corporation
    Inventor: Silke Witt-ehsani
  • Patent number: 8600741
    Abstract: A system and method for tuning a speech recognition engine to an individual microphone using a database containing acoustical models for a plurality of microphones. Microphone performance characteristics are obtained from a microphone at a speech recognition engine, the database is searched for an acoustical model that matches the characteristics, and the speech recognition engine is then modified based on the matching acoustical model.
    Type: Grant
    Filed: August 20, 2008
    Date of Patent: December 3, 2013
    Assignee: General Motors LLC
    Inventors: Gaurav Talwar, Rathinavelu Chengalvarayan, Jesse T. Gratke, Subhash B. Gullapalli, Dana B. Fecher
  • Patent number: 8600748
    Abstract: A system and methods for matching at least one word of an utterance against a set of template hierarchies to select the best matching template or set of templates corresponding to the utterance. The system and methods determines at least one exact, inexact, and partial match between the at least one word of the utterance and at least one term within the template hierarchy to select and populate a template or set of templates corresponding to the utterance. The populated template or set of templates may then be used to generate a narrative template or a report template.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: December 3, 2013
    Assignee: Cyberpulse L.L.C.
    Inventors: James Roberge, Jeffrey Soble
  • Patent number: 8600037
    Abstract: Echo cancellation is handled using a pattern classification technique. During a training phase, the communications device is trained to learn patterns of signal inputs that correspond to certain communication modes. After numerous different patterns have been classified by mode, the classified patterns are used during real-time use of the communications device in order to determine, dynamically, the mode in which the communications device is currently operating. The communications device may then apply, to the microphone-produced signal, a suppression action.
    Type: Grant
    Filed: June 1, 2012
    Date of Patent: December 3, 2013
    Assignee: Apple Inc.
    Inventor: Arvindh Krishnaswany
  • Patent number: 8583432
    Abstract: Methods and systems for automatic speech recognition and methods and systems for training acoustic language models are disclosed. One system for automatic speech recognition includes a dialect recognition unit and a controller. The dialect recognition unit is configured to analyze acoustic input data to identify portions of the acoustic input data that conform to a general language and to identify portions of the acoustic input data that conform to at least one dialect of the general language. In addition, the controller is configured to apply a general language model and at least one dialect language model to the input data to perform speech recognition by dynamically selecting between the models in accordance with each of the identified portions.
    Type: Grant
    Filed: July 25, 2012
    Date of Patent: November 12, 2013
    Assignee: International Business Machines Corporation
    Inventors: Fadi Biadsy, Lidia Mangu, Hagen Soltau
  • Patent number: 8583436
    Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.
    Type: Grant
    Filed: December 19, 2008
    Date of Patent: November 12, 2013
    Assignee: NEC Corporation
    Inventors: Hitoshi Yamamoto, Kiyokazu Miki
  • Publication number: 20130297310
    Abstract: This document describes methods, systems, techniques, and computer program products for generating and/or modifying acoustic models. Acoustic models and/or transformations for a target language/dialect can be generated and/or modified using acoustic models and/or transformations from a source language/dialect.
    Type: Application
    Filed: November 8, 2011
    Publication date: November 7, 2013
    Inventors: Eugene Weinstein, Pedro J. Moreno Mangibar
  • Patent number: 8577680
    Abstract: A method, article of manufacture, and apparatus for monitoring data traffic on a network is disclosed. In an embodiment, this includes obtaining intrinsic data from at least a portion of the traffic, obtaining extrinsic data from at least a portion of the traffic, associating the intrinsic data with the extrinsic data, and logging the intrinsic data and extrinsic data. The portion of the traffic from which the intrinsic data and extrinsic data are derived may not be stored, or may be stored in encrypted form.
    Type: Grant
    Filed: December 30, 2006
    Date of Patent: November 5, 2013
    Assignee: EMC Corporation
    Inventors: Christopher Hercules Claudatos, William Dale Andruss, Scott R. Bevan
  • Patent number: 8577681
    Abstract: A method of generating an alternative pronunciation for a word or phrase, given an initial pronunciation and a spoken example of the word or phrase, includes providing the initial pronunciation of the word or phrase, and generating the alternative pronunciation by searching a neighborhood of pronunciations about the initial pronunciation via a constrained hypothesis, wherein the neighborhood includes pronunciations that differ from the initial pronunciation by at most one phoneme. The method further includes selecting a highest scoring pronunciation within the neighborhood of pronunciations.
    Type: Grant
    Filed: September 13, 2004
    Date of Patent: November 5, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Daniel L. Roth, Laurence S. Gillick, Mike Shire
  • Patent number: 8577678
    Abstract: A speech recognition system according to the present invention includes a sound source separating section which separates mixed speeches from multiple sound sources from one another; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each frequency spectral component of a separated speech signal using distributions of speech signal and noise against separation reliability of the separated speech signal; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.
    Type: Grant
    Filed: March 10, 2011
    Date of Patent: November 5, 2013
    Assignee: Honda Motor Co., Ltd.
    Inventors: Kazuhiro Nakadai, Toru Takahashi, Hiroshi Okuno
  • Publication number: 20130289990
    Abstract: A dialog manager and spoken dialog service having a dialog manager generated according to a method comprising selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The dialog manager capable of handling context shifts in a spoken dialog with a user. Application dependencies are established in the top level flow controller thus enabling the subdialogs to be reusable and to be capable of managing context shifts and mixed initiative dialogs.
    Type: Application
    Filed: June 25, 2013
    Publication date: October 31, 2013
    Inventors: Giuseppe Di Fabbrizio, Charles Alfred Lewis