Specialized Equations Or Comparisons Patents (Class 704/236)
  • Patent number: 8364484
    Abstract: An input voice detect is detected after starting a voice input waiting state; the detected voice is recognized; an elapsed time from the start of the voice input waiting state is counted; an informative sound which urges a user to input the voice is outputted when the elapsed time reaches a preset output set time; and the output of the informative sound is stopped when the elapsed time at the time of inputting the voice is shorter than the output set timedetect.
    Type: Grant
    Filed: April 14, 2009
    Date of Patent: January 29, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Takehide Yano, Tadashi Amada, Kazunori Imoto, Koichi Yamamoto
  • Patent number: 8359199
    Abstract: A frame erasure concealment technique for a bitstream-based feature extractor in a speech recognition system particularly suited for use in a wireless communication system operates to “delete” each frame in which an erasure is declared. The deletions thus reduce the length of the observation sequence, but have been found to provide for sufficient speech recognition based on both single word and “string” tests of the deletion technique.
    Type: Grant
    Filed: November 29, 2011
    Date of Patent: January 22, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Richard Vandervoort Cox, Hong Kook Kim
  • Patent number: 8352257
    Abstract: The present system proposes a technique called the spectro-temporal varying technique, to compute the suppression gain. This method is motivated by the perceptual properties of human auditory system; specifically, that the human ear has higher frequency resolution in the lower frequencies band and less frequency resolution in the higher frequencies, and also that the important speech information in the high frequencies are consonants which usually have random noise spectral shape. A second property of the human auditory system is that the human ear has lower temporal resolution in the lower frequencies and higher temporal resolution in the higher frequencies. Based on that, the system uses a spectro-temporal varying method which introduces the concept of frequency-smoothing by modifying the estimation of the a posteriori SNR. In addition, the system also makes the a priori SNR time-smoothing factor depend on frequency.
    Type: Grant
    Filed: December 20, 2007
    Date of Patent: January 8, 2013
    Assignee: QNX Software Systems Limited
    Inventors: Phil A. Hetherington, Xueman Li
  • Patent number: 8351706
    Abstract: Document data corresponding to each page included in a document is stored, and furthermore, feature data indicative of a feature of the document data and a document index indicating the document are associated with the document data. A document extracting apparatus obtains input document data, calculates feature data from the input document data, judges similarity between the input document data and the document data based on the feature data, obtains a document index associated with document data similar to the input document data, and extracts a plurality of pieces of document data associated with the document index. Thus, document data concerning the document including a page corresponding to the document data similar to the input document data is extracted for a plurality of pages.
    Type: Grant
    Filed: July 23, 2008
    Date of Patent: January 8, 2013
    Assignee: Sharp Kabushiki Kaisha
    Inventor: Hitoshi Hirohata
  • Patent number: 8352263
    Abstract: The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.
    Type: Grant
    Filed: September 29, 2009
    Date of Patent: January 8, 2013
    Inventors: Tze-Fen Li, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
  • Publication number: 20130006629
    Abstract: The present invention relates to a searching device, searching method, and program whereby searching for a word string corresponding to input voice can be performed in a robust manner. A voice recognition unit 11 subjects an input voice to voice recognition. A matching unit 16 performs matching, for each of multiple word strings for search results which are word strings that are to be search results for word strings corresponding to the input voice, of a pronunciation symbol string for search results, which is an array of pronunciation symbols expressing pronunciation of the word string search result, and a recognition result pronunciation symbol string which is an array of pronunciation symbols expressing pronunciation of the voice recognition results of the input voice.
    Type: Application
    Filed: December 2, 2010
    Publication date: January 3, 2013
    Applicant: SONY CORPORATION
    Inventors: Hitoshi Honda, Yoshinori Maeda, Satoshi Asakawa
  • Patent number: 8346549
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer.
    Type: Grant
    Filed: December 4, 2009
    Date of Patent: January 1, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Mazin Gilbert
  • Patent number: 8346550
    Abstract: An automatic speech recognition (ASR) system and method is provided for controlling the recognition of speech utterances generated by an end user operating a communications device. The ASR system and method can be used with a mobile device that is used in a communications network. The ASR system can be used for ASR of speech utterances input into a mobile device, to perform compensating techniques using at least one characteristic and for updating an ASR speech recognizer associated with the ASR system by determined and using a background noise value and a distortion value that is based on the features of the mobile device. The ASR system can be used to augment a limited data input capability of a mobile device, for example, caused by limited input devices physically located on the mobile device.
    Type: Grant
    Filed: February 14, 2011
    Date of Patent: January 1, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Richard C. Rose, Sarangarajan Pathasarathy, Aaron Edward Rosenberg, Shrikanth Sambasivan Narayanan
  • Patent number: 8340968
    Abstract: A computer-implemented method for automatically training diction of a person acquires a speech data stream of the person as the person is speaking, compares the words in the speech data stream to a set of predefined undesirable phrases provided in a look-up table and upon detection of one of the predefined undesirable phrases in the speech data stream, alerting the person by an alarm.
    Type: Grant
    Filed: January 9, 2008
    Date of Patent: December 25, 2012
    Assignee: Lockheed Martin Corporation
    Inventor: Vladimir Gershman
  • Publication number: 20120323573
    Abstract: A method for scoring non-native speech includes receiving a speech sample spoken by a non-native speaker and performing automatic speech recognition and metric extraction on the speech sample to generate a transcript of the speech sample and a speech metric associated with the speech sample. The method further includes determining whether the speech sample is scorable or non-scorable based upon the transcript and speech metric, where the determination is based on an audio quality of the speech sample, an amount of speech of the speech sample, a degree to which the speech sample is off-topic, whether the speech sample includes speech from an incorrect language, or whether the speech sample includes plagiarized material. When the sample is determined to be non-scorable, an indication of non-scorability is associated with the speech sample. When the sample is determined to be scorable, the sample is provided to a scoring model for scoring.
    Type: Application
    Filed: March 23, 2012
    Publication date: December 20, 2012
    Inventors: Su-Youn Yoon, Derrick Higgins, Klaus Zechner, Shasha Xie, Je Hun Jeon, Keelan Evanini
  • Patent number: 8332221
    Abstract: The invention relates to a method, a computer program product, a segmentation system and a user interface for structuring an unstructured text by making use of statistical models trained on annotated training data. The method performs text segmentation into text sections and assigns labels to text sections as section headings. The performed segmentation and assignment is provided to a user for general review. Additionally, alternative segmentations and label assignments are provided to the user being capable to select alternative segmentations and alternative labels as well as to enter a user defined segmentation and user defined label. In response to the modifications introduced by the user, a plurality of different actions are initiated incorporating the re-segmentation and re-labeling of successive parts of the document or the entire document.
    Type: Grant
    Filed: August 15, 2011
    Date of Patent: December 11, 2012
    Assignee: Nuance Communications Austria GmbH
    Inventors: Jochen Peters, Evgeny Matusov, Carsten Meyer, Dietrich Klakow
  • Patent number: 8331583
    Abstract: A noise reducing apparatus includes: a voice signal inputting unit inputting an input voice signal; a noise occurrence period detecting unit detecting a noise occurrence period; a noise removing unit removing a noise for the noise occurrence period; a generation source signal acquiring unit acquiring a generation source signal with a time duration corresponding to a time duration corresponding to the noise occurrence period; a pitch calculating unit calculating a pitch of an input voice signal interval; an interval signal setting unit setting interval signals divided in each unit period interval; an interpolation signal generating unit generating an interpolation signal with the time duration corresponding to the noise occurrence period and alternately arranging the interval signal in a forward time direction and the interval signal in a backward time direction; and a combining unit combining the interpolation signal and the input voice signal, from which the noise is removed.
    Type: Grant
    Filed: February 18, 2010
    Date of Patent: December 11, 2012
    Assignee: Sony Corporation
    Inventor: Kazuhiko Ozawa
  • Patent number: 8321219
    Abstract: Embodiments of the present invention improve methods of performing speech recognition using human gestures. In one embodiment, the present invention includes a speech recognition method comprising detecting a gesture, selecting a first recognition set based on the gesture, receiving a speech input signal, and recognizing the speech input signal in the context of the first recognition set.
    Type: Grant
    Filed: September 25, 2008
    Date of Patent: November 27, 2012
    Assignee: Sensory, Inc.
    Inventor: Todd F. Mozer
  • Publication number: 20120296638
    Abstract: In embodiments of the present invention, capabilities are described for understanding and responding to the user intent and questions quickly wherein the understanding is based on supervised system learning, Intelligent layered semantic and syntactic information processing and personalized adaptive semantic interface. Supervised system learning creates reference pattern set for the intent repository and possible question categories. Each layer in the layered processing increases the probability of the intent/question recognition. Personalized adaptive voice interface learns from user's interactions over time by enriching the pattern sets and personal index for successfully resolved user intents and questions. Collectively, all these technologies improve the response time for correctly recognizing and responding to user's intents and questions.
    Type: Application
    Filed: May 18, 2012
    Publication date: November 22, 2012
    Inventor: Ashish Patwa
  • Publication number: 20120296648
    Abstract: Systems and methods for identifying the N-best strings of a weighted automaton. A potential for each state of an input automaton to a set of destination states of the input automaton is first determined. Then, the N-best paths are found in the result of an on-the-fly determinization of the input automaton. Only the portion of the input automaton needed to identify the N-best paths is determinized. As the input automaton is determinized, a potential for each new state of the partially determinized automaton is determined and is used in identifying the N-best paths of the determinized automaton, which correspond exactly to the N-best strings of the input automaton.
    Type: Application
    Filed: July 30, 2012
    Publication date: November 22, 2012
    Applicant: AT&T Corp.
    Inventors: Mehryar Mohri, Michael Dennis Riley
  • Publication number: 20120290302
    Abstract: A Chinese speech recognition system and method is disclosed. Firstly, a speech signal is received and recognized to output a word lattice. Next, the word lattice is received, and word arcs of the word lattice are rescored and reranked with a prosodic break model, a prosodic state model, a syllable prosodic-acoustic model, a syllable-juncture prosodic-acoustic model and a factored language model, so as to output a language tag, a prosodic tag and a phonetic segmentation tag, which correspond to the speech signal. The present invention performs rescoring in a two-stage way to promote the recognition rate of basic speech information and labels the language tag, prosodic tag and phonetic segmentation tag to provide the prosodic structure and language information for the rear-stage voice conversion and voice synthesis.
    Type: Application
    Filed: April 13, 2012
    Publication date: November 15, 2012
    Inventors: Jyh-Her YANG, Chen-Yu Chiang, Ming-Chieh Liu, Yih-Ru Wang, Yuan-Fu Liao, Sin-Horng Chen
  • Patent number: 8311842
    Abstract: A method and apparatus for expanding a bandwidth of an input narrowband voice signal is provided. The narrowband voice signal is analyzed separately for each frame, and a Degree of Voicing (DV) and a Degree of Stationary (DS) are calculated depending on the analysis. A Degree of Difficulty of Bandwidth Expansion (DDBWE) of the narrowband voice signal is calculated based on DV and DS. Bandwidth expansion is controlled according to DDBWE.
    Type: Grant
    Filed: March 3, 2008
    Date of Patent: November 13, 2012
    Assignee: Samsung Electronics Co., Ltd
    Inventors: Geun-Bae Song, Min-Sung Kim, Hee-Jin Oh, Austin Kim, Jae-Bum Kim
  • Patent number: 8306817
    Abstract: In an automatic speech recognition system, a feature extractor extracts features from a speech signal, and speech is recognized by the automatic speech recognition system based on the extracted features. Noise reduction as part of the feature extractor is provided by feature enhancement in which feature-domain noise reduction in the form of Mel-frequency cepstra is provided based on the minimum means square error criterion. Specifically, the devised method takes into account the random phase between the clean speech and the mixing noise. The feature-domain noise reduction is performed in a dimension-wise fashion to the individual dimensions of the feature vectors input to the automatic speech recognition system, in order to perform environment-robust speech recognition.
    Type: Grant
    Filed: January 8, 2008
    Date of Patent: November 6, 2012
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Alejandro Acero, James G. Droppo, Li Deng
  • Patent number: 8306818
    Abstract: Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.
    Type: Grant
    Filed: April 15, 2008
    Date of Patent: November 6, 2012
    Assignee: Microsoft Corporation
    Inventors: Ciprian Chelba, Alejandro Acero, Milind Mahajan
  • Publication number: 20120278075
    Abstract: A system and method for collecting from an ASR, a first rating of an intelligibility of human speech, and collecting another intelligibility rating of such speech from networked listeners to such speech. The first rating and the second rating are weighed based on an importance to a user of the ratings, and a third rating is created from such weighted two ratings.
    Type: Application
    Filed: April 25, 2012
    Publication date: November 1, 2012
    Inventors: Sherrie Ellen Shammass, Eyal Eshed, Ariel Velikovsky
  • Publication number: 20120253806
    Abstract: A system and method for distributed speech recognition is provided. Audio data is obtained from a caller participating in a call with an agent. A main recognizer receives a main grammar template and the audio data. A plurality of secondary recognizers each receive the audio data and a reference that identifies a secondary grammar, which is a non-overlapping section of the main grammar template. Speech recognition is performed on each of the secondary recognizers and speech recognition results are identified by applying the secondary grammar to the audio data. An n number of most likely speech recognition results are selected. The main recognizer constructs a new grammar based on the main grammar template using the speech recognition results from each of the secondary recognizers as a new vocabulary. Further speech recognition results are identified by applying the new grammar to the audio data.
    Type: Application
    Filed: June 18, 2012
    Publication date: October 4, 2012
    Inventor: Gilad Odinak
  • Publication number: 20120253807
    Abstract: A speaker state detecting apparatus comprises: an audio input unit for acquiring, at least, a first voice emanated by a first speaker and a second voice emanated by a second speaker; a speech interval detecting unit for detecting an overlap period between a first speech period of the first speaker included in the first voice and a second speech period of the second speaker included in the second voice, which starts before the first speech period, or an interval between the first speech period and the second speech period; a state information extracting unit for extracting state information representing a state of the first speaker from the first speech period; and a state detecting unit for detecting the state of the first speaker in the first speech period based on the overlap period or the interval and the first state information.
    Type: Application
    Filed: February 3, 2012
    Publication date: October 4, 2012
    Applicant: FUJITSU LIMITED
    Inventor: Akira KAMANO
  • Publication number: 20120253805
    Abstract: Systems, methods, and media for determining fraud risk from audio signals and non-audio data are provided herein. Some exemplary methods include receiving an audio signal and an associated audio signal identifier, receiving a fraud event identifier associated with a fraud event, determining a speaker model based on the received audio signal, determining a channel model based on a path of the received audio signal, using a server system, updating a fraudster channel database to include the determined channel model based on a comparison of the audio signal identifier and the fraud event identified, and updating a fraudster voice database to include the determined speaker model based on a comparison of the audio signal identifier and the fraud event identifier.
    Type: Application
    Filed: March 8, 2012
    Publication date: October 4, 2012
    Inventors: Anthony Rajakumar, Torsten Zeppenfeld, Lisa Guerra, Vipul Vyas
  • Patent number: 8265932
    Abstract: A system and method for identifying audio command prompts for use in a voice response environment is provided. A signature is generated for audio samples each having preceding audio, reference phrase audio, and trailing audio segments. The trailing segment is removed and each of the preceding and reference phrase segments are divided into buffers. The buffers are transformed into discrete fourier transform buffers. One of the discrete fourier transform buffers from the reference phrase segment that is dissimilar to each of the discrete fourier transform buffers from the preceding segment is selected as the signature. Audio command prompts are processed to generate a discrete fourier transform. Each discrete fourier transform for the audio command prompts is compared with each of the signatures and a correlation value is determined. One such audio command prompt matches one such signature when the correlation value for that audio command prompt satisfies a threshold.
    Type: Grant
    Filed: October 3, 2011
    Date of Patent: September 11, 2012
    Assignee: Intellisist, Inc.
    Inventor: Martin R. M. Dunsmuir
  • Patent number: 8255216
    Abstract: A method of and a system for processing speech. A spoken utterance of a plurality of characters can be received. A plurality of known character sequences that potentially correspond to the spoken utterance can be selected. Each selected known character sequence can be scored based on, at least in part, a weighting of individual characters that comprise the known character sequence.
    Type: Grant
    Filed: October 30, 2006
    Date of Patent: August 28, 2012
    Assignee: Nuance Communications, Inc.
    Inventor: Kenneth D. White
  • Publication number: 20120209603
    Abstract: Techniques for acoustic voice activity detection (AVAD) is described, including detecting a signal associated with a subband from a microphone, performing an operation on data associated with the signal, the operation generating a value associated with the subband, and determining whether the value distinguishes the signal from noise by using the value to determine a signal-to-noise ratio and comparing the value to a threshold.
    Type: Application
    Filed: January 9, 2012
    Publication date: August 16, 2012
    Inventor: Zhinian Jing
  • Publication number: 20120197641
    Abstract: A signal portion is extracted from an input signal for each frame having a specific duration to generate a per-frame input signal. The per-frame input signal in a time domain is converted into a per-frame input signal in a frequency domain, thereby generating a spectral pattern. Subband average energy is derived in each of subbands adjacent one another in the spectral pattern. The subband average energy is compared in at least one subband pair of a first subband and a second subband that is a higher frequency band than the first subband, the first and second subbands being consecutive subbands in the spectral pattern. It is determined that the per-frame input signal includes a consonant segment if the subband average energy of the second subband is higher than the subband average energy of the first subband.
    Type: Application
    Filed: February 1, 2012
    Publication date: August 2, 2012
    Applicant: JVC KENWOOD Corporation
    Inventors: Akiko Akechi, Takaaki Yamabe
  • Patent number: 8233590
    Abstract: The present invention relates to a method of automatically controlling the volume level of communication speech for Mean Opinion Score (MOS) measurement, which, before evaluating the quality of communication speech using a MOS measurement method, automatically controls the volume level of actual communication speech to a predetermined optimal level, thus improving the reliability of MOS values.
    Type: Grant
    Filed: November 28, 2006
    Date of Patent: July 31, 2012
    Assignee: Innowireless Co., Ltd.
    Inventors: Jong Tae Chung, Jin Soup Joung, Young Su Kwak, Jin Man Kim, Hyun Seok Cho
  • Patent number: 8229744
    Abstract: A method, system, and computer program for class detection and time mediated averaging of class dependent models. A technique is described to take advantage of gender information in training data and how obtain female, male, and gender independent models from this information. By using a probability value to average male and female Gaussian Mixture Models (GMMs), dramatic deterioration in cross gender decoding performance is avoided.
    Type: Grant
    Filed: August 26, 2003
    Date of Patent: July 24, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Satyanarayana Dharanipragada, Peder A. Olsen
  • Publication number: 20120185251
    Abstract: A method and system for candidate matching, such as used in match-making services, assesses narrative responses to measure candidate qualities. A candidate database includes self-assessment data and narrative data. Narrative data concerning a defined topic is analyzed to determine candidate qualities separate from topical information. Candidate qualities thus determined are included in candidate profiles and used to identify desirable candidates.
    Type: Application
    Filed: March 26, 2012
    Publication date: July 19, 2012
    Applicant: HOSHIKO LLC
    Inventor: Gary Stephen Shuster
  • Patent number: 8224644
    Abstract: Embodiments are provided for utilizing a client-side cache for utterance processing to facilitate network based speech recognition. An utterance comprising a query is received in a client computing device. The query is sent from the client to a network server for results processing. The utterance is processed to determine a speech profile. A cache lookup is performed based on the speech profile to determine whether results data for the query is stored in the cache. If the results data is stored in the cache, then a query is sent to cancel the results processing on the network server and the cached results data is displayed on the client computing device.
    Type: Grant
    Filed: December 18, 2008
    Date of Patent: July 17, 2012
    Assignee: Microsoft Corporation
    Inventors: Andrew K. Krumel, Shuangyu Chang, Robert L. Chambers
  • Patent number: 8219386
    Abstract: The Arabic poetry meter identification system and method produces coded Al-Khalyli transcriptions of Arabic poetry. The meters (Wazn, Awzan being forms of the Arabic poems units Bayt, Abyate) are identified. A spoken or written poem is accepted as input. A coded transcription of the poetry pattern forms is produced from input processing. The system identifies and distinguishes between proper spoken poetic meter and improper poetic meter. Error in the poem meters (Bahr, Buhur) and the ending rhyme pattern, “Qafiya” are detected and verified. The system accepts user selection of a desired poem meter and then interactively aids the user in the composition of poetry in the selected meter, suggesting alternative words and word groups that follow the desired poem pattern and dactyl components. The system can be in a stand-alone device or integrated with other computing devices.
    Type: Grant
    Filed: January 21, 2009
    Date of Patent: July 10, 2012
    Assignee: King Fahd University of Petroleum and Minerals
    Inventors: Al-Zahrani Abdul Kareem Saleh, Moustafa Elshafei
  • Patent number: 8214211
    Abstract: In a voice processing device, a male voice index calculator calculates a male voice index indicating a similarity of the input sound relative to a male speaker sound model. A female voice index calculator calculates a female voice index indicating a similarity of the input sound relative to a female speaker sound model. A first discriminator discriminates the input sound between a non-human-voice sound and a human voice sound which may be either of the male voice sound or the female voice sound. A second discriminator discriminates the input sound between the male voice sound and the female voice sound based on the male voice index and the female voice index in case that the first discriminator discriminates the human voice sound.
    Type: Grant
    Filed: August 26, 2008
    Date of Patent: July 3, 2012
    Assignee: Yamaha Corporation
    Inventor: Yasuo Yoshioka
  • Patent number: 8214210
    Abstract: A system for processing a query operates by receiving a first query segment that includes audio speech. Next, the system generates a representation for this first query segment, where the representation includes at least two paths associated with alternative phrase sequences for an ambiguity in the audio speech. The system then compares the paths in the representation to a group of documents and determines matching scores for the group of documents based on the comparisons. Finally, the system presents a ranking of the group of documents, where the ranking is based on the matching scores for the group of documents.
    Type: Grant
    Filed: September 19, 2006
    Date of Patent: July 3, 2012
    Assignee: Oracle America, Inc.
    Inventor: William A. Woods
  • Patent number: 8209172
    Abstract: Pattern recognition capable of robust identification for the variance of an input pattern is performed with a low processing cost while the possibility of identification errors is decreased. In a pattern recognition apparatus which identifies the pattern of input data from a data input unit (11) by using a hierarchical feature extraction processor (12) which hierarchically extracts features, an extraction result distribution analyzer (13) analyzes a distribution of at least one feature extraction result obtained by a primary feature extraction processor (121). On the basis of the analytical result, a secondary feature extraction processor (122) performs predetermined secondary feature extraction.
    Type: Grant
    Filed: December 16, 2004
    Date of Patent: June 26, 2012
    Assignee: Canon Kabushiki Kaisha
    Inventors: Yusuke Mitarai, Masakazu Matsuga, Katsuhiko Mori
  • Patent number: 8204738
    Abstract: A method of removing bias from an action classifier within a natural language understanding system can include identifying a sentence having a target embedded grammar that overlaps with at least one other embedded grammar and selecting a group of overlapping embedded grammars including the target embedded grammar and at least one additional embedded grammar. A sentence expansion can be created that includes the sentence including the target embedded grammar and a copy of the sentence for each additional embedded grammar of the group. Each copy of the sentence can include a different additional embedded grammar from the group in place of the target embedded grammar. The sentence expansion can be included within action classifier training data.
    Type: Grant
    Filed: November 3, 2006
    Date of Patent: June 19, 2012
    Assignee: Nuance Communications, Inc.
    Inventor: Ilya Skuratovsky
  • Publication number: 20120150539
    Abstract: Method of the present invention may include receiving speech feature vector converted from speech signal, performing first search by applying first language model to the received speech feature vector, and outputting word lattice and first acoustic score of the word lattice as continuous speech recognition result, outputting second acoustic score as phoneme recognition result by applying an acoustic model to the speech feature vector, comparing the first acoustic score of the continuous speech recognition result with the second acoustic score of the phoneme recognition result, outputting first language model weight when the first coustic score of the continuous speech recognition result is better than the second acoustic score of the phoneme recognition result and performing a second search by applying a second language model weight, which is the same as the output first language model, to the word lattice.
    Type: Application
    Filed: December 13, 2011
    Publication date: June 14, 2012
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Hyung Bae Jeon, Yun Keun Lee, Eui Sok Chung, Jong Jin Kim, Hoon Chung, Jeon Gue Park, Ho Young Jung, Byung Ok Kang, Ki Young Park, Sung Joo Lee, Jeom Ja Kang, Hwa Jeon Song
  • Patent number: 8200487
    Abstract: The invention relates to a method, a computer program product, a segmentation system and a user interface for structuring an unstructured text by making use of statistical models trained on annotated training data. The method performs text segmentation into text sections and assigns labels to text sections as section headings. The performed segmentation and assignment is provided to a user for general review. Additionally, alternative segmentations and label assignments are provided to the user being capable to select alternative segmentations and alternative labels as well as to enter a user defined segmentation and user defined label. In response to the modifications introduced by the user, a plurality of different actions are initiated incorporating the re-segmentation and re-labelling of successive parts of the document or the entire document.
    Type: Grant
    Filed: November 12, 2004
    Date of Patent: June 12, 2012
    Assignee: Nuance Communications Austria GmbH
    Inventors: Jochen Peters, Evgeny Matusov, Carsten Meyer, Dietrich Klakow
  • Patent number: 8200486
    Abstract: Method and system for processing and identifying a sub-audible signal formed by a source of sub-audible sounds. Sequences of samples of sub-audible sound patterns (“SASPs”) for known words/phrases in a selected database are received for overlapping time intervals, and Signal Processing Transforms (“SPTs”) are formed for each sample, as part of a matrix of entry values. The matrix is decomposed into contiguous, non-overlapping two-dimensional cells of entries, and neural net analysis is applied to estimate reference sets of weight coefficients that provide sums with optimal matches to reference sets of values. The reference sets of weight coefficients are used to determine a correspondence between a new (unknown) word/phrase and a word/phrase in the database.
    Type: Grant
    Filed: June 5, 2003
    Date of Patent: June 12, 2012
    Assignee: The United States of America as represented by the Administrator of the National Aeronautics & Space Administration (NASA)
    Inventors: Charles C. Jorgensen, Diana D. Lee, Shane T. Agabon
  • Patent number: 8200488
    Abstract: The invention provides a method for processing speech comprising the steps of receiving a speech input (SI) of a speaker, generating speech parameters (SP) from said speech input (SI), determining parameters describing an absolute loudness (L) of said speech input (SI), and evaluating (EV) said speech input (SI) and/or said speech parameters (SP) using said parameters describing the absolute loudness (L). In particular, the step of evaluation (EV) comprises a step of emotion recognition and/or speaker identification. Further, a microphone array comprising a plurality of microphones is used for determining said parameters describing the absolute loudness. With a microphone array the distance of the speaker from the microphone array can be determined and the loudness can be normalized by the distance.
    Type: Grant
    Filed: December 10, 2003
    Date of Patent: June 12, 2012
    Assignee: Sony Deutschland GmbH
    Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato
  • Patent number: 8185390
    Abstract: The invention comprises a method for lossy data compression, akin to vector quantization, in which there is no explicit codebook and no search, i.e. the codebook memory and associated search computation are eliminated. Some memory and computation are still required, but these are dramatically reduced, compared to systems that do not exploit this method. For this reason, both the memory and computation requirements of the method are exponentially smaller than comparable methods that do not exploit the invention. Because there is no explicit codebook to be stored or searched, no such codebook need be generated either. This makes the method well suited to adaptive coding schemes, where the compression system adapts to the statistics of the data presented for processing: both the complexity of the algorithm executed for adaptation, and the amount of data transmitted to synchronize the sender and receiver, are exponentially smaller than comparable existing methods.
    Type: Grant
    Filed: April 23, 2009
    Date of Patent: May 22, 2012
    Assignee: Promptu Systems Corporation
    Inventor: Harry Printz
  • Patent number: 8185392
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving voice queries, obtaining, for one or more of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query, generating, for the one or more voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query, selecting a subset of the one or more voice queries based on the posterior recognition confidence measures, and adapting an acoustic model using the subset of the voice queries.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: May 22, 2012
    Assignee: Google Inc.
    Inventors: Brian Strope, Douglas H. Beeferman
  • Patent number: 8180637
    Abstract: A method of compensating for additive and convolutive distortions applied to a signal indicative of an utterance is discussed. The method includes receiving a signal and initializing noise mean and channel mean vectors. Gaussian dependent matrix and Hidden Markov Model (HMM) parameters are calculated or updated to account for additive noise from the noise mean vector or convolutive distortion from the channel mean vector. The HMM parameters are adapted by decoding the utterance using the previously calculated HMM parameters and adjusting the Gaussian dependent matrix and the HMM parameters based upon data received during the decoding. The adapted HMM parameters are applied to decode the input utterance and provide a transcription of the utterance.
    Type: Grant
    Filed: December 3, 2007
    Date of Patent: May 15, 2012
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Li Deng, Alejandro Acero, Yifan Gong, Jinyu Li
  • Patent number: 8175730
    Abstract: In order to analyze an information signal, a significant short-time spectrum is extracted from the information signal, the means for extracting being configured to extract such short-time spectra which come closer to a specific characteristic than other short-time spectra of the information signal. The short-time spectra extracted are then decomposed into component signals using ICA analysis, a component signal spectrum representing a profile spectrum of a tone source which generates a tone corresponding to the characteristic sought for. From a sequence of short-time spectra of the information signal and from the profile spectra determined, an amplitude envelope is eventually calculated for each profile spectrum, the amplitude envelope indicating how a profile spectrum of a tone source all in all changes over time.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: May 8, 2012
    Assignee: SONY Corporation
    Inventors: Christian Dittmar, Christian Uhle, Jürgen Herre
  • Patent number: 8175236
    Abstract: Providing for inter-working between SMS network architectures and IMS network architectures in a mobile environment is described herein. By way of example, a next generation (NG) short message service center (SMSC) is provided that can receive SMS messages in mobile application protocol (MAP) and convert such messages to IMS protocol. In addition, the NG SMSC can also receive IMS data and convert the IMS data to an SMS MAP message. The NG SMSC can reference an IMS or an SMS location registry to determine a location of the target device, and convert from IMS to SMS MAP, and vice versa, as suitable. Accordingly, the NG SMSC can provide an efficient interface between legacy SMS and NG IMS network components while preserving legacy protocols associated with such networks.
    Type: Grant
    Filed: January 15, 2008
    Date of Patent: May 8, 2012
    Assignee: AT&T Mobility II LLC
    Inventors: Vinod Kumar Pandey, Karl J. Schlieber, Matthew Wayne Stafford, Jianrong Wang
  • Publication number: 20120109649
    Abstract: Automatic speech recognition including receiving speech via a microphone, pre-processing the received speech to generate acoustic feature vectors, classifying dialect of the received speech, selecting at least one of an acoustic model or a lexicon specific to the classified dialect, decoding the acoustic feature vectors using a processor and at least one of the selected dialect-specific acoustic model or selected lexicon to produce a plurality of hypotheses for the received speech, and post-processing the plurality of hypotheses to identify one of the plurality of hypotheses as the received speech.
    Type: Application
    Filed: November 1, 2010
    Publication date: May 3, 2012
    Applicant: GENERAL MOTORS LLC
    Inventors: Gaurav Talwar, Rathinavelu Chengalvarayan
  • Publication number: 20120095762
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Application
    Filed: October 19, 2011
    Publication date: April 19, 2012
    Applicants: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki-wan EOM, Chang-woo HAN, Tae-gyoon KANG, Nam-soo KIM, Doo-hwa HONG, Jae-won LEE, Hyung-joon LIM
  • Patent number: 8160866
    Abstract: The present invention can recognize both English and Chinese at the same time. The most important skill is that the features of all English words (without samples) are entirely extracted from the features of Chinese syllables. The invention normalizes the signal waveforms of variable lengths for English words (Chinese syllables) such that the same words (syllables) can have the same features at the same time position. Hence the Bayesian classifier can recognize both the fast and slow utterance of sentences. The invention can improve the feature such that the speech recognition of the unknown English (Chinese) is guaranteed to be correct. Furthermore, since the invention can create the features of English words from the features of Chinese syllables, it can also create the features of other languages from the features of Chinese syllables and hence it can also recognize other languages, such as German, French, Japanese, Korean, Russian, etc.
    Type: Grant
    Filed: October 10, 2008
    Date of Patent: April 17, 2012
    Inventors: Tze Fen Li, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
  • Patent number: 8155962
    Abstract: The methods and systems described herein may asynchronously process natural language utterances to provide real-time response performance and natural interaction with users. In particular, the methods and systems described herein may use various natural language speech recognition and interpretation components to identify a request (e.g., a query or command) in an utterance. The request identified in the utterance may then be processed with one or more domain agents, which may submit duplicate queries to multiple different data sources to process the request. The domain agents may then asynchronously evaluate responses to the duplicate queries to return results to users in a timely and natural manner, and further to account the fact that the different data sources may respond to the queries at different speeds, provide unsatisfactory responses to the queries, or fail to respond to the queries at all.
    Type: Grant
    Filed: July 19, 2010
    Date of Patent: April 10, 2012
    Assignee: VoiceBox Technologies, Inc.
    Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, Sr., Michael R. Kennewick, Jr., Richard Kennewick, Tom Freeman
  • Patent number: 8155959
    Abstract: Systems and methods are described that automatically control modules of dialog systems. The systems and methods include a dialog module that receives and processes utterances from a speaker and outputs data used to generate synthetic speech outputs as responses to the utterances. A controller is coupled to the dialog module, and the controller detects an abnormal output of the dialog module when the dialog module is processing in an automatic mode. The controller comprises a mode control for an agent to control the dialog module by correcting the abnormal output and transferring a corrected output to a downstream dialog module that follows, in a processing path, the dialog module. The corrected output is used in further processing the utterances.
    Type: Grant
    Filed: November 7, 2007
    Date of Patent: April 10, 2012
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Baoshi Yan, Zhe Feng