Specialized Equations Or Comparisons Patents (Class 704/236)
  • Patent number: 8924212
    Abstract: A method, apparatus and machine-readable medium are provided. A phonotactic grammar is utilized to perform speech recognition on received speech and to generate a phoneme lattice. A document shortlist is generated based on using the phoneme lattice to query an index. A grammar is generated from the document shortlist. Data for each of at least one input field is identified based on the received speech and the generated grammar.
    Type: Grant
    Filed: August 26, 2005
    Date of Patent: December 30, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Cyril Georges Luc Allauzen, Sarangarajan Parthasarathy
  • Patent number: 8918406
    Abstract: A method of processing content files may include receiving the content file, employing processing circuitry to determine an identity score of a source of a portion of at least a portion the content file, to determine a word score based for the content file and to determine a metadata score for the content file, determining a composite priority score based on the identity score, the word score and the metadata score, and associating the composite priority score with the content file for electronic provision of the content file together with the composite priority score to a human analyst.
    Type: Grant
    Filed: December 14, 2012
    Date of Patent: December 23, 2014
    Assignee: Second Wind Consulting LLC
    Inventor: Donna Rober
  • Patent number: 8918319
    Abstract: In a speech recognition device and a speech recognition method, a key phrase containing at least one key word is received. The speech recognition method comprises steps: receiving a sound source signal of a key word and generating a plurality of audio signals; transforming the audio signals into a plurality of frequency signals; receiving the frequency signals to obtain a space-frequency spectrum and an angular estimation value thereof; receiving the space-frequency spectrum to define and output at least one spatial eigenparameter, and using the angular estimation value and the frequency signals to perform spotting and evaluation and outputting a Bhattacharyya distance; and receiving the spatial eigenparameter and the Bhattacharyya distance and using corresponding thresholds to determine correctness of the key phrase. Thereby this invention robustly achieves high speech recognition rate under very low SNR conditions.
    Type: Grant
    Filed: July 7, 2011
    Date of Patent: December 23, 2014
    Assignee: National Chiao University
    Inventors: Jwu-Sheng Hu, Ming-Tang Lee, Ting-Chao Wang, Chia Hsin Yang
  • Patent number: 8909522
    Abstract: A voice activity detector (100) includes a frame divider (201) for dividing frames of an input signal into consecutive sub-frames, an energy level estimator (202) for estimating an energy level of the input signal in each of the consecutive sub-frames, a noise eliminator (203) for analyzing the estimated energy levels of sets of the sub-frames to detect and eliminate from enhancement noise sub-frames and to indicate remaining sub-frames as speech sub-frames, and an energy level enhancer (205) for enhancing the estimated energy level for each of the indicated speech sub-frames by an amount which relates to a detected change of the estimated energy level for a current speech sub-frame relative to that for neighboring speech sub-frames.
    Type: Grant
    Filed: July 8, 2008
    Date of Patent: December 9, 2014
    Assignee: Motorola Solutions, Inc.
    Inventors: Itzhak Shperling, Sergey Bondarenko, Eitan Koren, Yosi Rahamim, Tomer Yablonka
  • Patent number: 8903724
    Abstract: A speech recognition device includes, a speech recognition section that conducts a search, by speech recognition, on audio data stored in a first memory section to extract word-spoken portions where plural words transferred are each spoken and, of the word-spoken portions extracted, rejects the word-spoken portion for the word designated as a rejecting object; an acquisition section that obtains a derived word of a designated search target word, the derived word being generated in accordance with a derived word generation rule stored in a second memory section or read out from the second memory section; a transfer section that transfers the derived word and the search target word to the speech recognition section, the derived word being set to the outputting object or the rejecting object by the acquisition section; and an output section that outputs the word-spoken portion extracted and not rejected in the search.
    Type: Grant
    Filed: February 1, 2012
    Date of Patent: December 2, 2014
    Assignee: Fujitsu Limited
    Inventors: Nobuyuki Washio, Shouji Harada
  • Patent number: 8892436
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Grant
    Filed: October 19, 2011
    Date of Patent: November 18, 2014
    Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation
    Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
  • Patent number: 8892424
    Abstract: An audio analysis system includes a terminal apparatus and a host system. The terminal apparatus acquires an audio signal of a sound containing utterances of a user and another person, discriminates between portions of the audio signal corresponding to the utterances of the user and the other person, detects an utterance feature based on the portion corresponding to the utterance of the user or the other person, and transmits utterance information including the discrimination and detection results to the host system. The host system detects a part corresponding to a conversation from the received utterance information, detects portions of the part of the utterance information corresponding to the user and the other person, compares a combination of plural utterance features corresponding to the portions of the part of the utterance information of the user and the other person with relation information to estimate an emotion, and outputs estimation information.
    Type: Grant
    Filed: February 10, 2012
    Date of Patent: November 18, 2014
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Haruo Harada, Hirohito Yoneyama, Kei Shimotani, Yohei Nishino, Kiyoshi Iida, Takao Naito
  • Patent number: 8886533
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations.
    Type: Grant
    Filed: October 25, 2011
    Date of Patent: November 11, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Sumit Chopra, Dimitrios Dimitriadis, Patrick Haffner
  • Patent number: 8886532
    Abstract: On a computing device a speech utterance is received from a user. The speech utterance is a section of a speech dialog that includes a plurality of speech utterances. One or more features from the speech utterance are identified. Each identified feature from the speech utterance is a specific characteristic of the speech utterance. One or more features from the speech dialog are identified. Each identified feature from the speech dialog is associated with one or more events in the speech dialog. The one or more events occur prior to the speech utterance. One or more identified features from the speech utterance and one or more identified features from the speech dialog are used to calculate a confidence score for the speech utterance.
    Type: Grant
    Filed: October 27, 2010
    Date of Patent: November 11, 2014
    Assignee: Microsoft Corporation
    Inventors: Michael Levit, Bruce Melvin Buntschuh
  • Publication number: 20140330563
    Abstract: Some aspects of the invention may include a computer-implemented method for enrolling voice prints generated from audio streams, in a database. The method may include receiving an audio stream of a communication session and creating a preliminary association between the audio stream and an identity of a customer that has engaged in the communication session based on identification information. The method may further include determining a confidence level of the preliminary association based on authentication information related to the customer and if the confidence level is higher than a threshold, sending a request to compare the audio stream to a database of voice prints of known fraudsters. If the audio stream does not match any known fraudsters, sending a request to generate from the audio stream a current voice print associated with the customer and enrolling the voice print in a customer voice print database.
    Type: Application
    Filed: May 2, 2013
    Publication date: November 6, 2014
    Applicant: NICE-SYSTEMS LTD.
    Inventors: Shahar FAIANS, Avraham Lousky, Elad Hoffman, Alon Sabban, Jade Tarni Kahn, Roie Mandler
  • Patent number: 8880399
    Abstract: In the field of language learning systems, proper pronunciation of words and phrases is an integral aspect of language learning, determining the proximity of the language learner's pronunciation to a standardized, i.e. ‘perfect’, pronunciation is utilized to guide the learner from imperfect toward perfect pronunciation. In this regard, a phoneme lattice scoring system is utilized, whereby an input from a user is transduced into the perfect pronunciation example in a phoneme lattice. The cost of this transduction may be determined based on a summation of substitutions, deletions and insertions of phonemes needed to transducer from the input to the perfect pronunciation of the utterance.
    Type: Grant
    Filed: September 27, 2010
    Date of Patent: November 4, 2014
    Assignee: Rosetta Stone, Ltd.
    Inventors: Andreas Hagen, Bryan Pellom
  • Publication number: 20140324425
    Abstract: A voice control method is applied in an electronic device. The electronic device includes a voice input unit, a play unit, and a storage unit storing a conversation database and an association table between different ranges of voice characteristics and styles of response voice. The method includes the following steps. Obtaining voice signals input via the voice input unit. Determining which content is input according to the obtained voice signals. Searching in the conversation database to find a response corresponding to the input content. Analyzing voice characteristics of the obtained voice signals. Comparing the voice characteristics of the obtained voice signals with the pre-stored ranges. Selecting the associated response voice. Finally, outputting the found response using the associated response voice via the play unit.
    Type: Application
    Filed: May 21, 2013
    Publication date: October 30, 2014
    Applicants: HON HAI PRECISION INDUSTRY CO., LTD., FU TAI HUA INDUSTRY (SHENZHEN) CO., LTD.
    Inventor: REN-WEN HUANG
  • Patent number: 8856002
    Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: October 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Dimitri Kanevsky, David Nahamoo, Tara N Sainath
  • Patent number: 8843367
    Abstract: An adaptive equalization system that adjusts the spectral shape of a speech signal based on an intelligibility measurement of the speech signal may improve the intelligibility of the output speech signal. Such an adaptive equalization system may include a speech intelligibility measurement module, a spectral shape adjustment module, and an adaptive equalization module. The speech intelligibility measurement module is configured to calculate a speech intelligibility measurement of a speech signal. The spectral shape adjustment module is configured to generate a weighted long-term speech curve based on a first predetermined long-term average speech curve, a second predetermined long-term average speech curve, and the speech intelligibility measurement. The adaptive equalization module is configured to adapt equalization coefficients for the speech signal based on the weighted long-term speech curve.
    Type: Grant
    Filed: May 4, 2012
    Date of Patent: September 23, 2014
    Assignee: 8758271 Canada Inc.
    Inventors: Phillip Alan Hetherington, Xueman Li
  • Patent number: 8831941
    Abstract: Disclosed are systems, methods, and computer readable media for comparing customer voice prints comprising of uncommonly spoken words with a database of known fraudulent voice signatures and continually updating the database to decrease the risk of identity theft. The method embodiment comprises comparing a received voice signal against a database of known fraudulent voice signatures, denying the caller's transaction if the voice signal substantially matches the database of known fraudulent voice signatures, adding the caller's voice signal to the database of known fraudulent voice signatures if the voice signal does not substantially match a separate speaker verification database and received additional information is not verified.
    Type: Grant
    Filed: May 29, 2007
    Date of Patent: September 9, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Mazin Gilbert, Jay Wilpon
  • Publication number: 20140249816
    Abstract: An automatic speech recognition (ASR) system includes a speech-responsive application and a recognition engine. The ASR system generates user prompts to elicit certain spoken inputs, and the speech-responsive application performs operations when the spoken inputs are recognised. The recognition engine compares sounds within an input audio signal with phones within an acoustic model, to identify candidate matching phones. A recognition confidence score is calculated for each candidate matching phone, and the confidence scores are used to help identify one or more likely sequences of matching phones that appear to match a word within the grammar of the speech-responsive application. The per-phone confidence scores are evaluated against predefined confidence score criteria (for example, identifying scores below a ‘low confidence’ threshold) and the results of the evaluation are used to influence subsequent selection of user prompts.
    Type: Application
    Filed: February 26, 2014
    Publication date: September 4, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: John Brian Pickering, Timothy David Poultney, Benjamin Terrick Staniford, Matthew Whitbourne
  • Patent number: 8825479
    Abstract: A computerized method, software, and system for recognizing emotions from a speech signal, wherein statistical and MFCC features are extracted from the speech signal, the MFCC features are sorted to provide a basis for comparison between the speech signal and reference samples, the statistical and MFCC features are compared between the speech signal and reference samples, a scoring system is used to compare relative correlation to different emotions, a probable emotional state is assigned to the speech signal based on the scoring system and the probable emotional state is communicated to a user.
    Type: Grant
    Filed: October 24, 2013
    Date of Patent: September 2, 2014
    Assignee: Simple Emotion, Inc.
    Inventors: Akash Krishnan, Matthew Fernandez
  • Patent number: 8812315
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: August 19, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 8812291
    Abstract: Systems, methods, and computer program products for machine translation are provided. In some implementations a system is provided. The system includes a language model including a collection of n-grams from a corpus, each n-gram having a corresponding relative frequency in the corpus and an order n corresponding to a number of tokens in the n-gram, each n-gram corresponding to a backoff n-gram having an order of n?1 and a collection of backoff scores, each backoff score associated with an n-gram, the backoff score determined as a function of a backoff factor and a relative frequency of a corresponding backoff n-gram in the corpus.
    Type: Grant
    Filed: December 10, 2012
    Date of Patent: August 19, 2014
    Assignee: Google Inc.
    Inventors: Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, Jeffrey Dean
  • Patent number: 8812321
    Abstract: Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination.
    Type: Grant
    Filed: September 30, 2010
    Date of Patent: August 19, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Mazin Gilbert, Srinivas Bangalore, Patrick Haffner, Robert Bell
  • Patent number: 8812923
    Abstract: A decoder and method of decoding a sub-band coded digital audio signal. The decoder comprises: an input, for receiving sub-band coefficients for a plurality of sub-bands of the audio signal; an error detection unit, adapted to analyze the content of a sequence of coefficients in one of the sub-bands, to derive for each coefficient an indication of whether the coefficient has been corrupted by an error of a predefined type; an error masking unit, adapted to generate from the sequence a modified sequence of coefficients for the sub-band, wherein errors of the predefined type are attenuated; a coefficient combination unit, adapted to combine the received coefficients and the modified coefficients, in dependence upon the indication of error; and a signal reconstruction unit, adapted to reconstruct the audio signal using the combined coefficients.
    Type: Grant
    Filed: November 22, 2011
    Date of Patent: August 19, 2014
    Assignee: NXP, B.V.
    Inventor: Christophe Marc Macours
  • Publication number: 20140229177
    Abstract: Methods of incrementally modifying a word-level finite state transducer (FST) are described for adding and removing sentences. A prefix subset of states and arcs in the FST is determined that matches a prefix portion of the sentence. A suffix subset of states and arcs in the FST is determined that matches a suffix portion of the sentence. A new sentence can then be added to the FST by appending a new sequence of states and arcs to the FST corresponding to a remainder of the sentence between the prefix and suffix. An existing sentence can be removed from the FST by removing any arcs and states between the prefix subset and the suffix subset. The resulting modified FST is locally efficient but does not satisfy global optimization criteria such as minimization.
    Type: Application
    Filed: September 21, 2011
    Publication date: August 14, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: Stephan Kanthak, Oliver Bender
  • Publication number: 20140222423
    Abstract: Most speaker recognition systems use i-vectors which are compact representations of speaker voice characteristics. Typical i-vector extraction procedures are complex in terms of computations and memory usage. According an embodiment, a method and corresponding apparatus for speaker identification, comprise determining a representation for each component of a variability operator, representing statistical inter- and intra-speaker variability of voice features with respect to a background statistical model, in terms of an orthogonal operator common to all components of the variability operator and having a first dimension larger than a second dimension of the components of the variability operator; computing statistical voice characteristics of a particular speaker using the determined representations; and employing the statistical voice characteristics of the particular speaker in performing speaker recognition.
    Type: Application
    Filed: February 7, 2013
    Publication date: August 7, 2014
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Sandro Cumani, Pietro Laface
  • Publication number: 20140207456
    Abstract: A waveform analysis of speech is disclosed. Embodiments include methods for analyzing captured sounds produced by animals, such as human vowel sounds, and accurately determining the sound produced. Some embodiments utilize computer processing to identify the location of the sound within a waveform, select a particular time within the sound, and measure a fundamental frequency and one or more formants at the particular time. Embodiments compare the fundamental frequency and the one or more formants to known thresholds and multiples of the fundamental frequency, such as by a computer-run algorithm. The results of this comparison identify of the sound with a high degree of accuracy.
    Type: Application
    Filed: March 24, 2014
    Publication date: July 24, 2014
    Applicant: Waveform Communications, LLC
    Inventor: Michael A. Stokes
  • Patent number: 8788266
    Abstract: The present invention uses a language model creation device 200 that creates a new language model using a standard language model created from standard language text. The language model creation device 200 includes a transformation rule storage section 201 that stores transformation rules used for transforming dialect-containing word strings into standard language word strings, and a dialect language model creation section 203 that creates dialect-containing n-grams by applying the transformation rules to word n-grams in the standard language model and, furthermore, creates the new language model (dialect language model) by adding the created dialect-containing n-grams to the word n-grams.
    Type: Grant
    Filed: March 16, 2010
    Date of Patent: July 22, 2014
    Assignee: NEC Corporation
    Inventors: Tasuku Kitade, Takafumi Koshinaka, Yoshifumi Onishi
  • Publication number: 20140200890
    Abstract: Embodiments reduce the complexity of speaker dependent speech recognition systems and methods by representing the code word (i.e., the word to be recognized) using a single Gaussian Mixture Model (GMM) which is adapted from a Universal Background Model (UBM). Only the parameters of the GMM need to be stored. Further reduction in computation is achieved by only checking the GMM component that is relevant to the keyword template. In this scheme, keyword template is represented by a sequence of the index of best performing component of the GMM of the keyword model. Only one template is saved by combining the registration template using Longest Common Sequence algorithm. The quality of the word model is continuously updated by performing expectation maximization iteration using the test word which is accepted as keyword model.
    Type: Application
    Filed: March 31, 2013
    Publication date: July 17, 2014
    Applicant: STMicroelectronics Asia Pacific Pte Ltd.
    Inventors: Evelyn Kurniawati, Sapna George
  • Patent number: 8781825
    Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: July 15, 2014
    Assignee: Sensory, Incorporated
    Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
  • Publication number: 20140188470
    Abstract: A disclosed speech processor includes a front end to receive a speech input and generate a feature vector indicative of a portion of the speech input and a Gaussian mixture (GMM) circuit to receive the feature vector, model any one of a plurality of GMM speech recognition algorithms, and generate a GMM score for the feature vector based on the GMM speech recognition algorithm modeled. In at least one embodiment, the GMM circuit includes a common compute block to generate feature a vector sum indicative of a weighted sum of differences squares between the feature vector and a mixture component of the GMM speech recognition algorithm. In at least one embodiment, the GMM speech recognition algorithm being modeled includes a plurality of Gaussian mixture components and the common compute block is operable to generate feature vector scores corresponding to each of the plurality of mixture components.
    Type: Application
    Filed: December 31, 2012
    Publication date: July 3, 2014
    Inventors: Jenny Chang, Michael E. Deisher, Ravishankar Iyer
  • Patent number: 8768697
    Abstract: In some embodiments, a method includes measuring a disparity between two speech samples by segmenting both a reference speech sample and a student speech sample into speech units. A duration disparity can be determined for units that are not adjacent to each other in the reference speech sample. A duration disparity can also be determined for the corresponding units in the student speech sample. A difference can then be calculated between the student speech sample duration disparity and the reference speech sample duration disparity.
    Type: Grant
    Filed: January 29, 2010
    Date of Patent: July 1, 2014
    Assignee: Rosetta Stone, Ltd.
    Inventors: Joseph Tepperman, Theban Stanley, Kadri Hacioglu
  • Patent number: 8768706
    Abstract: Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.
    Type: Grant
    Filed: August 20, 2010
    Date of Patent: July 1, 2014
    Assignee: Multimodal Technologies, LLC
    Inventors: Kjell Schubert, Juergen Fritsch, Michael Finke, Detlef Koll
  • Patent number: 8768692
    Abstract: A speech recognition apparatus predicts, based on the occurrence cycle and duration time of impulse noise that occurs periodically, a segment in which impulse noise occurs, and executes speech recognition processing based on the feature components of the remaining frames excluding a feature component of a frame corresponding to the predicted segment, or the feature components extracted from frames created from sound data excluding a part corresponding to the predicted segment.
    Type: Grant
    Filed: May 3, 2007
    Date of Patent: July 1, 2014
    Assignee: Fujitsu Limited
    Inventor: Shoji Hayakawa
  • Patent number: 8768700
    Abstract: A system may receive a voice search query and may determine word hypotheses for the voice query. Each word hypothesis may include one or more terms. The system may obtain a search query log and may determine, for each word hypothesis, a quantity of other search queries, in the search query log, that include the one or more terms. The system may determine weights based on the determined quantities. The system may generate, based on the weights, a first search query from the word hypotheses and may obtain a first set of search results. The system may modify, based on the first set of search results, one or more of the weights. The system may generate a second search query from the word hypotheses and obtain, based on the second search query, a second set of search results for the voice query.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: July 1, 2014
    Assignee: Google Inc.
    Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
  • Publication number: 20140180688
    Abstract: A speech recognition device comprises, a corpus processor which includes a refiner to classify collected corpora into domains corresponding to functions of the speech recognition device, and an extractor which extracts collected basic sentences based on functions of the speech recognition device with respect to the corpora in the domains, a database (DB) which stores therein the extracted basic sentences based on functions of the speech recognition device, a corpus receiver which receives a user's corpora, and a controller which compares a received basic sentence extracted by the extractor with collected basic sentences stored in the DB and determines the function intended by the user's corpora.
    Type: Application
    Filed: September 12, 2013
    Publication date: June 26, 2014
    Inventors: Oh-yun KWON, Jae-cheol KIM, Seung-il YOON, Cheon-seong LEE
  • Patent number: 8762152
    Abstract: Methods and systems for performing speech recognition using an electronic interactive agent are disclosed. In embodiments of the invention, an electronic agent is presented in a form perceptible to a user. The electronic agent is used to solicit speech input from a user and to respond to the user's recognized speech, and mimics the behavior of a human agent in a natural language query session with the user. The electronic agent may be implemented in a distributed speech recognition system in which speech recognition tasks are divided between client and server.
    Type: Grant
    Filed: October 1, 2007
    Date of Patent: June 24, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Ian Bennett, Bandi Ramesh Babu, Kishor Morkhandikar, Pallaki Gururaj
  • Patent number: 8762147
    Abstract: A signal portion is extracted from an input signal for each frame having a specific duration to generate a per-frame input signal. The per-frame input signal in a time domain is converted into a per-frame input signal in a frequency domain, thereby generating a spectral pattern. Subband average energy is derived in each of subbands adjacent one another in the spectral pattern. The subband average energy is compared in at least one subband pair of a first subband and a second subband that is a higher frequency band than the first subband, the first and second subbands being consecutive subbands in the spectral pattern. It is determined that the per-frame input signal includes a consonant segment if the subband average energy of the second subband is higher than the subband average energy of the first subband.
    Type: Grant
    Filed: February 1, 2012
    Date of Patent: June 24, 2014
    Assignee: JVC KENWOOD Corporation
    Inventors: Akiko Akechi, Takaaki Yamabe
  • Publication number: 20140172427
    Abstract: A method for processing messages pertaining to an event includes receiving a plurality of messages pertaining to the event from electronic communication devices associated with a plurality of observers of the event, generating a first message stream that includes only a portion of the plurality of messages corresponding to a first participant in the event, identifying a first sub-event in the first message stream with reference to a time distribution of messages and content distribution of messages in the first message stream, generating a sub-event summary with reference to a portion of the plurality of messages in the first message stream that are associated with the first sub-event, and transmitting the sub-event summary to a plurality of electronic communication devices associated with a plurality of users who are not observers of the event.
    Type: Application
    Filed: December 13, 2013
    Publication date: June 19, 2014
    Applicant: Robert Bosch GmbH
    Inventors: Fei Liu, Fuliang Weng, Chao Shen, Lin Zhao
  • Patent number: 8751229
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for handling missing speech data. The computer-implemented method includes receiving speech with a missing segment, generating a plurality of hypotheses for the missing segment, identifying a best hypothesis for the missing segment, and recognizing the received speech by inserting the identified best hypothesis for the missing segment. In another method embodiment, the final step is replaced with synthesizing the received speech by inserting the identified best hypothesis for the missing segment. In one aspect, the method further includes identifying a duration for the missing segment and generating the plurality of hypotheses of the identified duration for the missing segment. The step of identifying the best hypothesis for the missing segment can be based on speech context, a pronouncing lexicon, and/or a language model. Each hypothesis can have an identical acoustic score.
    Type: Grant
    Filed: November 21, 2008
    Date of Patent: June 10, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie
  • Patent number: 8744845
    Abstract: A noise estimation method for a noisy speech signal according to an embodiment of the present invention includes the steps of approximating a transformation spectrum by transforming an input noisy speech signal to a frequency domain, calculating a smoothed magnitude spectrum having a decreased difference in a magnitude of the transformation spectrum between neighboring frames, calculating a search spectrum to represent an estimated noise component of the smoothed magnitude spectrum, and estimating a noise spectrum by using a recursive average method using an adaptive forgetting factor defined by using the search spectrum. According to an embodiment of the present invention, the amount of calculation for noise estimation is small, and large-capacity memory is not required. Accordingly, the present invention can be easily implemented in hardware or software. Further, the accuracy of noise estimation can be increase because an adaptive procedure can be performed on each frequency sub-band.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: June 3, 2014
    Assignee: Transono Inc.
    Inventors: Sung Il Jung, Dong Gyung Ha
  • Patent number: 8737571
    Abstract: A method, apparatus and computer readable medium for call quality testing is presented. A query is transmitted over a communications network from a first location to a second location. The query results in an audio signal at the second location, which is received at the first location. The audio signal is analyzed by comparing the signal with a reference signal clip. A statistical parameter is generated, the statistical parameter indicative of a quality of the received signal.
    Type: Grant
    Filed: June 29, 2004
    Date of Patent: May 27, 2014
    Assignee: Empirix Inc.
    Inventors: Albert R. Seeley, Nathan David, Zhongyi Chen, Douglas C. Williams, Andrew Ullmann
  • Patent number: 8738374
    Abstract: Described is a speech-to-text conversion system and method that provides secure, real-time and high-accuracy conversion of general-quality speech into text. The system is designed to interface with external devices and services, providing a simple and convenient manner to transcribe audio that may be stored elsewhere such as a wireless phone's voice mail, or occurring between two or more parties such as a conference call. The first step in the system's process ensures secure and private transcription by separating an audio stream into many audio shreds, each of which has duration of only a few seconds and cannot reveal the context of the conversation. A workforce of geographically distributed transcription agents who transcribe the audio shreds is able to generate transcription in real time, with many agents working in parallel on a single conversation. No one agent (or group of agents) receives a sufficient number of audio shreds to reconstruct the context of any conversation.
    Type: Grant
    Filed: May 22, 2009
    Date of Patent: May 27, 2014
    Assignee: j2 Global Communications, Inc.
    Inventor: Jon Jaroker
  • Patent number: 8731928
    Abstract: A phonetic vocabulary for a speech recognition system is adapted to a particular speaker's pronunciation. A speaker can be attributed specific pronunciation styles, which can be identified from specific pronunciation examples. Consequently, a phonetic vocabulary can be reduced in size, which can improve recognition accuracy and recognition speed.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: May 20, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Nitendra Rajput, Ashish Verma
  • Patent number: 8731921
    Abstract: A frame erasure concealment technique for a bitstream-based feature extractor in a speech recognition system particularly suited for use in a wireless communication system operates to “delete” each frame in which an erasure is declared. The deletions thus reduce the length of the observation sequence, but have been found to provide for sufficient speech recognition based on both single word and “string” tests of the deletion technique.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: May 20, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Richard Vandervoort Cox, Hong Kook Kim
  • Patent number: 8725518
    Abstract: A system for providing automatic quality management regarding a level of conformity to a specific accent, including, a recording system, a statistical model database with statistical models representing speech data of different levels of conformity to a specific accent, a speech analysis system, a quality management system. Wherein the recording system is adapted to record one or more samples of a speakers speech and provide it to the speech analysis system for analysis, and wherein the speech analysis system is adapted to provide a score of the speakers speech samples to the quality management system by analyzing the recorded speech samples relative to the statistical models in the statistical model database.
    Type: Grant
    Filed: April 25, 2006
    Date of Patent: May 13, 2014
    Assignee: Nice Systems Ltd.
    Inventors: Moshe Waserblat, Barak Eilam
  • Patent number: 8725829
    Abstract: A method and system is described which allows users to identify (pre-recorded) sounds such as music, radio broadcast, commercials, and other audio signals in almost any environment. The audio signal (or sound) must be a recording represented in a database of recordings. The service can quickly identify the signal from just a few seconds of excerption, while tolerating high noise and distortion. Once the signal is identified to the user, the user may perform transactions interactively in real-time or offline using the identification information.
    Type: Grant
    Filed: April 26, 2004
    Date of Patent: May 13, 2014
    Assignee: Shazam Investments Limited
    Inventors: Avery Li-Chun Wang, Christopher Jacques Penrose Barton, Dheeraj Shankar Mukherjee, Philip Inghelbrecht
  • Patent number: 8719021
    Abstract: A speech recognition dictionary compilation assisting system can create and update speech recognition dictionary and language models efficiently so as to reduce speech recognition errors by utilizing text data available at a low cost. The system includes speech recognition dictionary storage section 105, language model storage section 106 and acoustic model storage section 107. A virtual speech recognition processing section 102 processes analyzed text data generated by the text analyzing section 101 by making reference to the recognition dictionary, language models and acoustic models so as to generate virtual text data resulted from speech recognition, and compares the virtual text data resulted from speech recognition with the analyzed text data. The update processing section 103 updates the recognition dictionary and language models so as to reduce different point(s) between both sets of text data.
    Type: Grant
    Filed: February 2, 2007
    Date of Patent: May 6, 2014
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 8719035
    Abstract: Techniques are disclosed for recognizing user personality in accordance with a speech recognition system. For example, a technique for recognizing a personality trait associated with a user interacting with a speech recognition system includes the following steps/operations. One or more decoded spoken utterances of the user are obtained. The one or more decoded spoken utterances are generated by the speech recognition system. The one or more decoded spoken utterances are analyzed to determine one or more linguistic attributes (morphological and syntactic filters) that are associated with the one or more decoded spoken utterances. The personality trait associated with the user is then determined based on the analyzing step/operation.
    Type: Grant
    Filed: March 26, 2008
    Date of Patent: May 6, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Osamuyimen Thompson Stewart, Liwei Dai
  • Patent number: 8712775
    Abstract: A method and system for generating a finite state grammar is provided. The method comprises receiving user input of at least two sample phrases; analyzing the sample phrases to determine common words that occur in each of the sample phrases and optional words that occur in only some of the sample phrases; creating a mathematical expression representing the sample phrases, the expression including each word found in the sample phrases and an indication of whether a word is a common word or an optional word; displaying the mathematical expression to a user; allowing the user to alter the mathematical expression; generating a finite state grammar corresponding to the altered mathematical expression; and displaying the finite state grammar to the user.
    Type: Grant
    Filed: January 31, 2013
    Date of Patent: April 29, 2014
    Assignee: West Interactive Corporation II
    Inventor: Ashok Mitter Khosla
  • Patent number: 8711015
    Abstract: The invention relates to compressing of sparse data sets contains sequences of data values and position information therefor. The position information may be in the form of position indices defining active positions of the data values in a sparse vector of length N. The position information is encoded into the data values by adjusting one or more of the data values within a pre-defined tolerance range, so that a pre-defined mapping function of the data values and their positions is close to a target value. In one embodiment, the mapping function is defined using a sub-set of N filler values which elements are used to fill empty positions in the input sparse data vector. At the decoder, the correct data positions are identified by searching though possible sub-sets of filler values.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: April 29, 2014
    Assignee: Her Majesty the Queen in Right of Canada as represented by the Minister of Industry, through the Communications Research Centre Canada
    Inventors: Frederic Mustiere, Hossein Najaf-Zadeh, Ramin Pishehvar, Hassan Lahdili, Louis Thibault, Martin Bouchard
  • Patent number: 8706489
    Abstract: A system and method for selecting audio contents by using the speech recognition to obtain a textual phrase from a series of audio contents are provided. The system includes an output module outputting the audio contents, an input module receiving a speech input from a user, a buffer temporarily storing the audio contents within a desired period and the speech input, and a recognizing module performing a speech recognition between the audio contents within the desired period and the speech input to generate an audio phrase and the corresponding textual phrase matching with the speech input.
    Type: Grant
    Filed: August 8, 2006
    Date of Patent: April 22, 2014
    Assignee: Delta Electronics Inc.
    Inventors: Jia-lin Shen, Chien-Chou Hung
  • Patent number: 8704649
    Abstract: Disclosed herein is a vibrotactile device intuitively providing information by inducing a tactile sense to a user, and a method using the same. The device according to an embodiment includes a vibrating contact panel contacting with a user's hand; a plurality of vibratory modules that are attached to the lower part of the vibrating contact panel and vibrate with different intensities according to the amount of supplied power; and a plurality of vibration isolating links that are coupled, respectively, to an end of each of the modules to support the modules and to isolate the vibration from the modules.
    Type: Grant
    Filed: January 21, 2009
    Date of Patent: April 22, 2014
    Assignee: Korea Institute of Science and Technology
    Inventors: Dong Seok Ryu, Sung Chul Kang, Gi Hun Yang