Specialized Equations Or Comparisons Patents (Class 704/236)
-
Patent number: 8706491Abstract: One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.Type: GrantFiled: August 24, 2010Date of Patent: April 22, 2014Assignee: Microsoft CorporationInventors: Ciprian Chelba, Milind Mahajan
-
Patent number: 8706499Abstract: Client devices periodically capture ambient audio waveforms, generate waveform fingerprints, and upload the fingerprints to a server for analysis. The server compares the waveforms to a database of stored waveform fingerprints, and upon finding a match, pushes content or other information to the client device. The fingerprints in the database may be uploaded by other users, and compared to the received client waveform fingerprint based on common location or other social factors. Thus a client's location may be enhanced if the location of users whose fingerprints match the client's is known. In particular embodiments, the server may instruct clients whose fingerprints partially match to capture waveform data at a particular time and duration for further analysis and increased match confidence.Type: GrantFiled: August 16, 2011Date of Patent: April 22, 2014Assignee: Facebook, Inc.Inventors: Matthew Nicholas Papakipos, David Harry Garcia
-
Patent number: 8700399Abstract: In one embodiment the present invention includes a method comprising receiving an acoustic input signal and processing the acoustic input signal with a plurality of acoustic recognition processes configured to recognize the same target sound. Different acoustic recognition processes start processing different segments of the acoustic input signal at different time points in the acoustic input signal. In one embodiment, initial states in the recognition processes may be configured on each time step.Type: GrantFiled: July 6, 2010Date of Patent: April 15, 2014Assignee: Sensory, Inc.Inventors: Pieter J. Vermeulen, Jonathan Shaw, Todd F. Mozer
-
Patent number: 8700397Abstract: A method of and a system for processing speech. A spoken utterance of a plurality of characters can be received. A plurality of known character sequences that potentially correspond to the spoken utterance can be selected. Each selected known character sequence can be scored based on, at least in part, a weighting of individual characters that comprise the known character sequence.Type: GrantFiled: July 30, 2012Date of Patent: April 15, 2014Assignee: Nuance Communications, Inc.Inventor: Kenneth D. White
-
Patent number: 8700392Abstract: A user can provide input to a computing device through various combinations of speech, movement, and/or gestures. A computing device can analyze captured audio data and analyze that data to determine any speech information in the audio data. The computing device can simultaneously capture image or video information which can be used to assist in analyzing the audio information. For example, image information is utilized by the device to determine when someone is speaking, and the movement of the person's lips can be analyzed to assist in determining the words that were spoken. Any gestures or other motions can assist in the determination as well. By combining various types of data to determine user input, the accuracy of a process such as speech recognition can be improved, and the need for lengthy application training processes can be avoided.Type: GrantFiled: September 10, 2010Date of Patent: April 15, 2014Assignee: Amazon Technologies, Inc.Inventors: Gregory M. Hart, Ian W. Freed, Gregg Elliott Zehr, Jeffrey P. Bezos
-
Patent number: 8700398Abstract: An interactive user interface is described for setting confidence score thresholds in a language processing system. There is a display of a first system confidence score curve characterizing system recognition performance associated with a high confidence threshold, a first user control for adjusting the high confidence threshold and an associated visual display highlighting a point on the first system confidence score curve representing the selected high confidence threshold, a display of a second system confidence score curve characterizing system recognition performance associated with a low confidence threshold, and a second user control for adjusting the low confidence threshold and an associated visual display highlighting a point on the second system confidence score curve representing the selected low confidence threshold. The operation of the second user control is constrained to require that the low confidence threshold must be less than or equal to the high confidence threshold.Type: GrantFiled: November 29, 2011Date of Patent: April 15, 2014Assignee: Nuance Communications, Inc.Inventors: Jeffrey N. Marcus, Amy E. Ulug, William Bridges Smith, Jr.
-
Publication number: 20140100847Abstract: Disclosed is a voice recognition device including: first through Mth voice recognition parts each for detecting a voice interval from sound data stored in a sound data storage unit 2 to extract a feature quantity of the sound data within the voice interval, and each for carrying out a recognition process on the basis of the feature quantity extracted thereby while referring to a recognition dictionary; a voice recognition switching unit 4 for switching among the first through Mth voice recognition parts; a recognition control unit 5 for controlling the switching among the voice recognition parts by the voice recognition switching unit 4 to acquire recognition results acquired by a voice recognition part selected; and a recognition result selecting unit 6 for selecting a recognition result to be presented to a user from the recognition results acquired by the recognition control unit 5.Type: ApplicationFiled: July 5, 2011Publication date: April 10, 2014Applicant: MITSUBISHI ELECTRIC CORPORATIONInventors: Jun Ishii, Michihiro Yamazaki
-
Publication number: 20140100848Abstract: Methods and systems for identifying specified phrases within audio streams are provided. More particularly, a phrase is specified. An audio stream is them monitored for the phrase. In response to determining that the audio stream contains the phrase, verification from a user that the phrase was in fact included in the audio stream is requested. If such verification is received, the portion of the audio stream including the phrase is recorded. The recorded phrase can then be applied to identify future instances of the phrase in monitored audio streams.Type: ApplicationFiled: October 5, 2012Publication date: April 10, 2014Applicant: AVAYA INC.Inventors: Shmuel Shaffer, Keith Ponting, Valentine C. Matula
-
Patent number: 8694317Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.Type: GrantFiled: February 6, 2006Date of Patent: April 8, 2014Assignee: Aurix LimitedInventors: Adrian I Skilling, Howard A K Wright
-
Patent number: 8688451Abstract: A speech recognition method includes receiving input speech from a user, processing the input speech using a first grammar to obtain parameter values of a first N-best list of vocabulary, comparing a parameter value of a top result of the first N-best list to a threshold value, and if the compared parameter value is below the threshold value, then additionally processing the input speech using a second grammar to obtain parameter values of a second N-best list of vocabulary. Other preferred steps include: determining the input speech to be in-vocabulary if any of the results of the first N-best list is also present within the second N-best list, but out-of-vocabulary if none of the results of the first N-best list is within the second N-best list; and providing audible feedback to the user if the input speech is determined to be out-of-vocabulary.Type: GrantFiled: May 11, 2006Date of Patent: April 1, 2014Assignee: General Motors LLCInventors: Timothy J. Grost, Rathinavelu Chengalvarayan
-
Patent number: 8688448Abstract: The invention relates to a method, a computer program product, a segmentation system and a user interface for structuring an unstructured text by making use of statistical models trained on annotated training data. The method performs text segmentation into text sections and assigns labels to text sections as section headings. The performed segmentation and assignment is provided to a user for general review. Additionally, alternative segmentations and label assignments are provided to the user being capable to select alternative segmentations and alternative labels as well as to enter a user defined segmentation and user defined label. In response to the modifications introduced by the user, a plurality of different actions are initiated incorporating the re-segmentation and re-labeling of successive parts of the document or the entire document.Type: GrantFiled: September 14, 2012Date of Patent: April 1, 2014Assignee: Nuance Communications Austria GmbHInventors: Jochen Peters, Evgeny Matusov, Carsten Meyer, Dietrich Klakow
-
Patent number: 8682669Abstract: A system and a method to generate statistical utterance classifiers optimized for the individual states of a spoken dialog system is disclosed. The system and method make use of large databases of transcribed and annotated utterances from calls collected in a dialog system in production and log data reporting the association between the state of the system at the moment when the utterances were recorded and the utterance. From the system state, being a vector of multiple system variables, subsets of these variables, certain variable ranges, quantized variable values, etc. can be extracted to produce a multitude of distinct utterance subsets matching every possible system state. For each of these subset and variable combinations, statistical classifiers can be trained, tuned, and tested, and the classifiers can be stored together with the performance results and the state subset and variable combination.Type: GrantFiled: August 21, 2009Date of Patent: March 25, 2014Assignee: Synchronoss Technologies, Inc.Inventors: David Suendermann, Jackson Liscombe, Krishna Dayanidhi, Roberto Pieraccini
-
Patent number: 8682664Abstract: The present invention discloses a method and a device for audio signal classification, and relates to the field of communications technologies, which solve a problem of high complexity of type classification of audio signals in the prior art. In the present invention, after an audio signal to be classified is received, a tonal characteristic parameter of the audio signal to be classified, where the tonal characteristic parameter of the audio signal to be classified is in at least one sub-band, is obtained, and a type of the audio signal to be classified is determined according to the obtained characteristic parameter. The present invention is mainly applied to an audio signal classification scenario, and implements audio signal classification through a relatively simple method.Type: GrantFiled: September 27, 2011Date of Patent: March 25, 2014Assignee: Huawei Technologies Co., Ltd.Inventors: Lijing Xu, Shunmei Wu, Liwei Chen, Qing Zhang
-
Publication number: 20140081636Abstract: System and method to adjust an automatic speech recognition (ASR) engine, the method including: receiving social network information from a social network; data mining the social network information to extract one or more characteristics; inferring a trend from the extracted one or more characteristics; and adjusting the ASR engine based upon the inferred trend. Embodiments of the method may further include: receiving a speech signal from a user; and recognizing the speech signal by use of the adjusted ASR engine. Further embodiments of the method may further include: producing a list of candidate matching words; and ranking the list of candidate matching words by use of the inferred trend.Type: ApplicationFiled: September 15, 2012Publication date: March 20, 2014Applicant: Avaya Inc.Inventors: George W. Erhart, Valentine C. Matula, David J. Skiba
-
Patent number: 8676578Abstract: According to one embodiment, a meeting support apparatus includes a storage unit, a determination unit, a generation unit. The storage unit is configured to store storage information for each of words, the storage information indicating a word of the words, pronunciation information on the word, and pronunciation recognition frequency. The determination unit is configured to generate emphasis determination information including an emphasis level that represents whether a first word should be highlighted and represents a degree of highlighting determined in accordance with a pronunciation recognition frequency of a second word when the first word is highlighted, based on whether the storage information includes second set corresponding to first set and based on the pronunciation recognition frequency of the second word when the second set is included. The generation unit is configured to generate an emphasis character string based on the emphasis determination information when the first word is highlighted.Type: GrantFiled: March 25, 2011Date of Patent: March 18, 2014Assignee: Kabushiki Kaisha ToshibaInventors: Tomoo Ikeda, Nobuhiro Shimogori, Kouji Ueno
-
Patent number: 8676580Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.Type: GrantFiled: August 16, 2011Date of Patent: March 18, 2014Assignee: International Business Machines CorporationInventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
-
Patent number: 8676588Abstract: Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage.Type: GrantFiled: May 22, 2009Date of Patent: March 18, 2014Assignee: Accenture Global Services LimitedInventors: Thomas J. Ryan, Biji K. Janan
-
Publication number: 20140074468Abstract: An embodiment according to the invention provides a capability of automatically predicting how favorable a given speech signal is for statistical modeling, which is advantageous in a variety of different contexts. In Multi-Form Segment (MFS) synthesis, for example, an embodiment according to the invention uses prediction capability to provide an automatic acoustic driven template versus model decision maker with an output quality that is high, stable and depends gradually on the system footprint. In speaker selection for a statistical Text-to-Speech synthesis (TTS) system build, as another example context, an embodiment according to the invention enables a fast selection of the most appropriate speaker among several available ones for the full voice dataset recording and preparation, based on a small amount of recorded speech material.Type: ApplicationFiled: September 7, 2012Publication date: March 13, 2014Applicant: Nuance Communications, Inc.Inventors: Alexander Sorin, Slava Shechtman, Vincent Pollet
-
Publication number: 20140074469Abstract: Method and apparatus for generating compact signatures of acoustic signal are disclosed. A method of generating acoustic signal signatures comprises the steps of dividing input signal into multiple frames, computing Fourier transform of each frame, computing difference between non-negative Fourier transform output values for the current frame and non-negative Fourier transform output values for one of previous frames, combining difference values into subgroups, accumulating difference values within a subgroup, combining accumulated subgroup values into groups, and finding an extreme value within each group.Type: ApplicationFiled: September 8, 2013Publication date: March 13, 2014Inventor: Sergey Zhidkov
-
Publication number: 20140067392Abstract: A method of providing hands-free services using a mobile device having wireless access to computer-based services includes receiving speech in a vehicle from a vehicle occupant; recording the speech using a mobile device; transmitting the recorded speech from the mobile device to a cloud speech service; receiving automatic speech recognition (ASR) results from the cloud speech service at the mobile device; and comparing the recorded speech with the received ASR results at the mobile device to identify one or more error conditions.Type: ApplicationFiled: September 5, 2012Publication date: March 6, 2014Applicant: GM GLOBAL TECHNOLOGY OPERATIONS LLCInventors: Denis R. Burke, Danilo Gurovich, Daniel E. Rudman, Keith A. Fry, Shane M. McCutchen, Marco T. Carnevale, Mukesh Gupta
-
Publication number: 20140067391Abstract: A system and method are presented for predicting speech recognition performance using accuracy scores in speech recognition systems within the speech analytics field. A keyword set is selected. Figure of Merit (FOM) is computed for the keyword set. Relevant features that describe the word individually and in relation to other words in the language are computed. A mapping from these features to FOM is learned. This mapping can be generalized via a suitable machine learning algorithm and be used to predict FOM for a new keyword. In at least embodiment, the predicted FOM may be used to adjust internals of speech recognition engine to achieve a consistent behavior for all inputs for various settings of confidence values.Type: ApplicationFiled: August 30, 2012Publication date: March 6, 2014Applicant: INTERACTIVE INTELLIGENCE, INC.Inventors: Aravind Ganapathiraju, Yingyi Tan, Felix Immanuel Wyss, Scott Allen Randal
-
Patent number: 8666737Abstract: A noise power estimation system for estimating noise power of each frequency spectral component includes a cumulative histogram generating section for generating a cumulative histogram for each frequency spectral component of a time series signal, in which the horizontal axis indicates index of power level and the vertical axis indicates cumulative frequency and which is weighted by exponential moving average; and a noise power estimation section for determining an estimated value of noise power for each frequency spectral component of the time series signal based on the cumulative histogram.Type: GrantFiled: September 14, 2011Date of Patent: March 4, 2014Assignee: Honda Motor Co., Ltd.Inventors: Hirofumi Nakajima, Kazuhiro Nakadai, Yuji Hasegawa
-
Publication number: 20140052443Abstract: A voice control method of an electronic device is provided. The method includes detecting external voice around, changing detected voices into voices signals; periodically sensing non-vocal physical actions of the user, and identifying the action; comparing the identified action with a preset action to determine whether the identified action is same as the preset action; extracting voice characteristic of linguistic meaning from the voice signals when the signals are received; determining whether the extracted voice characteristic of linguistic meaning matches voice templates stored in a storage unit; and performing a particular function associated with the voice template when the storage unit stores the voice template corresponding to the voice characteristic of linguistic meaning. The electronic device is also provided.Type: ApplicationFiled: November 1, 2012Publication date: February 20, 2014Inventor: TZU-CHIAO SUNG
-
Patent number: 8655656Abstract: A method for assessing intelligibility of speech represented by a speech signal includes providing a speech signal and performing a feature extraction on at least one frame of the speech signal so as to obtain a feature vector for each of the at least one frame of the speech signal. The feature vector is input to a statistical machine learning model so as to obtain an estimated posterior probability of phonemes in the at least one frame as an output including a vector of phoneme posterior probabilities of different phonemes for each of the at least one frame of the speech signal. An entropy estimation is performed on the vector of phoneme posterior probabilities of the at least one frame of the speech signal so as to evaluate intelligibility of the at least one frame of the speech signal. An intelligibility measure is output for the at least one frame of the speech signal.Type: GrantFiled: March 4, 2011Date of Patent: February 18, 2014Assignee: Deutsche Telekom AGInventors: Hamed Ketabdar, Juan-Pablo Ramirez
-
Publication number: 20140039890Abstract: The present document relates to methods and systems for encoding an audio signal. The method comprises determining a spectral representation of the audio signal. The determining a spectral representation step may comprise determining modified discrete cosine transform, MDCT, coefficients, or a Quadrature Mirror Filter, QMF, filter bank representation of the audio signal. The method further comprises encoding the audio signal using the determined spectral representation; and classifying parts of the audio signal to be speech or non-speech based on the determined spectral representation. Finally, a loudness measure for the audio signal based on the speech parts is determined.Type: ApplicationFiled: April 27, 2012Publication date: February 6, 2014Applicant: DOLBY INTERNATIONAL ABInventors: Harald H. Mundt, Arijit Biswas, Rolf Meissner
-
Patent number: 8639506Abstract: Method, system and computer program for determining the matching between a first and a second sampled signals using an improved Dynamic Time Warping algorithm, called Unbounded DTW. It uses a dynamic programming algorithm to find exact start-end alignment points, unknown a priori, being the initial subsampling of the similarity matrix made via definition of optimal synchronization points, allowing a very fast process.Type: GrantFiled: December 10, 2010Date of Patent: January 28, 2014Assignee: Telefonica, S.A.Inventors: Xavier Anguera Miro, Robert Macrae
-
Patent number: 8639509Abstract: In a confidence computing method and system, a processor may interpret speech signals as a text string or directly receive a text string as input, generate a syntactical parse tree representing the interpreted string and including a plurality of sub-trees which each represents a corresponding section of the interpreted text string, determine for each sub-tree whether the sub-tree is accurate, obtain replacement speech signals for each sub-tree determined to be inaccurate, and provide output based on corresponding text string sections of at least one sub-tree determined to be accurate.Type: GrantFiled: July 27, 2007Date of Patent: January 28, 2014Assignee: Robert Bosch GmbHInventors: Fuliang Weng, Feng Lin, Zhe Feng
-
Patent number: 8639508Abstract: A method of automatic speech recognition includes receiving an utterance from a user via a microphone that converts the utterance into a speech signal, pre-processing the speech signal using a processor to extract acoustic data from the received speech signal, and identifying at least one user-specific characteristic in response to the extracted acoustic data. The method also includes determining a user-specific confidence threshold responsive to the at least one user-specific characteristic, and using the user-specific confidence threshold to recognize the utterance received from the user and/or to assess confusability of the utterance with stored vocabulary.Type: GrantFiled: February 14, 2011Date of Patent: January 28, 2014Assignee: General Motors LLCInventors: Xufang Zhao, Gaurav Talwar
-
Patent number: 8630860Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.Type: GrantFiled: March 3, 2011Date of Patent: January 14, 2014Assignee: Nuance Communications, Inc.Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
-
Patent number: 8620655Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acousticType: GrantFiled: August 10, 2011Date of Patent: December 31, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
-
Publication number: 20130339018Abstract: A system and method of verifying the identity of an authorized user in an authorized user group through a voice user interface for enabling secure access to one or more services via a mobile device includes receiving first voice information from a speaker through the voice user interface of the mobile device, calculating a confidence score based on a comparison of the first voice information with a stored voice model associated with the authorized user and specific to the authorized user, interpreting the first voice information as a specific service request, identifying a minimum confidence score for initiating the specific service request, determining whether or not the confidence score exceeds the minimum confidence score, and initiating the specific service request if the confidence score exceeds the minimum confidence score.Type: ApplicationFiled: July 27, 2012Publication date: December 19, 2013Applicant: SRI INTERNATIONALInventors: Nicolas Scheffer, Yun Lei, Douglas A. Bercow
-
Patent number: 8612223Abstract: There is provided a voice processing device. The device includes: score calculation unit configured to calculate a score indicating compatibility of a voice signal input on the basis of an utterance of a user with each of plural pieces of intention information indicating each of a plurality of intentions; intention selection unit configured to select the intention information indicating the intention of the utterance of the user among the plural pieces of intention information on the basis of the score calculated by the score calculation unit; and intention reliability calculation unit configured to calculate the reliability with respect to the intention information selected by the intention selection unit on the basis of the score calculated by the score calculation unit.Type: GrantFiled: June 17, 2010Date of Patent: December 17, 2013Assignee: Sony CorporationInventors: Katsuki Minamino, Hitoshi Honda, Yoshinori Maeda, Hiroaki Ogawa
-
Publication number: 20130332163Abstract: The voiced sound interval classification device comprises a vector calculation unit which calculates, from a power spectrum time series of voice signals, a multidimensional vector series as a vector series of a power spectrum having as many dimensions as the number of microphones, a difference calculation unit which calculates, with respect to each time of the multidimensional vector series, a vector of a difference between the time and the preceding time, a sound source direction estimation unit which estimates, as a sound source direction, a main component of the differential vector, and a voiced sound interval determination unit which determines whether each sound source direction is in a voiced sound interval or a voiceless sound interval by using a predetermined voiced sound index indicative of a likelihood of a voiced sound interval of the voice signal applied at each time.Type: ApplicationFiled: January 25, 2012Publication date: December 12, 2013Applicant: NEC CORPORATIONInventor: Yoshifumi Onishi
-
Patent number: 8606578Abstract: According to some embodiments, a method and apparatus are provided to buffer N audio frames of a plurality of audio frames associated with an audio signal, pre-compute scores for a subset of context dependent models (CDMs), and perform a graphical model search associated with the N audio frames where a score of a context independent model (CIM) associated with a CDM is used in lieu of a score for the CDM when a score for the CDM is needed and has not been pre-computed.Type: GrantFiled: June 25, 2009Date of Patent: December 10, 2013Assignee: Intel CorporationInventors: Michael Eugene Deisher, Tao Ma
-
Publication number: 20130325469Abstract: A method for providing a voice recognition function and an electronic device thereof are provided. The method provides a voice recognition function in an electronic device that includes outputting, when a voice instruction is input, a list of prediction instructions that are candidate instructions similar to the input voice instruction, updating, when a correction instruction correcting the output candidate instructions is input, the list of prediction instructions, and performing, if the correction instruction matches with an instruction of high similarity in the updated list of prediction instructions, a voice recognition function corresponding to the voice instruction.Type: ApplicationFiled: May 24, 2013Publication date: December 5, 2013Applicant: Samsung Electronics Co., Ltd.Inventors: Hee-Woon KIM, Yu-Mi AHN, Seon-Hwa KIM, Ha-Young JEON
-
Patent number: 8600765Abstract: Embodiments of the present invention provide a signal classification method and device, and encoding and decoding methods and devices. The encoding method includes: dividing a current frame into a low-frequency band signal and a high-frequency band signal; attenuating the high-frequency band signal or a to-be-encoded characteristic parameter of the high-frequency band signal according to an energy attenuation value of the low-frequency band signal, where the energy attenuation value indicates energy attenuation of the low-frequency band signal caused by encoding of the low-frequency band signal; and encoding the attenuated high-frequency band signal or the attenuated to-be-encoded characteristic parameter of the high-frequency band signal. The technical solutions according to the embodiments of the present invention can improve the effect of combining the low-frequency band signal and the high-frequency band signal at the decoder.Type: GrantFiled: December 27, 2012Date of Patent: December 3, 2013Assignee: Huawei Technologies Co., Ltd.Inventors: Zexin Liu, Lei Miao, Anisse Taleb
-
Publication number: 20130317821Abstract: Various arrangements for detecting a type of sound, such as speech, are presented. A plurality of audio snippets may be sampled. A period of time may elapse between consecutive audio snippets. A hypothetical test may be performed using the sampled plurality of audio snippets. Such a hypothetical test may include weighting one or more hypothetical values greater than one or more other hypothetical values. Each hypothetical value may correspond to an audio snippet of the plurality of audio snippets. The hypothetical test may further include using at least the greater weighted one or more hypothetical values to determine whether at least one audio snippet of the plurality of audio snippets comprises the type of sound.Type: ApplicationFiled: January 2, 2013Publication date: November 28, 2013Applicant: QUALCOMM INCORPORATEDInventors: Shankar Sadasivam, Minho Jin, Leonard Henry Grokop, Edward Harrison Teague
-
Publication number: 20130317820Abstract: An automatic speech recognition dictation application is described that includes a dictation module for performing automatic speech recognition in a dictation session with a speaker user to determine representative text corresponding to input speech from the speaker user. A post-processing module develops a session level metric correlated to verbatim recognition error rate of the dictation session, and determines if recognition performance degraded during the dictation session based on a comparison of the session metric to a baseline metric.Type: ApplicationFiled: May 24, 2012Publication date: November 28, 2013Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Xiaoqiang Xiao, Venkatesh Nagesha
-
Patent number: 8595004Abstract: A problem to be solved is to robustly detect a pronunciation variation example and acquire a pronunciation variation rule having a high generalization property, with less effort. The problem can be solved by a pronunciation variation rule extraction apparatus including a speech data storage unit, a base form pronunciation storage unit, a sub word language model generation unit, a speech recognition unit, and a difference extraction unit. The speech data storage unit stores speech data. The base form pronunciation storage unit stores base form pronunciation data representing base form pronunciation of the speech data. The sub word language model generation unit generates a sub word language model from the base form pronunciation data. The speech recognition unit recognizes the speech data by using the sub word language model.Type: GrantFiled: November 27, 2008Date of Patent: November 26, 2013Assignee: NEC CorporationInventor: Takafumi Koshinaka
-
Patent number: 8595005Abstract: A computerized method, software, and system for recognizing emotions from a speech signal, wherein statistical and MFCC features are extracted from the speech signal, the MFCC features are sorted to provide a basis for comparison between the speech signal and reference samples, the statistical and MFCC features are compared between the speech signal and reference samples, a scoring system is used to compare relative correlation to different emotions, a probable emotional state is assigned to the speech signal based on the scoring system, and the probable emotional state is communicated to a user.Type: GrantFiled: April 22, 2011Date of Patent: November 26, 2013Assignee: Simple Emotion, Inc.Inventors: Akash Krishnan, Matthew Fernandez
-
Patent number: 8595006Abstract: A speech recognition method and system, includes receiving in a first noise environment a speech input having a sequence of observations; determining a likelihood of a sequence of words arising from the sequence of observations using an acoustic model trained to recognize speech in a second noise environment, the model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to an observation; and adapting the model trained in the second environment to that of the first environment.Type: GrantFiled: March 26, 2010Date of Patent: November 26, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Haitian Xu, Mark John Francis Gales
-
Patent number: 8589152Abstract: To this end, a voice detection device includes a band-based power calculation unit that calculates a total of signal power values (sub-band power) of signals entered from the microphones from one preset frequency width (sub-band) to another. The voice detection device also includes a band-based noise estimation unit that estimates the sub-band based noise power, and a sub-band based SNR calculation unit. The sub-band based SNR calculation unit calculates a sub-band SNR from one sub-band to another to output the largest one of the sub-band SNRs as an SNR for a microphone of interest. The voice detection device further includes a voice/non-voice decision unit that determines the voice/non-voice using the SNR for the microphone of interest.Type: GrantFiled: May 26, 2009Date of Patent: November 19, 2013Assignee: NEC CorporationInventors: Tadashi Emori, Masanori Tsujikawa
-
Publication number: 20130304468Abstract: A method for contextual voice query dilation in a Spoken Web search includes determining a context in which a voice query is created, generating a set of multiple voice query terms based on the context and information derived by a speech recognizer component pertaining to the voice query, and processing the set of query terms with at least one dilation operator to produce a dilated set of queries. A method for performing a search on a voice query is also provided, including generating a set of multiple query terms based on information derived by a speech recognizer component processing a voice query, processing the set with multiple dilation operators to produce multiple dilated sub-sets of query terms, selecting at least one query term from each dilated sub-set to compose a query set, and performing a search on the query set.Type: ApplicationFiled: August 8, 2012Publication date: November 14, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nitendra Rajput, Kundan Shrivastava
-
Patent number: 8583436Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.Type: GrantFiled: December 19, 2008Date of Patent: November 12, 2013Assignee: NEC CorporationInventors: Hitoshi Yamamoto, Kiyokazu Miki
-
Publication number: 20130297306Abstract: An adaptive equalization system that adjusts the spectral shape of a speech signal based on an intelligibility measurement of the speech signal may improve the intelligibility of the output speech signal. Such an adaptive equalization system may include a speech intelligibility measurement module, a spectral shape adjustment module, and an adaptive equalization module. The speech intelligibility measurement module is configured to calculate a speech intelligibility measurement of a speech signal. The spectral shape adjustment module is configured to generate a weighted long-term speech curve based on a first predetermined long-term average speech curve, a second predetermined long-term average speech curve, and the speech intelligibility measurement. The adaptive equalization module is configured to adapt equalization coefficients for the speech signal based on the weighted long-term speech curve.Type: ApplicationFiled: May 4, 2012Publication date: November 7, 2013Applicant: QNX Software Systems LimitedInventors: Phillip Alan Hetherington, Xueman Li
-
Patent number: 8577678Abstract: A speech recognition system according to the present invention includes a sound source separating section which separates mixed speeches from multiple sound sources from one another; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each frequency spectral component of a separated speech signal using distributions of speech signal and noise against separation reliability of the separated speech signal; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.Type: GrantFiled: March 10, 2011Date of Patent: November 5, 2013Assignee: Honda Motor Co., Ltd.Inventors: Kazuhiro Nakadai, Toru Takahashi, Hiroshi Okuno
-
Publication number: 20130289987Abstract: A system and method are presented for negative example based performance improvements for speech recognition. The presently disclosed embodiments address identified false positives and the identification of negative examples of keywords in an Automatic Speech Recognition (ASR) system. Various methods may be used to identify negative examples of keywords. Such methods may include, for example, human listening and learning possible negative examples from a large domain specific text source. In at least one embodiment, negative examples of keywords may be used to improve the performance of an ASR system by reducing false positives.Type: ApplicationFiled: April 26, 2013Publication date: October 31, 2013Applicant: Interactive Intelligence, Inc.Inventors: Aravind Ganapathiraju, Ananth Nagaraja Iyer, Felix Immanuel Wyss
-
Patent number: 8571865Abstract: Systems, methods performed by data processing apparatus and computer storage media encoded with computer programs for receiving information relating to (i) a communication device that has received an utterance and (ii) a voice associated with the received utterance, comparing the received voice information with voice signatures in a comparison group, the comparison group including one or more individuals identified from one or more connections arising from the received information relating to the communication device, attempting to identify the voice associated with the utterance as matching one of the individuals in the comparison group, and based on a result of the attempt to identify, selectively providing the communication device with access to one or more resources associated with the matched individual.Type: GrantFiled: August 10, 2012Date of Patent: October 29, 2013Assignee: Google Inc.Inventor: Philip Hewinson
-
Publication number: 20130282374Abstract: A speech recognition device has: hypothesis search means which searches for an optimal solution of inputted speech data by generating a hypothesis which is a bundle of words which are searched for as recognition result candidates; self-repair decision means which calculates a self-repair likelihood of a word or a word sequence included in the hypothesis which is being searched for by the hypothesis search means, and decides whether or not self-repair of the word or the word sequence is performed; and transparent word hypothesis generation means which, when the self-repair decision means decides that the self-repair is performed, generates a transparent word hypothesis which is a hypothesis which regards as a transparent word a word or a word sequence included in an un-repaired interval related to the word or the word sequence, and the hypothesis search means searches hypotheses for an optimal solution, the hypotheses including as search target hypotheses the transparent word hypothesis generated by the transpType: ApplicationFiled: January 5, 2012Publication date: October 24, 2013Applicant: NEC CORPORATIONInventors: Koji Okabe, Ken Hanazawa, Seiya Osada
-
Patent number: 8566091Abstract: A speech recognition system is provided for selecting, via a speech input, an item from a list of items. The speech recognition system detects a first speech input, recognizes the first speech input, compares the recognized first speech input with the list of items and generates a first candidate list of best matching items based on the comparison result. The system then informs the speaker of at least one of the best matching items of the first candidate list for a selection of an item by the speaker. If the intended item is not one of the best matching items presented to the speaker, the system then detects a second speech input, recognizes the second speech input, and generates a second candidate list of best matching items taking into account the comparison result obtained with the first speech input.Type: GrantFiled: December 12, 2007Date of Patent: October 22, 2013Assignee: Nuance Communications, Inc.Inventors: Andreas Löw, Lars König, Christian Hillebrecht