Preliminary Matching Patents (Class 704/252)
  • Patent number: 8521523
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting training data. In one aspect, a method comprises: selecting a target out of vocabulary rate; selecting a target percentage of user sessions; and determining a minimum training data freshness for a vocabulary of words, the minimum training data freshness corresponding to the target percentage of user sessions experiencing the target out of vocabulary rate.
    Type: Grant
    Filed: August 24, 2012
    Date of Patent: August 27, 2013
    Assignee: Google Inc.
    Inventors: Maryam Garrett, Ciprian I. Chelba
  • Patent number: 8515746
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting training data. In an aspect, a method comprises: selecting a target out of vocabulary rate; selecting a target percentage of user sessions; and determining a minimum training data collection duration for a vocabulary of words, the minimum training data collection duration corresponding to the target percentage of user sessions experiencing the target out of vocabulary rate.
    Type: Grant
    Filed: August 24, 2012
    Date of Patent: August 20, 2013
    Assignee: Google Inc.
    Inventors: Maryam Garrett, Ciprian I. Chelba
  • Patent number: 8515745
    Abstract: Methods, systems, and apparatus for selecting training data. In an aspect, a method comprises: obtaining search session data comprising search sessions that include search queries, wherein each search query comprises words; determining a threshold out of vocabulary rate indicating a rate at which a word in a search query is not included in a vocabulary; determining a threshold session out of vocabulary rate, the session out of vocabulary rate indicating a rate at which search sessions have an out of vocabulary rate that meets the threshold out of vocabulary rate; selecting a vocabulary of words that, for a set of test data, has a session out of vocabulary rate that meets the threshold session out of vocabulary rate, the vocabulary of words being selected from the one or more words included in each of the search queries included in the search sessions.
    Type: Grant
    Filed: August 24, 2012
    Date of Patent: August 20, 2013
    Assignee: Google Inc.
    Inventors: Maryam Garrett, Ciprian I. Chelba
  • Patent number: 8510097
    Abstract: Computer methods, apparatus and articles of manufacture therefor, are disclosed for text-characterization using a finite state transducer that along each path accepts on a first side an n-gram of text-characterization (e.g., a language or a topic) and outputs on a second side a sequence of symbols identifying one or more text-characterizations from a set of text-characterizations. The finite state transducer is applied to input data. For each n-gram accepted by the finite state transducer, a frequency counter associated with the n-gram of the one or more text-characterizations in the set of text-characterizations is incremented. The input data is classified as one or more text-characterizations from the set of text-characterizations using the frequency counters associated therewith.
    Type: Grant
    Filed: December 18, 2008
    Date of Patent: August 13, 2013
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Lauri J Karttunen, Ji Fang
  • Patent number: 8504369
    Abstract: A device, for use by a transcriptionist in a transcription editing system for editing transcriptions dictated by speakers, includes, in combination, a monitor configured to display visual text of transcribed dictations, an audio mechanism configured to cause playback of portions of an audio file associated with a dictation, and a cursor-control module coupled to the audio mechanism and to the monitor and configured to cause the monitor to display multiple cursors in the text.
    Type: Grant
    Filed: June 2, 2004
    Date of Patent: August 6, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Benjamin Chigier, Edward A. Brody, Daniel Edward Chernin, Roger S. Zimmerman
  • Patent number: 8504365
    Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: August 6, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst Schroeter
  • Patent number: 8498865
    Abstract: An enhanced speech recognition system and method are provided that may be used with a voice recognition wireless communication system. The enhanced speech recognition system and method take advantage of group to group calling statistics to improve the recognition of names by the speech recognition system.
    Type: Grant
    Filed: February 17, 2011
    Date of Patent: July 30, 2013
    Assignee: Vocera Communications, Inc.
    Inventor: Robert E. Shostak
  • Publication number: 20130158999
    Abstract: A voice recognition apparatus creates a voice recognition dictionary of words which are cut out from address data constituting words that are a voice recognition target, and which have an occurrence frequency not less than a predetermined value, compares a time series of acoustic features of an input voice with the voice recognition dictionary, selects the most likely word string as the input voice from the voice recognition dictionary, carries out partial matching between the selected word string and the address data, and outputs the word that partially matches as a voice recognition result.
    Type: Application
    Filed: November 30, 2010
    Publication date: June 20, 2013
    Applicant: Mitsubishi Electric Corporation
    Inventors: Yuzo Maruta, Jun Ishii
  • Patent number: 8468012
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.
    Type: Grant
    Filed: May 26, 2010
    Date of Patent: June 18, 2013
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Trausti Kristjansson
  • Patent number: 8463608
    Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.
    Type: Grant
    Filed: March 12, 2012
    Date of Patent: June 11, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
  • Patent number: 8442824
    Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.
    Type: Grant
    Filed: November 25, 2009
    Date of Patent: May 14, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
  • Patent number: 8438029
    Abstract: Disclosed are apparatus and methods for generating synthesized utterances. A computing device can receive speech data corresponding to spoken utterances of a particular speaker. Textual elements of an input text corresponding to the speech data can be recognized. Confidence levels associated with the recognized textual elements can be determined. Speech-synthesis parameters of decision trees can be adapted based on the speech data, recognized textual elements, and associated confidence levels. Each adapted decision tree can map individual elements of a text to individual of the speech-synthesis parameters. A second input text can be received. The second input text can be mapped to speech-synthesis parameters using the adapted decision trees. A synthesized spoken utterance can be generated corresponding to the second input text using the speech-synthesis parameters. At least some of the speech-synthesis parameters are configured to simulate the particular speaker.
    Type: Grant
    Filed: August 22, 2012
    Date of Patent: May 7, 2013
    Assignee: Google Inc.
    Inventors: Matthew Nicholas Stuttle, Byungha Chun
  • Patent number: 8433568
    Abstract: A method for measuring speech intelligibility includes inputting a speech waveform to a system. At least one acoustic feature is extracted from the waveform. From the acoustic feature, at least one phoneme is segmented. At least one acoustic correlate measure is extracted from the at least one phoneme and at least one intelligibility measure is determined. The at least one acoustic correlate measure is mapped to the at least one intelligibility measure.
    Type: Grant
    Filed: March 29, 2010
    Date of Patent: April 30, 2013
    Assignee: Cochlear Limited
    Inventors: Lee Krause, Mark Skowranski, Bonny Banerjee
  • Patent number: 8433567
    Abstract: A method, system, and computer program product compensation of intra-speaker variability in speaker diarization are provided. The method includes: dividing a speech session into segments of duration less than an average duration between speaker change; parameterizing each segment by a time dependent probability density function supervector, for example, using a Gaussian Mixture Model; computing a difference between successive segment supervectors; and computing a scatter measure such as a covariance matrix of the difference as an estimate of intra-speaker variability. The method further includes compensating the speech session for intra-speaker variability using the estimate of intra-speaker variability.
    Type: Grant
    Filed: April 8, 2010
    Date of Patent: April 30, 2013
    Assignee: International Business Machines Corporation
    Inventor: Hagai Aronowitz
  • Patent number: 8417526
    Abstract: One or more embodiments include a speech recognition learning system for improved speech recognition. The learning system may include a speech optimizing system. The optimizing system may receive a first stimulus data package including spoken utterances having at least one phoneme, and contextual information. A number of result data packages may be retrieved which include stored spoken utterances and contextual information. A determination may be made as to whether the first stimulus data package requires improvement. A second stimulus data package may be generated based on the determination. A number of speech recognition implementation rules for implementing the second stimulus data package may be received. The rules may be associated with the contextual information. A determination may be made as to whether the second stimulus data package requires further improvement.
    Type: Grant
    Filed: March 13, 2009
    Date of Patent: April 9, 2013
    Assignee: Adacel, Inc.
    Inventor: Francois Bourdon
  • Patent number: 8407057
    Abstract: A machine, system and method for user-guided teaching and modifications of voice commands and actions to be executed by a conversational learning system. The machine includes a system bus for communicating data and control signals received from the conversational learning system to a computer system, a vehicle data and control bus for connecting devices and sensors in the machine, a bridge module for connecting the vehicle data and control bus to the system bus, machine subsystems coupled to the vehicle data and control bus having a respective user interface for receiving a voice command or input signal from a user, a memory coupled to the system bus for storing action command sequences learned for a new voice command and a processing unit coupled to the system bus for automatically executing the action command sequences learned when the new voice command is spoken.
    Type: Grant
    Filed: January 21, 2009
    Date of Patent: March 26, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Liam D. Comerford, Mahesh Viswanathan
  • Patent number: 8401854
    Abstract: A system and method is provided for recognizing a speech input and selecting an entry from a list of entries. The method includes recognizing a speech input. A fragment list of fragmented entries is provided and compared to the recognized speech input to generate a candidate list of best matching entries based on the comparison result. The system includes a speech recognition module, and a data base for storing the list of entries and the fragmented list. The speech recognition module may obtain the fragmented list from the data base and store a candidate list of best matching entries in memory. A display may also be provided to allow the user to select from a list of best matching entries.
    Type: Grant
    Filed: January 16, 2009
    Date of Patent: March 19, 2013
    Assignee: Nuance Communications, Inc.
    Inventor: Markus Schwarz
  • Patent number: 8396713
    Abstract: A method (and system) of handling out-of-grammar utterances includes building a statistical language model for a dialog state using, generating sentences and semantic interpretations for the sentences using finite state grammar, building a statistical action classifier, receiving user input, carrying out recognition with the finite state grammar, carrying out recognition with the statistical language model, using the statistical action classifier to find semantic interpretations, comparing an output from the finite state grammar and an output from the statistical language model, deciding which output of the output from the finite state grammar and the output from the statistical language model to keep as a final recognition output, selecting the final recognition output, and outputting the final recognition result, wherein the statistical action classifier, the finite state grammar and the statistical language model are used in conjunction to carry out speech recognition and interpretation.
    Type: Grant
    Filed: April 30, 2007
    Date of Patent: March 12, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Vaibhava Goel, Ramesh Gopinath, Ea-Ee Jan, Karthik Visweswariah
  • Patent number: 8392193
    Abstract: A method for performing speech recognition includes receiving a voice input and generating at least one possible result corresponding to the voice input. The method may also include calculating a value for the speech recognition result and comparing the calculated value to a particular portion of the speech recognition result. The method may further include retrieving information based on one or more factors associated with the voice input and using the retrieved information to determine a likelihood that the speech recognition result is correct.
    Type: Grant
    Filed: June 1, 2004
    Date of Patent: March 5, 2013
    Assignee: Verizon Business Global LLC
    Inventors: Paul T. Schultz, Robert A. Sartini
  • Patent number: 8386251
    Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.
    Type: Grant
    Filed: June 8, 2009
    Date of Patent: February 26, 2013
    Assignee: Microsoft Corporation
    Inventors: Nikko Strom, Julian Odell, Jon Hamaker
  • Patent number: 8374869
    Abstract: An utterance verification method for an isolated word N-best speech recognition result includes: calculating log likelihoods of a context-dependent phoneme and an anti-phoneme model based on an N-best speech recognition result for an input utterance; measuring a confidence score of an N-best speech-recognized word using the log likelihoods; calculating distance between phonemes for the N-best speech-recognized word; comparing the confidence score with a threshold and the distance with a predetermined mean of distances; and accepting the N-best speech-recognized word when the compared results for the confidence score and the distance correspond to acceptance.
    Type: Grant
    Filed: August 4, 2009
    Date of Patent: February 12, 2013
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Jeom Ja Kang, Yunkeun Lee, Jeon Gue Park, Ho-Young Jung, Hyung-Bae Jeon, Hoon Chung, Sung Joo Lee, Euisok Chung, Ji Hyun Wang, Byung Ok Kang, Ki-young Park, Jong Jin Kim
  • Patent number: 8355914
    Abstract: A method for selecting text created in a mobile terminal by word and correcting it or changing it to another word, and a mobile terminal implementing the same are disclosed. The mobile terminal includes: a display unit to display one or more words of text, and to display tags for each of the one or more words; an input unit to select at least one of the tagged one or more words as selected one word; and a controller to display candidate words having a similar pronunciation to that of the word selected via the input unit, select one of the candidate words as selected one candidate word, and change the selected one word from the text to the selected one candidate word.
    Type: Grant
    Filed: April 17, 2009
    Date of Patent: January 15, 2013
    Assignee: LG Electronics Inc.
    Inventors: Jae-Min Joh, Jae-Do Kwak, Jong-Keun Youn
  • Patent number: 8352266
    Abstract: The invention provides a system and method for improving speech recognition. A computer software system is provided for implementing the system and method. A user of the computer software system may speak to the system directly and the system may respond, in spoken language, with an appropriate response. Grammar rules may be generated automatically from sample utterances when implementing the system for a particular application. Dynamic grammar rules may also be generated during interaction between the user and the system. In addition to arranging searching order of grammar files based on a predetermined hierarchy, a dynamically generated searching order based on history of contexts of a single conversation may be provided for further improved speech recognition.
    Type: Grant
    Filed: March 8, 2011
    Date of Patent: January 8, 2013
    Assignee: Inago Corporation
    Inventors: Gary Farmaner, Ron DiCarlantonio, Huw Leonard
  • Patent number: 8352261
    Abstract: A communication system includes at least one transmitting device and at least one receiving device, one or more network systems for connecting the transmitting device to the receiving device, and an automatic speech recognition (“ASR”) system, including an ASR engine. A user speaks an utterance into the transmitting device, and the recorded speech audio is sent to the ASR engine. The ASR engine returns intermediate transcription results to the transmitting device, which displays the intermediate transcription results in real-time to the user. The intermediate transcription results are also correlated by utterance fragment to final transcription results and displayed to the user. The user may use the information thus presented to make decisions as to whether to edit the final transcription results or to speak the utterance again, thereby repeating the process. The intermediate transcription results may also be used by the user to edit the final transcription results.
    Type: Grant
    Filed: March 9, 2009
    Date of Patent: January 8, 2013
    Assignee: Canyon IP Holdings, LLC
    Inventors: James Richard Terrell, II, Marc White
  • Patent number: 8346554
    Abstract: A method for automatic speech recognition includes determining for an input signal a plurality scores representative of certainties that the input signal is associated with corresponding states of a speech recognition model, using the speech recognition model and the determined scores to compute an average signal, computing a difference value representative of a difference between the input signal and the average signal, and processing the input signal in accordance with the difference value.
    Type: Grant
    Filed: September 15, 2010
    Date of Patent: January 1, 2013
    Assignee: Nuance Communications, Inc.
    Inventor: Igor Zlokarnik
  • Patent number: 8332225
    Abstract: Techniques to create and share custom voice fonts are described. An apparatus may include a preprocessing component to receive voice audio data and a corresponding text script from a client and to process the voice audio data to produce prosody labels and a rich script. The apparatus may further include a verification component to automatically verify the voice audio data and the text script. The apparatus may further include a training component to train a custom voice font from the verified voice audio data and rich script and to generate custom voice font data usable by the TTS component. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 4, 2009
    Date of Patent: December 11, 2012
    Assignee: Microsoft Corporation
    Inventors: Sheng Zhao, Zhi Li, Shenghao Qin, Chiwei Che, Jingyang Xu, Binggong Ding
  • Patent number: 8326626
    Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In once aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: December 4, 2012
    Assignee: West Corporation
    Inventors: Mark J. Pettay, Fonda J. Narke
  • Patent number: 8321197
    Abstract: System and method for electronically identifying and analyzing the type and frequency of errors and mismatches in a stenographically or voice written text against a stored master file and dynamically creating personalized user feedback, drills, and practice based on identified errors and mismatches from within the context of the stored master file. The system provides the user with a plurality of methods to enter a text file for error identification and analysis including both realtime and non-realtime input. The text input is then compared to a stored master file through a word-by-word iterative process which produces a comparison of writing input and stored master wherein errors and mismatches are identified and grouped in a plurality of pre-defined and user-selected categories, each of which is color-coded to facilitate pattern recognition of type and frequency of errors and mismatches in the submitted writing.
    Type: Grant
    Filed: October 17, 2007
    Date of Patent: November 27, 2012
    Inventors: Teresa Ruth Gaudet, Gordon James Gaudet, Gary Kenneth Pollreis, Sandra Joyce Natale
  • Patent number: 8315857
    Abstract: Systems and methods for modification of an audio input signal are provided. In exemplary embodiments, an adaptive multiple-model optimizer is configured to generate at least one source model parameter for facilitating modification of an analyzed signal. The adaptive multiple-model optimizer comprises a segment grouping engine and a source grouping engine. The segment grouping engine is configured to group simultaneous feature segments to generate at least one segment model. The at least one segment model is used by the source grouping engine to generate at least one source model, which comprises the at least one source model parameter. Control signals for modification of the analyzed signal may then be generated based on the at least one source model parameter.
    Type: Grant
    Filed: May 30, 2006
    Date of Patent: November 20, 2012
    Assignee: Audience, Inc.
    Inventors: David Klein, Stephen Malinowski, Lloyd Watts, Bernard Mont-Reynaud
  • Patent number: 8315864
    Abstract: Provided herein are systems and methods for using context-sensitive speech recognition logic in a computer to create a software program, including context-aware voice entry of instructions that make up a software program, automatic context-sensitive instruction formatting, and automatic context-sensitive insertion-point positioning.
    Type: Grant
    Filed: April 24, 2012
    Date of Patent: November 20, 2012
    Inventor: Lunis Orcutt
  • Patent number: 8311814
    Abstract: The present invention is directed to a voice activity detector that uses the periodicity of amplitude peaks and valleys to identify signals of substantially fixed power or having periodicity.
    Type: Grant
    Filed: September 19, 2006
    Date of Patent: November 13, 2012
    Assignee: Avaya Inc.
    Inventors: Mei-Sing Ong, Luke A. Tucker
  • Patent number: 8306814
    Abstract: A method for classifying a pair of audio signals into an agent audio signal and a customer audio signal. One embodiment relates to unsupervised training, in which the training corpus comprises a multiplicity of audio signal pairs, wherein each pair comprises an agent signal and a customer signal, and wherein it is unknown for each signal if it is by the agent or by the customer. Training is based on the agent signals being more similar to one another than the customer signals. An agent cluster and a customer cluster are determined. The input signals are associated with the agent or the customer according to the higher score combination of the input signals and the clusters. Another embodiment relates to supervised training, wherein an agent model is generated, and the input signal that yields higher score against the model is the agent signal, while the other is the customer signal.
    Type: Grant
    Filed: May 11, 2010
    Date of Patent: November 6, 2012
    Assignee: Nice-Systems Ltd.
    Inventors: Gil Dobry, Hila Lam, Moshe Wasserblat
  • Patent number: 8301455
    Abstract: A user identification method is described in which, in a first identification procedure, identification data (ID1) of a first type belonging to a target individual to be identified are determined and are compared with previously stored user identification data (ND1) of the first type assigned to an authorized user. In addition, identification data (ID2) of a second type that belong with a certain probability to the same target individual are automatically determined. After a successful confirmation of the identify of the target individual with the authorized user from the identification data (ID1) of the first type, user identification data (ND2) of the second type are stored for the respective authorized user using the determined identification data (ID2) of the second type in order to use said data in a subsequent identification procedure. In addition, a corresponding user identification device is disclosed.
    Type: Grant
    Filed: December 17, 2002
    Date of Patent: October 30, 2012
    Assignee: Koninklijke Philips Electronics N.V.
    Inventor: Volker Steinbiss
  • Patent number: 8296144
    Abstract: Embodiments of an automated dialog system testing method and component are described. This automated testing method and system supplements real human-based testing with simulated user input and incorporates a set of evaluation measures that focus on three basic aspects of task-oriented dialog systems, namely, understanding ability, efficiency, and the appropriateness of system actions. These measures are first applied on a corpus generated between a dialog system and a group of human users to demonstrate the validity of these measures with the human users' satisfaction levels. Results generally show that these measures are significantly correlated with these satisfaction levels. A regression model is then built to predict the user satisfaction scores using these evaluation measures.
    Type: Grant
    Filed: June 4, 2008
    Date of Patent: October 23, 2012
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Hua Ai
  • Patent number: 8285537
    Abstract: Recognition of proper nouns by an automated speech recognition system is improved by augmenting the pronunciation of each proper noun or name in the natural language of the speech recognition system with at least one “native” pronunciation in another natural language. To maximize recognition, preferably the pronunciations are predicted based on information not available to the speech recognition system. Prediction of pronunciation may be based on a location derived from a telephone number or postal address associated with the name and the language or dialect spoken in the country or region of that location. The “native” pronunciation(s) may be added to a dictionary of the speech recognition system or directly to the grammar used for recognizing speech.
    Type: Grant
    Filed: January 31, 2003
    Date of Patent: October 9, 2012
    Assignee: Comverse, Inc.
    Inventors: Marc D. Tanner, Erin M. Panttaja
  • Patent number: 8284905
    Abstract: A method and apparatus for monitoring conversations to control telecommunications is provided. In one embodiment, a method for identifying undesirable data within voice communications to control telecommunications, comprising processing voice communications between at least two entities and analyzing the voice communications to determine indicia of undesirable data.
    Type: Grant
    Filed: June 30, 2008
    Date of Patent: October 9, 2012
    Assignee: Symantec Corporation
    Inventor: Sourabh Suri
  • Patent number: 8271282
    Abstract: A voice recognition apparatus includes an extraction unit extracting a feature amount from a voice signal, a word dictionary storing a plurality of recognition words; a reject word generation unit storing reject words in the word dictionary in association with the recognition words and a collation unit calculating a degree of similarity between the voice signal and each of the recognition words and reject words stored in the word dictionary by using the feature amount extracted by the extraction unit, determining whether or not a word having a high calculated degree of similarity corresponds to a reject word, when the word is determined as the reject word, excluding the recognition word stored in the word dictionary in association with the reject word from a result of recognition, and outputting a recognition word having a high calculated degree of similarity as a result of recognition.
    Type: Grant
    Filed: April 30, 2009
    Date of Patent: September 18, 2012
    Assignee: Fujitsu Limited
    Inventor: Shouji Harada
  • Patent number: 8265936
    Abstract: A method for creating and editing an XML-based speech synthesis document for input to a text-to-speech engine is provided. The method includes recording voice utterances of a user reading a pre-selected text and parsing the recorded voice utterances into individual words and periods of silence. The method also includes recording a synthesized speech output generated by a text-to-speech engine, the synthesized speech output being an audible rendering of the pre-selected text, and parsing the synthesized speech output into individual words and periods of silence. The method further includes annotating the XML-based speech synthesis document based upon a comparison of the recorded voice utterances and the recorded synthesized speech output.
    Type: Grant
    Filed: June 3, 2008
    Date of Patent: September 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ciprian Agapi, Oswaldo Gago, Maria Elena Smith, Roberto Vila
  • Patent number: 8249873
    Abstract: Tonal correction of speech is provided. Received speech is analyzed and compared to a table of commonly mispronounced phrases. These phrases are mapped to the phrase likely intended by the speaker. The phrase determines to be the phrase the user likely intended can be suggested to the user. If the user approves of the suggestion, tonal correction can be applied to the speech before that speech is delivered to a recipient.
    Type: Grant
    Filed: August 12, 2005
    Date of Patent: August 21, 2012
    Assignee: Avaya Inc.
    Inventors: Colin Blair, Kevin Chan, Christopher R. Gentle, Neil Hepworth, Andrew W. Lang, Paul R. Michaelis
  • Patent number: 8234114
    Abstract: The present invention relates to a speech interactive system and method. The system comprises a target information receiving module, an interactive mode setting and speech processing module, an interactive information update module, a decision module, and an output response module. It receives target information and sets corresponding target text sentence information. It also receives a user's speech signal, sets an interactive mode, decides the speech's target text sentence information, and generates an assessment for the target text sentence. Under the set interactive mode, the system updates the information in an interactive information recording table according to the assessment and a timing count. According to the interactive mode and the recorded information, an output mode for the target text sentence information is generated. According to the output mode and the recorded information, the response information is generated.
    Type: Grant
    Filed: August 14, 2009
    Date of Patent: July 31, 2012
    Assignee: Industrial Technology Research Institute
    Inventors: Yao-Yuan Chang, Sen-Chia Chang, Shih-Chieh Chien, Jia-Jang Tu
  • Patent number: 8234118
    Abstract: A dialog prosody structure generating method and apparatus, and a speech synthesis method and system employing the dialog prosody structure generation method and apparatus, are provided. The speech synthesis method using the dialog prosody structure generation method includes: determining a system speaking style based on a user utterance; if the system speaking style is dialog speech, generating dialog prosody information by reflecting discourse information between a user and a system; and synthesizing a system utterance based on the generated dialog prosody information.
    Type: Grant
    Filed: May 19, 2005
    Date of Patent: July 31, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Kyoungnan Pyo, Jaewon Lee
  • Patent number: 8224649
    Abstract: A method and apparatus for remote access to a target application is disclosed where a system administrator may establish telephonic contact with an interactive voice response system and obtain access to the target application by speech communication. The interactive response system may authenticate the system administrator by implementing various measures including biometric measures. Once access is granted, the interactive response system may broker a communication between the target application using text/data and the system administrator using natural language.
    Type: Grant
    Filed: June 2, 2004
    Date of Patent: July 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Ryan L. Osborn, Jason W. Pelecanos, Ganesh N. Ramaswamy, Ran D. Zilca
  • Patent number: 8219384
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: July 10, 2012
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Trausti Kristjansson
  • Patent number: 8209170
    Abstract: Provided herein are systems and methods for using context-sensitive speech recognition logic in a computer to create a software program, including context-aware voice entry of instructions that make up a software program, automatic context-sensitive instruction formatting, and automatic context-sensitive insertion-point positioning.
    Type: Grant
    Filed: June 2, 2011
    Date of Patent: June 26, 2012
    Assignee: Lunis ORCUTT
    Inventor: Lunis Orcutt
  • Patent number: 8209176
    Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.
    Type: Grant
    Filed: August 27, 2011
    Date of Patent: June 26, 2012
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
  • Patent number: 8200490
    Abstract: A method of searching music using speech recognition in a mobile device, the method including: recognizing a speech signal uttered by a user as a phoneme sequence; and searching music information by performing partial symbol matching between the recognized phoneme sequence and a standard pronunciation sequence.
    Type: Grant
    Filed: February 9, 2007
    Date of Patent: June 12, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: In Jeong Choi, Nam Hoon Kim, Ick Sano Han, Sang Bae Jeong
  • Patent number: 8195461
    Abstract: A voice recognition system used for onboard equipment having a genre database (DB) that stores search target vocabularies in accordance with respective genres. It has a mike 1 for outputting speech sounds as spoken data; a first voice recognition dictionary 2a for recognizing words of search target genres in the genre DB; a second voice recognition dictionary 2b for recognizing words outside the search target genres; a voice recognition unit 3 for recognizing the speech sounds by collating the spoken data delivered from the mike with the vocabularies contained in the first and second voice recognition dictionaries; an interactive control unit 4 for outputting, when a word delivered from the voice recognition unit as a recognition result is a word obtained using the second voice recognition dictionary, a message so stating as presentation information; and a presentation unit 5 for presenting the presentation information to an outside.
    Type: Grant
    Filed: October 4, 2007
    Date of Patent: June 5, 2012
    Assignee: Mitsubishi Electric Corporation
    Inventor: Takayoshi Chikuri
  • Patent number: 8185392
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving voice queries, obtaining, for one or more of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query, generating, for the one or more voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query, selecting a subset of the one or more voice queries based on the posterior recognition confidence measures, and adapting an acoustic model using the subset of the voice queries.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: May 22, 2012
    Assignee: Google Inc.
    Inventors: Brian Strope, Douglas H. Beeferman
  • Patent number: 8180638
    Abstract: Disclosed herein is a method for emotion recognition based on a minimum classification error. In the method, a speaker's neutral emotion is extracted using a Gaussian mixture model (GMM), other emotions except the neutral emotion are classified using the Gaussian Mixture Model to which a discriminative weight for minimizing the loss function of a classification error for the feature vector for emotion recognition is applied. In the emotion recognition, the emotion recognition is performed by applying a discriminative weight evaluated using the Gaussian Mixture Model based on minimum classification error to feature vectors of the emotion classified with difficult, thereby enhancing the performance of emotion recognition.
    Type: Grant
    Filed: February 23, 2010
    Date of Patent: May 15, 2012
    Assignee: Korea Institute of Science and Technology
    Inventors: Hyoung Gon Kim, Ig Jae Kim, Joon-Hyuk Chang, Kye Hwan Lee, Chang Seok Bae
  • Patent number: 8180641
    Abstract: Sequential speech recognition using two unequal automatic speech recognition (ASR) systems may be provided. The system may provide two sets of vocabulary data. A determination may be made as to whether entries in one set of vocabulary data are likely to be confused with entries in the other set of vocabulary data. If confusion is likely, a decoy entry from one set of the vocabulary data may be placed in the other set of vocabulary data to ensure more efficient and accurate speech recognition processing may take place.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: May 15, 2012
    Assignee: Microsoft Corporation
    Inventors: Michael Levit, Shuangyu Chang, Bruce Melvin Buntschuh