Preliminary Matching Patents (Class 704/252)

Selecting speech data for speech recognition vocabulary

Patent number: 8521523

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting training data. In one aspect, a method comprises: selecting a target out of vocabulary rate; selecting a target percentage of user sessions; and determining a minimum training data freshness for a vocabulary of words, the minimum training data freshness corresponding to the target percentage of user sessions experiencing the target out of vocabulary rate.

Type: Grant

Filed: August 24, 2012

Date of Patent: August 27, 2013

Assignee: Google Inc.

Inventors: Maryam Garrett, Ciprian I. Chelba
Selecting speech data for speech recognition vocabulary

Patent number: 8515746

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting training data. In an aspect, a method comprises: selecting a target out of vocabulary rate; selecting a target percentage of user sessions; and determining a minimum training data collection duration for a vocabulary of words, the minimum training data collection duration corresponding to the target percentage of user sessions experiencing the target out of vocabulary rate.

Type: Grant

Filed: August 24, 2012

Date of Patent: August 20, 2013

Assignee: Google Inc.

Inventors: Maryam Garrett, Ciprian I. Chelba
Selecting speech data for speech recognition vocabulary

Patent number: 8515745

Abstract: Methods, systems, and apparatus for selecting training data. In an aspect, a method comprises: obtaining search session data comprising search sessions that include search queries, wherein each search query comprises words; determining a threshold out of vocabulary rate indicating a rate at which a word in a search query is not included in a vocabulary; determining a threshold session out of vocabulary rate, the session out of vocabulary rate indicating a rate at which search sessions have an out of vocabulary rate that meets the threshold out of vocabulary rate; selecting a vocabulary of words that, for a set of test data, has a session out of vocabulary rate that meets the threshold session out of vocabulary rate, the vocabulary of words being selected from the one or more words included in each of the search queries included in the search sessions.

Type: Grant

Filed: August 24, 2012

Date of Patent: August 20, 2013

Assignee: Google Inc.

Inventors: Maryam Garrett, Ciprian I. Chelba
Region-matching transducers for text-characterization

Patent number: 8510097

Abstract: Computer methods, apparatus and articles of manufacture therefor, are disclosed for text-characterization using a finite state transducer that along each path accepts on a first side an n-gram of text-characterization (e.g., a language or a topic) and outputs on a second side a sequence of symbols identifying one or more text-characterizations from a set of text-characterizations. The finite state transducer is applied to input data. For each n-gram accepted by the finite state transducer, a frequency counter associated with the n-gram of the one or more text-characterizations in the set of text-characterizations is incremented. The input data is classified as one or more text-characterizations from the set of text-characterizations using the frequency counters associated therewith.

Type: Grant

Filed: December 18, 2008

Date of Patent: August 13, 2013

Assignee: Palo Alto Research Center Incorporated

Inventors: Lauri J Karttunen, Ji Fang
Multi-cursor transcription editing

Patent number: 8504369

Abstract: A device, for use by a transcriptionist in a transcription editing system for editing transcriptions dictated by speakers, includes, in combination, a monitor configured to display visual text of transcribed dictations, an audio mechanism configured to cause playback of portions of an audio file associated with a dictation, and a cursor-control module coupled to the audio mechanism and to the monitor and configured to cause the monitor to display multiple cursors in the text.

Type: Grant

Filed: June 2, 2004

Date of Patent: August 6, 2013

Assignee: Nuance Communications, Inc.

Inventors: Benjamin Chigier, Edward A. Brody, Daniel Edward Chernin, Roger S. Zimmerman
System and method for detecting synthetic speaker verification

Patent number: 8504365

Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.

Type: Grant

Filed: April 11, 2008

Date of Patent: August 6, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst Schroeter
Speech recognition system and method using group call statistics

Patent number: 8498865

Abstract: An enhanced speech recognition system and method are provided that may be used with a voice recognition wireless communication system. The enhanced speech recognition system and method take advantage of group to group calling statistics to improve the recognition of names by the speech recognition system.

Type: Grant

Filed: February 17, 2011

Date of Patent: July 30, 2013

Assignee: Vocera Communications, Inc.

Inventor: Robert E. Shostak
VOICE RECOGNITION APPARATUS AND NAVIGATION SYSTEM

Publication number: 20130158999

Abstract: A voice recognition apparatus creates a voice recognition dictionary of words which are cut out from address data constituting words that are a voice recognition target, and which have an occurrence frequency not less than a predetermined value, compares a time series of acoustic features of an input voice with the voice recognition dictionary, selects the most likely word string as the input voice from the voice recognition dictionary, carries out partial matching between the selected word string and the address data, and outputs the word that partially matches as a voice recognition result.

Type: Application

Filed: November 30, 2010

Publication date: June 20, 2013

Applicant: Mitsubishi Electric Corporation

Inventors: Yuzo Maruta, Jun Ishii
Acoustic model adaptation using geographic information

Patent number: 8468012

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Type: Grant

Filed: May 26, 2010

Date of Patent: June 18, 2013

Assignee: Google Inc.

Inventors: Matthew I. Lloyd, Trausti Kristjansson
Interactive speech recognition model

Patent number: 8463608

Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.

Type: Grant

Filed: March 12, 2012

Date of Patent: June 11, 2013

Assignee: Nuance Communications, Inc.

Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
Device, system, and method of liveness detection utilizing voice biometrics

Patent number: 8442824

Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.

Type: Grant

Filed: November 25, 2009

Date of Patent: May 14, 2013

Assignee: Nuance Communications, Inc.

Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
Confidence tying for unsupervised synthetic speech adaptation

Patent number: 8438029

Abstract: Disclosed are apparatus and methods for generating synthesized utterances. A computing device can receive speech data corresponding to spoken utterances of a particular speaker. Textual elements of an input text corresponding to the speech data can be recognized. Confidence levels associated with the recognized textual elements can be determined. Speech-synthesis parameters of decision trees can be adapted based on the speech data, recognized textual elements, and associated confidence levels. Each adapted decision tree can map individual elements of a text to individual of the speech-synthesis parameters. A second input text can be received. The second input text can be mapped to speech-synthesis parameters using the adapted decision trees. A synthesized spoken utterance can be generated corresponding to the second input text using the speech-synthesis parameters. At least some of the speech-synthesis parameters are configured to simulate the particular speaker.

Type: Grant

Filed: August 22, 2012

Date of Patent: May 7, 2013

Assignee: Google Inc.

Inventors: Matthew Nicholas Stuttle, Byungha Chun
Systems and methods for measuring speech intelligibility

Patent number: 8433568

Abstract: A method for measuring speech intelligibility includes inputting a speech waveform to a system. At least one acoustic feature is extracted from the waveform. From the acoustic feature, at least one phoneme is segmented. At least one acoustic correlate measure is extracted from the at least one phoneme and at least one intelligibility measure is determined. The at least one acoustic correlate measure is mapped to the at least one intelligibility measure.

Type: Grant

Filed: March 29, 2010

Date of Patent: April 30, 2013

Assignee: Cochlear Limited

Inventors: Lee Krause, Mark Skowranski, Bonny Banerjee
Compensation of intra-speaker variability in speaker diarization

Patent number: 8433567

Abstract: A method, system, and computer program product compensation of intra-speaker variability in speaker diarization are provided. The method includes: dividing a speech session into segments of duration less than an average duration between speaker change; parameterizing each segment by a time dependent probability density function supervector, for example, using a Gaussian Mixture Model; computing a difference between successive segment supervectors; and computing a scatter measure such as a covariance matrix of the difference as an estimate of intra-speaker variability. The method further includes compensating the speech session for intra-speaker variability using the estimate of intra-speaker variability.

Type: Grant

Filed: April 8, 2010

Date of Patent: April 30, 2013

Assignee: International Business Machines Corporation

Inventor: Hagai Aronowitz
Speech recognition learning system and method

Patent number: 8417526

Abstract: One or more embodiments include a speech recognition learning system for improved speech recognition. The learning system may include a speech optimizing system. The optimizing system may receive a first stimulus data package including spoken utterances having at least one phoneme, and contextual information. A number of result data packages may be retrieved which include stored spoken utterances and contextual information. A determination may be made as to whether the first stimulus data package requires improvement. A second stimulus data package may be generated based on the determination. A number of speech recognition implementation rules for implementing the second stimulus data package may be received. The rules may be associated with the contextual information. A determination may be made as to whether the second stimulus data package requires further improvement.

Type: Grant

Filed: March 13, 2009

Date of Patent: April 9, 2013

Assignee: Adacel, Inc.

Inventor: Francois Bourdon
Machine, system and method for user-guided teaching and modifying of voice commands and actions executed by a conversational learning system

Patent number: 8407057

Abstract: A machine, system and method for user-guided teaching and modifications of voice commands and actions to be executed by a conversational learning system. The machine includes a system bus for communicating data and control signals received from the conversational learning system to a computer system, a vehicle data and control bus for connecting devices and sensors in the machine, a bridge module for connecting the vehicle data and control bus to the system bus, machine subsystems coupled to the vehicle data and control bus having a respective user interface for receiving a voice command or input signal from a user, a memory coupled to the system bus for storing action command sequences learned for a new voice command and a processing unit coupled to the system bus for automatically executing the action command sequences learned when the new voice command is spoken.

Type: Grant

Filed: January 21, 2009

Date of Patent: March 26, 2013

Assignee: Nuance Communications, Inc.

Inventors: Liam D. Comerford, Mahesh Viswanathan
Speech recognition on large lists using fragments

Patent number: 8401854

Abstract: A system and method is provided for recognizing a speech input and selecting an entry from a list of entries. The method includes recognizing a speech input. A fragment list of fragmented entries is provided and compared to the recognized speech input to generate a candidate list of best matching entries based on the comparison result. The system includes a speech recognition module, and a data base for storing the list of entries and the fragmented list. The speech recognition module may obtain the fragmented list from the data base and store a candidate list of best matching entries in memory. A display may also be provided to allow the user to select from a list of best matching entries.

Type: Grant

Filed: January 16, 2009

Date of Patent: March 19, 2013

Assignee: Nuance Communications, Inc.

Inventor: Markus Schwarz
Method and system for using a statistical language model and an action classifier in parallel with grammar for better handling of out-of-grammar utterances

Patent number: 8396713

Abstract: A method (and system) of handling out-of-grammar utterances includes building a statistical language model for a dialog state using, generating sentences and semantic interpretations for the sentences using finite state grammar, building a statistical action classifier, receiving user input, carrying out recognition with the finite state grammar, carrying out recognition with the statistical language model, using the statistical action classifier to find semantic interpretations, comparing an output from the finite state grammar and an output from the statistical language model, deciding which output of the output from the finite state grammar and the output from the statistical language model to keep as a final recognition output, selecting the final recognition output, and outputting the final recognition result, wherein the statistical action classifier, the finite state grammar and the statistical language model are used in conjunction to carry out speech recognition and interpretation.

Type: Grant

Filed: April 30, 2007

Date of Patent: March 12, 2013

Assignee: Nuance Communications, Inc.

Inventors: Vaibhava Goel, Ramesh Gopinath, Ea-Ee Jan, Karthik Visweswariah
Systems and methods for performing speech recognition using constraint based processing

Patent number: 8392193

Abstract: A method for performing speech recognition includes receiving a voice input and generating at least one possible result corresponding to the voice input. The method may also include calculating a value for the speech recognition result and comparing the calculated value to a particular portion of the speech recognition result. The method may further include retrieving information based on one or more factors associated with the voice input and using the retrieved information to determine a likelihood that the speech recognition result is correct.

Type: Grant

Filed: June 1, 2004

Date of Patent: March 5, 2013

Assignee: Verizon Business Global LLC

Inventors: Paul T. Schultz, Robert A. Sartini
Progressive application of knowledge sources in multistage speech recognition

Patent number: 8386251

Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.

Type: Grant

Filed: June 8, 2009

Date of Patent: February 26, 2013

Assignee: Microsoft Corporation

Inventors: Nikko Strom, Julian Odell, Jon Hamaker
Utterance verification method and apparatus for isolated word N-best recognition result

Patent number: 8374869

Abstract: An utterance verification method for an isolated word N-best speech recognition result includes: calculating log likelihoods of a context-dependent phoneme and an anti-phoneme model based on an N-best speech recognition result for an input utterance; measuring a confidence score of an N-best speech-recognized word using the log likelihoods; calculating distance between phonemes for the N-best speech-recognized word; comparing the confidence score with a threshold and the distance with a predetermined mean of distances; and accepting the N-best speech-recognized word when the compared results for the confidence score and the distance correspond to acceptance.

Type: Grant

Filed: August 4, 2009

Date of Patent: February 12, 2013

Assignee: Electronics and Telecommunications Research Institute

Inventors: Jeom Ja Kang, Yunkeun Lee, Jeon Gue Park, Ho-Young Jung, Hyung-Bae Jeon, Hoon Chung, Sung Joo Lee, Euisok Chung, Ji Hyun Wang, Byung Ok Kang, Ki-young Park, Jong Jin Kim
Mobile terminal and method for correcting text thereof

Patent number: 8355914

Abstract: A method for selecting text created in a mobile terminal by word and correcting it or changing it to another word, and a mobile terminal implementing the same are disclosed. The mobile terminal includes: a display unit to display one or more words of text, and to display tags for each of the one or more words; an input unit to select at least one of the tagged one or more words as selected one word; and a controller to display candidate words having a similar pronunciation to that of the word selected via the input unit, select one of the candidate words as selected one candidate word, and change the selected one word from the text to the selected one candidate word.

Type: Grant

Filed: April 17, 2009

Date of Patent: January 15, 2013

Assignee: LG Electronics Inc.

Inventors: Jae-Min Joh, Jae-Do Kwak, Jong-Keun Youn
System and methods for improving accuracy of speech recognition utilizing concept to keyword mapping

Patent number: 8352266

Abstract: The invention provides a system and method for improving speech recognition. A computer software system is provided for implementing the system and method. A user of the computer software system may speak to the system directly and the system may respond, in spoken language, with an appropriate response. Grammar rules may be generated automatically from sample utterances when implementing the system for a particular application. Dynamic grammar rules may also be generated during interaction between the user and the system. In addition to arranging searching order of grammar files based on a predetermined hierarchy, a dynamically generated searching order based on history of contexts of a single conversation may be provided for further improved speech recognition.

Type: Grant

Filed: March 8, 2011

Date of Patent: January 8, 2013

Assignee: Inago Corporation

Inventors: Gary Farmaner, Ron DiCarlantonio, Huw Leonard
Use of intermediate speech transcription results in editing final speech transcription results

Patent number: 8352261

Abstract: A communication system includes at least one transmitting device and at least one receiving device, one or more network systems for connecting the transmitting device to the receiving device, and an automatic speech recognition (“ASR”) system, including an ASR engine. A user speaks an utterance into the transmitting device, and the recorded speech audio is sent to the ASR engine. The ASR engine returns intermediate transcription results to the transmitting device, which displays the intermediate transcription results in real-time to the user. The intermediate transcription results are also correlated by utterance fragment to final transcription results and displayed to the user. The user may use the information thus presented to make decisions as to whether to edit the final transcription results or to speak the utterance again, thereby repeating the process. The intermediate transcription results may also be used by the user to edit the final transcription results.

Type: Grant

Filed: March 9, 2009

Date of Patent: January 8, 2013

Assignee: Canyon IP Holdings, LLC

Inventors: James Richard Terrell, II, Marc White
Speech recognition using channel verification

Patent number: 8346554

Abstract: A method for automatic speech recognition includes determining for an input signal a plurality scores representative of certainties that the input signal is associated with corresponding states of a speech recognition model, using the speech recognition model and the determined scores to compute an average signal, computing a difference value representative of a difference between the input signal and the average signal, and processing the input signal in accordance with the difference value.

Type: Grant

Filed: September 15, 2010

Date of Patent: January 1, 2013

Assignee: Nuance Communications, Inc.

Inventor: Igor Zlokarnik
Techniques to create a custom voice font

Patent number: 8332225

Abstract: Techniques to create and share custom voice fonts are described. An apparatus may include a preprocessing component to receive voice audio data and a corresponding text script from a client and to process the voice audio data to produce prosody labels and a rich script. The apparatus may further include a verification component to automatically verify the voice audio data and the text script. The apparatus may further include a training component to train a custom voice font from the verified voice audio data and rich script and to generate custom voice font data usable by the TTS component. Other embodiments are described and claimed.

Type: Grant

Filed: June 4, 2009

Date of Patent: December 11, 2012

Assignee: Microsoft Corporation

Inventors: Sheng Zhao, Zhi Li, Shenghao Qin, Chiwei Che, Jingyang Xu, Binggong Ding
Script compliance and quality assurance based on speech recognition and duration of interaction

Patent number: 8326626

Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In once aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script.

Type: Grant

Filed: December 22, 2011

Date of Patent: December 4, 2012

Assignee: West Corporation

Inventors: Mark J. Pettay, Fonda J. Narke
Method and process for performing category-based analysis, evaluation, and prescriptive practice creation upon stenographically written and voice-written text files

Patent number: 8321197

Abstract: System and method for electronically identifying and analyzing the type and frequency of errors and mismatches in a stenographically or voice written text against a stored master file and dynamically creating personalized user feedback, drills, and practice based on identified errors and mismatches from within the context of the stored master file. The system provides the user with a plurality of methods to enter a text file for error identification and analysis including both realtime and non-realtime input. The text input is then compared to a stored master file through a word-by-word iterative process which produces a comparison of writing input and stored master wherein errors and mismatches are identified and grouped in a plurality of pre-defined and user-selected categories, each of which is color-coded to facilitate pattern recognition of type and frequency of errors and mismatches in the submitted writing.

Type: Grant

Filed: October 17, 2007

Date of Patent: November 27, 2012

Inventors: Teresa Ruth Gaudet, Gordon James Gaudet, Gary Kenneth Pollreis, Sandra Joyce Natale
Systems and methods for audio signal analysis and modification

Patent number: 8315857

Abstract: Systems and methods for modification of an audio input signal are provided. In exemplary embodiments, an adaptive multiple-model optimizer is configured to generate at least one source model parameter for facilitating modification of an analyzed signal. The adaptive multiple-model optimizer comprises a segment grouping engine and a source grouping engine. The segment grouping engine is configured to group simultaneous feature segments to generate at least one segment model. The at least one segment model is used by the source grouping engine to generate at least one source model, which comprises the at least one source model parameter. Control signals for modification of the analyzed signal may then be generated based on the at least one source model parameter.

Type: Grant

Filed: May 30, 2006

Date of Patent: November 20, 2012

Assignee: Audience, Inc.

Inventors: David Klein, Stephen Malinowski, Lloyd Watts, Bernard Mont-Reynaud
Voiced programming system and method

Patent number: 8315864

Abstract: Provided herein are systems and methods for using context-sensitive speech recognition logic in a computer to create a software program, including context-aware voice entry of instructions that make up a software program, automatic context-sensitive instruction formatting, and automatic context-sensitive insertion-point positioning.

Type: Grant

Filed: April 24, 2012

Date of Patent: November 20, 2012

Inventor: Lunis Orcutt
Efficient voice activity detector to detect fixed power signals

Patent number: 8311814

Abstract: The present invention is directed to a voice activity detector that uses the periodicity of amplitude peaks and valleys to identify signals of substantially fixed power or having periodicity.

Type: Grant

Filed: September 19, 2006

Date of Patent: November 13, 2012

Assignee: Avaya Inc.

Inventors: Mei-Sing Ong, Luke A. Tucker
Method for speaker source classification

Patent number: 8306814

Abstract: A method for classifying a pair of audio signals into an agent audio signal and a customer audio signal. One embodiment relates to unsupervised training, in which the training corpus comprises a multiplicity of audio signal pairs, wherein each pair comprises an agent signal and a customer signal, and wherein it is unknown for each signal if it is by the agent or by the customer. Training is based on the agent signals being more similar to one another than the customer signals. An agent cluster and a customer cluster are determined. The input signals are associated with the agent or the customer according to the higher score combination of the input signals and the clusters. Another embodiment relates to supervised training, wherein an agent model is generated, and the input signal that yields higher score against the model is the agent signal, while the other is the customer signal.

Type: Grant

Filed: May 11, 2010

Date of Patent: November 6, 2012

Assignee: Nice-Systems Ltd.

Inventors: Gil Dobry, Hila Lam, Moshe Wasserblat
User identification method and device

Patent number: 8301455

Abstract: A user identification method is described in which, in a first identification procedure, identification data (ID1) of a first type belonging to a target individual to be identified are determined and are compared with previously stored user identification data (ND1) of the first type assigned to an authorized user. In addition, identification data (ID2) of a second type that belong with a certain probability to the same target individual are automatically determined. After a successful confirmation of the identify of the target individual with the authorized user from the identification data (ID1) of the first type, user identification data (ND2) of the second type are stored for the respective authorized user using the determined identification data (ID2) of the second type in order to use said data in a subsequent identification procedure. In addition, a corresponding user identification device is disclosed.

Type: Grant

Filed: December 17, 2002

Date of Patent: October 30, 2012

Assignee: Koninklijke Philips Electronics N.V.

Inventor: Volker Steinbiss
System and method for automated testing of complicated dialog systems

Patent number: 8296144

Abstract: Embodiments of an automated dialog system testing method and component are described. This automated testing method and system supplements real human-based testing with simulated user input and incorporates a set of evaluation measures that focus on three basic aspects of task-oriented dialog systems, namely, understanding ability, efficiency, and the appropriateness of system actions. These measures are first applied on a corpus generated between a dialog system and a group of human users to demonstrate the validity of these measures with the human users' satisfaction levels. Results generally show that these measures are significantly correlated with these satisfaction levels. A regression model is then built to predict the user satisfaction scores using these evaluation measures.

Type: Grant

Filed: June 4, 2008

Date of Patent: October 23, 2012

Assignee: Robert Bosch GmbH

Inventors: Fuliang Weng, Hua Ai
Recognition of proper nouns using native-language pronunciation

Patent number: 8285537

Abstract: Recognition of proper nouns by an automated speech recognition system is improved by augmenting the pronunciation of each proper noun or name in the natural language of the speech recognition system with at least one “native” pronunciation in another natural language. To maximize recognition, preferably the pronunciations are predicted based on information not available to the speech recognition system. Prediction of pronunciation may be based on a location derived from a telephone number or postal address associated with the name and the language or dialect spoken in the country or region of that location. The “native” pronunciation(s) may be added to a dictionary of the speech recognition system or directly to the grammar used for recognizing speech.

Type: Grant

Filed: January 31, 2003

Date of Patent: October 9, 2012

Assignee: Comverse, Inc.

Inventors: Marc D. Tanner, Erin M. Panttaja
Method and apparatus for monitoring conversations to control telecommunications

Patent number: 8284905

Abstract: A method and apparatus for monitoring conversations to control telecommunications is provided. In one embodiment, a method for identifying undesirable data within voice communications to control telecommunications, comprising processing voice communications between at least two entities and analyzing the voice communications to determine indicia of undesirable data.

Type: Grant

Filed: June 30, 2008

Date of Patent: October 9, 2012

Assignee: Symantec Corporation

Inventor: Sourabh Suri
Voice recognition apparatus, voice recognition method and recording medium

Patent number: 8271282

Abstract: A voice recognition apparatus includes an extraction unit extracting a feature amount from a voice signal, a word dictionary storing a plurality of recognition words; a reject word generation unit storing reject words in the word dictionary in association with the recognition words and a collation unit calculating a degree of similarity between the voice signal and each of the recognition words and reject words stored in the word dictionary by using the feature amount extracted by the extraction unit, determining whether or not a word having a high calculated degree of similarity corresponds to a reject word, when the word is determined as the reject word, excluding the recognition word stored in the word dictionary in association with the reject word from a result of recognition, and outputting a recognition word having a high calculated degree of similarity as a result of recognition.

Type: Grant

Filed: April 30, 2009

Date of Patent: September 18, 2012

Assignee: Fujitsu Limited

Inventor: Shouji Harada
Methods and system for creating and editing an XML-based speech synthesis document

Patent number: 8265936

Abstract: A method for creating and editing an XML-based speech synthesis document for input to a text-to-speech engine is provided. The method includes recording voice utterances of a user reading a pre-selected text and parsing the recorded voice utterances into individual words and periods of silence. The method also includes recording a synthesized speech output generated by a text-to-speech engine, the synthesized speech output being an audible rendering of the pre-selected text, and parsing the synthesized speech output into individual words and periods of silence. The method further includes annotating the XML-based speech synthesis document based upon a comparison of the recorded voice utterances and the recorded synthesized speech output.

Type: Grant

Filed: June 3, 2008

Date of Patent: September 11, 2012

Assignee: International Business Machines Corporation

Inventors: Ciprian Agapi, Oswaldo Gago, Maria Elena Smith, Roberto Vila
Tonal correction of speech

Patent number: 8249873

Abstract: Tonal correction of speech is provided. Received speech is analyzed and compared to a table of commonly mispronounced phrases. These phrases are mapped to the phrase likely intended by the speaker. The phrase determines to be the phrase the user likely intended can be suggested to the user. If the user approves of the suggestion, tonal correction can be applied to the speech before that speech is delivered to a recipient.

Type: Grant

Filed: August 12, 2005

Date of Patent: August 21, 2012

Assignee: Avaya Inc.

Inventors: Colin Blair, Kevin Chan, Christopher R. Gentle, Neil Hepworth, Andrew W. Lang, Paul R. Michaelis
Speech interactive system and method

Patent number: 8234114

Abstract: The present invention relates to a speech interactive system and method. The system comprises a target information receiving module, an interactive mode setting and speech processing module, an interactive information update module, a decision module, and an output response module. It receives target information and sets corresponding target text sentence information. It also receives a user's speech signal, sets an interactive mode, decides the speech's target text sentence information, and generates an assessment for the target text sentence. Under the set interactive mode, the system updates the information in an interactive information recording table according to the assessment and a timing count. According to the interactive mode and the recorded information, an output mode for the target text sentence information is generated. According to the output mode and the recorded information, the response information is generated.

Type: Grant

Filed: August 14, 2009

Date of Patent: July 31, 2012

Assignee: Industrial Technology Research Institute

Inventors: Yao-Yuan Chang, Sen-Chia Chang, Shih-Chieh Chien, Jia-Jang Tu
Method and apparatus for generating dialog prosody structure, and speech synthesis method and system employing the same

Patent number: 8234118

Abstract: A dialog prosody structure generating method and apparatus, and a speech synthesis method and system employing the dialog prosody structure generation method and apparatus, are provided. The speech synthesis method using the dialog prosody structure generation method includes: determining a system speaking style based on a user utterance; if the system speaking style is dialog speech, generating dialog prosody information by reflecting discourse information between a user and a system; and synthesizing a system utterance based on the generated dialog prosody information.

Type: Grant

Filed: May 19, 2005

Date of Patent: July 31, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Kyoungnan Pyo, Jaewon Lee
Method and apparatus for remote command, control and diagnostics of systems using conversational or audio interface

Patent number: 8224649

Abstract: A method and apparatus for remote access to a target application is disclosed where a system administrator may establish telephonic contact with an interactive voice response system and obtain access to the target application by speech communication. The interactive response system may authenticate the system administrator by implementing various measures including biometric measures. Once access is granted, the interactive response system may broker a communication between the target application using text/data and the system administrator using natural language.

Type: Grant

Filed: June 2, 2004

Date of Patent: July 17, 2012

Assignee: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Ryan L. Osborn, Jason W. Pelecanos, Ganesh N. Ramaswamy, Ran D. Zilca
Acoustic model adaptation using geographic information

Patent number: 8219384

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Type: Grant

Filed: September 30, 2011

Date of Patent: July 10, 2012

Assignee: Google Inc.

Inventors: Matthew I. Lloyd, Trausti Kristjansson
Voiced programming system and method

Patent number: 8209170

Abstract: Provided herein are systems and methods for using context-sensitive speech recognition logic in a computer to create a software program, including context-aware voice entry of instructions that make up a software program, automatic context-sensitive instruction formatting, and automatic context-sensitive insertion-point positioning.

Type: Grant

Filed: June 2, 2011

Date of Patent: June 26, 2012

Assignee: Lunis ORCUTT

Inventor: Lunis Orcutt
System and method for latency reduction for automatic speech recognition using partial multi-pass results

Patent number: 8209176

Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.

Type: Grant

Filed: August 27, 2011

Date of Patent: June 26, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
Method and apparatus for searching multimedia data using speech recognition in mobile device

Patent number: 8200490

Abstract: A method of searching music using speech recognition in a mobile device, the method including: recognizing a speech signal uttered by a user as a phoneme sequence; and searching music information by performing partial symbol matching between the recognized phoneme sequence and a standard pronunciation sequence.

Type: Grant

Filed: February 9, 2007

Date of Patent: June 12, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: In Jeong Choi, Nam Hoon Kim, Ick Sano Han, Sang Bae Jeong
Voice recognition system

Patent number: 8195461

Abstract: A voice recognition system used for onboard equipment having a genre database (DB) that stores search target vocabularies in accordance with respective genres. It has a mike 1 for outputting speech sounds as spoken data; a first voice recognition dictionary 2a for recognizing words of search target genres in the genre DB; a second voice recognition dictionary 2b for recognizing words outside the search target genres; a voice recognition unit 3 for recognizing the speech sounds by collating the spoken data delivered from the mike with the vocabularies contained in the first and second voice recognition dictionaries; an interactive control unit 4 for outputting, when a word delivered from the voice recognition unit as a recognition result is a word obtained using the second voice recognition dictionary, a message so stating as presentation information; and a presentation unit 5 for presenting the presentation information to an outside.

Type: Grant

Filed: October 4, 2007

Date of Patent: June 5, 2012

Assignee: Mitsubishi Electric Corporation

Inventor: Takayoshi Chikuri
Adapting enhanced acoustic models

Patent number: 8185392

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving voice queries, obtaining, for one or more of the voice queries, feedback information that references an action taken by a user that submitted the voice query after reviewing a result of the voice query, generating, for the one or more voice queries, a posterior recognition confidence measure that reflects a probability that the voice query was correctly recognized, wherein the posterior recognition confidence measure is generated based at least on the feedback information for the voice query, selecting a subset of the one or more voice queries based on the posterior recognition confidence measures, and adapting an acoustic model using the subset of the voice queries.

Type: Grant

Filed: September 30, 2011

Date of Patent: May 22, 2012

Assignee: Google Inc.

Inventors: Brian Strope, Douglas H. Beeferman
Method for emotion recognition based on minimum classification error

Patent number: 8180638

Abstract: Disclosed herein is a method for emotion recognition based on a minimum classification error. In the method, a speaker's neutral emotion is extracted using a Gaussian mixture model (GMM), other emotions except the neutral emotion are classified using the Gaussian Mixture Model to which a discriminative weight for minimizing the loss function of a classification error for the feature vector for emotion recognition is applied. In the emotion recognition, the emotion recognition is performed by applying a discriminative weight evaluated using the Gaussian Mixture Model based on minimum classification error to feature vectors of the emotion classified with difficult, thereby enhancing the performance of emotion recognition.

Type: Grant

Filed: February 23, 2010

Date of Patent: May 15, 2012

Assignee: Korea Institute of Science and Technology

Inventors: Hyoung Gon Kim, Ig Jae Kim, Joon-Hyuk Chang, Kye Hwan Lee, Chang Seok Bae
Sequential speech recognition with two unequal ASR systems

Patent number: 8180641

Abstract: Sequential speech recognition using two unequal automatic speech recognition (ASR) systems may be provided. The system may provide two sets of vocabulary data. A determination may be made as to whether entries in one set of vocabulary data are likely to be confused with entries in the other set of vocabulary data. If confusion is likely, a decoy entry from one set of the vocabulary data may be placed in the other set of vocabulary data to ensure more efficient and accurate speech recognition processing may take place.

Type: Grant

Filed: September 29, 2008

Date of Patent: May 15, 2012

Assignee: Microsoft Corporation

Inventors: Michael Levit, Shuangyu Chang, Bruce Melvin Buntschuh

prev 1 2 3 4 5 6 7 8 … next