Preliminary Matching Patents (Class 704/252)

Method and apparatus for predicting word accuracy in automatic speech recognition systems

Patent number: 8175877

Abstract: The invention comprises a method and apparatus for predicting word accuracy. Specifically, the method comprises obtaining an utterance in speech data where the utterance comprises an actual word string, processing the utterance for generating an interpretation of the actual word string, processing the utterance to identify at least one utterance frame, and predicting a word accuracy associated with the interpretation according to at least one stationary signal-to-noise ratio and at least one non-stationary signal to noise ratio, wherein the at least one stationary signal-to-noise ratio and the at least one non-stationary signal to noise ratio are determined according to a frame energy associated with each of the at least one utterance frame.

Type: Grant

Filed: February 2, 2005

Date of Patent: May 8, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Mazin Gilbert, Hong Kook Kim
Distributed voice browser

Patent number: 8170881

Abstract: The present invention can include a method of call processing using a distributed voice browser including allocating a plurality of service processors configured to interpret parsed voice markup language data and allocating a plurality of voice markup language parsers configured to retrieve and parse voice markup language data representing a telephony service. The plurality of service processors and the plurality of markup language parsers can be registered with one or more session managers. Accordingly, components of received telephony service requests can be distributed to the voice markup language parsers and the parsed voice markup language data can be distributed to the service processors.

Type: Grant

Filed: July 26, 2011

Date of Patent: May 1, 2012

Assignee: Nuance Communications, Inc.

Inventors: Thomas E. Creamer, Victor S. Moore, Glen R. Walters, Scott Lee Winters
Interactive speech recognition model

Patent number: 8160876

Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.

Type: Grant

Filed: September 29, 2004

Date of Patent: April 17, 2012

Assignee: Nuance Communications, Inc.

Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
System and method for unsupervised and active learning for automatic speech recognition

Patent number: 8155960

Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.

Type: Grant

Filed: September 19, 2011

Date of Patent: April 10, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Zeynep Hakkani-Tur, Giuseppe Riccardi
Arrangement and method for reproducing audio data as well as computer program product for this

Patent number: 8150691

Abstract: During the replaying of audio data stored in a, which audio data corresponds to text data from a text composed of words, the replaying of the audio data in forward and reverse modes is controlled. Starting from particular momentary replay position in the audio data, a backward jump over a return distance corresponding to the length of about at least two words, to a target position, is automatically initiated for the replaying of the audio data in the reverse mode. Then, starting from the particular target position, a replay of the audio data in the forward sequence for just one part of the return distance is undertaken.

Type: Grant

Filed: October 13, 2003

Date of Patent: April 3, 2012

Assignee: Nuance Communications Austria GmbH

Inventor: Kwaku Frimpong-Ansah
Voice recognition apparatus and navigation apparatus

Patent number: 8145487

Abstract: A voice recognition apparatus recognizes speaker's voice collected by a microphone, determines whether a telephone number is grouped into categories based on an inclusion of vocabulary in the telephone number that divides the telephone number into groups such as an area code, a city code and a subscriber number, and displays the telephone number in a display part in a grouped form of the area code, city code and subscriber number.

Type: Grant

Filed: January 24, 2008

Date of Patent: March 27, 2012

Assignee: DENSO CORPORATION

Inventors: Ryuichi Suzuki, Manabu Otsuka, Katsushi Asami
User intention based on N-best list of recognition hypotheses for utterances in a dialog

Patent number: 8140328

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for using alternate recognition hypotheses to improve whole-dialog understanding accuracy. The method includes receiving an utterance as part of a user dialog, generating an N-best list of recognition hypotheses for the user dialog turn, selecting an underlying user intention based on a belief distribution across the generated N-best list and at least one contextually similar N-best list, and responding to the user based on the selected underlying user intention. Selecting an intention can further be based on confidence scores associated with recognition hypotheses in the generated N-best lists, and also on the probability of a user's action given their underlying intention. A belief or cumulative confidence score can be assigned to each inferred user intention.

Type: Grant

Filed: December 1, 2008

Date of Patent: March 20, 2012

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Jason Williams
Apparatus, server, method, and tangible machine-readable medium thereof for processing and recognizing a sound signal

Patent number: 8126704

Abstract: An apparatus, a server, a method, and a tangible machine-readable medium thereof for processing and recognizing a sound signal are provided. The apparatus is configured to sense the sound signal of the environment and to dynamically derive and to transmit a feature signal and a sound feature message of the sound signal to the server. The server is configured to retrieve the stored sound models according to the sound feature message and to compare each of the sound models with the feature signal to determine whether the sound signal is abnormal after receiving the feature signal and the sound feature message.

Type: Grant

Filed: February 20, 2008

Date of Patent: February 28, 2012

Assignee: Institute for Information Industry

Inventor: Ing-Jr Ding
Method and device for verifying the identity of a user of several telecommunication services using biometric characteristics

Patent number: 8117035

Abstract: A method and device for verification of an identity of a subscriber of a communication service on a telecommunications network is provided. The communication service requires authentication of the subscriber. The verification includes comparing a reference biometric with at least one biometric characteristic detected from a biometric sample of the subscriber, in order to provide the subscriber with access to the restricted communication service. The reference biometric can be adapted and used for verification purposes based on the different security requirements of the various communication services provided on the telecommunications network.

Type: Grant

Filed: April 20, 2007

Date of Patent: February 14, 2012

Assignee: Deutsche Telekom AG

Inventors: Fred Runge, Juergen Emhardt
Speech recognition apparatus and method

Patent number: 8108215

Abstract: An apparatus and method for recognizing paraphrases of uttered phrases, such as place names. At least one keyword contained in a speech utterance is recognized. Then, the keyword(s) contained in the speech utterance are re-recognized using a phrase including the keyword(s). Based on both recognition results, it is determined whether a paraphrase could have been uttered. If a paraphrase could have been uttered, a phrase corresponding to the paraphrase is determined as a result of speech recognition of the speech utterance.

Type: Grant

Filed: October 22, 2007

Date of Patent: January 31, 2012

Assignee: Nissan Motor Co., Ltd.

Inventors: Keiko Katsuragawa, Minoru Tomikashi, Takeshi Ono, Daisuke Saitoh, Eiji Tozuka
Script compliance and quality assurance based on speech recognition and duration of interaction

Patent number: 8108213

Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.

Type: Grant

Filed: January 13, 2010

Date of Patent: January 31, 2012

Assignee: West Corporation

Inventors: Mark J Pettay, Fonda J Narke
Electronic appliance and voice signal processing method for use in the same

Patent number: 8103504

Abstract: An electronic appliance includes a speaker which outputs a first sound wave based on a first voice signal generated from the electronic appliance, and a microphone to detect a second sound wave on which a sound wave generated for control of the electronic appliance is superimposed to output a second voice signal. A first waveform generator generates a first waveform signal based on the first voice signal, and a second waveform generator generates a second waveform signal based on the second voice signal. A waveform shaping unit outputs a third waveform signal in which the first waveform signal is enlarged in a time axis direction, and a subtracter subtracts the third waveform signal from the second waveform signal.

Type: Grant

Filed: August 24, 2007

Date of Patent: January 24, 2012

Assignee: Victor Company of Japan, Limited

Inventors: Hirokazu Ohguri, Masahiro Kitaura
Voice activated language translation

Patent number: 8103508

Abstract: A voice activated language translation system that is accessed by telephones where voice messages of a caller are translated into a selected language and returned to the caller or optionally sent to another caller. A voice recognition system converts the voice messages into text of a first language. The text is then translated into text of the selected language. The text of the selected language is then converted into voice.

Type: Grant

Filed: February 19, 2003

Date of Patent: January 24, 2012

Assignee: Mitel Networks Corporation

Inventor: John Raymond Lord
Interactive clustering method for identifying problems in speech applications

Patent number: 8099279

Abstract: A method of aiding a speech recognition program developer by grouping calls passing through an identified question-answer (QA) state or transition into clusters based on causes of problems associated with the calls is provided. The method includes determining a number of clusters into which a plurality of calls will be grouped. Then, the plurality of calls is at least partially randomly assigned to the different clusters. Model parameters are estimated using clustering information based upon the assignment of the plurality of calls to the different clusters. Individual probabilities are calculated for each of the plurality of calls using the estimated model parameters. The individual probabilities are indicative of a likelihood that the corresponding call belongs to a particular cluster. The plurality of calls is then re-assigned to the different clusters based upon the calculated probabilities. These steps are then repeated until the grouping of the plurality of calls achieves a desired stability.

Type: Grant

Filed: February 9, 2005

Date of Patent: January 17, 2012

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Dong Yu
Modifying a grammar of a hierarchical multimodal menu in dependence upon speech command frequency

Patent number: 8090584

Abstract: Methods, systems, and computer program products are provided for modifying a grammar of a hierarchical multimodal menu that include monitoring a user invoking a speech command in a first tier grammar, and adding the speech command to a second tier grammar in dependence upon the frequency of the user invoking the speech command. Adding the speech command to a second tier grammar may be carried out by adding the speech command to a higher tier grammar or by adding the speech command to a lower tier grammar. Adding the speech command to a second tier grammar may include storing the speech command in a grammar cache in the second tier grammar.

Type: Grant

Filed: June 16, 2005

Date of Patent: January 3, 2012

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Michael C. Hollinger, Igor R. Jablokov, Benjamin D. Lewis, Hilary A. Pike, Daniel M. Smith, David W. Wintermute, Michael A. Zaitzeff
Automatic identification of repeated material in audio signals

Patent number: 8090579

Abstract: A system and method are described for recognizing repeated audio material within at least one media stream without prior knowledge of the nature of the repeated material. The system and method are able to create a screening database from the media stream or streams. An unknown sample audio fragment is taken from the media stream and compared against the screening database to find if there are matching fragments within the media streams by determining if the unknown sample matches any samples in the screening database.

Type: Grant

Filed: February 8, 2006

Date of Patent: January 3, 2012

Assignee: Landmark Digital Services

Inventors: David L. DeBusk, Darren P. Briggs, Michael Karliner, Richard Wing Cheong Tang, Avery Li-Chun Wang
Methods and apparatus for audio data analysis and data mining using speech recognition

Patent number: 8055503

Abstract: A system and method provide an audio analysis intelligence tool with ad-hoc search capabilities using spoken words as an organized data form. An SQL-like interface is used to process and search audio data and combine it with other traditional data forms to enhance searching of audio segments to identify those audio segments satisfying minimum confidence levels for a match.

Type: Grant

Filed: November 1, 2006

Date of Patent: November 8, 2011

Assignee: Siemens Enterprise Communications, Inc.

Inventors: Robert Scarano, Lawrence Mark
Synchronizing visual and speech events in a multimodal application

Patent number: 8055504

Abstract: Exemplary methods, systems, and products are disclosed for synchronizing visual and speech events in a multimodal application, including receiving from a user speech; determining a semantic interpretation of the speech; calling a global application update handler; identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation; and executing the additional function. Typical embodiments may include updating a visual element after executing the additional function. Typical embodiments may include updating a voice form after executing the additional function. Typical embodiments also may include updating a state table after updating the voice form. Typical embodiments also may include restarting the voice form after executing the additional function.

Type: Grant

Filed: April 3, 2008

Date of Patent: November 8, 2011

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Michael C. Hollinger, Igor R. Jablokov, David B. Lewis, Hilary A. Pike, Daniel M. Smith, David W. Wintermute, Michael A. Zaitzeff
Apparatus and method of voice recognition system for AV system

Patent number: 8046223

Abstract: To improve the accuracy of the voice recognition system for an AV system, the present invention includes a reflected sound remover having a plurality of filters, the reflected sound remover being configured to receive an input sound signal including a reflected AV system audio, a user's voice, and a noise, and being configured to remove the reflected audio from the input sound according to user's voice information; a voice detector detecting the user's voice from a signal outputted from the reflected sound remover and obtaining the user's voice information based on the detected user's voice; and a voice recognition unit comparing the detected user's voice with voice patterns that belong to at least one model.

Type: Grant

Filed: July 6, 2004

Date of Patent: October 25, 2011

Assignee: LG Electronics Inc.

Inventors: Min Ho Jin, Jong Keun Shin, Chang D. Yoo, Sang Gyun Kim, Jong Uk Kim
System and method for unsupervised and active learning for automatic speech recognition

Patent number: 8024190

Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.

Type: Grant

Filed: March 30, 2009

Date of Patent: September 20, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Zeynep Hakkani-Tur, Giuseppe Riccardi
Speech recognition system, speech recognition method and storage medium

Patent number: 8010359

Abstract: Provided are a speech recognition system, a method and a storage medium capable of, even in a case where plural speakers input superimposed speeches, recognizing a speech of an individual each speaker and making a single application program sharable among the speakers in execution. In a speech recognition system receiving speeches of plural speakers to execute a predetermined application program, the received speeches are separated according to the respective speakers if necessary, the received speeches of individual speakers are speech-recognized, results of speech recognition are matched with data items necessary for executing the application program, one of results of recognition of plural speeches which are found as a result of the matching to be overlapping is selected, and the results of recognition of plural speeches which are found as a result of the matching not to be overlapping are linked to the selected result of speech recognition.

Type: Grant

Filed: June 24, 2005

Date of Patent: August 30, 2011

Assignee: Fujitsu Limited

Inventor: Naoshi Matsuo
System and method for latency reduction for automatic speech recognition using partial multi-pass results

Patent number: 8010360

Abstract: A system and method is provided for reducing latency for automatic speech recognition. In one embodiment, intermediate results produced by multiple search passes are used to update a display of transcribed text.

Type: Grant

Filed: December 15, 2009

Date of Patent: August 30, 2011

Assignee: AT&T Intellectual Property IL, L.P.

Inventors: Michiel Adriaan Unico Bacchiani, Brian Scott Amento
Dynamic modification of a messaging language

Patent number: 8010338

Abstract: A method for dynamically modifying an outgoing message language includes receiving a message from a sender. A language associated with the received message is identified and an outgoing message language is automatically set to the identified language associated with the received message.

Type: Grant

Filed: November 27, 2006

Date of Patent: August 30, 2011

Assignee: Sony Ericsson Mobile Communications AB

Inventor: Ola Karl Thörn
Apparatus and method for audio analysis

Patent number: 8005675

Abstract: An apparatus and method for an improved audio analysis process is disclosed. The improvement concerns the accuracy level of the results and the rate of false alarms produced by the audio analysis process. The proposed apparatus and method provides a three-stage audio analysis route. The three-stage analysis process includes a pre-analysis stage, a main analysis stage and a post analysis stage.

Type: Grant

Filed: March 17, 2005

Date of Patent: August 23, 2011

Assignee: Nice Systems, Ltd.

Inventors: Moshe Wasserblat, Oren Pereg
Method and system for predictive interactive voice recognition

Patent number: 8000452

Abstract: A method for a predictive interactive voice recognition system includes receiving a voice call, associating said voice call with a behavioral pattern, and invoking a service context responsive to said behavioral pattern. The system provides advantages of improved voice recognition and more efficient use of the voice user interface to obtain services.

Type: Grant

Filed: July 26, 2004

Date of Patent: August 16, 2011

Assignee: General Motors LLC

Inventors: Gary A. Watkins, James M. Smith
System and method for post processing speech recognition output

Patent number: 7996223

Abstract: A system and method may be disclosed for facilitating the conversion of dictation into usable and formatted documents by providing a method of post processing speech recognition output. In particular, the post processing system may be configured to implement rewrite rules and process raw speech recognition output or other raw data according to those rewrite rules. The application of the rewrite rules may format and/or normalize the raw speech recognition output into formatted or finalized documents and reports. The system may thereby reduce or eliminate the need for post processing by transcriptionists or dictation authors.

Type: Grant

Filed: September 29, 2004

Date of Patent: August 9, 2011

Assignee: Dictaphone Corporation

Inventors: Alan Frankel, Ana Santisteban
System and method using blind change detection for audio segmentation

Patent number: 7991619

Abstract: A system, method and computer program product for performing blind change detection audio segmentation that combines hypothesized boundaries from several segmentation algorithms to achieve the final segmentation of the audio stream. Automatic segmentation of the audio streams according to the system and method of the invention may be used for many applications like speech recognition, speaker recognition, audio data mining, online audio indexing, and information retrieval systems, where the actual boundaries of the audio segments are required.

Type: Grant

Filed: June 19, 2008

Date of Patent: August 2, 2011

Assignee: International Business Machines Corporation

Inventors: Upendra V. Chaudhari, Mohamed Kamal Omar, Ganesh N. Ramaswamy
Understanding spoken location information based on intersections

Patent number: 7983913

Abstract: In one embodiment, the present system recognizes a user's speech input using an automatically generated probabilistic context free grammar for street names that maps all pronunciation variations of a street name to a single canonical representation during recognition. A tokenizer expands the representation using position-dependent phonetic tokens and an intersection classifier classifies an intersection, despite the presence of recognition errors and incomplete street names.

Type: Grant

Filed: July 31, 2007

Date of Patent: July 19, 2011

Assignee: Microsoft Corporation

Inventors: Michael L. Seltzer, Yun-Cheng Ju, Ivan J. Tashev
Speech recognition system and speech file recording system

Patent number: 7979278

Abstract: A user term information extraction unit extracts term information of a user out of information that has been input by the user to an application for use other than speech recording beforehand, and a speech recognition dictionary management unit expands a vocabulary of a speech recognition dictionary according to the term information of the user. Next, the user inputs speech via a speech input unit, and a speech recognition unit executes speech recognition using the speech recognition dictionary. A representative term information selection unit extracts the term information of the user contained in the speech recognition result, and selects one or a plurality of pieces of representative term information from the term information of the user. A speech file recording unit records the speech data as a speech file, and renders a file name of the speech file according to the representative term information.

Type: Grant

Filed: November 1, 2002

Date of Patent: July 12, 2011

Assignee: Fujitsu Limited

Inventor: Naoshi Matsuo
System and method for monitoring communications

Patent number: 7979279

Abstract: A system and method for providing enhanced security through the monitoring of communications. In one embodiment, the monitoring process is aided through an automatic speech recognition process that is focused on the recognition of words from a limited vocabulary.

Type: Grant

Filed: December 30, 2003

Date of Patent: July 12, 2011

Assignee: AT&T Intellectual Property I, LP

Inventor: Vicki Karen McKinney
Operating method for an automated language recognizer intended for the speaker-independent language recognition of words in different languages and automated language recognizer

Patent number: 7974843

Abstract: The invention relates to an operating method for an automated language recognizer intended for the speaker-independent language recognition of words from different languages, particularly for recognizing names from different languages. The method is based on a language defined as the mother tongue and has an input phase for establishing a language recognizer vocabulary. Phonetic transcripts are determined for words in various languages in order to obtain phoneme sequences for pronunciation variants. The phonemes of each relevant phoneme set of the mother tongue are then specifically mapped to determine phoneme sequences that correspond to pronunciation variants.

Type: Grant

Filed: January 2, 2003

Date of Patent: July 5, 2011

Assignee: Siemens Aktiengesellschaft

Inventor: Tobias Schneider
APPARATUS, METHOD AND SYSTEM FOR GENERATING THRESHOLD FOR UTTERANCE VERIFICATION

Publication number: 20110161084

Abstract: Apparatus, method and system for generating a threshold for utterance verification are introduced herein. When a processing object is determined, a recommendation threshold is generated according to an expected utterance verification result. In addition, extra collection of corpuses or training models is not necessary for the utterance verification introduced here. The processing unit can be a recognition object or an utterance verification object. In the apparatus, method and system for generating a threshold for utterance verification, at least one of the processing objects is received and then a speech unit sequence is generated therefrom. One or more values corresponding to each of the speech unit of the speech unit sequence are obtained accordingly, and then a recommendation threshold is generated based on an expected utterance verification result.

Type: Application

Filed: June 24, 2010

Publication date: June 30, 2011

Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Inventors: Cheng-Hsien Lin, Sen-Chia Chang, Chi-Tien Chiu
Speaker authentication in digital communication networks

Patent number: 7970611

Abstract: Example embodiments provide a speaker authentication technology that compensates for mismatches between enrollment process conditions and test process conditions using correction parameters or correction models, which allow for correcting one of the test voice characterizing parameter set and the enrollment voice characterizing parameter set according to a mismatch between the test process conditions and the enrollment process conditions, thereby obtaining values for the test voice characterizing parameter set and the enrollment voice characterizing parameter set that are based on the same or at least similar process conditions. Alternatively, each of the enrollment and test voice characterizing parameter sets may be normalized to predetermined standard process conditions by using the correction parameters or correction models.

Type: Grant

Filed: May 2, 2006

Date of Patent: June 28, 2011

Assignee: Voice.Trust AG

Inventors: Raja Kuppuswamy, Christian S Pilz
Multiplying confidence scores for utterance verification in a mobile telephone

Patent number: 7966183

Abstract: Automatic speech recognition verification using a combination of two or more confidence scores based on UV features which reuse computations of the original recognition.

Type: Grant

Filed: May 4, 2007

Date of Patent: June 21, 2011

Assignee: Texas Instruments Incorporated

Inventors: Kaisheng Yao, Lorin Paul Netsch, Vishu Viswanathan
Voiced programming system and method

Patent number: 7966182

Abstract: Provided herein are systems and methods for using context-sensitive speech recognition logic in a computer to create a software program, including context-aware voice entry of instructions that make up a software program, automatic context-sensitive instruction formatting, and automatic context-sensitive insertion-point positioning.

Type: Grant

Filed: June 20, 2006

Date of Patent: June 21, 2011

Inventor: Lunis Orcutt
Optimization of detection systems using a detection error tradeoff analysis criterion

Patent number: 7945444

Abstract: In detection systems, such as speaker verification systems, for a given operating point range, with an associated detection “cost”, the detection cost is preferably reduced by essentially trading off the system error in the area of interest with areas essentially “outside” that interest. Among the advantages achieved thereby are higher optimization gain and better generalization. From a measurable Detection Error Tradeoff (DET) curve of the given detection system, a criterion is preferably derived, such that its minimization provably leads to detection cost reduction in the area of interest. The criterion allows for selective access to the slope and offset of the DET curve (a line in case of normally distributed detection scores, a curve approximated by mixture of Gaussians in case of other distributions). By modifying the slope of the DET curve, the behavior of the detection system is changed favorably with respect to the given area of interest.

Type: Grant

Filed: September 8, 2008

Date of Patent: May 17, 2011

Assignee: Nuance Communications, Inc.

Inventors: Jiri Navratil, Ganesh N. Ramaswamy
Speech recognition accuracy via concept to keyword mapping

Patent number: 7925506

Abstract: The invention provides a system and method for improving speech recognition. A computer software system is provided for implementing the system and method. A user of the computer software system may speak to the system directly and the system may respond, in spoken language, with an appropriate response. Grammar rules may be generated automatically from sample utterances when implementing the system for a particular application. Dynamic grammar rules may also be generated during interaction between the user and the system. In addition to arranging searching order of grammar files based on a predetermined hierarchy, a dynamically generated searching order based on history of contexts of a single conversation may be provided for further improved speech recognition.

Type: Grant

Filed: October 5, 2004

Date of Patent: April 12, 2011

Assignee: Inago Corporation

Inventors: Gary Farmaner, Ron Dicarlantonio, Huw Leonard
Synchronizing visual and speech events in a multimodal application

Patent number: 7917365

Abstract: Exemplary methods, systems, and products are disclosed for synchronizing visual and speech events in a multimodal application, including receiving from a user speech; determining a semantic interpretation of the speech; calling a global application update handler; identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation; and executing the additional function. Typical embodiments may include updating a visual element after executing the additional function. Typical embodiments may include updating a voice form after executing the additional function. Typical embodiments also may include updating a state table after updating the voice form. Typical embodiments also may include restarting the voice form after executing the additional function.

Type: Grant

Filed: June 16, 2005

Date of Patent: March 29, 2011

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Michael C. Hollinger, Igor R. Jablokov, Benjamin D. Lewis, Hilary A. Pike, Daniel M. Smith, David W. Wintermute, Michael A. Zaitzeff
System and method using multiple automated speech recognition engines

Patent number: 7917364

Abstract: A system comprises a computer system comprising a central processing unit coupled to a memory and resource management application. A plurality of different automatic speech recognition (ASR) engines is coupled to the computer system. The computer system is adapted to select ASR engines to analyze a speech utterance based on resources available on the system.

Type: Grant

Filed: September 23, 2003

Date of Patent: March 29, 2011

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Sherif Yacoub
System and method for automatic speech recognition

Patent number: 7912721

Abstract: A system and method for automatic recognition of foreign speakers by performing analysis of a speech sample to produce a signal representative thereof and attempting to match the representative signal to one of a plurality of predetermined sounds to produce recognition; and determining whether a gap exists in a table of predetermined sounds in a predetermined language and, if a gap exists, substituting for matching a sound from a position in the table near the gap. This automatic substitution of known foreign pronunciation characteristics improves and modifies the output from standard, monolingual automatic speech recognition. There is no requirement to implement either adaptation or to impose additional manual modification of the baseform pool on the application developer. In addition, the danger of increased ambiguity is removed by introducing an explicit accent and/or language checker which queries known differences across accents.

Type: Grant

Filed: December 29, 2005

Date of Patent: March 22, 2011

Assignee: Nuance Communications, Inc.

Inventors: Barry Neil Dow, Stephen Graham Copinger Lawrence, John Brian Pickering
Generating words and names using N-grams of phonemes

Patent number: 7912716

Abstract: Generating words and/or names, comprising: receiving at least one corpus based on a given language; generating a plurality of N-grams of phonemes and a plurality of frequencies of occurrence using the corpus, such that each frequency of occurrence corresponds to a respective pair of phonemes and indicates the frequency of the second phoneme in the pair following the first phoneme in the pair; generating a phoneme tree using the plurality of N-grams of phonemes and the plurality of frequencies of occurrence; performing a random walk on the phoneme tree using the frequencies of occurrence to generate a sequence of phonemes; and mapping the sequence of phonemes into a sequence of graphemes.

Type: Grant

Filed: October 6, 2005

Date of Patent: March 22, 2011

Assignee: Sony Online Entertainment LLC

Inventor: Patrick McCuller
Recognition results postprocessor for use in voice recognition systems

Patent number: 7899671

Abstract: Systems and techniques for analyzing voice recognition results in order to improve efficiency and accuracy of voice recognition. When a voice activated module undertakes a voice recognition attempt, it invokes a voice recognition module that constructs a list of voice recognition results. The list is analyzed by a results postprocessor that employs information relating to past recognition results and user information to make changes to the list. The results postprocessor may delete results that have been previously rejected during a current recognition transaction and may further alter and reorder the results list based on historical results. The results postprocessor may further alter and reorder the results list based on information relating to the user engaging in the recognition attempt.

Type: Grant

Filed: February 5, 2004

Date of Patent: March 1, 2011

Assignee: Avaya, Inc.

Inventors: Robert S. Cooper, Derek Sanders, Vladimir Sergeyevich Tokarev
Controlling an apparatus based on speech

Patent number: 7885818

Abstract: An apparatus with a speech control unit includes a microphone array having multiple microphones for receiving respective audio signals, and a beam forming module for extracting a speech signal of a user, from the audio signals. A keyword recognition system recognizes a predetermined keyword that is spoken by the user and which is represented by a particular audio signal and is arranged to control the beam forming module, on basis of tie recognition. A speech recognition unit creates an instruction for the apparatus based on recognized speech items of the speech signal. As a consequence, the speech control unit is more selective for those parts of the audio signals for speech recognition which correspond to speech items spoken by the user.

Type: Grant

Filed: September 22, 2003

Date of Patent: February 8, 2011

Assignee: Koninklijke Philips Electronics N.V.

Inventor: Fabio Vignoli
Speech recognition device and speech recognition method and recording medium utilizing preliminary word selection

Patent number: 7881935

Abstract: A speech recognition apparatus in which the accuracy in speech recognition is improved as the resource is prevented from increasing. Such a word which is probable as the result of the speech recognition is selected on the basis of an acoustic score and a linguistic score, while word selection is also performed on the basis of a measure different from the acoustic score, such as the number of phonemes being small, a part of speech being a pre-set one, inclusion in the past results of speech recognition or the linguistic score being not less than a pre-set value. The words so selected are subjected to matching processing.

Type: Grant

Filed: February 16, 2001

Date of Patent: February 1, 2011

Assignee: Sony Corporation

Inventors: Yasuharu Asano, Katsuki Minamino, Hiroaki Ogawa, Helmut Lucke
Speech recognition using channel verification

Patent number: 7877255

Abstract: A method for automatic speech recognition includes determining for an input signal a plurality scores representative of certainties that the input signal is associated with corresponding states of a speech recognition model, using the speech recognition model and the determined scores to compute an average signal, computing a difference value representative of a difference between the input signal and the average signal, and processing the input signal in accordance with the difference value.

Type: Grant

Filed: March 31, 2006

Date of Patent: January 25, 2011

Assignee: Voice Signal Technologies, Inc.

Inventor: Igor Zlokarnik
Avoiding repeated misunderstandings in spoken dialog system

Patent number: 7865364

Abstract: A method for improving speech recognition accuracy includes utilizing skiplists or lists of values that cannot occur because of improbability or impossibility. A table or list is stored in a dialog manager module. The table includes a plurality of information items and a corresponding list of improbable values for each of the plurality of information items. A plurality of recognized ordered interpretations is received from an automatic speech recognition (ASR) engine. Each of the plurality of recognized ordered interpretations each includes a number of information items. A value of one or more of the received information items for a first recognized ordered interpretation is compared to a table to determine if the value of the one of the received information items matches any of the list of improbable values for the corresponding information item.

Type: Grant

Filed: May 5, 2006

Date of Patent: January 4, 2011

Assignee: Nuance Communications, Inc.

Inventor: Marc Helbing
Speech Recognition Method for Selecting a Combination of List Elements via a Speech Input

Publication number: 20100305947

Abstract: The invention provides a speech recognition method for selecting a combination of list elements via a speech input, wherein a first list element of the combination is part of a first set of list elements and a second list element of the combination is part of a second set of list elements, the method comprising the steps of receiving the speech input, comparing each list element of the first set with the speech input to obtain a first candidate list of best matching list elements, processing the second set using the first candidate list to obtain a subset of the second set, comparing each list element of the subset of the second set with the speech input to obtain a second candidate list of best matching list elements, and selecting a combination of list elements using the first and the second candidate list.

Type: Application

Filed: June 2, 2010

Publication date: December 2, 2010

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Markus Schwarz, Matthias Schulz, Marc Biedert, Christian Hillebrecht, Franz Gerl, Udo Haiber
Speech recognition for detecting setting instructions

Patent number: 7844458

Abstract: A speech recognition apparatus that enables efficient multimodal input in setting a plurality of items by one utterance is provided. An input unit inputs a setting instruction by speech. A speech interpretation unit recognizes and interprets the contents of the setting instruction by speech to generate first structured data containing candidates of the interpretation result. An instruction input detecting unit detects a setting instruction input by a user. An instruction input interpretation unit interprets the contents of the setting instruction input to generate second structured data. A selection unit selects one of the interpretation candidates contained in the first structured data based on the second structured data.

Type: Grant

Filed: October 30, 2006

Date of Patent: November 30, 2010

Assignee: Canon Kabushiki Kaisha

Inventors: Makoto Hirota, Hiroki Yamamoto
Network based interactive speech recognition system

Patent number: 7831426

Abstract: A network based interactive speech system responds in real-time to speech-based queries addressed to a set of topic entries. A best matching response is provided based on speech recognition processing and natural language processing performed on recognized speech utterances to identify a selected set of phrases related to the set of topic entries. Another routine converts the selected set of phrases into a search query suitable for identifying a first group of one or more topic entries corresponding to the speech-based query. The words/phrases can be assigned different weightings and can include entries which are not actually in the set of topic entries.

Type: Grant

Filed: June 23, 2006

Date of Patent: November 9, 2010

Assignee: Phoenix Solutions, Inc.

Inventor: Ian M. Bennett
Time-anchored posterior indexing of speech

Patent number: 7831425

Abstract: A computer-implemented method of indexing a speech lattice for search of audio corresponding to the speech lattice is provided. The method includes identifying at least two speech recognition hypotheses for a word which have time ranges satisfying a criteria. The method further includes merging the at least two speech recognition hypotheses to generate a merged speech recognition hypothesis for the word.

Type: Grant

Filed: December 15, 2005

Date of Patent: November 9, 2010

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Asela J. Gunawardana, Ciprian I. Chelba, Erik W. Selberg, Frank Torsten B. Seide, Patrick Nguyen, Roger Peng Yu

prev 1 2 3 4 5 6 7 8 9 next