Preliminary Matching Patents (Class 704/252)

Large vocabulary binary speech recognition

Patent number: 8818807

Abstract: This invention describes methods for implementing human speech recognition. The methods described here are of using sub-events that are sounds between spaces (typically a fully spoken word) that is then compared with a library of sub-events. All sub-events are packaged with it's own speech recognition function as individual units. This invention illustrates how this model can be used as a Large Vocabulary Speech Recognition System.

Type: Grant

Filed: May 24, 2010

Date of Patent: August 26, 2014

Inventor: Darrell Poirier
Detection and use of acoustic signal quality indicators

Patent number: 8812326

Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.

Type: Grant

Filed: August 6, 2013

Date of Patent: August 19, 2014

Assignee: Promptu Systems Corporation

Inventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
Processing natural language grammar

Patent number: 8805677

Abstract: Creating and processing a natural language grammar set of data based on an input text string are disclosed. The method may include tagging the input text string, and examining, via a processor, the input text string for at least one first set of substitutions based on content of the input text string. The method may also include determining whether the input text string is a substring of a previously tagged input text string by comparing the input text string to a previously tagged input text string, such that the substring determination operation determines whether the input text string is wholly included in the previously tagged input text string.

Type: Grant

Filed: February 4, 2014

Date of Patent: August 12, 2014

Assignee: West Corporation

Inventor: Steven John Schanbacher
System and method for detecting synthetic speaker verification

Patent number: 8805685

Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.

Type: Grant

Filed: August 5, 2013

Date of Patent: August 12, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst J. Schroeter
Reducing false positives in speech recognition systems

Patent number: 8781825

Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.

Type: Grant

Filed: August 24, 2011

Date of Patent: July 15, 2014

Assignee: Sensory, Incorporated

Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
Mobile speech-to-speech interpretation system

Patent number: 8775181

Abstract: Interpretation from a first language to a second language via one or more communication devices is performed through a communication network (e.g. phone network or the internet) using a server for performing recognition and interpretation tasks, comprising the steps of: receiving an input speech utterance in a first language on a first mobile communication device; conditioning said input speech utterance; first transmitting said conditioned input speech utterance to a server; recognizing said first transmitted speech utterance to generate one or more recognition results; interpreting said recognition results to generate one or more interpretation results in an interlingua; mapping the interlingua to a second language in a first selected format; second transmitting said interpretation results in the first selected format to a second mobile communication device; and presenting said interpretation results in a second selected format on said second communication device.

Type: Grant

Filed: July 2, 2013

Date of Patent: July 8, 2014

Assignee: Fluential, LLC

Inventors: Farzad Ehsani, Demitrios Master, Elaine Drom Zuber
Script compliance and quality assurance based on speech recognition and duration of interaction

Patent number: 8775180

Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.

Type: Grant

Filed: November 26, 2012

Date of Patent: July 8, 2014

Assignee: West Corporation

Inventors: Mark J. Pettay, Fonda J. Narke
Voice search engine interface for scoring search hypotheses

Patent number: 8768700

Abstract: A system may receive a voice search query and may determine word hypotheses for the voice query. Each word hypothesis may include one or more terms. The system may obtain a search query log and may determine, for each word hypothesis, a quantity of other search queries, in the search query log, that include the one or more terms. The system may determine weights based on the determined quantities. The system may generate, based on the weights, a first search query from the word hypotheses and may obtain a first set of search results. The system may modify, based on the first set of search results, one or more of the weights. The system may generate a second search query from the word hypotheses and obtain, based on the second search query, a second set of search results for the voice query.

Type: Grant

Filed: September 14, 2012

Date of Patent: July 1, 2014

Assignee: Google Inc.

Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
Method for voice recognition

Patent number: 8751145

Abstract: A voice recognition method that is used for finding a street uses a database including information about a plurality of streets. The streets are characterized by respective street names and street types. A user provides a voice input for the street that the user tries to find. The voice input includes a street name and a street type. The street type is recognized by processing the voice input. Streets having the recognized street type are then selected from the database and a street name of at least one of the streets selected from the database is recognized by processing the voice input.

Type: Grant

Filed: November 30, 2005

Date of Patent: June 10, 2014

Assignees: Volkswagen of America, Inc., Audi AG

Inventors: Ramon Eduardo Prieto, Carsten Bergmann, William B. Lathrop, M. Kashif Imam, Gerd Gruchalski, Markus Möhrle
Speech recognition on large lists using fragments

Patent number: 8731927

Abstract: A system and method is provided for recognizing a speech input and selecting an entry from a list of entries. The method includes recognizing a speech input. A fragment list of fragmented entries is provided and compared to the recognized speech input to generate a candidate list of best matching entries based on the comparison result. The system includes a speech recognition module, and a data base for storing the list of entries and the fragmented list. The speech recognition module may obtain the fragmented list from the data base and store a candidate list of best matching entries in memory. A display may also be provided to allow a user to select from a list of best matching entries.

Type: Grant

Filed: March 18, 2013

Date of Patent: May 20, 2014

Assignee: Nuance Communications, Inc.

Inventor: Markus Schwarz
Robustness to environmental changes of a context dependent speech recognizer

Patent number: 8719023

Abstract: An apparatus to improve robustness to environmental changes of a context dependent speech recognizer for an application, that includes a training database to store sounds for speech recognition training, a dictionary to store words supported by the speech recognizer, and a speech recognizer training module to train a set of one or more multiple state Hidden Markov Models (HMMs) with use of the training database and the dictionary. The speech recognizer training module performs a non-uniform state clustering process on each of the states of each HMM, which includes using a different non-uniform cluster threshold for at least some of the states of each HMM to more heavily cluster and correspondingly reduce a number of observation distributions for those of the states of each HMM that are less empirically affected by one or more contextual dependencies.

Type: Grant

Filed: May 21, 2010

Date of Patent: May 6, 2014

Assignee: Sony Computer Entertainment Inc.

Inventors: Xavier Menendez-Pidal, Ruxin Chen
Speech analytics system and system and method for determining structured speech

Patent number: 8719016

Abstract: A method for converting speech to text in a speech analytics system is provided. The method includes receiving audio data containing speech made up of sounds from an audio source, processing the sounds with a phonetic module resulting in symbols corresponding to the sounds, and processing the symbols with a language module and occurrence table resulting in text. The method also includes determining a probability of correct translation for each word in the text, comparing the probability of correct translation for each word in the text to the occurrence table, and adjusting the occurrence table based on the probability of correct translation for each word in the text.

Type: Grant

Filed: April 7, 2010

Date of Patent: May 6, 2014

Assignee: Verint Americas Inc.

Inventors: Omer Ziv, Ran Achituv, Ido Shapira
Systems and methods for generating a hybrid text string from two or more text strings generated by multiple automated speech recognition systems

Patent number: 8712774

Abstract: A hybrid text generator is disclosed that generates a hybrid text string from multiple text strings that are produced from an audio input by multiple automated speech recognition systems. The hybrid text generator receives metadata that describes a time-location that each word from the multiple text strings is located in the audio input. The hybrid text generator matches words between the multiple text strings using the metadata and generates a hybrid text string that includes the matched words. The hybrid text generator utilizes confidence scores associated with words that do not match between the multiple text strings to determine whether to add an unmatched word to the hybrid text string.

Type: Grant

Filed: March 29, 2010

Date of Patent: April 29, 2014

Assignee: Nuance Communications, Inc.

Inventor: Jonathan Wiggs
System and method for selecting audio contents by using speech recognition

Patent number: 8706489

Abstract: A system and method for selecting audio contents by using the speech recognition to obtain a textual phrase from a series of audio contents are provided. The system includes an output module outputting the audio contents, an input module receiving a speech input from a user, a buffer temporarily storing the audio contents within a desired period and the speech input, and a recognizing module performing a speech recognition between the audio contents within the desired period and the speech input to generate an audio phrase and the corresponding textual phrase matching with the speech input.

Type: Grant

Filed: August 8, 2006

Date of Patent: April 22, 2014

Assignee: Delta Electronics Inc.

Inventors: Jia-lin Shen, Chien-Chou Hung
Intent deduction based on previous user interactions with voice assistant

Patent number: 8706503

Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A text string is obtained from a speech input received from a user. Information is derived from a communication event that occurred at the electronic device prior to receipt of the speech input. The text string is interpreted to derive a plurality of candidate interpretations of user intent. One of the candidate user intents is selected based on the information relating to the communication event.

Type: Grant

Filed: December 21, 2012

Date of Patent: April 22, 2014

Assignee: Apple Inc.

Inventors: Adam John Cheyer, Didier Rene Guzzoni, Thomas Robert Gruber, Christopher Dean Brigham
Applying a structured language model to information extraction

Patent number: 8706491

Abstract: One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.

Type: Grant

Filed: August 24, 2010

Date of Patent: April 22, 2014

Assignee: Microsoft Corporation

Inventors: Ciprian Chelba, Milind Mahajan
Systems and methods for off-board voice-automated vehicle navigation

Patent number: 8700259

Abstract: A system for selecting music includes a mobile system for processing and transmitting through a wireless link a continuous voice stream spoken by a user of the mobile system, the continuous voice stream including a music request, and a data center for processing the continuous voice stream received through the wireless link into voice music information. The data center can perform automated voice recognition processing on the voice music information to recognize music components of the music request, confirm the recognized music components through interactive speech exchanges with the mobile system user through the wireless link and the mobile system, selectively allow human data center operator intervention to assist in identifying the selected recognized music components having a recognition confidence below a selected threshold value, and download music information pertaining to the music request for transmission to the mobile system derived from the confirmed recognized music components.

Type: Grant

Filed: March 15, 2013

Date of Patent: April 15, 2014

Assignee: Agero Connected Services, Inc.

Inventor: Thomas Barton Schalk
Interface for setting confidence thresholds for automatic speech recognition and call steering applications

Patent number: 8700398

Abstract: An interactive user interface is described for setting confidence score thresholds in a language processing system. There is a display of a first system confidence score curve characterizing system recognition performance associated with a high confidence threshold, a first user control for adjusting the high confidence threshold and an associated visual display highlighting a point on the first system confidence score curve representing the selected high confidence threshold, a display of a second system confidence score curve characterizing system recognition performance associated with a low confidence threshold, and a second user control for adjusting the low confidence threshold and an associated visual display highlighting a point on the second system confidence score curve representing the selected low confidence threshold. The operation of the second user control is constrained to require that the low confidence threshold must be less than or equal to the high confidence threshold.

Type: Grant

Filed: November 29, 2011

Date of Patent: April 15, 2014

Assignee: Nuance Communications, Inc.

Inventors: Jeffrey N. Marcus, Amy E. Ulug, William Bridges Smith, Jr.
Methods and apparatus relating to searching of spoken audio data

Patent number: 8694317

Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.

Type: Grant

Filed: February 6, 2006

Date of Patent: April 8, 2014

Assignee: Aurix Limited

Inventors: Adrian I Skilling, Howard A K Wright
Distinguishing out-of-vocabulary speech from in-vocabulary speech

Patent number: 8688451

Abstract: A speech recognition method includes receiving input speech from a user, processing the input speech using a first grammar to obtain parameter values of a first N-best list of vocabulary, comparing a parameter value of a top result of the first N-best list to a threshold value, and if the compared parameter value is below the threshold value, then additionally processing the input speech using a second grammar to obtain parameter values of a second N-best list of vocabulary. Other preferred steps include: determining the input speech to be in-vocabulary if any of the results of the first N-best list is also present within the second N-best list, but out-of-vocabulary if none of the results of the first N-best list is within the second N-best list; and providing audible feedback to the user if the input speech is determined to be out-of-vocabulary.

Type: Grant

Filed: May 11, 2006

Date of Patent: April 1, 2014

Assignee: General Motors LLC

Inventors: Timothy J. Grost, Rathinavelu Chengalvarayan
Intent mining via analysis of utterances

Patent number: 8688453

Abstract: According to example configurations, a speech processing system can include a syntactic parser, a word extractor, word extraction rules, and an analyzer. The syntactic parser of the speech processing system parses the utterance to identify syntactic relationships amongst words in the utterance. The word extractor utilizes word extraction rules to identify groupings of related words in the utterance that most likely represent an intended meaning of the utterance. The analyzer in the speech processing system maps each set of the sets of words produced by the word extractor to a respective candidate intent value to produce a list of candidate intent values for the utterance. The analyzer is configured to select, from the list of candidate intent values (i.e., possible intended meanings) of the utterance, a particular candidate intent value as being representative of the intent (i.e., intended meaning) of the utterance.

Type: Grant

Filed: February 28, 2011

Date of Patent: April 1, 2014

Assignee: Nuance Communications, Inc.

Inventors: Sachindra Joshi, Shantanu Godbole
System and method for building optimal state-dependent statistical utterance classifiers in spoken dialog systems

Patent number: 8682669

Abstract: A system and a method to generate statistical utterance classifiers optimized for the individual states of a spoken dialog system is disclosed. The system and method make use of large databases of transcribed and annotated utterances from calls collected in a dialog system in production and log data reporting the association between the state of the system at the moment when the utterances were recorded and the utterance. From the system state, being a vector of multiple system variables, subsets of these variables, certain variable ranges, quantized variable values, etc. can be extracted to produce a multitude of distinct utterance subsets matching every possible system state. For each of these subset and variable combinations, statistical classifiers can be trained, tuned, and tested, and the classifiers can be stored together with the performance results and the state subset and variable combination.

Type: Grant

Filed: August 21, 2009

Date of Patent: March 25, 2014

Assignee: Synchronoss Technologies, Inc.

Inventors: David Suendermann, Jackson Liscombe, Krishna Dayanidhi, Roberto Pieraccini
Automatic speech and concept recognition

Patent number: 8676580

Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.

Type: Grant

Filed: August 16, 2011

Date of Patent: March 18, 2014

Assignee: International Business Machines Corporation

Inventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
Dual microphone voice authentication for mobile device

Patent number: 8676579

Abstract: A method of authenticating a user of a mobile device having a first microphone and a second microphone, the method comprising receiving voice input from the user at the first and second microphones, determining a position of the user relative to the mobile device based on the voice input received by the first and second microphones, and authenticating the user based on the position of the user.

Type: Grant

Filed: April 30, 2012

Date of Patent: March 18, 2014

Assignee: BlackBerry Limited

Inventor: James Allen Hymel
Use of metadata to post process speech recognition output

Patent number: 8676577

Abstract: A method of utilizing metadata stored in a computer-readable medium to assist in the conversion of an audio stream to a text stream. The method compares personally identifiable data, such as a user's electronic address book and/or Caller/Recipient ID information (in the case of processing voice mail to text), to the n-best results generated by a speech recognition engine for each word that is output by the engine. A goal of this comparison is to correct a possible misrecognition of a spoken proper noun such as a name or company with its proper textual form or a spoken phone number to correctly formatted phone number with Arabic numerals to improve the overall accuracy of the output of the voice recognition system.

Type: Grant

Filed: March 31, 2009

Date of Patent: March 18, 2014

Assignee: Canyon IP Holdings, LLC

Inventors: Igor Roditis Jablokov, Clifford J. Strohofer, III, Marc White, Victor Roditis Jablokov
Speech recognition method for selecting a combination of list elements via a speech input

Patent number: 8666743

Abstract: The invention provides a speech recognition method for selecting a combination of list elements via a speech input, wherein a first list element of the combination is part of a first set of list elements and a second list element of the combination is part of a second set of list elements, the method comprising the steps of receiving the speech input, comparing each list element of the first set with the speech input to obtain a first candidate list of best matching list elements, processing the second set using the first candidate list to obtain a subset of the second set, comparing each list element of the subset of the second set with the speech input to obtain a second candidate list of best matching list elements, and selecting a combination of list elements using the first and the second candidate list.

Type: Grant

Filed: June 2, 2010

Date of Patent: March 4, 2014

Assignee: Nuance Communications, Inc.

Inventors: Markus Schwarz, Matthias Schulz, Marc Biedert, Christian Hillebrecht, Franz Gerl, Udo Haiber
Grammar fragment acquisition using syntactic and semantic clustering

Patent number: 8666744

Abstract: A method and apparatus are provided for automatically acquiring grammar fragments for recognizing and understanding fluently spoken language. Grammar fragments representing a set of syntactically and semantically similar phrases may be generated using three probability distributions: of succeeding words, of preceding words, and of associated call-types. The similarity between phrases may be measured by applying Kullback-Leibler distance to these three probability distributions. Phrases being close in all three distances may be clustered into a grammar fragment.

Type: Grant

Filed: September 21, 2000

Date of Patent: March 4, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Kazuhiro Arai, Allen L. Gorin, Giuseppe Riccardi, Jeremy H. Wright
Processing natural language grammar

Patent number: 8666729

Abstract: Creating and processing a natural language grammar set of data based on an input text string are disclosed. The method may include tagging the input text string, and examining, via a processor, the input text string for at least one first set of substitutions based on content of the input text string. The method may also include determining whether the input text string is a substring of a previously tagged input text string by comparing the input text string to a previously tagged input text string, such that the substring determination operation determines whether the input text string is wholly included in the previously tagged input text string.

Type: Grant

Filed: February 10, 2010

Date of Patent: March 4, 2014

Assignee: West Corporation

Inventor: Steven John Schanbacher
Automatic separation of audio data

Patent number: 8660845

Abstract: Systems and methods for audio editing are provided. In one implementation, a computer-implemented method is provided. The method includes receiving digital audio data including a plurality of distinct vocal components. Each distinct vocal component is automatically identified using one or more attributes that uniquely identify each distinct vocal component. The audio data is separated into two or more individual tracks where each individual track comprises audio data corresponding to one distinct vocal component. The separated individual tracks are then made available for further processing.

Type: Grant

Filed: October 16, 2007

Date of Patent: February 25, 2014

Assignee: Adobe Systems Incorporated

Inventors: Nariman Sodeifi, David E. Johnston
Method and system for computing or determining confidence scores for parse trees at all levels

Patent number: 8639509

Abstract: In a confidence computing method and system, a processor may interpret speech signals as a text string or directly receive a text string as input, generate a syntactical parse tree representing the interpreted string and including a plurality of sub-trees which each represents a corresponding section of the interpreted text string, determine for each sub-tree whether the sub-tree is accurate, obtain replacement speech signals for each sub-tree determined to be inaccurate, and provide output based on corresponding text string sections of at least one sub-tree determined to be accurate.

Type: Grant

Filed: July 27, 2007

Date of Patent: January 28, 2014

Assignee: Robert Bosch GmbH

Inventors: Fuliang Weng, Feng Lin, Zhe Feng
Voice recognition system, voice recognition method, and program for voice recognition

Patent number: 8639507

Abstract: The present invention enables the recognition process at high speed even when a lot of garbage is included in the grammar. The first voice recognition processing unit generates a recognition hypothesis graph which indicates a structure of hypothesis that is derived according to a first grammar together with a score associated with respective connections of a recognition unit by executing a voice recognition process based on the first grammar to a voice feature amount of input voice, and the second voice recognition processing unit outputs the recognition result from a total score of a hypothesis which is derived according to a second grammar after executing a voice recognition process according to the second grammar that is specified to accept a section other than keywords in input voice as the garbage section to a voice feature amount of input voice, and the second voice recognition processing unit acquires the structure and the score of the garbage section from the recognition hypothesis graph.

Type: Grant

Filed: December 22, 2008

Date of Patent: January 28, 2014

Assignee: NEC Corporation

Inventors: Fumihiro Adachi, Ryosuke Isotani, Ken Hanazawa
Speaker and call characteristic sensitive open voice search

Patent number: 8630860

Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.

Type: Grant

Filed: March 3, 2011

Date of Patent: January 14, 2014

Assignee: Nuance Communications, Inc.

Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
Text oriented, user-friendly editing of a voicemail message

Patent number: 8620654

Abstract: A system in one embodiment includes a server associated with a unified messaging system (UMS). The server records speech of a user as an audio data file, translates the audio data file into a text data file, and maps each word within the text data file to a corresponding segment of audio data in the audio data file. A graphical user interface (GUI) of a message editor running on an endpoint associated with the user displays the text data file on the endpoint and allows the user to identify a portion of the text data file for replacement. The server being further operable to record new speech of the user as new audio data and to replace one or more segments of the audio data file corresponding to the portion of the text with the new audio data.

Type: Grant

Filed: July 20, 2007

Date of Patent: December 31, 2013

Assignee: Cisco Technology, Inc.

Inventors: Joseph F. Khouri, Laurent Philonenko, Mukul Jain, Shmuel Shaffer
Portable terminal and management system

Patent number: 8612221

Abstract: A portable terminal having an audio pickup means that acquires sound, an absolute position detection unit that detects the absolute position of the portable terminal, a relative position detection unit that detects the relative position of the portable terminal, and a speech recognition and synthesis unit that recognizes the audio acquired by the audio pickup means as speech, is achieved with a simple configuration. A portable terminal (1) that exchanges data with a server (2) has disposed to the portable terminal an audio pickup means that acquires sound, an absolute position detection unit (1-1) that detects the absolute position of the portable terminal, a relative position detection unit (1-2) that detects the relative position of the portable terminal, and a speech recognition and synthesis unit (1-3) that recognizes the audio acquired by the audio pickup means as speech.

Type: Grant

Filed: February 2, 2010

Date of Patent: December 17, 2013

Assignee: Seiko Epson Corporation

Inventors: Junichi Yoshizawa, Tetsuo Ozawa, Koji Koseki
Automatic simultaneous interpertation system

Patent number: 8606560

Abstract: An interpretation system that includes an optical or audio acquisition device for acquiring a sentence written or spoke in a source language and an audio restoration device for generating, from an input signal acquired by the acquisition device, a source sentence that is a transcription of the sentence in the source language. The interpretation system further includes a translation device for generating, from the source sentence, a target sentence that is a translation of the source sentence in a target language, and a speech synthesis device for generating, from the target sentence, an output audio signal reproduced by the audio restoration device. The interpretation system includes a smoothing device for calling the recognition, translation and speech synthesis devices in order to produce in real time an interpretation in the target language of the sentence in the source language.

Type: Grant

Filed: November 18, 2008

Date of Patent: December 10, 2013

Inventor: Jean Grenier
Evaluating pronouns in context

Patent number: 8606568

Abstract: Methods, computer program products, and systems are described for receiving, by a speech recognition engine, audio data that encodes an utterance and determining, by the speech recognition engine, that a transcription of the utterance includes one or more keywords associated with a command, and a pronoun. In addition, the methods, computer program products, and systems described herein pertain to transmitting a disambiguation request to an application, wherein the disambiguation request identifies the pronoun, receiving, by the speech recognition engine, a response to the disambiguation request, wherein the response references an item of content identified by the application, and generating, by the speech recognition engine, the command using the keywords and the response.

Type: Grant

Filed: October 23, 2012

Date of Patent: December 10, 2013

Assignee: Google Inc.

Inventors: Simon Tickner, Richard Z. Cohen
System and method for an integrated, multi-modal, multi-device natural language voice services environment

Patent number: 8589161

Abstract: A system and method for an integrated, multi-modal, multi-device natural language voice services environment may be provided. In particular, the environment may include a plurality of voice-enabled devices each having intent determination capabilities for processing multi-modal natural language inputs in addition to knowledge of the intent determination capabilities of other devices in the environment. Further, the environment may be arranged in a centralized manner, a distributed peer-to-peer manner, or various combinations thereof. As such, the various devices may cooperate to determine intent of multi-modal natural language inputs, and commands, queries, or other requests may be routed to one or more of the devices best suited to take action in response thereto.

Type: Grant

Filed: May 27, 2008

Date of Patent: November 19, 2013

Assignee: VoiceBox Technologies, Inc.

Inventors: Robert A. Kennewick, Chris Weider
Word category estimation apparatus, word category estimation method, speech recognition apparatus, speech recognition method, program, and recording medium

Patent number: 8583436

Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.

Type: Grant

Filed: December 19, 2008

Date of Patent: November 12, 2013

Assignee: NEC Corporation

Inventors: Hitoshi Yamamoto, Kiyokazu Miki
Natural language system and method based on unisolated performance metric

Patent number: 8571869

Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.

Type: Grant

Filed: May 15, 2008

Date of Patent: October 29, 2013

Assignee: Nuance Communications, Inc.

Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
System and method for enriching spoken language translation with prosodic information

Patent number: 8571849

Abstract: Disclosed herein are systems, methods, and computer readable-media for enriching spoken language translation with prosodic information in a statistical speech translation framework. The method includes receiving speech for translation to a target language, generating pitch accent labels representing segments of the received speech which are prosodically prominent, and injecting pitch accent labels with word tokens within the translation engine to create enriched target language output text. A further step may be added of synthesizing speech in the target language based on the prosody enriched target language output text. An automatic prosody labeler can generate pitch accent labels. An automatic prosody labeler can exploit lexical, syntactic, and prosodic information of the speech. A maximum entropy model may be used to determine which segments of the speech are prosodically prominent.

Type: Grant

Filed: September 30, 2008

Date of Patent: October 29, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Srinivas Bangalore, Vivek Kumar Rangarajan Sridhar
Method and discriminator for classifying different segments of a signal

Patent number: 8571858

Abstract: For classifying different segments of a signal which has segments of at least a first type and second type, e.g. audio and speech segments, the signal is short-term classified on the basis of the at least one short-term feature extracted from the signal and a short-term classification result is delivered. The signal is also long-term classified on the basis of the at least one short-term feature and at least one long-term feature extracted from the signal and a long-term classification result is delivered. The short-term classification result and the long-term classification result are combined to provide an output signal indicating whether a segment of the signal is of the first type or of the second type.

Type: Grant

Filed: January 11, 2011

Date of Patent: October 29, 2013

Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.

Inventors: Guillaume Fuchs, Stefan Bayer, Jens Hirschfeld, Juergen Herre, Jeremie Lecomte, Frederik Nagel, Nikolaus Rettelbach, Stefan Wabnik, Yoshikazu Yokotani
Lexical acquisition apparatus, multi dialogue behavior system, and lexical acquisition program

Patent number: 8566097

Abstract: A lexical acquisition apparatus includes: a phoneme recognition section 2 for preparing a phoneme sequence candidate from an inputted speech; a word matching section 3 for preparing a plurality of word sequences based on the phoneme sequence candidate; a discrimination section 4 for selecting, from among a plurality of word sequences, a word sequence having a high likelihood in a recognition result; an acquisition section 5 for acquiring a new word based on the word sequence selected by the discrimination section 4; a teaching word list 4A used to teach a name; and a probability model 4B of the teaching word and an unknown word, wherein the discrimination section 4 calculates, for each word sequence, a first evaluation value showing how much words in the word sequence correspond to teaching words in the list 4A and a second evaluation value showing a probability at which the words in the word sequence are adjacent to one another and selects a word sequence for which a sum of the first evaluation value and the

Type: Grant

Filed: June 1, 2010

Date of Patent: October 22, 2013

Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute International

Inventors: Mikio Nakano, Takashi Nose, Ryo Taguchi, Kotaro Funakoshi, Naoto Iwahashi
Timeline alignment and coordination for closed-caption text using speech recognition transcripts

Patent number: 8564721

Abstract: The addition of temporal positions to an inverted index allows for temporal queries in addition to phrase queries. Store additional binary data for each term instance in the word-level index to prepare for searching in response to time-based queries from a user is accomplished through the use of Lucene's binary payload feature where the payload structure is defined for use in such searches. The pre-defined payload fields consist of three integers, which account for 12 extra bytes that must be stored for each term instance. A content database on the Master/Administrator server node provides the indexes for search into content in response to user events, returning results in JSON format. The search results may then be used to locate and present content segments to a user containing both requested search term results and the time location and duration within a content asset where the search term(s) is found.

Type: Grant

Filed: August 28, 2012

Date of Patent: October 22, 2013

Inventors: Matthew Berry, Changwen Yang
Direct marketing system for matching caller value to risk and revenue

Patent number: 8560373

Abstract: A method for direct marketing comprising establishing a first communications link between a prospective customer using a device having a unique identification number and a communications device, automatically transmitting the unique identification number associated with the prospective customer's device to the communications device, establishing a second communications link between the communications device and a computer operably connected to a memory apparatus having a prospective customer database comprising prospective customer information associated with the unique identification number of the prospective customer's device, in which the information in the database determines prospective customer value which can be used to determine subsequent operations and marketing actions with the prospective customer.

Type: Grant

Filed: September 25, 2008

Date of Patent: October 15, 2013

Inventor: Eileen A. Fraser
Mobile terminal and menu control method thereof

Patent number: 8560324

Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal, a memory configured to store information related to operations performed on the mobile terminal, and a controller configured to activate the voice recognition function upon receiving the input to activate the voice recognition function, to determine a meaning of an input voice instruction based on at least one prior operation performed on the mobile terminal and a language included in the voice instruction, and to provide operations related to the determined meaning of the input voice instruction based on the at least one prior operation performed on the mobile terminal and the language included in the voice instruction and based on a probability that the determined meaning of the input voice instruction matches the information related to the operations of the mobile terminal.

Type: Grant

Filed: January 31, 2012

Date of Patent: October 15, 2013

Assignee: LG Electronics Inc.

Inventors: Jong-Ho Shin, Jae-Do Kwak, Jong-Keun Youn
Voice recognition device, voice recognition method, and voice recognition program

Patent number: 8548806

Abstract: A voice recognition device, a voice recognition method and a voice recognition program capable of appropriately restricting recognition objects based on voice input from a user to recognize the input voice with accuracy are provided.

Type: Grant

Filed: September 11, 2007

Date of Patent: October 1, 2013

Assignee: Honda Motor Co. Ltd.

Inventor: Hisayuki Nagashima
Method for speech recognition

Patent number: 8527271

Abstract: A method for the voice recognition of a spoken expression to be recognized, comprising a plurality of expression parts that are to be recognized. Partial voice recognition takes place on a first selected expression part, and depending on a selection of hits for the first expression part detected by the partial voice recognition, voice recognition on the first and further expression parts is executed.

Type: Grant

Filed: June 18, 2008

Date of Patent: September 3, 2013

Assignee: Nuance Communications, Inc.

Inventors: Michael Wandinger, Jesus Fernando Guitarte Perez, Bernhard Littel
Disambiguation of a spoken query term

Patent number: 8521526

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.

Type: Grant

Filed: July 28, 2010

Date of Patent: August 27, 2013

Assignee: Google Inc.

Inventors: Matthew I. Lloyd, Johan Schalkwyk, Pankaj Risbood
Detection and use of acoustic signal quality indicators

Patent number: 8521537

Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.

Type: Grant

Filed: April 3, 2007

Date of Patent: August 27, 2013

Assignee: Promptu Systems Corporation

Inventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
Displaying additional data about outputted media data by a display device for a speech search command

Patent number: 8521531

Abstract: A speech search method for a display device is discussed. The method includes the steps of outputting media data, receiving a speech search command from a user, and determining whether the speech search command includes a query term. If the speech search command does not include a query term, the method further comprises the step of extracting a query term which is full and searchable from audio data of the media data which is outputted immediately prior to the speech search command. Finally, the method includes the step of performing a speech search using the extracted query term.

Type: Grant

Filed: February 6, 2013

Date of Patent: August 27, 2013

Assignee: LG Electronics Inc.

Inventor: Yongsin Kim

prev 1 2 3 4 5 6 7 … next