Preliminary Matching Patents (Class 704/252)
  • Patent number: 8818807
    Abstract: This invention describes methods for implementing human speech recognition. The methods described here are of using sub-events that are sounds between spaces (typically a fully spoken word) that is then compared with a library of sub-events. All sub-events are packaged with it's own speech recognition function as individual units. This invention illustrates how this model can be used as a Large Vocabulary Speech Recognition System.
    Type: Grant
    Filed: May 24, 2010
    Date of Patent: August 26, 2014
    Inventor: Darrell Poirier
  • Patent number: 8812326
    Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.
    Type: Grant
    Filed: August 6, 2013
    Date of Patent: August 19, 2014
    Assignee: Promptu Systems Corporation
    Inventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
  • Patent number: 8805677
    Abstract: Creating and processing a natural language grammar set of data based on an input text string are disclosed. The method may include tagging the input text string, and examining, via a processor, the input text string for at least one first set of substitutions based on content of the input text string. The method may also include determining whether the input text string is a substring of a previously tagged input text string by comparing the input text string to a previously tagged input text string, such that the substring determination operation determines whether the input text string is wholly included in the previously tagged input text string.
    Type: Grant
    Filed: February 4, 2014
    Date of Patent: August 12, 2014
    Assignee: West Corporation
    Inventor: Steven John Schanbacher
  • Patent number: 8805685
    Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    Type: Grant
    Filed: August 5, 2013
    Date of Patent: August 12, 2014
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst J. Schroeter
  • Patent number: 8781825
    Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: July 15, 2014
    Assignee: Sensory, Incorporated
    Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
  • Patent number: 8775181
    Abstract: Interpretation from a first language to a second language via one or more communication devices is performed through a communication network (e.g. phone network or the internet) using a server for performing recognition and interpretation tasks, comprising the steps of: receiving an input speech utterance in a first language on a first mobile communication device; conditioning said input speech utterance; first transmitting said conditioned input speech utterance to a server; recognizing said first transmitted speech utterance to generate one or more recognition results; interpreting said recognition results to generate one or more interpretation results in an interlingua; mapping the interlingua to a second language in a first selected format; second transmitting said interpretation results in the first selected format to a second mobile communication device; and presenting said interpretation results in a second selected format on said second communication device.
    Type: Grant
    Filed: July 2, 2013
    Date of Patent: July 8, 2014
    Assignee: Fluential, LLC
    Inventors: Farzad Ehsani, Demitrios Master, Elaine Drom Zuber
  • Patent number: 8775180
    Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.
    Type: Grant
    Filed: November 26, 2012
    Date of Patent: July 8, 2014
    Assignee: West Corporation
    Inventors: Mark J. Pettay, Fonda J. Narke
  • Patent number: 8768700
    Abstract: A system may receive a voice search query and may determine word hypotheses for the voice query. Each word hypothesis may include one or more terms. The system may obtain a search query log and may determine, for each word hypothesis, a quantity of other search queries, in the search query log, that include the one or more terms. The system may determine weights based on the determined quantities. The system may generate, based on the weights, a first search query from the word hypotheses and may obtain a first set of search results. The system may modify, based on the first set of search results, one or more of the weights. The system may generate a second search query from the word hypotheses and obtain, based on the second search query, a second set of search results for the voice query.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: July 1, 2014
    Assignee: Google Inc.
    Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
  • Patent number: 8751145
    Abstract: A voice recognition method that is used for finding a street uses a database including information about a plurality of streets. The streets are characterized by respective street names and street types. A user provides a voice input for the street that the user tries to find. The voice input includes a street name and a street type. The street type is recognized by processing the voice input. Streets having the recognized street type are then selected from the database and a street name of at least one of the streets selected from the database is recognized by processing the voice input.
    Type: Grant
    Filed: November 30, 2005
    Date of Patent: June 10, 2014
    Assignees: Volkswagen of America, Inc., Audi AG
    Inventors: Ramon Eduardo Prieto, Carsten Bergmann, William B. Lathrop, M. Kashif Imam, Gerd Gruchalski, Markus Möhrle
  • Patent number: 8731927
    Abstract: A system and method is provided for recognizing a speech input and selecting an entry from a list of entries. The method includes recognizing a speech input. A fragment list of fragmented entries is provided and compared to the recognized speech input to generate a candidate list of best matching entries based on the comparison result. The system includes a speech recognition module, and a data base for storing the list of entries and the fragmented list. The speech recognition module may obtain the fragmented list from the data base and store a candidate list of best matching entries in memory. A display may also be provided to allow a user to select from a list of best matching entries.
    Type: Grant
    Filed: March 18, 2013
    Date of Patent: May 20, 2014
    Assignee: Nuance Communications, Inc.
    Inventor: Markus Schwarz
  • Patent number: 8719023
    Abstract: An apparatus to improve robustness to environmental changes of a context dependent speech recognizer for an application, that includes a training database to store sounds for speech recognition training, a dictionary to store words supported by the speech recognizer, and a speech recognizer training module to train a set of one or more multiple state Hidden Markov Models (HMMs) with use of the training database and the dictionary. The speech recognizer training module performs a non-uniform state clustering process on each of the states of each HMM, which includes using a different non-uniform cluster threshold for at least some of the states of each HMM to more heavily cluster and correspondingly reduce a number of observation distributions for those of the states of each HMM that are less empirically affected by one or more contextual dependencies.
    Type: Grant
    Filed: May 21, 2010
    Date of Patent: May 6, 2014
    Assignee: Sony Computer Entertainment Inc.
    Inventors: Xavier Menendez-Pidal, Ruxin Chen
  • Patent number: 8719016
    Abstract: A method for converting speech to text in a speech analytics system is provided. The method includes receiving audio data containing speech made up of sounds from an audio source, processing the sounds with a phonetic module resulting in symbols corresponding to the sounds, and processing the symbols with a language module and occurrence table resulting in text. The method also includes determining a probability of correct translation for each word in the text, comparing the probability of correct translation for each word in the text to the occurrence table, and adjusting the occurrence table based on the probability of correct translation for each word in the text.
    Type: Grant
    Filed: April 7, 2010
    Date of Patent: May 6, 2014
    Assignee: Verint Americas Inc.
    Inventors: Omer Ziv, Ran Achituv, Ido Shapira
  • Patent number: 8712774
    Abstract: A hybrid text generator is disclosed that generates a hybrid text string from multiple text strings that are produced from an audio input by multiple automated speech recognition systems. The hybrid text generator receives metadata that describes a time-location that each word from the multiple text strings is located in the audio input. The hybrid text generator matches words between the multiple text strings using the metadata and generates a hybrid text string that includes the matched words. The hybrid text generator utilizes confidence scores associated with words that do not match between the multiple text strings to determine whether to add an unmatched word to the hybrid text string.
    Type: Grant
    Filed: March 29, 2010
    Date of Patent: April 29, 2014
    Assignee: Nuance Communications, Inc.
    Inventor: Jonathan Wiggs
  • Patent number: 8706489
    Abstract: A system and method for selecting audio contents by using the speech recognition to obtain a textual phrase from a series of audio contents are provided. The system includes an output module outputting the audio contents, an input module receiving a speech input from a user, a buffer temporarily storing the audio contents within a desired period and the speech input, and a recognizing module performing a speech recognition between the audio contents within the desired period and the speech input to generate an audio phrase and the corresponding textual phrase matching with the speech input.
    Type: Grant
    Filed: August 8, 2006
    Date of Patent: April 22, 2014
    Assignee: Delta Electronics Inc.
    Inventors: Jia-lin Shen, Chien-Chou Hung
  • Patent number: 8706503
    Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A text string is obtained from a speech input received from a user. Information is derived from a communication event that occurred at the electronic device prior to receipt of the speech input. The text string is interpreted to derive a plurality of candidate interpretations of user intent. One of the candidate user intents is selected based on the information relating to the communication event.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: April 22, 2014
    Assignee: Apple Inc.
    Inventors: Adam John Cheyer, Didier Rene Guzzoni, Thomas Robert Gruber, Christopher Dean Brigham
  • Patent number: 8706491
    Abstract: One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.
    Type: Grant
    Filed: August 24, 2010
    Date of Patent: April 22, 2014
    Assignee: Microsoft Corporation
    Inventors: Ciprian Chelba, Milind Mahajan
  • Patent number: 8700259
    Abstract: A system for selecting music includes a mobile system for processing and transmitting through a wireless link a continuous voice stream spoken by a user of the mobile system, the continuous voice stream including a music request, and a data center for processing the continuous voice stream received through the wireless link into voice music information. The data center can perform automated voice recognition processing on the voice music information to recognize music components of the music request, confirm the recognized music components through interactive speech exchanges with the mobile system user through the wireless link and the mobile system, selectively allow human data center operator intervention to assist in identifying the selected recognized music components having a recognition confidence below a selected threshold value, and download music information pertaining to the music request for transmission to the mobile system derived from the confirmed recognized music components.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: April 15, 2014
    Assignee: Agero Connected Services, Inc.
    Inventor: Thomas Barton Schalk
  • Patent number: 8700398
    Abstract: An interactive user interface is described for setting confidence score thresholds in a language processing system. There is a display of a first system confidence score curve characterizing system recognition performance associated with a high confidence threshold, a first user control for adjusting the high confidence threshold and an associated visual display highlighting a point on the first system confidence score curve representing the selected high confidence threshold, a display of a second system confidence score curve characterizing system recognition performance associated with a low confidence threshold, and a second user control for adjusting the low confidence threshold and an associated visual display highlighting a point on the second system confidence score curve representing the selected low confidence threshold. The operation of the second user control is constrained to require that the low confidence threshold must be less than or equal to the high confidence threshold.
    Type: Grant
    Filed: November 29, 2011
    Date of Patent: April 15, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Jeffrey N. Marcus, Amy E. Ulug, William Bridges Smith, Jr.
  • Patent number: 8694317
    Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: April 8, 2014
    Assignee: Aurix Limited
    Inventors: Adrian I Skilling, Howard A K Wright
  • Patent number: 8688451
    Abstract: A speech recognition method includes receiving input speech from a user, processing the input speech using a first grammar to obtain parameter values of a first N-best list of vocabulary, comparing a parameter value of a top result of the first N-best list to a threshold value, and if the compared parameter value is below the threshold value, then additionally processing the input speech using a second grammar to obtain parameter values of a second N-best list of vocabulary. Other preferred steps include: determining the input speech to be in-vocabulary if any of the results of the first N-best list is also present within the second N-best list, but out-of-vocabulary if none of the results of the first N-best list is within the second N-best list; and providing audible feedback to the user if the input speech is determined to be out-of-vocabulary.
    Type: Grant
    Filed: May 11, 2006
    Date of Patent: April 1, 2014
    Assignee: General Motors LLC
    Inventors: Timothy J. Grost, Rathinavelu Chengalvarayan
  • Patent number: 8688453
    Abstract: According to example configurations, a speech processing system can include a syntactic parser, a word extractor, word extraction rules, and an analyzer. The syntactic parser of the speech processing system parses the utterance to identify syntactic relationships amongst words in the utterance. The word extractor utilizes word extraction rules to identify groupings of related words in the utterance that most likely represent an intended meaning of the utterance. The analyzer in the speech processing system maps each set of the sets of words produced by the word extractor to a respective candidate intent value to produce a list of candidate intent values for the utterance. The analyzer is configured to select, from the list of candidate intent values (i.e., possible intended meanings) of the utterance, a particular candidate intent value as being representative of the intent (i.e., intended meaning) of the utterance.
    Type: Grant
    Filed: February 28, 2011
    Date of Patent: April 1, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Sachindra Joshi, Shantanu Godbole
  • Patent number: 8682669
    Abstract: A system and a method to generate statistical utterance classifiers optimized for the individual states of a spoken dialog system is disclosed. The system and method make use of large databases of transcribed and annotated utterances from calls collected in a dialog system in production and log data reporting the association between the state of the system at the moment when the utterances were recorded and the utterance. From the system state, being a vector of multiple system variables, subsets of these variables, certain variable ranges, quantized variable values, etc. can be extracted to produce a multitude of distinct utterance subsets matching every possible system state. For each of these subset and variable combinations, statistical classifiers can be trained, tuned, and tested, and the classifiers can be stored together with the performance results and the state subset and variable combination.
    Type: Grant
    Filed: August 21, 2009
    Date of Patent: March 25, 2014
    Assignee: Synchronoss Technologies, Inc.
    Inventors: David Suendermann, Jackson Liscombe, Krishna Dayanidhi, Roberto Pieraccini
  • Patent number: 8676580
    Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.
    Type: Grant
    Filed: August 16, 2011
    Date of Patent: March 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
  • Patent number: 8676579
    Abstract: A method of authenticating a user of a mobile device having a first microphone and a second microphone, the method comprising receiving voice input from the user at the first and second microphones, determining a position of the user relative to the mobile device based on the voice input received by the first and second microphones, and authenticating the user based on the position of the user.
    Type: Grant
    Filed: April 30, 2012
    Date of Patent: March 18, 2014
    Assignee: BlackBerry Limited
    Inventor: James Allen Hymel
  • Patent number: 8676577
    Abstract: A method of utilizing metadata stored in a computer-readable medium to assist in the conversion of an audio stream to a text stream. The method compares personally identifiable data, such as a user's electronic address book and/or Caller/Recipient ID information (in the case of processing voice mail to text), to the n-best results generated by a speech recognition engine for each word that is output by the engine. A goal of this comparison is to correct a possible misrecognition of a spoken proper noun such as a name or company with its proper textual form or a spoken phone number to correctly formatted phone number with Arabic numerals to improve the overall accuracy of the output of the voice recognition system.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: March 18, 2014
    Assignee: Canyon IP Holdings, LLC
    Inventors: Igor Roditis Jablokov, Clifford J. Strohofer, III, Marc White, Victor Roditis Jablokov
  • Patent number: 8666743
    Abstract: The invention provides a speech recognition method for selecting a combination of list elements via a speech input, wherein a first list element of the combination is part of a first set of list elements and a second list element of the combination is part of a second set of list elements, the method comprising the steps of receiving the speech input, comparing each list element of the first set with the speech input to obtain a first candidate list of best matching list elements, processing the second set using the first candidate list to obtain a subset of the second set, comparing each list element of the subset of the second set with the speech input to obtain a second candidate list of best matching list elements, and selecting a combination of list elements using the first and the second candidate list.
    Type: Grant
    Filed: June 2, 2010
    Date of Patent: March 4, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Markus Schwarz, Matthias Schulz, Marc Biedert, Christian Hillebrecht, Franz Gerl, Udo Haiber
  • Patent number: 8666744
    Abstract: A method and apparatus are provided for automatically acquiring grammar fragments for recognizing and understanding fluently spoken language. Grammar fragments representing a set of syntactically and semantically similar phrases may be generated using three probability distributions: of succeeding words, of preceding words, and of associated call-types. The similarity between phrases may be measured by applying Kullback-Leibler distance to these three probability distributions. Phrases being close in all three distances may be clustered into a grammar fragment.
    Type: Grant
    Filed: September 21, 2000
    Date of Patent: March 4, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Kazuhiro Arai, Allen L. Gorin, Giuseppe Riccardi, Jeremy H. Wright
  • Patent number: 8666729
    Abstract: Creating and processing a natural language grammar set of data based on an input text string are disclosed. The method may include tagging the input text string, and examining, via a processor, the input text string for at least one first set of substitutions based on content of the input text string. The method may also include determining whether the input text string is a substring of a previously tagged input text string by comparing the input text string to a previously tagged input text string, such that the substring determination operation determines whether the input text string is wholly included in the previously tagged input text string.
    Type: Grant
    Filed: February 10, 2010
    Date of Patent: March 4, 2014
    Assignee: West Corporation
    Inventor: Steven John Schanbacher
  • Patent number: 8660845
    Abstract: Systems and methods for audio editing are provided. In one implementation, a computer-implemented method is provided. The method includes receiving digital audio data including a plurality of distinct vocal components. Each distinct vocal component is automatically identified using one or more attributes that uniquely identify each distinct vocal component. The audio data is separated into two or more individual tracks where each individual track comprises audio data corresponding to one distinct vocal component. The separated individual tracks are then made available for further processing.
    Type: Grant
    Filed: October 16, 2007
    Date of Patent: February 25, 2014
    Assignee: Adobe Systems Incorporated
    Inventors: Nariman Sodeifi, David E. Johnston
  • Patent number: 8639509
    Abstract: In a confidence computing method and system, a processor may interpret speech signals as a text string or directly receive a text string as input, generate a syntactical parse tree representing the interpreted string and including a plurality of sub-trees which each represents a corresponding section of the interpreted text string, determine for each sub-tree whether the sub-tree is accurate, obtain replacement speech signals for each sub-tree determined to be inaccurate, and provide output based on corresponding text string sections of at least one sub-tree determined to be accurate.
    Type: Grant
    Filed: July 27, 2007
    Date of Patent: January 28, 2014
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Feng Lin, Zhe Feng
  • Patent number: 8639507
    Abstract: The present invention enables the recognition process at high speed even when a lot of garbage is included in the grammar. The first voice recognition processing unit generates a recognition hypothesis graph which indicates a structure of hypothesis that is derived according to a first grammar together with a score associated with respective connections of a recognition unit by executing a voice recognition process based on the first grammar to a voice feature amount of input voice, and the second voice recognition processing unit outputs the recognition result from a total score of a hypothesis which is derived according to a second grammar after executing a voice recognition process according to the second grammar that is specified to accept a section other than keywords in input voice as the garbage section to a voice feature amount of input voice, and the second voice recognition processing unit acquires the structure and the score of the garbage section from the recognition hypothesis graph.
    Type: Grant
    Filed: December 22, 2008
    Date of Patent: January 28, 2014
    Assignee: NEC Corporation
    Inventors: Fumihiro Adachi, Ryosuke Isotani, Ken Hanazawa
  • Patent number: 8630860
    Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.
    Type: Grant
    Filed: March 3, 2011
    Date of Patent: January 14, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
  • Patent number: 8620654
    Abstract: A system in one embodiment includes a server associated with a unified messaging system (UMS). The server records speech of a user as an audio data file, translates the audio data file into a text data file, and maps each word within the text data file to a corresponding segment of audio data in the audio data file. A graphical user interface (GUI) of a message editor running on an endpoint associated with the user displays the text data file on the endpoint and allows the user to identify a portion of the text data file for replacement. The server being further operable to record new speech of the user as new audio data and to replace one or more segments of the audio data file corresponding to the portion of the text with the new audio data.
    Type: Grant
    Filed: July 20, 2007
    Date of Patent: December 31, 2013
    Assignee: Cisco Technology, Inc.
    Inventors: Joseph F. Khouri, Laurent Philonenko, Mukul Jain, Shmuel Shaffer
  • Patent number: 8612221
    Abstract: A portable terminal having an audio pickup means that acquires sound, an absolute position detection unit that detects the absolute position of the portable terminal, a relative position detection unit that detects the relative position of the portable terminal, and a speech recognition and synthesis unit that recognizes the audio acquired by the audio pickup means as speech, is achieved with a simple configuration. A portable terminal (1) that exchanges data with a server (2) has disposed to the portable terminal an audio pickup means that acquires sound, an absolute position detection unit (1-1) that detects the absolute position of the portable terminal, a relative position detection unit (1-2) that detects the relative position of the portable terminal, and a speech recognition and synthesis unit (1-3) that recognizes the audio acquired by the audio pickup means as speech.
    Type: Grant
    Filed: February 2, 2010
    Date of Patent: December 17, 2013
    Assignee: Seiko Epson Corporation
    Inventors: Junichi Yoshizawa, Tetsuo Ozawa, Koji Koseki
  • Patent number: 8606560
    Abstract: An interpretation system that includes an optical or audio acquisition device for acquiring a sentence written or spoke in a source language and an audio restoration device for generating, from an input signal acquired by the acquisition device, a source sentence that is a transcription of the sentence in the source language. The interpretation system further includes a translation device for generating, from the source sentence, a target sentence that is a translation of the source sentence in a target language, and a speech synthesis device for generating, from the target sentence, an output audio signal reproduced by the audio restoration device. The interpretation system includes a smoothing device for calling the recognition, translation and speech synthesis devices in order to produce in real time an interpretation in the target language of the sentence in the source language.
    Type: Grant
    Filed: November 18, 2008
    Date of Patent: December 10, 2013
    Inventor: Jean Grenier
  • Patent number: 8606568
    Abstract: Methods, computer program products, and systems are described for receiving, by a speech recognition engine, audio data that encodes an utterance and determining, by the speech recognition engine, that a transcription of the utterance includes one or more keywords associated with a command, and a pronoun. In addition, the methods, computer program products, and systems described herein pertain to transmitting a disambiguation request to an application, wherein the disambiguation request identifies the pronoun, receiving, by the speech recognition engine, a response to the disambiguation request, wherein the response references an item of content identified by the application, and generating, by the speech recognition engine, the command using the keywords and the response.
    Type: Grant
    Filed: October 23, 2012
    Date of Patent: December 10, 2013
    Assignee: Google Inc.
    Inventors: Simon Tickner, Richard Z. Cohen
  • Patent number: 8589161
    Abstract: A system and method for an integrated, multi-modal, multi-device natural language voice services environment may be provided. In particular, the environment may include a plurality of voice-enabled devices each having intent determination capabilities for processing multi-modal natural language inputs in addition to knowledge of the intent determination capabilities of other devices in the environment. Further, the environment may be arranged in a centralized manner, a distributed peer-to-peer manner, or various combinations thereof. As such, the various devices may cooperate to determine intent of multi-modal natural language inputs, and commands, queries, or other requests may be routed to one or more of the devices best suited to take action in response thereto.
    Type: Grant
    Filed: May 27, 2008
    Date of Patent: November 19, 2013
    Assignee: VoiceBox Technologies, Inc.
    Inventors: Robert A. Kennewick, Chris Weider
  • Patent number: 8583436
    Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.
    Type: Grant
    Filed: December 19, 2008
    Date of Patent: November 12, 2013
    Assignee: NEC Corporation
    Inventors: Hitoshi Yamamoto, Kiyokazu Miki
  • Patent number: 8571869
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
    Type: Grant
    Filed: May 15, 2008
    Date of Patent: October 29, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Patent number: 8571849
    Abstract: Disclosed herein are systems, methods, and computer readable-media for enriching spoken language translation with prosodic information in a statistical speech translation framework. The method includes receiving speech for translation to a target language, generating pitch accent labels representing segments of the received speech which are prosodically prominent, and injecting pitch accent labels with word tokens within the translation engine to create enriched target language output text. A further step may be added of synthesizing speech in the target language based on the prosody enriched target language output text. An automatic prosody labeler can generate pitch accent labels. An automatic prosody labeler can exploit lexical, syntactic, and prosodic information of the speech. A maximum entropy model may be used to determine which segments of the speech are prosodically prominent.
    Type: Grant
    Filed: September 30, 2008
    Date of Patent: October 29, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Srinivas Bangalore, Vivek Kumar Rangarajan Sridhar
  • Patent number: 8571858
    Abstract: For classifying different segments of a signal which has segments of at least a first type and second type, e.g. audio and speech segments, the signal is short-term classified on the basis of the at least one short-term feature extracted from the signal and a short-term classification result is delivered. The signal is also long-term classified on the basis of the at least one short-term feature and at least one long-term feature extracted from the signal and a long-term classification result is delivered. The short-term classification result and the long-term classification result are combined to provide an output signal indicating whether a segment of the signal is of the first type or of the second type.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: October 29, 2013
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.
    Inventors: Guillaume Fuchs, Stefan Bayer, Jens Hirschfeld, Juergen Herre, Jeremie Lecomte, Frederik Nagel, Nikolaus Rettelbach, Stefan Wabnik, Yoshikazu Yokotani
  • Patent number: 8566097
    Abstract: A lexical acquisition apparatus includes: a phoneme recognition section 2 for preparing a phoneme sequence candidate from an inputted speech; a word matching section 3 for preparing a plurality of word sequences based on the phoneme sequence candidate; a discrimination section 4 for selecting, from among a plurality of word sequences, a word sequence having a high likelihood in a recognition result; an acquisition section 5 for acquiring a new word based on the word sequence selected by the discrimination section 4; a teaching word list 4A used to teach a name; and a probability model 4B of the teaching word and an unknown word, wherein the discrimination section 4 calculates, for each word sequence, a first evaluation value showing how much words in the word sequence correspond to teaching words in the list 4A and a second evaluation value showing a probability at which the words in the word sequence are adjacent to one another and selects a word sequence for which a sum of the first evaluation value and the
    Type: Grant
    Filed: June 1, 2010
    Date of Patent: October 22, 2013
    Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute International
    Inventors: Mikio Nakano, Takashi Nose, Ryo Taguchi, Kotaro Funakoshi, Naoto Iwahashi
  • Patent number: 8564721
    Abstract: The addition of temporal positions to an inverted index allows for temporal queries in addition to phrase queries. Store additional binary data for each term instance in the word-level index to prepare for searching in response to time-based queries from a user is accomplished through the use of Lucene's binary payload feature where the payload structure is defined for use in such searches. The pre-defined payload fields consist of three integers, which account for 12 extra bytes that must be stored for each term instance. A content database on the Master/Administrator server node provides the indexes for search into content in response to user events, returning results in JSON format. The search results may then be used to locate and present content segments to a user containing both requested search term results and the time location and duration within a content asset where the search term(s) is found.
    Type: Grant
    Filed: August 28, 2012
    Date of Patent: October 22, 2013
    Inventors: Matthew Berry, Changwen Yang
  • Patent number: 8560373
    Abstract: A method for direct marketing comprising establishing a first communications link between a prospective customer using a device having a unique identification number and a communications device, automatically transmitting the unique identification number associated with the prospective customer's device to the communications device, establishing a second communications link between the communications device and a computer operably connected to a memory apparatus having a prospective customer database comprising prospective customer information associated with the unique identification number of the prospective customer's device, in which the information in the database determines prospective customer value which can be used to determine subsequent operations and marketing actions with the prospective customer.
    Type: Grant
    Filed: September 25, 2008
    Date of Patent: October 15, 2013
    Inventor: Eileen A. Fraser
  • Patent number: 8560324
    Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal, a memory configured to store information related to operations performed on the mobile terminal, and a controller configured to activate the voice recognition function upon receiving the input to activate the voice recognition function, to determine a meaning of an input voice instruction based on at least one prior operation performed on the mobile terminal and a language included in the voice instruction, and to provide operations related to the determined meaning of the input voice instruction based on the at least one prior operation performed on the mobile terminal and the language included in the voice instruction and based on a probability that the determined meaning of the input voice instruction matches the information related to the operations of the mobile terminal.
    Type: Grant
    Filed: January 31, 2012
    Date of Patent: October 15, 2013
    Assignee: LG Electronics Inc.
    Inventors: Jong-Ho Shin, Jae-Do Kwak, Jong-Keun Youn
  • Patent number: 8548806
    Abstract: A voice recognition device, a voice recognition method and a voice recognition program capable of appropriately restricting recognition objects based on voice input from a user to recognize the input voice with accuracy are provided.
    Type: Grant
    Filed: September 11, 2007
    Date of Patent: October 1, 2013
    Assignee: Honda Motor Co. Ltd.
    Inventor: Hisayuki Nagashima
  • Patent number: 8527271
    Abstract: A method for the voice recognition of a spoken expression to be recognized, comprising a plurality of expression parts that are to be recognized. Partial voice recognition takes place on a first selected expression part, and depending on a selection of hits for the first expression part detected by the partial voice recognition, voice recognition on the first and further expression parts is executed.
    Type: Grant
    Filed: June 18, 2008
    Date of Patent: September 3, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Michael Wandinger, Jesus Fernando Guitarte Perez, Bernhard Littel
  • Patent number: 8521526
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.
    Type: Grant
    Filed: July 28, 2010
    Date of Patent: August 27, 2013
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Johan Schalkwyk, Pankaj Risbood
  • Patent number: 8521537
    Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.
    Type: Grant
    Filed: April 3, 2007
    Date of Patent: August 27, 2013
    Assignee: Promptu Systems Corporation
    Inventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
  • Patent number: 8521531
    Abstract: A speech search method for a display device is discussed. The method includes the steps of outputting media data, receiving a speech search command from a user, and determining whether the speech search command includes a query term. If the speech search command does not include a query term, the method further comprises the step of extracting a query term which is full and searchable from audio data of the media data which is outputted immediately prior to the speech search command. Finally, the method includes the step of performing a speech search using the extracted query term.
    Type: Grant
    Filed: February 6, 2013
    Date of Patent: August 27, 2013
    Assignee: LG Electronics Inc.
    Inventor: Yongsin Kim