Preliminary Matching Patents (Class 704/252)
-
Patent number: 8818807Abstract: This invention describes methods for implementing human speech recognition. The methods described here are of using sub-events that are sounds between spaces (typically a fully spoken word) that is then compared with a library of sub-events. All sub-events are packaged with it's own speech recognition function as individual units. This invention illustrates how this model can be used as a Large Vocabulary Speech Recognition System.Type: GrantFiled: May 24, 2010Date of Patent: August 26, 2014Inventor: Darrell Poirier
-
Patent number: 8812326Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.Type: GrantFiled: August 6, 2013Date of Patent: August 19, 2014Assignee: Promptu Systems CorporationInventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
-
Patent number: 8805677Abstract: Creating and processing a natural language grammar set of data based on an input text string are disclosed. The method may include tagging the input text string, and examining, via a processor, the input text string for at least one first set of substitutions based on content of the input text string. The method may also include determining whether the input text string is a substring of a previously tagged input text string by comparing the input text string to a previously tagged input text string, such that the substring determination operation determines whether the input text string is wholly included in the previously tagged input text string.Type: GrantFiled: February 4, 2014Date of Patent: August 12, 2014Assignee: West CorporationInventor: Steven John Schanbacher
-
Patent number: 8805685Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.Type: GrantFiled: August 5, 2013Date of Patent: August 12, 2014Assignee: AT&T Intellectual Property I, L.P.Inventor: Horst J. Schroeter
-
Patent number: 8781825Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.Type: GrantFiled: August 24, 2011Date of Patent: July 15, 2014Assignee: Sensory, IncorporatedInventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
-
Patent number: 8775181Abstract: Interpretation from a first language to a second language via one or more communication devices is performed through a communication network (e.g. phone network or the internet) using a server for performing recognition and interpretation tasks, comprising the steps of: receiving an input speech utterance in a first language on a first mobile communication device; conditioning said input speech utterance; first transmitting said conditioned input speech utterance to a server; recognizing said first transmitted speech utterance to generate one or more recognition results; interpreting said recognition results to generate one or more interpretation results in an interlingua; mapping the interlingua to a second language in a first selected format; second transmitting said interpretation results in the first selected format to a second mobile communication device; and presenting said interpretation results in a second selected format on said second communication device.Type: GrantFiled: July 2, 2013Date of Patent: July 8, 2014Assignee: Fluential, LLCInventors: Farzad Ehsani, Demitrios Master, Elaine Drom Zuber
-
Patent number: 8775180Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.Type: GrantFiled: November 26, 2012Date of Patent: July 8, 2014Assignee: West CorporationInventors: Mark J. Pettay, Fonda J. Narke
-
Patent number: 8768700Abstract: A system may receive a voice search query and may determine word hypotheses for the voice query. Each word hypothesis may include one or more terms. The system may obtain a search query log and may determine, for each word hypothesis, a quantity of other search queries, in the search query log, that include the one or more terms. The system may determine weights based on the determined quantities. The system may generate, based on the weights, a first search query from the word hypotheses and may obtain a first set of search results. The system may modify, based on the first set of search results, one or more of the weights. The system may generate a second search query from the word hypotheses and obtain, based on the second search query, a second set of search results for the voice query.Type: GrantFiled: September 14, 2012Date of Patent: July 1, 2014Assignee: Google Inc.Inventors: Alexander Mark Franz, Monika H. Henzinger, Sergey Brin, Brian Christopher Milch
-
Patent number: 8751145Abstract: A voice recognition method that is used for finding a street uses a database including information about a plurality of streets. The streets are characterized by respective street names and street types. A user provides a voice input for the street that the user tries to find. The voice input includes a street name and a street type. The street type is recognized by processing the voice input. Streets having the recognized street type are then selected from the database and a street name of at least one of the streets selected from the database is recognized by processing the voice input.Type: GrantFiled: November 30, 2005Date of Patent: June 10, 2014Assignees: Volkswagen of America, Inc., Audi AGInventors: Ramon Eduardo Prieto, Carsten Bergmann, William B. Lathrop, M. Kashif Imam, Gerd Gruchalski, Markus Möhrle
-
Patent number: 8731927Abstract: A system and method is provided for recognizing a speech input and selecting an entry from a list of entries. The method includes recognizing a speech input. A fragment list of fragmented entries is provided and compared to the recognized speech input to generate a candidate list of best matching entries based on the comparison result. The system includes a speech recognition module, and a data base for storing the list of entries and the fragmented list. The speech recognition module may obtain the fragmented list from the data base and store a candidate list of best matching entries in memory. A display may also be provided to allow a user to select from a list of best matching entries.Type: GrantFiled: March 18, 2013Date of Patent: May 20, 2014Assignee: Nuance Communications, Inc.Inventor: Markus Schwarz
-
Patent number: 8719023Abstract: An apparatus to improve robustness to environmental changes of a context dependent speech recognizer for an application, that includes a training database to store sounds for speech recognition training, a dictionary to store words supported by the speech recognizer, and a speech recognizer training module to train a set of one or more multiple state Hidden Markov Models (HMMs) with use of the training database and the dictionary. The speech recognizer training module performs a non-uniform state clustering process on each of the states of each HMM, which includes using a different non-uniform cluster threshold for at least some of the states of each HMM to more heavily cluster and correspondingly reduce a number of observation distributions for those of the states of each HMM that are less empirically affected by one or more contextual dependencies.Type: GrantFiled: May 21, 2010Date of Patent: May 6, 2014Assignee: Sony Computer Entertainment Inc.Inventors: Xavier Menendez-Pidal, Ruxin Chen
-
Patent number: 8719016Abstract: A method for converting speech to text in a speech analytics system is provided. The method includes receiving audio data containing speech made up of sounds from an audio source, processing the sounds with a phonetic module resulting in symbols corresponding to the sounds, and processing the symbols with a language module and occurrence table resulting in text. The method also includes determining a probability of correct translation for each word in the text, comparing the probability of correct translation for each word in the text to the occurrence table, and adjusting the occurrence table based on the probability of correct translation for each word in the text.Type: GrantFiled: April 7, 2010Date of Patent: May 6, 2014Assignee: Verint Americas Inc.Inventors: Omer Ziv, Ran Achituv, Ido Shapira
-
Patent number: 8712774Abstract: A hybrid text generator is disclosed that generates a hybrid text string from multiple text strings that are produced from an audio input by multiple automated speech recognition systems. The hybrid text generator receives metadata that describes a time-location that each word from the multiple text strings is located in the audio input. The hybrid text generator matches words between the multiple text strings using the metadata and generates a hybrid text string that includes the matched words. The hybrid text generator utilizes confidence scores associated with words that do not match between the multiple text strings to determine whether to add an unmatched word to the hybrid text string.Type: GrantFiled: March 29, 2010Date of Patent: April 29, 2014Assignee: Nuance Communications, Inc.Inventor: Jonathan Wiggs
-
Patent number: 8706489Abstract: A system and method for selecting audio contents by using the speech recognition to obtain a textual phrase from a series of audio contents are provided. The system includes an output module outputting the audio contents, an input module receiving a speech input from a user, a buffer temporarily storing the audio contents within a desired period and the speech input, and a recognizing module performing a speech recognition between the audio contents within the desired period and the speech input to generate an audio phrase and the corresponding textual phrase matching with the speech input.Type: GrantFiled: August 8, 2006Date of Patent: April 22, 2014Assignee: Delta Electronics Inc.Inventors: Jia-lin Shen, Chien-Chou Hung
-
Patent number: 8706503Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A text string is obtained from a speech input received from a user. Information is derived from a communication event that occurred at the electronic device prior to receipt of the speech input. The text string is interpreted to derive a plurality of candidate interpretations of user intent. One of the candidate user intents is selected based on the information relating to the communication event.Type: GrantFiled: December 21, 2012Date of Patent: April 22, 2014Assignee: Apple Inc.Inventors: Adam John Cheyer, Didier Rene Guzzoni, Thomas Robert Gruber, Christopher Dean Brigham
-
Patent number: 8706491Abstract: One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.Type: GrantFiled: August 24, 2010Date of Patent: April 22, 2014Assignee: Microsoft CorporationInventors: Ciprian Chelba, Milind Mahajan
-
Patent number: 8700259Abstract: A system for selecting music includes a mobile system for processing and transmitting through a wireless link a continuous voice stream spoken by a user of the mobile system, the continuous voice stream including a music request, and a data center for processing the continuous voice stream received through the wireless link into voice music information. The data center can perform automated voice recognition processing on the voice music information to recognize music components of the music request, confirm the recognized music components through interactive speech exchanges with the mobile system user through the wireless link and the mobile system, selectively allow human data center operator intervention to assist in identifying the selected recognized music components having a recognition confidence below a selected threshold value, and download music information pertaining to the music request for transmission to the mobile system derived from the confirmed recognized music components.Type: GrantFiled: March 15, 2013Date of Patent: April 15, 2014Assignee: Agero Connected Services, Inc.Inventor: Thomas Barton Schalk
-
Patent number: 8700398Abstract: An interactive user interface is described for setting confidence score thresholds in a language processing system. There is a display of a first system confidence score curve characterizing system recognition performance associated with a high confidence threshold, a first user control for adjusting the high confidence threshold and an associated visual display highlighting a point on the first system confidence score curve representing the selected high confidence threshold, a display of a second system confidence score curve characterizing system recognition performance associated with a low confidence threshold, and a second user control for adjusting the low confidence threshold and an associated visual display highlighting a point on the second system confidence score curve representing the selected low confidence threshold. The operation of the second user control is constrained to require that the low confidence threshold must be less than or equal to the high confidence threshold.Type: GrantFiled: November 29, 2011Date of Patent: April 15, 2014Assignee: Nuance Communications, Inc.Inventors: Jeffrey N. Marcus, Amy E. Ulug, William Bridges Smith, Jr.
-
Patent number: 8694317Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.Type: GrantFiled: February 6, 2006Date of Patent: April 8, 2014Assignee: Aurix LimitedInventors: Adrian I Skilling, Howard A K Wright
-
Patent number: 8688451Abstract: A speech recognition method includes receiving input speech from a user, processing the input speech using a first grammar to obtain parameter values of a first N-best list of vocabulary, comparing a parameter value of a top result of the first N-best list to a threshold value, and if the compared parameter value is below the threshold value, then additionally processing the input speech using a second grammar to obtain parameter values of a second N-best list of vocabulary. Other preferred steps include: determining the input speech to be in-vocabulary if any of the results of the first N-best list is also present within the second N-best list, but out-of-vocabulary if none of the results of the first N-best list is within the second N-best list; and providing audible feedback to the user if the input speech is determined to be out-of-vocabulary.Type: GrantFiled: May 11, 2006Date of Patent: April 1, 2014Assignee: General Motors LLCInventors: Timothy J. Grost, Rathinavelu Chengalvarayan
-
Patent number: 8688453Abstract: According to example configurations, a speech processing system can include a syntactic parser, a word extractor, word extraction rules, and an analyzer. The syntactic parser of the speech processing system parses the utterance to identify syntactic relationships amongst words in the utterance. The word extractor utilizes word extraction rules to identify groupings of related words in the utterance that most likely represent an intended meaning of the utterance. The analyzer in the speech processing system maps each set of the sets of words produced by the word extractor to a respective candidate intent value to produce a list of candidate intent values for the utterance. The analyzer is configured to select, from the list of candidate intent values (i.e., possible intended meanings) of the utterance, a particular candidate intent value as being representative of the intent (i.e., intended meaning) of the utterance.Type: GrantFiled: February 28, 2011Date of Patent: April 1, 2014Assignee: Nuance Communications, Inc.Inventors: Sachindra Joshi, Shantanu Godbole
-
Patent number: 8682669Abstract: A system and a method to generate statistical utterance classifiers optimized for the individual states of a spoken dialog system is disclosed. The system and method make use of large databases of transcribed and annotated utterances from calls collected in a dialog system in production and log data reporting the association between the state of the system at the moment when the utterances were recorded and the utterance. From the system state, being a vector of multiple system variables, subsets of these variables, certain variable ranges, quantized variable values, etc. can be extracted to produce a multitude of distinct utterance subsets matching every possible system state. For each of these subset and variable combinations, statistical classifiers can be trained, tuned, and tested, and the classifiers can be stored together with the performance results and the state subset and variable combination.Type: GrantFiled: August 21, 2009Date of Patent: March 25, 2014Assignee: Synchronoss Technologies, Inc.Inventors: David Suendermann, Jackson Liscombe, Krishna Dayanidhi, Roberto Pieraccini
-
Patent number: 8676580Abstract: A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition.Type: GrantFiled: August 16, 2011Date of Patent: March 18, 2014Assignee: International Business Machines CorporationInventors: Om D. Deshmukh, Etienne Marcheret, Shajith I. Mohamed, Ashish Verma, Karthik Visweswariah
-
Patent number: 8676579Abstract: A method of authenticating a user of a mobile device having a first microphone and a second microphone, the method comprising receiving voice input from the user at the first and second microphones, determining a position of the user relative to the mobile device based on the voice input received by the first and second microphones, and authenticating the user based on the position of the user.Type: GrantFiled: April 30, 2012Date of Patent: March 18, 2014Assignee: BlackBerry LimitedInventor: James Allen Hymel
-
Patent number: 8676577Abstract: A method of utilizing metadata stored in a computer-readable medium to assist in the conversion of an audio stream to a text stream. The method compares personally identifiable data, such as a user's electronic address book and/or Caller/Recipient ID information (in the case of processing voice mail to text), to the n-best results generated by a speech recognition engine for each word that is output by the engine. A goal of this comparison is to correct a possible misrecognition of a spoken proper noun such as a name or company with its proper textual form or a spoken phone number to correctly formatted phone number with Arabic numerals to improve the overall accuracy of the output of the voice recognition system.Type: GrantFiled: March 31, 2009Date of Patent: March 18, 2014Assignee: Canyon IP Holdings, LLCInventors: Igor Roditis Jablokov, Clifford J. Strohofer, III, Marc White, Victor Roditis Jablokov
-
Patent number: 8666743Abstract: The invention provides a speech recognition method for selecting a combination of list elements via a speech input, wherein a first list element of the combination is part of a first set of list elements and a second list element of the combination is part of a second set of list elements, the method comprising the steps of receiving the speech input, comparing each list element of the first set with the speech input to obtain a first candidate list of best matching list elements, processing the second set using the first candidate list to obtain a subset of the second set, comparing each list element of the subset of the second set with the speech input to obtain a second candidate list of best matching list elements, and selecting a combination of list elements using the first and the second candidate list.Type: GrantFiled: June 2, 2010Date of Patent: March 4, 2014Assignee: Nuance Communications, Inc.Inventors: Markus Schwarz, Matthias Schulz, Marc Biedert, Christian Hillebrecht, Franz Gerl, Udo Haiber
-
Patent number: 8666744Abstract: A method and apparatus are provided for automatically acquiring grammar fragments for recognizing and understanding fluently spoken language. Grammar fragments representing a set of syntactically and semantically similar phrases may be generated using three probability distributions: of succeeding words, of preceding words, and of associated call-types. The similarity between phrases may be measured by applying Kullback-Leibler distance to these three probability distributions. Phrases being close in all three distances may be clustered into a grammar fragment.Type: GrantFiled: September 21, 2000Date of Patent: March 4, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Kazuhiro Arai, Allen L. Gorin, Giuseppe Riccardi, Jeremy H. Wright
-
Patent number: 8666729Abstract: Creating and processing a natural language grammar set of data based on an input text string are disclosed. The method may include tagging the input text string, and examining, via a processor, the input text string for at least one first set of substitutions based on content of the input text string. The method may also include determining whether the input text string is a substring of a previously tagged input text string by comparing the input text string to a previously tagged input text string, such that the substring determination operation determines whether the input text string is wholly included in the previously tagged input text string.Type: GrantFiled: February 10, 2010Date of Patent: March 4, 2014Assignee: West CorporationInventor: Steven John Schanbacher
-
Patent number: 8660845Abstract: Systems and methods for audio editing are provided. In one implementation, a computer-implemented method is provided. The method includes receiving digital audio data including a plurality of distinct vocal components. Each distinct vocal component is automatically identified using one or more attributes that uniquely identify each distinct vocal component. The audio data is separated into two or more individual tracks where each individual track comprises audio data corresponding to one distinct vocal component. The separated individual tracks are then made available for further processing.Type: GrantFiled: October 16, 2007Date of Patent: February 25, 2014Assignee: Adobe Systems IncorporatedInventors: Nariman Sodeifi, David E. Johnston
-
Patent number: 8639509Abstract: In a confidence computing method and system, a processor may interpret speech signals as a text string or directly receive a text string as input, generate a syntactical parse tree representing the interpreted string and including a plurality of sub-trees which each represents a corresponding section of the interpreted text string, determine for each sub-tree whether the sub-tree is accurate, obtain replacement speech signals for each sub-tree determined to be inaccurate, and provide output based on corresponding text string sections of at least one sub-tree determined to be accurate.Type: GrantFiled: July 27, 2007Date of Patent: January 28, 2014Assignee: Robert Bosch GmbHInventors: Fuliang Weng, Feng Lin, Zhe Feng
-
Patent number: 8639507Abstract: The present invention enables the recognition process at high speed even when a lot of garbage is included in the grammar. The first voice recognition processing unit generates a recognition hypothesis graph which indicates a structure of hypothesis that is derived according to a first grammar together with a score associated with respective connections of a recognition unit by executing a voice recognition process based on the first grammar to a voice feature amount of input voice, and the second voice recognition processing unit outputs the recognition result from a total score of a hypothesis which is derived according to a second grammar after executing a voice recognition process according to the second grammar that is specified to accept a section other than keywords in input voice as the garbage section to a voice feature amount of input voice, and the second voice recognition processing unit acquires the structure and the score of the garbage section from the recognition hypothesis graph.Type: GrantFiled: December 22, 2008Date of Patent: January 28, 2014Assignee: NEC CorporationInventors: Fumihiro Adachi, Ryosuke Isotani, Ken Hanazawa
-
Patent number: 8630860Abstract: Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results.Type: GrantFiled: March 3, 2011Date of Patent: January 14, 2014Assignee: Nuance Communications, Inc.Inventors: Shilei Zhang, Shenghua Bao, Wen Liu, Yong Qin, Zhiwei Shuang, Jian Chen, Zhong Su, Qin Shi, William F. Ganong, III
-
Patent number: 8620654Abstract: A system in one embodiment includes a server associated with a unified messaging system (UMS). The server records speech of a user as an audio data file, translates the audio data file into a text data file, and maps each word within the text data file to a corresponding segment of audio data in the audio data file. A graphical user interface (GUI) of a message editor running on an endpoint associated with the user displays the text data file on the endpoint and allows the user to identify a portion of the text data file for replacement. The server being further operable to record new speech of the user as new audio data and to replace one or more segments of the audio data file corresponding to the portion of the text with the new audio data.Type: GrantFiled: July 20, 2007Date of Patent: December 31, 2013Assignee: Cisco Technology, Inc.Inventors: Joseph F. Khouri, Laurent Philonenko, Mukul Jain, Shmuel Shaffer
-
Patent number: 8612221Abstract: A portable terminal having an audio pickup means that acquires sound, an absolute position detection unit that detects the absolute position of the portable terminal, a relative position detection unit that detects the relative position of the portable terminal, and a speech recognition and synthesis unit that recognizes the audio acquired by the audio pickup means as speech, is achieved with a simple configuration. A portable terminal (1) that exchanges data with a server (2) has disposed to the portable terminal an audio pickup means that acquires sound, an absolute position detection unit (1-1) that detects the absolute position of the portable terminal, a relative position detection unit (1-2) that detects the relative position of the portable terminal, and a speech recognition and synthesis unit (1-3) that recognizes the audio acquired by the audio pickup means as speech.Type: GrantFiled: February 2, 2010Date of Patent: December 17, 2013Assignee: Seiko Epson CorporationInventors: Junichi Yoshizawa, Tetsuo Ozawa, Koji Koseki
-
Patent number: 8606560Abstract: An interpretation system that includes an optical or audio acquisition device for acquiring a sentence written or spoke in a source language and an audio restoration device for generating, from an input signal acquired by the acquisition device, a source sentence that is a transcription of the sentence in the source language. The interpretation system further includes a translation device for generating, from the source sentence, a target sentence that is a translation of the source sentence in a target language, and a speech synthesis device for generating, from the target sentence, an output audio signal reproduced by the audio restoration device. The interpretation system includes a smoothing device for calling the recognition, translation and speech synthesis devices in order to produce in real time an interpretation in the target language of the sentence in the source language.Type: GrantFiled: November 18, 2008Date of Patent: December 10, 2013Inventor: Jean Grenier
-
Patent number: 8606568Abstract: Methods, computer program products, and systems are described for receiving, by a speech recognition engine, audio data that encodes an utterance and determining, by the speech recognition engine, that a transcription of the utterance includes one or more keywords associated with a command, and a pronoun. In addition, the methods, computer program products, and systems described herein pertain to transmitting a disambiguation request to an application, wherein the disambiguation request identifies the pronoun, receiving, by the speech recognition engine, a response to the disambiguation request, wherein the response references an item of content identified by the application, and generating, by the speech recognition engine, the command using the keywords and the response.Type: GrantFiled: October 23, 2012Date of Patent: December 10, 2013Assignee: Google Inc.Inventors: Simon Tickner, Richard Z. Cohen
-
Patent number: 8589161Abstract: A system and method for an integrated, multi-modal, multi-device natural language voice services environment may be provided. In particular, the environment may include a plurality of voice-enabled devices each having intent determination capabilities for processing multi-modal natural language inputs in addition to knowledge of the intent determination capabilities of other devices in the environment. Further, the environment may be arranged in a centralized manner, a distributed peer-to-peer manner, or various combinations thereof. As such, the various devices may cooperate to determine intent of multi-modal natural language inputs, and commands, queries, or other requests may be routed to one or more of the devices best suited to take action in response thereto.Type: GrantFiled: May 27, 2008Date of Patent: November 19, 2013Assignee: VoiceBox Technologies, Inc.Inventors: Robert A. Kennewick, Chris Weider
-
Patent number: 8583436Abstract: A word category estimation apparatus (100) includes a word category model (5) which is formed from a probability model having a plurality of kinds of information about a word category as features, and includes information about an entire word category graph as at least one of the features. A word category estimation unit (4) receives the word category graph of a speech recognition hypothesis to be processed, computes scores by referring to the word category model for respective arcs that form the word category graph, and outputs a word category sequence candidate based on the scores.Type: GrantFiled: December 19, 2008Date of Patent: November 12, 2013Assignee: NEC CorporationInventors: Hitoshi Yamamoto, Kiyokazu Miki
-
Patent number: 8571869Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.Type: GrantFiled: May 15, 2008Date of Patent: October 29, 2013Assignee: Nuance Communications, Inc.Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
-
Patent number: 8571849Abstract: Disclosed herein are systems, methods, and computer readable-media for enriching spoken language translation with prosodic information in a statistical speech translation framework. The method includes receiving speech for translation to a target language, generating pitch accent labels representing segments of the received speech which are prosodically prominent, and injecting pitch accent labels with word tokens within the translation engine to create enriched target language output text. A further step may be added of synthesizing speech in the target language based on the prosody enriched target language output text. An automatic prosody labeler can generate pitch accent labels. An automatic prosody labeler can exploit lexical, syntactic, and prosodic information of the speech. A maximum entropy model may be used to determine which segments of the speech are prosodically prominent.Type: GrantFiled: September 30, 2008Date of Patent: October 29, 2013Assignee: AT&T Intellectual Property I, L.P.Inventors: Srinivas Bangalore, Vivek Kumar Rangarajan Sridhar
-
Patent number: 8571858Abstract: For classifying different segments of a signal which has segments of at least a first type and second type, e.g. audio and speech segments, the signal is short-term classified on the basis of the at least one short-term feature extracted from the signal and a short-term classification result is delivered. The signal is also long-term classified on the basis of the at least one short-term feature and at least one long-term feature extracted from the signal and a long-term classification result is delivered. The short-term classification result and the long-term classification result are combined to provide an output signal indicating whether a segment of the signal is of the first type or of the second type.Type: GrantFiled: January 11, 2011Date of Patent: October 29, 2013Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.Inventors: Guillaume Fuchs, Stefan Bayer, Jens Hirschfeld, Juergen Herre, Jeremie Lecomte, Frederik Nagel, Nikolaus Rettelbach, Stefan Wabnik, Yoshikazu Yokotani
-
Patent number: 8566097Abstract: A lexical acquisition apparatus includes: a phoneme recognition section 2 for preparing a phoneme sequence candidate from an inputted speech; a word matching section 3 for preparing a plurality of word sequences based on the phoneme sequence candidate; a discrimination section 4 for selecting, from among a plurality of word sequences, a word sequence having a high likelihood in a recognition result; an acquisition section 5 for acquiring a new word based on the word sequence selected by the discrimination section 4; a teaching word list 4A used to teach a name; and a probability model 4B of the teaching word and an unknown word, wherein the discrimination section 4 calculates, for each word sequence, a first evaluation value showing how much words in the word sequence correspond to teaching words in the list 4A and a second evaluation value showing a probability at which the words in the word sequence are adjacent to one another and selects a word sequence for which a sum of the first evaluation value and theType: GrantFiled: June 1, 2010Date of Patent: October 22, 2013Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute InternationalInventors: Mikio Nakano, Takashi Nose, Ryo Taguchi, Kotaro Funakoshi, Naoto Iwahashi
-
Patent number: 8564721Abstract: The addition of temporal positions to an inverted index allows for temporal queries in addition to phrase queries. Store additional binary data for each term instance in the word-level index to prepare for searching in response to time-based queries from a user is accomplished through the use of Lucene's binary payload feature where the payload structure is defined for use in such searches. The pre-defined payload fields consist of three integers, which account for 12 extra bytes that must be stored for each term instance. A content database on the Master/Administrator server node provides the indexes for search into content in response to user events, returning results in JSON format. The search results may then be used to locate and present content segments to a user containing both requested search term results and the time location and duration within a content asset where the search term(s) is found.Type: GrantFiled: August 28, 2012Date of Patent: October 22, 2013Inventors: Matthew Berry, Changwen Yang
-
Patent number: 8560373Abstract: A method for direct marketing comprising establishing a first communications link between a prospective customer using a device having a unique identification number and a communications device, automatically transmitting the unique identification number associated with the prospective customer's device to the communications device, establishing a second communications link between the communications device and a computer operably connected to a memory apparatus having a prospective customer database comprising prospective customer information associated with the unique identification number of the prospective customer's device, in which the information in the database determines prospective customer value which can be used to determine subsequent operations and marketing actions with the prospective customer.Type: GrantFiled: September 25, 2008Date of Patent: October 15, 2013Inventor: Eileen A. Fraser
-
Patent number: 8560324Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal, a memory configured to store information related to operations performed on the mobile terminal, and a controller configured to activate the voice recognition function upon receiving the input to activate the voice recognition function, to determine a meaning of an input voice instruction based on at least one prior operation performed on the mobile terminal and a language included in the voice instruction, and to provide operations related to the determined meaning of the input voice instruction based on the at least one prior operation performed on the mobile terminal and the language included in the voice instruction and based on a probability that the determined meaning of the input voice instruction matches the information related to the operations of the mobile terminal.Type: GrantFiled: January 31, 2012Date of Patent: October 15, 2013Assignee: LG Electronics Inc.Inventors: Jong-Ho Shin, Jae-Do Kwak, Jong-Keun Youn
-
Patent number: 8548806Abstract: A voice recognition device, a voice recognition method and a voice recognition program capable of appropriately restricting recognition objects based on voice input from a user to recognize the input voice with accuracy are provided.Type: GrantFiled: September 11, 2007Date of Patent: October 1, 2013Assignee: Honda Motor Co. Ltd.Inventor: Hisayuki Nagashima
-
Patent number: 8527271Abstract: A method for the voice recognition of a spoken expression to be recognized, comprising a plurality of expression parts that are to be recognized. Partial voice recognition takes place on a first selected expression part, and depending on a selection of hits for the first expression part detected by the partial voice recognition, voice recognition on the first and further expression parts is executed.Type: GrantFiled: June 18, 2008Date of Patent: September 3, 2013Assignee: Nuance Communications, Inc.Inventors: Michael Wandinger, Jesus Fernando Guitarte Perez, Bernhard Littel
-
Patent number: 8521526Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.Type: GrantFiled: July 28, 2010Date of Patent: August 27, 2013Assignee: Google Inc.Inventors: Matthew I. Lloyd, Johan Schalkwyk, Pankaj Risbood
-
Patent number: 8521537Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.Type: GrantFiled: April 3, 2007Date of Patent: August 27, 2013Assignee: Promptu Systems CorporationInventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
-
Patent number: 8521531Abstract: A speech search method for a display device is discussed. The method includes the steps of outputting media data, receiving a speech search command from a user, and determining whether the speech search command includes a query term. If the speech search command does not include a query term, the method further comprises the step of extracting a query term which is full and searchable from audio data of the media data which is outputted immediately prior to the speech search command. Finally, the method includes the step of performing a speech search using the extracted query term.Type: GrantFiled: February 6, 2013Date of Patent: August 27, 2013Assignee: LG Electronics Inc.Inventor: Yongsin Kim