Natural Language Patents (Class 704/257)

Methods and systems for speech recognition processing using search query information

Patent number: 8768698

Abstract: Methods and systems for speech recognition processing are described. In an example, a computing device may be configured to receive information indicative of a frequency of submission of a search query to a search engine for a search query composed of a sequence of words. Based on the frequency of submission of the search query exceeding a threshold, the computing device may be configured to determine groupings of one or more words of the search query based on an order in which the one or more words occur in the sequence of words of the search query. Further, the computing device may be configured to provide information indicating the groupings to a speech recognition system.

Type: Grant

Filed: September 24, 2013

Date of Patent: July 1, 2014

Assignee: Google Inc.

Inventors: Pedro J. Moreno Mengibar, Jeffrey Scott Sorensen, Eugene Weinstein
Electronic devices with voice command and contextual data processing capabilities

Patent number: 8762469

Abstract: An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.

Type: Grant

Filed: September 5, 2012

Date of Patent: June 24, 2014

Assignee: Apple Inc.

Inventor: Aram M. Lindahl
System and method for improving name dialer performance

Patent number: 8762153

Abstract: Disclosed herein are systems, methods, and computer readable-media for improving name dialer performance. The method includes receiving a speech query for a name in a directory of names, retrieving matches to the query, if the matches are uniquely spelled homophones or near-homophones, identifying information that is unique to all retrieved matches, and presenting a spoken disambiguation statement to a user that incorporates the identified unique information. Identifying information can include multiple pieces of unique information if necessary to completely disambiguate the matches. A hierarchy can establish priority of multiple pieces of unique information for use in the spoken disambiguation statement.

Type: Grant

Filed: September 12, 2008

Date of Patent: June 24, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Steven Hart Lewis, Michael T. Czahor, III, Ramkishore Dudi, Susan Helen Pearsall
Speech recognition system interactive agent

Patent number: 8762152

Abstract: Methods and systems for performing speech recognition using an electronic interactive agent are disclosed. In embodiments of the invention, an electronic agent is presented in a form perceptible to a user. The electronic agent is used to solicit speech input from a user and to respond to the user's recognized speech, and mimics the behavior of a human agent in a natural language query session with the user. The electronic agent may be implemented in a distributed speech recognition system in which speech recognition tasks are divided between client and server.

Type: Grant

Filed: October 1, 2007

Date of Patent: June 24, 2014

Assignee: Nuance Communications, Inc.

Inventors: Ian Bennett, Bandi Ramesh Babu, Kishor Morkhandikar, Pallaki Gururaj
Handheld voice activated spelling device

Patent number: 8756063

Abstract: A handheld voice activated spelling device includes a housing and a cover secured to the housing. A power source and microphone are mounted within the housing and control switches are operable from the front surface of the housing. A first memory has a plurality of words stored therein. A second memory has a plurality of word definitions stored therein, each associated with a respective word. A speech recognition apparatus is coupled to the microphone and to the first memory and responsive to the electronic signals generated by the microphone for selecting at least one word from the first memory representative of the specific word spoken by the user. A display is provided for displaying the plurality of words and the plurality of word definitions when the user operates a selected control switch. A related word classification can be also selected from a third memory and displayed on the display.

Type: Grant

Filed: November 20, 2007

Date of Patent: June 17, 2014

Inventors: Samuel A. McDonald, William H. McDonald, Regina McDonald
INTEGRATED LANGUAGE MODEL, RELATED SYSTEMS AND METHODS

Publication number: 20140163989

Abstract: An integrated language model includes an upper-level language model component and a lower-level language model component, with the upper-level language model component including a non-terminal and the lower-level language model component being applied to the non-terminal. The upper-level and lower-level language model components can be of the same or different language model formats, including finite state grammar (FSG) and statistical language model (SLM) formats. Systems and methods for making integrated language models allow designation of language model formats for the upper-level and lower-level components and identification of non-terminals. Automatic non-terminal replacement and retention criteria can be used to facilitate the generation of one or both language model components, which can include the modification of existing language models.

Type: Application

Filed: July 30, 2013

Publication date: June 12, 2014

Applicant: ADACEL SYSTEMS, INC.

Inventors: Chang Qing Shu, Han Shu, John M. Merwin
METHOD AND APPARATUS FOR CORRECTING SPEECH RECOGNITION ERROR

Publication number: 20140163975

Abstract: Disclosed are a speech recognition error correction method and an apparatus thereof. The speech recognition error correction method includes determining a likelihood that a speech recognition result is erroneous, and if the likelihood that the speech recognition result is erroneous is higher than a predetermined standard, generating a parallel corpus according to whether the speech recognition result matches the correct answer corpus, generating a speech recognition model based on the parallel corpus, and correcting an erroneous speech recognition result based on the speech recognition model and the language model. Accordingly, speech recognition errors are corrected.

Type: Application

Filed: November 22, 2013

Publication date: June 12, 2014

Applicant: POSTECH ACADEMY - INDUSTRY FOUNDATION

Inventors: Geun Bae Lee, Jun Hwi Choi, In Jae Lee, Dong Hyeon Lee, Hong Suck Seo, Yong Hee Kim, Seong Han Ryu, Sang Jun Koo
Multi-modal input on an electronic device

Patent number: 8751217

Abstract: A computer-implemented input-method editor process includes receiving a request from a user for an application-independent input method editor having written and spoken input capabilities, identifying that the user is about to provide spoken input to the application-independent input method editor, and receiving a spoken input from the user. The spoken input corresponds to input to an application and is converted to text that represents the spoken input. The text is provided as input to the application.

Type: Grant

Filed: September 29, 2011

Date of Patent: June 10, 2014

Assignee: Google Inc.

Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, William J. Byrne, Gudmundur Hafsteinsson, Michael J. LeBeau
System and method for targeted tuning of a speech recognition system

Patent number: 8751232

Abstract: A system and method of targeted tuning of a speech recognition system are disclosed. A particular method includes detecting that a frequency of occurrence of a particular type of utterance satisfies a threshold. The method further includes tuning a speech recognition system with respect to the particular type of utterance.

Type: Grant

Filed: February 6, 2013

Date of Patent: June 10, 2014

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Robert R. Bushey, Benjamin Anthony Knott, John Mills Martin
Communication device for determining contextual information

Patent number: 8751234

Abstract: A method and communication device for determining contextual information is provided. Textual information is received from at least one of an input device and a communication interface at the communication device. The textual information is processed to automatically extract contextual data embedded in the textual information in response to the receiving. Supplementary contextual data is automatically retrieved based on the contextual data from a remote data source via the communication interface in response to the processing. The supplementary contextual data is automatically rendered at the display device in association with the contextual data in response to receiving the supplementary contextual data.

Type: Grant

Filed: April 27, 2011

Date of Patent: June 10, 2014

Assignee: BlackBerry Limited

Inventors: Jasjit Singh, Suzanne Abellera, Shakila Shahul Hameed, Ankur Aggarwal, Carol C. Wu, Paxton Ronald Cooper, Robert Felice Mori
CONTENT SEARCHING APPARATUS, CONTENT SEARCH METHOD, AND CONTROL PROGRAM PRODUCT

Publication number: 20140156279

Abstract: According to one embodiment, a content searching apparatus includes: a search condition generator configured to perform voice recognition in parallel with an input of a natural language voice giving an instruction for a search for a piece of content, and to generate search conditions sequentially; a searching module configured to perform a content search while updating the search condition used in the search as the search condition is generated; and a search result display configured to update the search condition used in the content search and a result of the content search based on the search condition to be displayed as the search condition is generated.

Type: Application

Filed: September 11, 2013

Publication date: June 5, 2014

Applicant: Kabushiki Kaisha Toshiba

Inventors: Masayuki OKAMOTO, Hiroko FUJII, Daisuke SANO, Masaru SAKAI
Recognition of target words using designated characteristic values

Patent number: 8744839

Abstract: Target word recognition includes: obtaining a candidate word set and corresponding characteristic computation data, the candidate word set comprising text data, and characteristic computation data being associated with the candidate word set; performing segmentation of the characteristic computation data to generate a plurality of text segments; combining the plurality of text segments to form a text data combination set; determining an intersection of the candidate word set and the text data combination set, the intersection comprising a plurality of text data combinations; determining a plurality of designated characteristic values for the plurality of text data combinations; based at least in part on the plurality of designated characteristic values and according to at least a criterion, recognizing among the plurality of text data combinations target words whose characteristic values fulfill the criterion.

Type: Grant

Filed: September 22, 2011

Date of Patent: June 3, 2014

Assignee: Alibaba Group Holding Limited

Inventors: Haibo Sun, Yang Yang, Yining Chen
System and method for processing multi-modal device interactions in a natural language voice services environment

Patent number: 8738380

Abstract: A system and method for processing multi-modal device interactions in a natural language voice services environment may be provided. In particular, one or more multi-modal device interactions may be received in a natural language voice services environment that includes one or more electronic devices. The multi-modal device interactions may include a non-voice interaction with at least one of the electronic devices or an application associated therewith, and may further include a natural language utterance relating to the non-voice interaction. Context relating to the non-voice interaction and the natural language utterance may be extracted and combined to determine an intent of the multi-modal device interaction, and a request may then be routed to one or more of the electronic devices based on the determined intent of the multi-modal device interaction.

Type: Grant

Filed: December 3, 2012

Date of Patent: May 27, 2014

Assignee: VoiceBox Technologies Corporation

Inventors: Larry Baldwin, Chris Weider
Predicting and learning carrier phrases for speech input

Patent number: 8738377

Abstract: Predicting and learning users' intended actions on an electronic device based on free-form speech input. Users' actions can be monitored to develop of a list of carrier phrases having one or more actions that correspond to the carrier phrases. A user can speak a command into a device to initiate an action. The spoken command can be parsed and compared to a list of carrier phrases. If the spoken command matches one of the known carrier phrases, the corresponding action(s) can be presented to the user for selection. If the spoken command does not match one of the known carrier phrases, search results (e.g., Internet search results) corresponding to the spoken command can be presented to the user. The actions of the user in response to the presented action(s) and/or the search results can be monitored to update the list of carrier phrases.

Type: Grant

Filed: June 7, 2010

Date of Patent: May 27, 2014

Assignee: Google Inc.

Inventors: William J. Byrne, Alexander H. Gruenstein, Douglas Beeferman
Multitask learning for spoken language understanding

Patent number: 8738379

Abstract: Systems for improving or generating a spoken language understanding system using a multitask learning method for intent or call-type classification. The multitask learning method aims at training tasks in parallel while using a shared representation. A computing device automatically re-uses the existing labeled data from various applications, which are similar but may have different call-types, intents or intent distributions to improve the performance. An automated intent mapping algorithm operates across applications. In one aspect, active learning is employed to selectively sample the data to be re-used.

Type: Grant

Filed: December 28, 2009

Date of Patent: May 27, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventor: Gokhan Tur
Method and system for creating natural language understanding grammars

Patent number: 8738384

Abstract: Grammars for interactive voice response systems using natural language understanding can be created using information which is available on websites. These grammars can be created in automated manners and can have various tuning measures applied to obtain optimal results when deployed in a customer contact environment. These grammars can allow a variety of statements to be appropriately handled by the system.

Type: Grant

Filed: November 16, 2012

Date of Patent: May 27, 2014

Assignee: Convergys CMG Utah Inc.

Inventors: Dhananjay Bansal, Nancy Gardner, Chang-Qing Shu, Kristie Goss, Matthew Yuschik, Sunil Issar, Woosung Kim, Jayant M. Naik
Contextual voice query dilation to improve spoken web searching

Patent number: 8731930

Abstract: A method for contextual voice query dilation in a Spoken Web search includes determining a context in which a voice query is created, generating a set of multiple voice query terms based on the context and information derived by a speech recognizer component pertaining to the voice query, and processing the set of query terms with at least one dilation operator to produce a dilated set of queries. A method for performing a search on a voice query is also provided, including generating a set of multiple query terms based on information derived by a speech recognizer component processing a voice query, processing the set with multiple dilation operators to produce multiple dilated sub-sets of query terms, selecting at least one query term from each dilated sub-set to compose a query set, and performing a search on the query set.

Type: Grant

Filed: August 8, 2012

Date of Patent: May 20, 2014

Assignee: International Business Machines Corporation

Inventors: Nitendra Rajput, Kundan Shrivastava
Document transcription system training

Patent number: 8731920

Abstract: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system my identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

Type: Grant

Filed: November 30, 2012

Date of Patent: May 20, 2014

Assignee: MModal IP LLC

Inventors: Girija Yegnanarayanan, Michael Finke, Juergen Fritsch, Detlef Koll, Monika Woszczyna
Agent architecture for determining meanings of natural language utterances

Patent number: 8731929

Abstract: Systems and methods for receiving natural language queries and/or commands and execute the queries and/or commands. The systems and methods overcomes the deficiencies of prior art speech query and response systems through the application of a complete speech-based information query, retrieval, presentation and command environment. This environment makes significant use of context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command.

Type: Grant

Filed: February 4, 2009

Date of Patent: May 20, 2014

Assignee: VoiceBox Technologies Corporation

Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, Sr., Michael R. Kennewick, Jr., Richard Kennewick, Tom Freeman
Knowledge re-use for call routing

Patent number: 8732114

Abstract: A method is described for semantic classification in human-machine dialog applications, for example, call routing. Utterances in a new training corpus of a new semantic classification application are tagged using a pre-existing semantic classifier and associated pre-existing classification tags trained for an earlier semantic classification application.

Type: Grant

Filed: April 13, 2009

Date of Patent: May 20, 2014

Assignee: Nuance Communications, Inc.

Inventor: Ding Liu
System and method for enhancing comprehension and readability of text

Patent number: 8731905

Abstract: The present invention is a text display system with speech output that uses a method of text segmentation in which segments of text are presented one after another for reading text sequentially. To indicate the location of text a user is currently reading, the current sentence is emphasized by presenting the surrounding text in faded colors. The current sentence is segmented into phrases where the points of segmentation are chosen by a series of grammatical rules and the desired number of words in each segment. When the text is presented sequentially, each segment is highlighted within the current sentence. With the use of a text-to-speech output system, each segment is spoken out with a pause before the next segment is presented. In a non-linear/selective reading scenario, a user can select a text segment, for which the span of the segment can be automatically generated or manually selected by the user.

Type: Grant

Filed: February 22, 2013

Date of Patent: May 20, 2014

Assignee: Quillsoft Ltd.

Inventors: Vivian Tsang, David Jacob, Fraser Shein
Providing expressive user interaction with a multimodal application

Patent number: 8725513

Abstract: Methods, apparatus, and products are disclosed for providing expressive user interaction with a multimodal application, the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a speech engine through a VoiceXML interpreter, including: receiving, by the multimodal browser, user input from a user through a particular mode of user interaction; determining, by the multimodal browser, user output for the user in dependence upon the user input; determining, by the multimodal browser, a style for the user output in dependence upon the user input, the style specifying expressive output characteristics for at least one other mode of user interaction; and rendering, by the multimodal browser, the user output in dependence upon the style.

Type: Grant

Filed: April 12, 2007

Date of Patent: May 13, 2014

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Ellen M. Eide, Igor R. Jablokov
Recognizing multiple semantic items from single utterance

Patent number: 8725492

Abstract: Semantically distinct items are extracted from a single utterance by repeatedly recognizing the same utterance using constraints provided by semantic items already recognized. User feedback for selection or correction of partially recognized utterance may be used in a hierarchical, multi-modal, or single step manner. An accuracy of recognition is preserved while the less structured and more natural single utterance recognition form is allowed to be used.

Type: Grant

Filed: March 5, 2008

Date of Patent: May 13, 2014

Assignee: Microsoft Corporation

Inventors: Julian J Odell, Robert L. Chambers, Oliver Scholz
Enhanced accuracy for speech recognition grammars

Patent number: 8725511

Abstract: Disclosed herein are methods and systems for recognizing speech. A method embodiment comprises comparing received speech with a precompiled grammar based on a database and if the received speech matches data in the precompiled grammar then returning a result based on the matched data. If the received speech does not match data in the precompiled grammar, then dynamically compiling a new grammar based only on new data added to the database after the compiling of the precompiled grammar. The database may comprise a directory of names.

Type: Grant

Filed: July 2, 2013

Date of Patent: May 13, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Harry Blanchard, Steven H. Lewis, Shankarnarayan Sivaprasad, Lan Zhang
Customization of a natural language processing engine

Patent number: 8725496

Abstract: A method, an apparatus and an article of manufacture for customizing a natural language processing engine. The method includes enabling selection of one or more parameters of a desired natural language processing task, the one or more parameters intended for use by a trained and an untrained user, mapping the one or more selected parameters to a collection of one or more intervals of an input parameter to an optimization algorithm, and applying the optimization algorithm with the collection of one or more intervals of an input parameter to a model used by a natural language processing engine to produce a customized model.

Type: Grant

Filed: July 26, 2011

Date of Patent: May 13, 2014

Assignee: International Business Machines Corporation

Inventors: Bing Zhao, Vittorio Castelli
Pervasive contact center

Patent number: 8718622

Abstract: Methods and systems that support the receipt of location data and/or touch data from a mobile communication device are provided. More particularly, a mobile customer service server is provided that can receive location data from or regarding a mobile communication device, and associate that location data with recognition data. The recognition data can in turn be delivered to other server side components, and used to select content to be returned to the mobile communication device. The mobile customer service server can also receive touch data input to the mobile communication device, and can provide recognition data related to the touch input to other server side component. Server side components provided with location or touch data by the mobile customer service server do not themselves need to natively support location recognition or touch recognition capabilities.

Type: Grant

Filed: September 22, 2010

Date of Patent: May 6, 2014

Assignee: Avaya Inc.

Inventors: George Erhart, Valentine Matula, David Skiba
Speech recognition dictionary compilation assisting system, speech recognition dictionary compilation assisting method and speech recognition dictionary compilation assisting program

Patent number: 8719021

Abstract: A speech recognition dictionary compilation assisting system can create and update speech recognition dictionary and language models efficiently so as to reduce speech recognition errors by utilizing text data available at a low cost. The system includes speech recognition dictionary storage section 105, language model storage section 106 and acoustic model storage section 107. A virtual speech recognition processing section 102 processes analyzed text data generated by the text analyzing section 101 by making reference to the recognition dictionary, language models and acoustic models so as to generate virtual text data resulted from speech recognition, and compares the virtual text data resulted from speech recognition with the analyzed text data. The update processing section 103 updates the recognition dictionary and language models so as to reduce different point(s) between both sets of text data.

Type: Grant

Filed: February 2, 2007

Date of Patent: May 6, 2014

Assignee: NEC Corporation

Inventor: Takafumi Koshinaka
Contextual voice query dilation to improve spoken web searching

Patent number: 8719025

Abstract: An apparatus and an article of manufacture for contextual voice query dilation in a Spoken Web search include determining a context in which a voice query is created, generating a set of multiple voice query terms based on the context and information derived by a speech recognizer component pertaining to the voice query, and processing the set of query terms with at least one dilation operator to produce a dilated set of queries.

Type: Grant

Filed: May 14, 2012

Date of Patent: May 6, 2014

Assignee: International Business Machines Corporation

Inventors: Nitendra Rajput, Kundan Shrivastava
Robustness to environmental changes of a context dependent speech recognizer

Patent number: 8719023

Abstract: An apparatus to improve robustness to environmental changes of a context dependent speech recognizer for an application, that includes a training database to store sounds for speech recognition training, a dictionary to store words supported by the speech recognizer, and a speech recognizer training module to train a set of one or more multiple state Hidden Markov Models (HMMs) with use of the training database and the dictionary. The speech recognizer training module performs a non-uniform state clustering process on each of the states of each HMM, which includes using a different non-uniform cluster threshold for at least some of the states of each HMM to more heavily cluster and correspondingly reduce a number of observation distributions for those of the states of each HMM that are less empirically affected by one or more contextual dependencies.

Type: Grant

Filed: May 21, 2010

Date of Patent: May 6, 2014

Assignee: Sony Computer Entertainment Inc.

Inventors: Xavier Menendez-Pidal, Ruxin Chen
Name synthesis

Patent number: 8719027

Abstract: An automated method of providing a pronunciation of a word to a remote device is disclosed. The method includes receiving an input indicative of the word to be pronounced. The method further includes searching a database having a plurality of records. Each of the records has an indication of a textual representation and an associated indication of an audible representation. At least one output is provided to the remote device of an audible representation of the word to be pronounced.

Type: Grant

Filed: February 28, 2007

Date of Patent: May 6, 2014

Assignee: Microsoft Corporation

Inventors: Yining Chen, Yusheng Li, Min Chu, Frank Kao-Ping Soong
Aligning a transcript to audio data

Patent number: 8719024

Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method including receiving audio data and a transcript of the audio data. The method further includes generating a language model including a factor automaton that includes automaton states and arcs, each of the automaton arcs corresponding to a language element from the transcript. The method further includes receiving language elements recognized from the received audio data and times at which each of the recognized language elements occur in the audio data. The method further includes comparing the recognized language elements to one or more of the language elements from the factor automaton to identify times at which the one or more of the language elements from the transcript occur in the audio data. The method further includes aligning a portion of the transcript with a portion of the audio data using the identified times.

Type: Grant

Filed: March 5, 2012

Date of Patent: May 6, 2014

Assignee: Google Inc.

Inventors: Pedro J. Moreno, Christopher Alberti
System and method for providing a natural language voice user interface in an integrated voice navigation services environment

Patent number: 8719026

Abstract: A conversational, natural language voice user interface may provide an integrated voice navigation services environment. The voice user interface may enable a user to make natural language requests relating to various navigation services, and further, may interact with the user in a cooperative, conversational dialogue to resolve the requests. Through dynamic awareness of context, available sources of information, domain knowledge, user behavior and preferences, and external systems and devices, among other things, the voice user interface may provide an integrated environment in which the user can speak conversationally, using natural language, to issue queries, commands, or other requests relating to the navigation services provided in the environment.

Type: Grant

Filed: February 4, 2013

Date of Patent: May 6, 2014

Assignee: VoiceBox Technologies Corporation

Inventors: Michael R. Kennewick, Catherine Cheung, Larry Baldwin, Ari Salomon, Michael Tjalve, Sheetal Guttigoli, Lynn Armstrong, Philippe DiChristo, Bernie Zimmerman, Sam Menaker
Systems and methods for punctuating voicemail transcriptions

Patent number: 8719004

Abstract: A system, method and software product punctuates voicemail transcription text. A transcription text of the voicemail message is generated and the pauses between words of the transcribed text are determined. Ellipses are inserted into the transcription text at the position of “er” and “ahh” type words and pauses between words of the transcribed text.

Type: Grant

Filed: March 19, 2010

Date of Patent: May 6, 2014

Assignee: Ditech Networks, Inc.

Inventor: James Siminoff
System and method for processing multi-modal device interactions in a natural language voice services environment

Patent number: 8719009

Abstract: A system and method for processing multi-modal device interactions in a natural language voice services environment may be provided. In particular, one or more multi-modal device interactions may be received in a natural language voice services environment that includes one or more electronic devices. The multi-modal device interactions may include a non-voice interaction with at least one of the electronic devices or an application associated therewith, and may further include a natural language utterance relating to the non-voice interaction. Context relating to the non-voice interaction and the natural language utterance may be extracted and combined to determine an intent of the multi-modal device interaction, and a request may then be routed to one or more of the electronic devices based on the determined intent of the multi-modal device interaction.

Type: Grant

Filed: September 14, 2012

Date of Patent: May 6, 2014

Assignee: VoiceBox Technologies Corporation

Inventors: Larry Baldwin, Chris Weider
Method and system for modeling a common-language speech recognition, by a computer, under the influence of a plurality of dialects

Patent number: 8712773

Abstract: The present invention relates to a method for modeling a common-language speech recognition, by a computer, under the influence of multiple dialects and concerns a technical field of speech recognition by a computer. In this method, a triphone standard common-language model is first generated based on training data of standard common language, and first and second monophone dialectal-accented common-language models are based on development data of dialectal-accented common languages of first kind and second kind, respectively. Then a temporary merged model is obtained in a manner that the first dialectal-accented common-language model is merged into the standard common-language model according to a first confusion matrix obtained by recognizing the development data of first dialectal-accented common language using the standard common-language model.

Type: Grant

Filed: October 29, 2009

Date of Patent: April 29, 2014

Assignees: Sony Computer Entertainment Inc., Tsinghua University

Inventors: Fang Zheng, Xi Xiao, Linquan Liu, Zhan You, Wenxiao Cao, Makoto Akabane, Ruxin Chen, Yoshikazu Takahashi
System and method for customized prompting

Patent number: 8712781

Abstract: A method for providing an audible prompt to a user within a vehicle. The method includes retrieving one or more data files from a memory device. The data files define certain characteristics of an audio prompt. The method also includes creating the audio prompt from the data files and outputting the audio prompt as an audio signal.

Type: Grant

Filed: January 4, 2008

Date of Patent: April 29, 2014

Assignee: Johnson Controls Technology Company

Inventors: Mark Zeinstra, Richard J. Chutorash, Jeffrey Golden, Jon M. Skekloff
Method and system to generate finite state grammars using sample phrases

Patent number: 8712775

Abstract: A method and system for generating a finite state grammar is provided. The method comprises receiving user input of at least two sample phrases; analyzing the sample phrases to determine common words that occur in each of the sample phrases and optional words that occur in only some of the sample phrases; creating a mathematical expression representing the sample phrases, the expression including each word found in the sample phrases and an indication of whether a word is a common word or an optional word; displaying the mathematical expression to a user; allowing the user to alter the mathematical expression; generating a finite state grammar corresponding to the altered mathematical expression; and displaying the finite state grammar to the user.

Type: Grant

Filed: January 31, 2013

Date of Patent: April 29, 2014

Assignee: West Interactive Corporation II

Inventor: Ashok Mitter Khosla
Specializing disambiguation of a natural language expression

Patent number: 8712759

Abstract: Disambiguation of the meaning of a natural language expression proceeds by constructing a natural language expression, and then incrementally specializing the meaning representation to more specific meanings as more information and constraints are obtained, in accordance with one or more specialization hierarchies between semantic descriptors. The method is generalized to disjunctive sets of interpretations that can be specialized hierarchically.

Type: Grant

Filed: October 22, 2010

Date of Patent: April 29, 2014

Assignee: Clausal Computing Oy

Inventor: Tatu J. Ylonen
Electronic devices with voice command and contextual data processing capabilities

Patent number: 8713119

Abstract: An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.

Type: Grant

Filed: September 13, 2012

Date of Patent: April 29, 2014

Assignee: Apple Inc.

Inventor: Aram M. Lindahl
Voice rendering of E-mail with tags for improved user experience

Patent number: 8705705

Abstract: Tags, such as XML tags, are inserted into email to separate email content from signature blocks, privacy notices and confidentiality notices, and to separate original email messages from replies and replies from further replies. The tags are detected by a system that renders email as speech, such as voice command platform or network-based virtual assistant or message center. The system can render an original email message in one voice mode and the reply in a different voice mode. The tags can be inserted to identify a voice memo in which a user responds to a particular portion of an email message.

Type: Grant

Filed: April 3, 2012

Date of Patent: April 22, 2014

Assignee: Sprint Spectrum L.P.

Inventors: Balaji S. Thenthiruperai, Elizabeth Roche, Brian Landers, Jesse Kates
Applying a structured language model to information extraction

Patent number: 8706491

Abstract: One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.

Type: Grant

Filed: August 24, 2010

Date of Patent: April 22, 2014

Assignee: Microsoft Corporation

Inventors: Ciprian Chelba, Milind Mahajan
System and method for selecting audio contents by using speech recognition

Patent number: 8706489

Abstract: A system and method for selecting audio contents by using the speech recognition to obtain a textual phrase from a series of audio contents are provided. The system includes an output module outputting the audio contents, an input module receiving a speech input from a user, a buffer temporarily storing the audio contents within a desired period and the speech input, and a recognizing module performing a speech recognition between the audio contents within the desired period and the speech input to generate an audio phrase and the corresponding textual phrase matching with the speech input.

Type: Grant

Filed: August 8, 2006

Date of Patent: April 22, 2014

Assignee: Delta Electronics Inc.

Inventors: Jia-lin Shen, Chien-Chou Hung
Method for disambiguating multiple readings in language conversion

Patent number: 8706472

Abstract: Disambiguating multiple readings in language conversion is disclosed, including: receiving an input data to be converted into a set of characters comprising a symbolic representation of the input data in a target symbolic system; and using a language model that distinguishes between a first reading and a second reading of a character of the target symbolic system to determine a probability that the heteronymous character should be used to represent a corresponding portion of the input data.

Type: Grant

Filed: August 11, 2011

Date of Patent: April 22, 2014

Assignee: Apple Inc.

Inventors: Brent D. Ramerth, Devang K. Naik, Douglas R. Davidson, Jannes G. A. Dolfing, Jia Pu
Transcription data extraction

Patent number: 8700395

Abstract: A computer program product, for performing data determination from medical record transcriptions, resides on a computer-readable medium and includes computer-readable instructions for causing a computer to obtain a medical transcription of a dictation, the dictation being from medical personnel and concerning a patient, analyze the transcription for an indicating phrase associated with a type of data desired to be determined from the transcription, the type of desired data being relevant to medical records, determine whether data indicated by text disposed proximately to the indicating phrase is of the desired type, and store an indication of the data if the data is of the desired type.

Type: Grant

Filed: September 13, 2012

Date of Patent: April 15, 2014

Assignee: Nuance Communications, Inc.

Inventors: Roger S. Zimmerman, Paul Egerman, George Zavaliagkos
System and method for using semantic and syntactic graphs for utterance classification

Patent number: 8700404

Abstract: Disclosed herein is a system, method and computer readable medium storing instructions related to semantic and syntactic information in a language understanding system. The method embodiment of the invention is a method for classifying utterances during a natural language dialog between a human and a computing device. The method comprises receiving a user utterance; generating a semantic and syntactic graph associated with the received utterance, extracting all n-grams as features from the generated semantic and syntactic graph and classifying the utterance. Classifying the utterance may be performed any number of ways such as using the extracted n-grams, a syntactic and semantic graphs or writing rules.

Type: Grant

Filed: August 27, 2005

Date of Patent: April 15, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Ananlada Chotimongkol, Dilek Z. Hakkani-Tur, Gokhan Tur
Data extraction system, terminal apparatus, program of the terminal apparatus, server apparatus, and program of the server apparatus for extracting prescribed data from web pages

Patent number: 8700702

Abstract: This invention provides a terminal searching for web pages on the web and extracting the prescribed data from the web pages and a server verifying and accumulating the extracted data. The prescribed data can be extracted from the web pages on the web in a manner that the process relating to the data extraction is distributed between the terminal and the server. Therefore, necessary processes up to the data extraction are distributed, and the burden placed on each apparatus can be lessened. Further, new data not formerly found in the web pages can be found out and extracted from the web pages that has been updated or newly made.

Type: Grant

Filed: August 24, 2012

Date of Patent: April 15, 2014

Assignee: Kabushiki Kaisha Square Enix

Inventor: Kengo Nakajima
Providing a task description name space map for the information worker

Patent number: 8700385

Abstract: Providing for generation of a task oriented data structure that can correlate natural language descriptions of computer related tasks to application level commands and functions is described herein. By way of example, a system can include an activity translation component that can receive a natural language description of an application level task. Furthermore, the system can include a language modeling component that can generate the data structure based on an association between the description of the task and at least one application level command utilized in executing the computer related task. Once generated, the data structure can be utilized to automate computer related tasks by input of a human centric description of those tasks. According to further embodiments, machine learning can be employed to train classifiers and heuristic models to optimize task/description relationships and/or tailor such relationships to the needs of particular users.

Type: Grant

Filed: April 4, 2008

Date of Patent: April 15, 2014

Assignee: Microsoft Corporation

Inventors: Susan T. Dumais, William H. Gates, III, Srikanth Shoroff, Michael Ehrenberg, Jensen M. Harris, Richard J. Wolf, Joshua T. Goodman, Eran Megiddo
Discriminative training of document transcription system

Patent number: 8694312

Abstract: A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

Type: Grant

Filed: February 22, 2013

Date of Patent: April 8, 2014

Assignee: MModal IP LLC

Inventors: Lambert Mathias, Girija Yegnanarayanan, Juergen Fritsch
Providing text input using speech data and non-speech data

Patent number: 8688446

Abstract: Systems, methods, and computer readable media providing a speech input interface. The interface can receive speech input and non-speech input from a user through a user interface. The speech input can be converted to text data and the text data can be combined with the non-speech input for presentation to a user.

Type: Grant

Filed: November 18, 2011

Date of Patent: April 1, 2014

Assignee: Apple Inc.

Inventor: Kazuhisa Yanagihara
System and method for using data to automatically generate a narrative story

Patent number: 8688434

Abstract: A system and method for automatically generating a narrative story receives data and information pertaining to a domain event. The received data and information and/or one or more derived features are then used to identify a plurality of angles for the narrative story. The plurality of angles is then filtered, for example through use of parameters that specify a focus for the narrative story, length of the narrative story, etc. Points associated with the filtered plurality of angles are then assembled and the narrative story is rendered using the filtered plurality of angles and the assembled points.

Type: Grant

Filed: May 13, 2010

Date of Patent: April 1, 2014

Assignee: Narrative Science Inc.

Inventors: Lawrence A. Birnbaum, Kristian J. Hammond, Nicholas D. Allen, John R. Templon

prev … 9 10 11 12 13 14 15 16 17 … next