Abstract: A specification of a first natural language understanding (NLU) machine learning model for a first human communication language is received. The specification specifies a language content associated with one or more intents of the first NLU machine learning model in the first human communication language. An identification of an association between the first NLU machine learning model and a second NLU machine learning model for a second human communication language is received. The first NLU machine learning model and the second NLU machine learning model are managed together. This includes detecting a change to the first NLU machine learning model in the first human communication language and a software using the first NLU and in response automatically assisting in maintaining consistency in the second NLU machine learning model in the second human communication language by respectively updating the software with respect to the detected change.
Abstract: A taxi management device includes: an information acquisition unit that acquires first information including information specifying a use language of a first person and second information including information specifying an interpretable language of a second person; and a discount setting unit that performs discount setting of a fare of the second person at a time of ride-sharing of taxi by the first person and the second person based on the first information and the second information.
Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for a domain, such as the medical domain. In some embodiments, words and/or phrases that may be confused by an ASR system may be determined and associated in sets of words and/or phrases. Words and/or phrases that may be determined include those that change a meaning of a phrase or sentence when included in the phrase/sentence.
Type:
Application
Filed:
July 9, 2012
Publication date:
January 9, 2014
Applicant:
Nuance Communications, Inc.
Inventors:
William F. Ganong, III, Raghu Vemula, Robert Fleming
Abstract: Different advantageous embodiments provide a crowdsourcing method for modeling user intent in conversational interfaces. One or more stimuli are presented to a plurality of describers. One or more sets of describer data are captured from the plurality of describers using a data collection mechanism. The one or more sets of describer data are processed to generate one or more models. Each of the one or more models is associated with a specific stimulus from the one or more stimuli.
Type:
Application
Filed:
April 3, 2012
Publication date:
October 3, 2013
Applicant:
MICROSOFT CORPORATION
Inventors:
Christopher John Brockett, Piali Choudhury, William Brennan Dolan, Yun-Cheng Ju, Patrick Pantel, Noelle Mallory Sophy, Svitlana Volkova
Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t ? T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.
Type:
Application
Filed:
November 2, 2012
Publication date:
May 9, 2013
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor:
International Business Machines Corporation
Abstract: A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics.
Type:
Application
Filed:
March 21, 2011
Publication date:
July 14, 2011
Applicant:
AT&T Corp.
Inventors:
Dilek Z. Hakkani-Tur, Jackson J. Liscombe, Guiseppe Riccardi
Abstract: Architecture that employs an overall grammar as a set of context-specific grammars for recognition of an input, each responsible for a specific context, such as subtask category, geographic region, etc. The grammars together cover the entire domain. Moreover, multiple recognitions can be run in parallel against the same input, where each recognition uses one or more of the context-specific grammars. The multiple intermediate recognition results from the different recognizer-grammars are reconciled by running re-recognition using a dynamically composed grammar based on the multiple recognition results and potentially other domain knowledge, or selecting the winner using a statistical classifier operating on classification features extracted from the multiple recognition results and other domain knowledge.
Type:
Application
Filed:
June 4, 2009
Publication date:
December 9, 2010
Applicant:
Microsoft Corporation
Inventors:
Shuangyu Chang, Michael Levit, Bruce Buntschuh
Abstract: Among other things, techniques and systems are disclosed for implementing contextual voice commands. On a device, a data item in a first context is displayed. On the device, a physical input selecting the displayed data item in the first context is received. On the device, a voice input that relates the selected data item to an operation in a second context is received. The operation is performed on the selected data item in the second context.
Type:
Application
Filed:
June 5, 2009
Publication date:
December 9, 2010
Applicant:
APPLE INC.
Inventors:
Marcel Van Os, Gregory Novick, Scott Herz
Abstract: A method for enhancing a media file to enable speech-recognition of spoken navigation commands can be provided. The method can include receiving a plurality of textual items based on subject matter of the media file and generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine. The method can further include associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar. The method can further include associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.
Type:
Application
Filed:
July 28, 2008
Publication date:
January 14, 2010
Applicant:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Abstract: In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a search software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the search software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.
Type:
Application
Filed:
October 3, 2007
Publication date:
September 11, 2008
Inventors:
Joseph Cerra, Roman V. Kishchenko, John N. Nguyen, Michael S. Phillips, Han Shu
Abstract: In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a software application resident on a mobile communication facility, where recorded speech may be presented by the user using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility, and may be accompanied by information related to the software application. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the software application and the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the software application.
Type:
Application
Filed:
October 1, 2007
Publication date:
September 11, 2008
Inventors:
Joseph P. Cerra, Roman V. Kishchenko, John N. Nguyen, Michael S. Phillips, Han Shu
Abstract: In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a messaging software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the messaging software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.
Type:
Application
Filed:
October 3, 2007
Publication date:
September 11, 2008
Inventors:
Joseph P. Cerra, Roman V. Kishchenko, John N. Nguyen, Michael S. Phillips, Han Shu
Abstract: A system, apparatus and method for transmitting and receiving messages over a wireless communications network, the communicating device including a computer platform having storage for one or more programs, a user interface that includes a visual display for displaying at least alphanumeric characters and a microphone for inputting speech of a user of the computerized communicating device a trackball module, the trackball module for inputting at least alphanumeric characters, a sensor for obtaining biodata from a user of the computerized paging device, and a speech translation program resident and selectively executable on the computer platform, whereupon initiating a message for transmission, the speech translation software interpreting the words of the user and translating them into a digital text format, the speech translation program may include an electronic dictionary, the electronic dictionary identifies a word by comparing an electronic signature of the word to a plurality of electronic signatures sto
Abstract: A method and system may provide an interface (e.g., “API”), client side software module or other process that may accept client input defining a playback environment, such as a speech output interface, accept client input selecting preprogrammed functionality for operating the speech playback environment, accept client input tailoring the preprogrammed functionality based on the client input, create the speech playback environment, and create embedded code to embed the speech playback environment within a website for providing speech output. A method and system may provide a website including web-site code controlling the operation of the website and plug-in code providing preprogrammed functionality for operating an embedded speech playback environment, where the plug-in code is tailored by a client, where the web-site code is to query the plug-in code for speech requests and requests for preprogrammed functionality in addition to speech functionality.