Using Context Dependencies, E.g., Language Models, Etc. (epo) Patents (Class 704/E15.019)
  • Publication number: 20140012575
    Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for a domain, such as the medical domain. In some embodiments, words and/or phrases that may be confused by an ASR system may be determined and associated in sets of words and/or phrases. Words and/or phrases that may be determined include those that change a meaning of a phrase or sentence when included in the phrase/sentence.
    Type: Application
    Filed: July 9, 2012
    Publication date: January 9, 2014
    Applicant: Nuance Communications, Inc.
    Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
  • Publication number: 20130262114
    Abstract: Different advantageous embodiments provide a crowdsourcing method for modeling user intent in conversational interfaces. One or more stimuli are presented to a plurality of describers. One or more sets of describer data are captured from the plurality of describers using a data collection mechanism. The one or more sets of describer data are processed to generate one or more models. Each of the one or more models is associated with a specific stimulus from the one or more stimuli.
    Type: Application
    Filed: April 3, 2012
    Publication date: October 3, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Christopher John Brockett, Piali Choudhury, William Brennan Dolan, Yun-Cheng Ju, Patrick Pantel, Noelle Mallory Sophy, Svitlana Volkova
  • Publication number: 20130117024
    Abstract: A method, system and computer program product for recognizing terms in a specified corpus. In one embodiment, the method comprises providing a set of known terms t ? T, each of the known terms t belonging to a set of types ? (t)={?1, . . . }, wherein each of the terms is comprised of a list of words, t=w1, w2, . . . , wn, and the union of all the words for all the terms is a word set W. The method further comprises using the set of terms T and the set of types to determine a set of pattern-to-type mappings p??; and using the set of pattern-to-type mappings to recognize terms in the specified corpus and, for each of the recognized terms in the specified corpus, to recognize one or more of the types ? for said each recognized term.
    Type: Application
    Filed: November 2, 2012
    Publication date: May 9, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: International Business Machines Corporation
  • Publication number: 20110172999
    Abstract: A system, method and computer-readable medium for practicing a method of emotion detection during a natural language dialog between a human and a computing device are disclosed. The method includes receiving an utterance from a user in a natural language dialog, receiving contextual information regarding the natural language dialog which is related to changes of emotion over time in the dialog, and detecting an emotion of the user based on the received contextual information. Examples of contextual information include, for example, differential statistics, joint statistics and distance statistics.
    Type: Application
    Filed: March 21, 2011
    Publication date: July 14, 2011
    Applicant: AT&T Corp.
    Inventors: Dilek Z. Hakkani-Tur, Jackson J. Liscombe, Guiseppe Riccardi
  • Publication number: 20100312546
    Abstract: Architecture that employs an overall grammar as a set of context-specific grammars for recognition of an input, each responsible for a specific context, such as subtask category, geographic region, etc. The grammars together cover the entire domain. Moreover, multiple recognitions can be run in parallel against the same input, where each recognition uses one or more of the context-specific grammars. The multiple intermediate recognition results from the different recognizer-grammars are reconciled by running re-recognition using a dynamically composed grammar based on the multiple recognition results and potentially other domain knowledge, or selecting the winner using a statistical classifier operating on classification features extracted from the multiple recognition results and other domain knowledge.
    Type: Application
    Filed: June 4, 2009
    Publication date: December 9, 2010
    Applicant: Microsoft Corporation
    Inventors: Shuangyu Chang, Michael Levit, Bruce Buntschuh
  • Publication number: 20100312547
    Abstract: Among other things, techniques and systems are disclosed for implementing contextual voice commands. On a device, a data item in a first context is displayed. On the device, a physical input selecting the displayed data item in the first context is received. On the device, a voice input that relates the selected data item to an operation in a second context is received. The operation is performed on the selected data item in the second context.
    Type: Application
    Filed: June 5, 2009
    Publication date: December 9, 2010
    Applicant: APPLE INC.
    Inventors: Marcel Van Os, Gregory Novick, Scott Herz
  • Publication number: 20100010814
    Abstract: A method for enhancing a media file to enable speech-recognition of spoken navigation commands can be provided. The method can include receiving a plurality of textual items based on subject matter of the media file and generating a grammar for each textual item, thereby generating a plurality of grammars for use by a speech recognition engine. The method can further include associating a time stamp with each grammar, wherein a time stamp indicates a location in the media file of a textual item corresponding with a grammar. The method can further include associating the plurality of grammars with the media file, such that speech recognized by the speech recognition engine is associated with a corresponding location in the media file.
    Type: Application
    Filed: July 28, 2008
    Publication date: January 14, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Paritosh D. Patel
  • Publication number: 20080221901
    Abstract: In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a search software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the search software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.
    Type: Application
    Filed: October 3, 2007
    Publication date: September 11, 2008
    Inventors: Joseph Cerra, Roman V. Kishchenko, John N. Nguyen, Michael S. Phillips, Han Shu
  • Publication number: 20080221899
    Abstract: In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a messaging software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the messaging software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.
    Type: Application
    Filed: October 3, 2007
    Publication date: September 11, 2008
    Inventors: Joseph P. Cerra, Roman V. Kishchenko, John N. Nguyen, Michael S. Phillips, Han Shu
  • Publication number: 20080221897
    Abstract: In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a software application resident on a mobile communication facility, where recorded speech may be presented by the user using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility, and may be accompanied by information related to the software application. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the software application and the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the software application.
    Type: Application
    Filed: October 1, 2007
    Publication date: September 11, 2008
    Inventors: Joseph P. Cerra, Roman V. Kishchenko, John N. Nguyen, Michael S. Phillips, Han Shu
  • Publication number: 20080147409
    Abstract: A system, apparatus and method for transmitting and receiving messages over a wireless communications network, the communicating device including a computer platform having storage for one or more programs, a user interface that includes a visual display for displaying at least alphanumeric characters and a microphone for inputting speech of a user of the computerized communicating device a trackball module, the trackball module for inputting at least alphanumeric characters, a sensor for obtaining biodata from a user of the computerized paging device, and a speech translation program resident and selectively executable on the computer platform, whereupon initiating a message for transmission, the speech translation software interpreting the words of the user and translating them into a digital text format, the speech translation program may include an electronic dictionary, the electronic dictionary identifies a word by comparing an electronic signature of the word to a plurality of electronic signatures sto
    Type: Application
    Filed: December 18, 2006
    Publication date: June 19, 2008
    Inventor: Robert Taormina
  • Publication number: 20080126095
    Abstract: A method and system may provide an interface (e.g., “API”), client side software module or other process that may accept client input defining a playback environment, such as a speech output interface, accept client input selecting preprogrammed functionality for operating the speech playback environment, accept client input tailoring the preprogrammed functionality based on the client input, create the speech playback environment, and create embedded code to embed the speech playback environment within a website for providing speech output. A method and system may provide a website including web-site code controlling the operation of the website and plug-in code providing preprogrammed functionality for operating an embedded speech playback environment, where the plug-in code is tailored by a client, where the web-site code is to query the plug-in code for speech requests and requests for preprogrammed functionality in addition to speech functionality.
    Type: Application
    Filed: October 26, 2007
    Publication date: May 29, 2008
    Inventor: Gil Sideman