Patents by Inventor Yun-cheng Ju
Yun-cheng Ju has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20150019216Abstract: Described herein are various technologies pertaining to performing an operation relative to tabular data based upon voice input. An ASR system includes a language model that is customized based upon content of the tabular data. The ASR system receives a voice signal that is representative of speech of a user. The ASR system creates a transcription of the voice signal based upon the ASR being customized with the content of the tabular data. The operation relative to the tabular data is performed based upon the transcription of the voice signal.Type: ApplicationFiled: May 21, 2014Publication date: January 15, 2015Applicant: Microsoft CorporationInventors: Prabhdeep Singh, Kris Ganjam, Sumit Gulwani, Mark Marron, Yun-Cheng Ju, Kaushik Chakrabarti
-
Method For Finding Elements In A Webpage Suitable For Use In A Voice User Interface (Disambiguation)
Publication number: 20140350941Abstract: A disambiguation process for a voice interface for web pages or other documents. The process identifies interactive elements such as links, obtains one or more phrases of each interactive element, such as link text, title text and alternative text for images, and adds the phrases to a grammar which is used for speech recognition. A group of interactive elements are identified as potential best matches to a voice command when there is no single, clear best match. The disambiguation process modifies a display of the document to provide unique labels for each interactive element in the group, and the user is prompted to provide a subsequent spoke command to identify one of the unique labels. The selected unique label is identified and a click event is generated for the corresponding interactive element.Type: ApplicationFiled: May 21, 2013Publication date: November 27, 2014Applicant: Microsoft CorporationInventors: Andrew Stephen Zeigler, Michael H. Kim, Rodger Benson, Raman Sarin, Yun-Cheng Ju -
Publication number: 20140350928Abstract: A voice interface for web pages or other documents identifies interactive elements such as links, obtains one or more phrases of each interactive element, such as link text, title text and alternative text for images, and adds the phrases to a grammar which is used for speech recognition. A click event is generated for an interactive element having a phrase which is a best match for the voice command of a user. In one aspect, the phrases of currently-displayed elements of the document are used for speech recognition. In another aspect, phrases which are not displayed, such as title text and alternative text for images, are used in the grammar. In another aspect, updates to the document are detected and the grammar is updated accordingly so that the grammar is synchronized with the current state of the document.Type: ApplicationFiled: May 21, 2013Publication date: November 27, 2014Applicant: Microsoft CorporationInventors: Andrew Stephen Zeigler, Michael H. Kim, Rodger Benson, Raman Sarin, Yun-Cheng Ju
-
Patent number: 8838449Abstract: This document describes word-dependent language models, as well as their creation and use. A word-dependent language model can permit a speech-recognition engine to accurately verify that a speech utterance matches a multi-word phrase. This is useful in many contexts, including those where one or more letters of the expected phrase are known to the speaker.Type: GrantFiled: December 23, 2010Date of Patent: September 16, 2014Assignee: Microsoft CorporationInventors: Yun-Cheng Ju, Ivan J. Tashev, Chad R. Heinemann
-
Publication number: 20140244254Abstract: A development system is described for facilitating the development of a spoken natural language (SNL) interface. The development system receives seed templates from a developer, each of which provides a command phrasing that can be used to invoke a function, when spoken by an end user. The development system then uses one or more development resources, such as a crowdsourcing system and a paraphrasing system, to provide additional templates. This yields an extended set of templates. A generation system then generates one or more models based on the extended set of templates. A user device may install the model(s) for use in interpreting commands spoken by an end user. When the user device recognizes a command, it may automatically invoke a function associated with that command. Overall, the development system provides an easy-to-use tool for producing an SNL interface.Type: ApplicationFiled: February 25, 2013Publication date: August 28, 2014Applicant: MICROSOFT CORPORATIONInventors: Yun-Cheng Ju, Matthai Philipose, Seungyeop Han
-
Patent number: 8793130Abstract: A method of generating a confidence measure generator is provided for use in a voice search system, the voice search system including voice search components comprising a speech recognition system, a dialog manager and a search system. The method includes selecting voice search features, from a plurality of the voice search components, to be considered by the confidence measure generator in generating a voice search confidence measure. The method includes training a model, using a computer processor, to generate the voice search confidence measure based on selected voice search features.Type: GrantFiled: March 23, 2012Date of Patent: July 29, 2014Assignee: Microsoft CorporationInventors: Ye-Yi Wang, Yun-Cheng Ju, Dong Yu
-
Patent number: 8615388Abstract: Training data may be provided, the training data including pairs of source phrases and target phrases. The pairs may be used to train an intra-language statistical machine translation model, where the intra-language statistical machine translation model, when given an input phrase of text in the human language, can compute probabilities of semantic equivalence of the input phrase to possible translations of the input phrase in the human language. The statistical machine translation model may be used to translate between queries and listings. The queries may be text strings in the human language submitted to a search engine. The listing strings may be text strings of formal names of real world entities that are to be searched by the search engine to find matches for the query strings.Type: GrantFiled: March 28, 2008Date of Patent: December 24, 2013Assignee: Microsoft CorporationInventors: Xiao Li, Yun-Cheng Ju, Geoffrey Zweig, Alex Aero
-
Patent number: 8589157Abstract: An automated “Voice Search Message Service” provides a voice-based user interface for generating text messages from an arbitrary speech input. Specifically, the Voice Search Message Service provides a voice-search information retrieval process that evaluates user speech inputs to select one or more probabilistic matches from a database of pre-defined or user-defined text messages. These probabilistic matches are also optionally sorted in terms of relevancy. A single text message from the probabilistic matches is then selected and automatically transmitted to one or more intended recipients. Optionally, one or more of the probabilistic matches are presented to the user for confirmation or selection prior to transmission. Correction or recovery of speech recognition errors avoided since the probabilistic matches are intended to paraphrase the user speech input rather than exactly reproduce that speech, though exact matches are possible.Type: GrantFiled: December 5, 2008Date of Patent: November 19, 2013Assignee: Microsoft CorporationInventors: Yun-Cheng Ju, Ye-Yi Wang
-
Publication number: 20130262114Abstract: Different advantageous embodiments provide a crowdsourcing method for modeling user intent in conversational interfaces. One or more stimuli are presented to a plurality of describers. One or more sets of describer data are captured from the plurality of describers using a data collection mechanism. The one or more sets of describer data are processed to generate one or more models. Each of the one or more models is associated with a specific stimulus from the one or more stimuli.Type: ApplicationFiled: April 3, 2012Publication date: October 3, 2013Applicant: MICROSOFT CORPORATIONInventors: Christopher John Brockett, Piali Choudhury, William Brennan Dolan, Yun-Cheng Ju, Patrick Pantel, Noelle Mallory Sophy, Svitlana Volkova
-
Publication number: 20130159000Abstract: The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.Type: ApplicationFiled: December 15, 2011Publication date: June 20, 2013Applicant: MICROSOFT CORPORATIONInventors: Yun-Cheng Ju, James Garnet Droppo, III
-
Patent number: 8433576Abstract: A novel system for automatic reading tutoring provides effective error detection and reduced false alarms combined with low processing time burdens and response times short enough to maintain a natural, engaging flow of interaction. According to one illustrative embodiment, an automatic reading tutoring method includes displaying a text output and receiving an acoustic input. The acoustic input is modeled with a domain-specific target language model specific to the text output, and with a general-domain garbage language model, both of which may be efficiently constructed as context-free grammars. The domain-specific target language model may be built dynamically or “on-the-fly” based on the currently displayed text (e.g. the story to be read by the user), while the general-domain garbage language model is shared among all different text outputs. User-perceptible tutoring feedback is provided based on the target language model and the garbage language model.Type: GrantFiled: January 19, 2007Date of Patent: April 30, 2013Assignee: Microsoft CorporationInventors: Xiaolong Li, Yun-Cheng Ju, Li Deng, Alejandro Acero
-
Publication number: 20130090921Abstract: Systems and methods are described for adding entries to a custom lexicon used by a speech recognition engine of a speech interface in response to user interaction with the speech interface. In one embodiment, a speech signal is obtained when the user speaks a name of a particular item to be selected from among a finite set of items. If a phonetic description of the speech signal is not recognized by the speech recognition engine, then the user is presented with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item. After the user has selected the particular item via the means for selecting, the phonetic description of the speech signal is stored in association with a text description of the particular item in the custom lexicon.Type: ApplicationFiled: October 7, 2011Publication date: April 11, 2013Applicant: MICROSOFT CORPORATIONInventors: Wei-Ting Frank Liu, Andrew Lovitt, Stefanie Tomko, Yun-Cheng Ju
-
Patent number: 8364487Abstract: A language processing system may determine a display form of a spoken word by analyzing the spoken form using a language model that includes dictionary entries for display forms of homonyms. The homonyms may include trade names as well as given names and other phrases. The language processing system may receive spoken language and produce a display form of the language while displaying the proper form of the homonym. Such a system may be used in search systems where audio input is converted to a graphical display of a portion of the spoken input.Type: GrantFiled: October 21, 2008Date of Patent: January 29, 2013Assignee: Microsoft CorporationInventors: Yun-Cheng Ju, Julian J. Odell
-
Publication number: 20120323967Abstract: A multimedia system configured to receive user input in the form of a spelled character sequence is provided. In one implementation, a spell mode is initiated, and a user spells a character sequence. The multimedia system performs spelling recognition and recognizes a sequence of character representations having a possible ambiguity resulting from any user and/or system errors. The sequence of character representations with the possible ambiguity yields multiple search keys. The multimedia system performs a fuzzy pattern search by scoring each target item from a finite dataset of target items based on the multiple search keys. One or more relevant items are ranked and presented to the user for selection, each relevant item being a target item that exceeds a relevancy threshold. The user selects the indented character sequence from the one or more relevant items.Type: ApplicationFiled: June 14, 2011Publication date: December 20, 2012Applicant: MICROSOFT CORPORATIONInventors: Yun-Cheng Ju, Ivan J. Tashev, Xiao Li, Dax Hawkins, Thomas Soemo, Michael H. Kim
-
Patent number: 8306822Abstract: A method of providing automatic reading tutoring is disclosed. The method includes retrieving a textual indication of a story from a data store and creating a language model including constructing a target context free grammar indicative of a first portion of the story. A first acoustic input is received and a speech recognition engine is employed to recognize the first acoustic input. An output of the speech recognition engine is compared to the language model and a signal indicative of whether the output of the speech recognition matches at least a portion of the target context free grammar is provided.Type: GrantFiled: September 11, 2007Date of Patent: November 6, 2012Assignee: Microsoft CorporationInventors: Xiaolong Li, Li Deng, Yun-Cheng Ju, Alex Acero
-
Patent number: 8285542Abstract: A statistical language model is trained for use in a directory assistance system using the data in a directory assistance listing corpus. Calculations are made to determine how important words in the corpus are in distinguishing a listing from other listings, and how likely words are to be omitted or added by a user. The language model is trained using these calculations.Type: GrantFiled: February 15, 2011Date of Patent: October 9, 2012Assignee: Microsoft CorporationInventors: Dong Yu, Alejandro Acero, Yun-Cheng Ju
-
Publication number: 20120185252Abstract: A method of generating a confidence measure generator is provided for use in a voice search system, the voice search system including voice search components comprising a speech recognition system, a dialog manager and a search system. The method includes selecting voice search features, from a plurality of the voice search components, to be considered by the confidence measure generator in generating a voice search confidence measure. The method includes training a model, using a computer processor, to generate the voice search confidence measure based on selected voice search features.Type: ApplicationFiled: March 23, 2012Publication date: July 19, 2012Applicant: Microsoft CorporationInventors: Ye-Yi Wang, Yun-Cheng Ju, Dong Yu
-
Publication number: 20120166196Abstract: This document describes word-dependent language models, as well as their creation and use. A word-dependent language model can permit a speech-recognition engine to accurately verify that a speech utterance matches a multi-word phrase. This is useful in many contexts, including those where one or more letters of the expected phrase are known to the speaker.Type: ApplicationFiled: December 23, 2010Publication date: June 28, 2012Applicant: Microsoft CorporationInventors: Yun-Cheng Ju, Ivan J. Tashev, Chad R. Heinemann
-
Publication number: 20120109994Abstract: A data-retrieval method for use on a portable electronic device. The method comprises receiving a query string at a user interface of the device and displaying one or more index strings on the user interface such that the relative prominence of each index string displayed increases with increasing resemblance of that index string to the query string. The method further comprises displaying an index string with greater prominence when a fixed-length substring of the query string occurs anywhere in the index string, regardless of position. In this manner, the relevance of prominently displayed index strings increases as more characters are appended to the query string, even if the query string contains errors.Type: ApplicationFiled: October 28, 2010Publication date: May 3, 2012Applicant: MICROSOFT CORPORATIONInventors: Yun-Cheng Ju, Frank Liu, Yen-Tsang Lee, Jason Farmer, Ted E. Dinklocker
-
Patent number: 8165877Abstract: A voice search system has a speech recognizer, a search component, and a dialog manager. A confidence measure generator receives speech recognition features from the speech recognizer, search features from the search component, and dialog features from the dialog manager, and calculates an overall confidence measure for voice search results based upon the features received. The invention can be extended to include the generation of additional features, based on those received from the individual components of the voice search system.Type: GrantFiled: August 3, 2007Date of Patent: April 24, 2012Assignee: Microsoft CorporationInventors: Ye-Yi Wang, Yun-Cheng Ju, Dong Yu