Patents by Inventor Yunbo Cao

Yunbo Cao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7707204
    Abstract: A query and a factoid type selection are received from a user. An index of passages, indexed based on factoids, is accessed and passages that are related to the query, and that have the selected factoid type, are retrieved. The retrieved passages are ranked and provided to the user based on a calculated score, in rank order.
    Type: Grant
    Filed: December 13, 2005
    Date of Patent: April 27, 2010
    Assignee: Microsoft Corporation
    Inventors: Hang Li, Jianfeng Gao, Yunbo Cao
  • Publication number: 20100049498
    Abstract: A question search system provides a collection of questions having words for use in evaluating the utility of the questions based on a language model. The question search system calculates n-gram probabilities for words within the questions of the collection. The n-gram probability of a word for a sequence of n?1 words indicates the probability of that word being next after that sequence in the collection of questions. The n-gram probabilities for the words of the collection represent the language model of the collection. The question search system calculates a language model utility score for each question within a collection that indicates the likelihood that a question is repeatedly asked by users. The question search system derives the language model utility score for a question from the n-gram probabilities of the words within that question.
    Type: Application
    Filed: August 25, 2008
    Publication date: February 25, 2010
    Applicant: Microsoft Corporation
    Inventors: Yunbo Cao, Chin-Yew Lin
  • Publication number: 20100030769
    Abstract: A method and system for presenting questions that are relevant to a queried question based on clusters of topics and clusters of focuses of the questions is provided. A question search system provides a collection of questions. Each question of the collection has an associated topic and focus. Upon receiving a queried question, the question search system identifies questions of the collection that may be relevant to the queried question and generates a score or ranking indicating relevance of the identified questions. The question search system clusters the identified questions into topic clusters of questions with similar topics. The question search system may also cluster the questions within each topic cluster into focus clusters of questions with similar focuses.
    Type: Application
    Filed: August 4, 2008
    Publication date: February 4, 2010
    Applicant: Microsoft Corporation
    Inventors: Yunbo Cao, Chin-Yew Lin
  • Publication number: 20100030770
    Abstract: A method and system for determining the relevance of questions to a queried question based on topics and focuses of the questions is provided. A question search system provides a collection of questions with topics and focuses. Upon receiving a queried question, the question search system identifies a queried topic and queried focus of the queried question. The question search system generates a score indicating the relevance of a question of the collection to the queried question based on a language model of the topic of the question and a language model of the focus of the question.
    Type: Application
    Filed: August 4, 2008
    Publication date: February 4, 2010
    Applicant: Microsoft Corporation
    Inventors: Yunbo Cao, Chin-Yew Lin
  • Patent number: 7644074
    Abstract: A method of finding documents. A method of finding documents comprising, ranking documents according to relevance to form a ranked relevance list, ranking documents according to type to form a ranked type list, and interpolating the ranked relevance list and the ranked type list to form a list of documents ranked by relevance and type.
    Type: Grant
    Filed: December 22, 2005
    Date of Patent: January 5, 2010
    Assignee: Microsoft Corporation
    Inventors: Yunbo Cao, Hang Li, Jun Xu
  • Publication number: 20090259642
    Abstract: In a question answering system, the system identifies a type of question input by a user. The system then generates answer summaries that summarize answers to the input question in a format that is determined based on the type of question asked by the user. The answer summaries are output, in the corresponding format, in answer to the input question.
    Type: Application
    Filed: April 15, 2008
    Publication date: October 15, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Yunbo Cao, Chin-Yew Lin
  • Publication number: 20090253112
    Abstract: The present system graphs topic terms in stored cQA questions and also converts a submitted question into a graph of topic terms. Topic terms that correspond to a question topic are delineated from topic terms that correspond to question focus. New questions are recommended to the user based on a comparison between the topics of the new questions and the topic of the submitted question as well as the focus of the new questions and the focus of the submitted question.
    Type: Application
    Filed: April 7, 2008
    Publication date: October 8, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Yunbo Cao, Chin-Yew Lin
  • Patent number: 7593934
    Abstract: A method and system for generating a ranking function to rank the relevance of documents to a query is provided. The ranking system learns a ranking function from training data that includes queries, resultant documents, and relevance of each document to its query. The ranking system learns a ranking function using the training data by weighting incorrect rankings of relevant documents more heavily than the incorrect rankings of not relevant documents so that more emphasis is placed on correctly ranking relevant documents. The ranking system may also learn a ranking function using the training data by normalizing the contribution of each query to the ranking function so that it is independent of the number of relevant documents of each query.
    Type: Grant
    Filed: July 28, 2006
    Date of Patent: September 22, 2009
    Assignee: Microsoft Corporation
    Inventors: Hang Li, Jun Xu, Yunbo Cao, Tie-Yan Liu
  • Patent number: 7590608
    Abstract: A cascaded processing approach is used to clean noisy electronic mail or other text messaging data. Non-text filtering is first performed on the noisy data to filter out non-text items in the data. Text normalization is then performed on the filtered data to provide cleaned data. The cleaned data can be used in one or more of a wide variety of other applications or processing systems.
    Type: Grant
    Filed: December 2, 2005
    Date of Patent: September 15, 2009
    Assignee: Microsoft Corporation
    Inventors: Hang Li, Yunbo Cao, ZhaoHui Tang
  • Patent number: 7512582
    Abstract: Collaborative bootstrapping with uncertainty reduction for increased classifier performance. One classifier selects a portion of data that is uncertain with respect to the classifier and a second classifier labels the portion. Uncertainty reduction includes parallel processing where the second classifier also selects an uncertain portion for the first classifier to label. Uncertainty reduction can be incorporated into existing or new co-training or bootstrapping, including bilingual bootstrapping.
    Type: Grant
    Filed: December 10, 2003
    Date of Patent: March 31, 2009
    Assignee: Microsoft Corporation
    Inventors: Yunbo Cao, Hang Li
  • Publication number: 20090083096
    Abstract: A method for handling product reviews can detect a first quality product review from a second quality product review. The first and second quality product reviews can be associated with a product. The first quality product review can be filtered. An opinion segment in the second quality product review can be identified and the polarity can be determined of the opinion segment. An opinion set can be generated with the opinion segment for a product feature. A score (or weighty can be aggregated of segments in the opinion set for the product feature.
    Type: Application
    Filed: September 20, 2007
    Publication date: March 26, 2009
    Applicant: Microsoft Corporation
    Inventors: Yunbo Cao, Chin-Yew Lin, Ming Zhou
  • Patent number: 7469251
    Abstract: An information extraction model is trained on format features identified within labeled training documents. Information from a document is extracted by assigning labels to units based on format features of the units within the document. A begin label and end label are identified and the information is extracted between the begin label and the end label. The extracted information can be used in various document processing tasks such as ranking.
    Type: Grant
    Filed: July 29, 2005
    Date of Patent: December 23, 2008
    Assignee: Microsoft Corporation
    Inventors: Hang Li, Ruihua Song, Yunbo Cao, Dmitriy Meyerzon
  • Patent number: 7461056
    Abstract: A method for extracting key terms and associated key terms for use in text mining is provided. The method includes receiving unstructured text documents, such as emails over a customer service system. Term candidates are extracted based on identifying consecutive word strings satisfying a context independency threshold. Term candidates are weighted using mutual information to generate a list of weighted terms. The weighted terms are then recounted. Terms are associated based on Chi-square values. Associated terms can then be used for information retrieval. A user interface can be personalized with individual user profiles.
    Type: Grant
    Filed: February 9, 2005
    Date of Patent: December 2, 2008
    Assignee: Microsoft Corporation
    Inventors: Yunbo Cao, Hang Li, Olivier Ribet, Benjamin Martin
  • Publication number: 20080249764
    Abstract: A sentiment classifier is described. In one implementation, a system applies both full text and complex feature analyses to sentences of a product review. Each analysis is weighted prior to linear combination into a final sentiment prediction. A full text model and a complex features model can be trained separately offline to support online full text analysis and complex features analysis. Complex features include opinion indicators, negation patterns, sentiment-specific sections of the product review, user ratings, sequence of text chunks, and sentence types and lengths. A Conditional Random Field (CRF) framework provides enhanced sentiment classification for each segment of a complex sentence to enhance sentiment prediction.
    Type: Application
    Filed: December 5, 2007
    Publication date: October 9, 2008
    Applicant: Microsoft Corporation
    Inventors: Shen Huang, Ling Bao, Yunbo Cao, Zheng Chen, Chin-Yew Lin, Christoph R. Ponath, Jian-Tao Sun, Ming Zhou, Jian Wang
  • Publication number: 20080147654
    Abstract: A typed separable mixture model is used to mine associative relationships between sets of objects. Instead of modeling only one type of co-occurrence among the sets of objects, the typed separable mixture model can model multiple different types of co-occurrences among more than two sets of objects, and co-occurrences that exist in different contexts.
    Type: Application
    Filed: April 12, 2007
    Publication date: June 19, 2008
    Applicant: Microsoft Corporation
    Inventors: Yunbo Cao, Hang Li
  • Publication number: 20080027925
    Abstract: A method and system for generating a ranking function to rank the relevance of documents to a query is provided. The ranking system learns a ranking function from training data that includes queries, resultant documents, and relevance of each document to its query. The ranking system learns a ranking function using the training data by weighting incorrect rankings of relevant documents more heavily than the incorrect rankings of not relevant documents so that more emphasis is placed on correctly ranking relevant documents. The ranking system may also learn a ranking function using the training data by normalizing the contribution of each query to the ranking function so that it is independent of the number of relevant documents of each query.
    Type: Application
    Filed: July 28, 2006
    Publication date: January 31, 2008
    Applicant: Microsoft Corporation
    Inventors: Hang Li, Jun Xu, Yunbo Cao, Tie-Yan Liu
  • Patent number: 7299228
    Abstract: The present invention relates to extracting information from an information source. During extraction, strings in the information source are accessed. These strings in the information source are matched with generalized extraction patterns that include words and wildcards. The wildcards denote that at least one word in an individual string can be skipped in order to match the individual string to an individual generalized extraction pattern.
    Type: Grant
    Filed: December 11, 2003
    Date of Patent: November 20, 2007
    Assignee: Microsoft Corporation
    Inventors: Yunbo Cao, Hang Li
  • Patent number: 7284006
    Abstract: A computer-implemented method is provided that includes receiving a document and determining a file type for the document. In addition, the document is segmented into blocks of text as a function of the file type and at least one keyword and a summary is generated for the document.
    Type: Grant
    Filed: November 14, 2003
    Date of Patent: October 16, 2007
    Assignee: Microsoft Corporation
    Inventors: Yunbo Cao, Hang Li
  • Publication number: 20070150473
    Abstract: A method of finding documents. A method of finding documents comprising, ranking documents according to relevance to form a ranked relevance list, ranking documents according to type to form a ranked type list, and combining the ranked relevance list and the ranked type list to form a list of documents ranked by relevance and type.
    Type: Application
    Filed: May 16, 2006
    Publication date: June 28, 2007
    Applicant: Microsoft Corporation
    Inventors: Hang Li, Yunbo Cao, Jun Xu
  • Publication number: 20070150472
    Abstract: A method of finding documents. A method of finding documents comprising, ranking documents according to relevance to form a ranked relevance list, ranking documents according to type to form a ranked type list, and interpolating the ranked relevance list and the ranked type list to form a list of documents ranked by relevance and type.
    Type: Application
    Filed: December 22, 2005
    Publication date: June 28, 2007
    Applicant: Microsoft Corporation
    Inventors: Yunbo Cao, Hang Li, Jun Xu