Patents by Inventor Yunbo Cao

Yunbo Cao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Factoid-based searching

Patent number: 7707204

Abstract: A query and a factoid type selection are received from a user. An index of passages, indexed based on factoids, is accessed and passages that are related to the query, and that have the selected factoid type, are retrieved. The retrieved passages are ranked and provided to the user based on a calculated score, in rank order.

Type: Grant

Filed: December 13, 2005

Date of Patent: April 27, 2010

Assignee: Microsoft Corporation

Inventors: Hang Li, Jianfeng Gao, Yunbo Cao
DETERMINING UTILITY OF A QUESTION

Publication number: 20100049498

Abstract: A question search system provides a collection of questions having words for use in evaluating the utility of the questions based on a language model. The question search system calculates n-gram probabilities for words within the questions of the collection. The n-gram probability of a word for a sequence of n?1 words indicates the probability of that word being next after that sequence in the collection of questions. The n-gram probabilities for the words of the collection represent the language model of the collection. The question search system calculates a language model utility score for each question within a collection that indicates the likelihood that a question is repeatedly asked by users. The question search system derives the language model utility score for a question from the n-gram probabilities of the words within that question.

Type: Application

Filed: August 25, 2008

Publication date: February 25, 2010

Applicant: Microsoft Corporation

Inventors: Yunbo Cao, Chin-Yew Lin
CLUSTERING QUESTION SEARCH RESULTS BASED ON TOPIC AND FOCUS

Publication number: 20100030769

Abstract: A method and system for presenting questions that are relevant to a queried question based on clusters of topics and clusters of focuses of the questions is provided. A question search system provides a collection of questions. Each question of the collection has an associated topic and focus. Upon receiving a queried question, the question search system identifies questions of the collection that may be relevant to the queried question and generates a score or ranking indicating relevance of the identified questions. The question search system clusters the identified questions into topic clusters of questions with similar topics. The question search system may also cluster the questions within each topic cluster into focus clusters of questions with similar focuses.

Type: Application

Filed: August 4, 2008

Publication date: February 4, 2010

Applicant: Microsoft Corporation

Inventors: Yunbo Cao, Chin-Yew Lin
SEARCHING QUESTIONS BASED ON TOPIC AND FOCUS

Publication number: 20100030770

Abstract: A method and system for determining the relevance of questions to a queried question based on topics and focuses of the questions is provided. A question search system provides a collection of questions with topics and focuses. Upon receiving a queried question, the question search system identifies a queried topic and queried focus of the queried question. The question search system generates a score indicating the relevance of a question of the collection to the queried question based on a language model of the topic of the question and a language model of the focus of the question.

Type: Application

Filed: August 4, 2008

Publication date: February 4, 2010

Applicant: Microsoft Corporation

Inventors: Yunbo Cao, Chin-Yew Lin
Search by document type and relevance

Patent number: 7644074

Abstract: A method of finding documents. A method of finding documents comprising, ranking documents according to relevance to form a ranked relevance list, ranking documents according to type to form a ranked type list, and interpolating the ranked relevance list and the ranked type list to form a list of documents ranked by relevance and type.

Type: Grant

Filed: December 22, 2005

Date of Patent: January 5, 2010

Assignee: Microsoft Corporation

Inventors: Yunbo Cao, Hang Li, Jun Xu
QUESTION TYPE-SENSITIVE ANSWER SUMMARIZATION

Publication number: 20090259642

Abstract: In a question answering system, the system identifies a type of question input by a user. The system then generates answer summaries that summarize answers to the input question in a format that is determined based on the type of question asked by the user. The answer summaries are output, in the corresponding format, in answer to the input question.

Type: Application

Filed: April 15, 2008

Publication date: October 15, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Yunbo Cao, Chin-Yew Lin
RECOMMENDING QUESTIONS TO USERS OF COMMUNITY QIESTION ANSWERING

Publication number: 20090253112

Abstract: The present system graphs topic terms in stored cQA questions and also converts a submitted question into a graph of topic terms. Topic terms that correspond to a question topic are delineated from topic terms that correspond to question focus. New questions are recommended to the user based on a comparison between the topics of the new questions and the topic of the submitted question as well as the focus of the new questions and the focus of the submitted question.

Type: Application

Filed: April 7, 2008

Publication date: October 8, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Yunbo Cao, Chin-Yew Lin
Learning a document ranking using a loss function with a rank pair or a query parameter

Patent number: 7593934

Abstract: A method and system for generating a ranking function to rank the relevance of documents to a query is provided. The ranking system learns a ranking function from training data that includes queries, resultant documents, and relevance of each document to its query. The ranking system learns a ranking function using the training data by weighting incorrect rankings of relevant documents more heavily than the incorrect rankings of not relevant documents so that more emphasis is placed on correctly ranking relevant documents. The ranking system may also learn a ranking function using the training data by normalizing the contribution of each query to the ranking function so that it is independent of the number of relevant documents of each query.

Type: Grant

Filed: July 28, 2006

Date of Patent: September 22, 2009

Assignee: Microsoft Corporation

Inventors: Hang Li, Jun Xu, Yunbo Cao, Tie-Yan Liu
Electronic mail data cleaning

Patent number: 7590608

Abstract: A cascaded processing approach is used to clean noisy electronic mail or other text messaging data. Non-text filtering is first performed on the noisy data to filter out non-text items in the data. Text normalization is then performed on the filtered data to provide cleaned data. The cleaned data can be used in one or more of a wide variety of other applications or processing systems.

Type: Grant

Filed: December 2, 2005

Date of Patent: September 15, 2009

Assignee: Microsoft Corporation

Inventors: Hang Li, Yunbo Cao, ZhaoHui Tang
Uncertainty reduction in collaborative bootstrapping

Patent number: 7512582

Abstract: Collaborative bootstrapping with uncertainty reduction for increased classifier performance. One classifier selects a portion of data that is uncertain with respect to the classifier and a second classifier labels the portion. Uncertainty reduction includes parallel processing where the second classifier also selects an uncertain portion for the first classifier to label. Uncertainty reduction can be incorporated into existing or new co-training or bootstrapping, including bilingual bootstrapping.

Type: Grant

Filed: December 10, 2003

Date of Patent: March 31, 2009

Assignee: Microsoft Corporation

Inventors: Yunbo Cao, Hang Li
Handling product reviews

Publication number: 20090083096

Abstract: A method for handling product reviews can detect a first quality product review from a second quality product review. The first and second quality product reviews can be associated with a product. The first quality product review can be filtered. An opinion segment in the second quality product review can be identified and the polarity can be determined of the opinion segment. An opinion set can be generated with the opinion segment for a product feature. A score (or weighty can be aggregated of segments in the opinion set for the product feature.

Type: Application

Filed: September 20, 2007

Publication date: March 26, 2009

Applicant: Microsoft Corporation

Inventors: Yunbo Cao, Chin-Yew Lin, Ming Zhou
Extraction of information from documents

Patent number: 7469251

Abstract: An information extraction model is trained on format features identified within labeled training documents. Information from a document is extracted by assigning labels to units based on format features of the units within the document. A begin label and end label are identified and the information is extracted between the begin label and the end label. The extracted information can be used in various document processing tasks such as ranking.

Type: Grant

Filed: July 29, 2005

Date of Patent: December 23, 2008

Assignee: Microsoft Corporation

Inventors: Hang Li, Ruihua Song, Yunbo Cao, Dmitriy Meyerzon
Text mining apparatus and associated methods

Patent number: 7461056

Abstract: A method for extracting key terms and associated key terms for use in text mining is provided. The method includes receiving unstructured text documents, such as emails over a customer service system. Term candidates are extracted based on identifying consecutive word strings satisfying a context independency threshold. Term candidates are weighted using mutual information to generate a list of weighted terms. The weighted terms are then recounted. Terms are associated based on Chi-square values. Associated terms can then be used for information retrieval. A user interface can be personalized with individual user profiles.

Type: Grant

Filed: February 9, 2005

Date of Patent: December 2, 2008

Assignee: Microsoft Corporation

Inventors: Yunbo Cao, Hang Li, Olivier Ribet, Benjamin Martin
Smart Sentiment Classifier for Product Reviews

Publication number: 20080249764

Abstract: A sentiment classifier is described. In one implementation, a system applies both full text and complex feature analyses to sentences of a product review. Each analysis is weighted prior to linear combination into a final sentiment prediction. A full text model and a complex features model can be trained separately offline to support online full text analysis and complex features analysis. Complex features include opinion indicators, negation patterns, sentiment-specific sections of the product review, user ratings, sequence of text chunks, and sentence types and lengths. A Conditional Random Field (CRF) framework provides enhanced sentiment classification for each segment of a complex sentence to enhance sentiment prediction.

Type: Application

Filed: December 5, 2007

Publication date: October 9, 2008

Applicant: Microsoft Corporation

Inventors: Shen Huang, Ling Bao, Yunbo Cao, Zheng Chen, Chin-Yew Lin, Christoph R. Ponath, Jian-Tao Sun, Ming Zhou, Jian Wang
Mining latent associations of objects using a typed mixture model

Publication number: 20080147654

Abstract: A typed separable mixture model is used to mine associative relationships between sets of objects. Instead of modeling only one type of co-occurrence among the sets of objects, the typed separable mixture model can model multiple different types of co-occurrences among more than two sets of objects, and co-occurrences that exist in different contexts.

Type: Application

Filed: April 12, 2007

Publication date: June 19, 2008

Applicant: Microsoft Corporation

Inventors: Yunbo Cao, Hang Li
LEARNING A DOCUMENT RANKING USING A LOSS FUNCTION WITH A RANK PAIR OR A QUERY PARAMETER

Publication number: 20080027925

Abstract: A method and system for generating a ranking function to rank the relevance of documents to a query is provided. The ranking system learns a ranking function from training data that includes queries, resultant documents, and relevance of each document to its query. The ranking system learns a ranking function using the training data by weighting incorrect rankings of relevant documents more heavily than the incorrect rankings of not relevant documents so that more emphasis is placed on correctly ranking relevant documents. The ranking system may also learn a ranking function using the training data by normalizing the contribution of each query to the ranking function so that it is independent of the number of relevant documents of each query.

Type: Application

Filed: July 28, 2006

Publication date: January 31, 2008

Applicant: Microsoft Corporation

Inventors: Hang Li, Jun Xu, Yunbo Cao, Tie-Yan Liu
Learning and using generalized string patterns for information extraction

Patent number: 7299228

Abstract: The present invention relates to extracting information from an information source. During extraction, strings in the information source are accessed. These strings in the information source are matched with generalized extraction patterns that include words and wildcards. The wildcards denote that at least one word in an individual string can be skipped in order to match the individual string to an individual generalized extraction pattern.

Type: Grant

Filed: December 11, 2003

Date of Patent: November 20, 2007

Assignee: Microsoft Corporation

Inventors: Yunbo Cao, Hang Li
Method and apparatus for browsing document content

Patent number: 7284006

Abstract: A computer-implemented method is provided that includes receiving a document and determining a file type for the document. In addition, the document is segmented into blocks of text as a function of the file type and at least one keyword and a summary is generated for the document.

Type: Grant

Filed: November 14, 2003

Date of Patent: October 16, 2007

Assignee: Microsoft Corporation

Inventors: Yunbo Cao, Hang Li
Search By Document Type And Relevance

Publication number: 20070150473

Abstract: A method of finding documents. A method of finding documents comprising, ranking documents according to relevance to form a ranked relevance list, ranking documents according to type to form a ranked type list, and combining the ranked relevance list and the ranked type list to form a list of documents ranked by relevance and type.

Type: Application

Filed: May 16, 2006

Publication date: June 28, 2007

Applicant: Microsoft Corporation

Inventors: Hang Li, Yunbo Cao, Jun Xu
Search By Document Type

Publication number: 20070150472

Abstract: A method of finding documents. A method of finding documents comprising, ranking documents according to relevance to form a ranked relevance list, ranking documents according to type to form a ranked type list, and interpolating the ranked relevance list and the ranked type list to form a list of documents ranked by relevance and type.

Type: Application

Filed: December 22, 2005

Publication date: June 28, 2007

Applicant: Microsoft Corporation

Inventors: Yunbo Cao, Hang Li, Jun Xu

prev 1 2 3 next