Patents by Inventor Tie-Yan Liu

Tie-Yan Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Calculating global importance of documents based on global hitting times

Patent number: 7930303

Abstract: A calculate importance system calculates the global importance of a web page based on a “mean hitting time.” Hitting time of a target web page is a measure of the minimum number of transitions needed to land on the target web page. Mean hitting time of a target web page is an average number of such transitions for all possible starting web pages. The calculate importance system calculates a global importance score for a web page based on the reciprocal of a mean hitting time. A search engine may rank web pages of a search result based on a combination of relevance of the web pages to the search request and global importance of the web pages based on a global hitting time.

Type: Grant

Filed: April 30, 2007

Date of Patent: April 19, 2011

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Hang Li, Lei Qi, Bin Gao
Hierarchy-based propagation of contribution of documents

Patent number: 7890502

Abstract: A method and system for determining the contribution of a document within a hierarchy of documents based on the contribution of descendant documents is provided. The contribution system provides a hierarchy of documents that specifies the ancestor/descendant relations between documents. For each document of a hierarchy, the contribution system determines the contribution of each document factoring in the contribution of descendant documents. The contribution may be the relevance of a document to a topic, a feature of a document, and so on.

Type: Grant

Filed: November 14, 2005

Date of Patent: February 15, 2011

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Wei-Ying Ma, Tao Qin
SUPERVISED RANK AGGREGATION BASED ON RANKINGS

Publication number: 20110029466

Abstract: A method and system for rank aggregation of entities based on supervised learning is provided. A rank aggregation system provides an order-based aggregation of rankings of entities by learning weights within an optimization framework for combining the rankings of the entities using labeled training data and the ordering of the individual rankings. The rank aggregation system is provided with multiple rankings of entities. The rank aggregation system is also provided with training data that indicates the relative ranking of pairs of entities. The rank aggregation system then learns weights for each of the ranking sources by attempting to optimize the difference between the relative rankings of pairs of entities using the weights and the relative rankings of pairs of entities of the training data.

Type: Application

Filed: October 15, 2010

Publication date: February 3, 2011

Applicant: Microsoft Corporation

Inventors: Tie-Yan Liu, Hang Li, Yu-Ting Liu
Anti-spam tool for browser

Patent number: 7860971

Abstract: An anti-spam tool works with a web browser to detect spam webpages locally on a client machine. The anti-spam tool can be implemented either as a plug-in module or an integral part of the browser, and manifested as a toolbar. The tool can perform an anti-spam action whenever a webpage is accessed through the browser, and does not require direct involvement of a search engine. A spam detection module installed on the computing device determines whether a webpage being accessed or whether a link contained in the webpage being accessed is spam, by comparing the URL of the webpage or the link with a spam list. The spam list can be downloaded from a remote search engine server, stored locally and updated from time to time. A two-level indexing technique is also introduced to improve the efficiency of the anti-spam tool's use of the spam list.

Type: Grant

Filed: February 21, 2008

Date of Patent: December 28, 2010

Assignee: Microsoft Corporation

Inventors: Bin Gao, Tie-Yan Liu, Hang Li, Lei Yang
Feature selection for ranking

Patent number: 7853599

Abstract: This disclosure describes various exemplary methods, computer program products, and systems for selecting features for ranking in information retrieval. This disclosure describes calculating importance scores for features, measuring similarity scores between two features, selecting features that maximizes total importance scores of the features and minimizes total similarity scores between the features. Also, the disclosure includes selecting features for ranking that solves an optimization problem. Thus, this disclosure identifies relevant features by removing noisy and redundant features and speeds up a process of model training.

Type: Grant

Filed: January 21, 2008

Date of Patent: December 14, 2010

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Geng Xiubo, Tao Qin, Hang Li
Supervised rank aggregation based on rankings

Patent number: 7840522

Abstract: A method and system for rank aggregation of entities based on supervised learning is provided. A rank aggregation system provides an order-based aggregation of rankings of entities by learning weights within an optimization framework for combining the rankings of the entities using labeled training data and the ordering of the individual rankings. The rank aggregation system is provided with multiple rankings of entities. The rank aggregation system is also provided with training data that indicates the relative ranking of pairs of entities. The rank aggregation system then learns weights for each of the ranking sources by attempting to optimize the difference between the relative rankings of pairs of entities using the weights and the relative rankings of pairs of entities of the training data.

Type: Grant

Filed: March 7, 2007

Date of Patent: November 23, 2010

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Hang Li, Yu-Ting Liu
DISTRIBUTED DATA REORGANIZATION FOR PARALLEL EXECUTION ENGINES

Publication number: 20100281078

Abstract: A distributed data reorganization system and method for mapping and reducing raw data containing a plurality of data records. Embodiments of the distributed data reorganization system and method operate in a general-purpose parallel execution environment that use an arbitrary communication directed acyclic graph. The vertices of the graph accept multiple data inputs and generate multiple data inputs, and may be of different types. Embodiments of the distributed data reorganization system and method include a plurality of distributed mappers that use a mapping criteria supplied by a developer to map the plurality of data records to data buckets. The mapped data record and data bucket identifications are input for a plurality of distributed reducers. Each distributed reducer groups together data records having the same data bucket identification and then uses a merge logic supplied by the developer to reduce the grouped data records to obtain reorganized data.

Type: Application

Filed: April 30, 2009

Publication date: November 4, 2010

Applicant: Microsoft Corporation

Inventors: Taifeng Wang, Tie-Yan Liu
Event detection based on evolution of click-through data

Patent number: 7818279

Abstract: A method and system for detecting events based on query-page relationships is provided. The event detection system detects events by analyzing occurrences of query-page pairs generated from a user selecting the page of the pair from a search result for the query of the pair. The event detection system may identify semantic and temporal similarity between query-page pairs. The event detection system then identifies clusters of query-page pairs that are semantically and temporally similar.

Type: Grant

Filed: March 13, 2006

Date of Patent: October 19, 2010

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Wei-Ying Ma
LEARNING TO RANK USING QUERY-DEPENDENT LOSS FUNCTIONS

Publication number: 20100257167

Abstract: Queries describe users' search needs and therefore they play a role in the context of learning to rank for information retrieval and Web search. However, most existing approaches for learning to rank do not explicitly take into consideration the fact that queries vary significantly along several dimensions and require different objectives for the ranking models. The technique described herein incorporates query difference into learning to rank by introducing query-dependent loss functions. Specifically, the technique employs query categorization to represent query differences and employs specific query-dependent loss functions based on such kind of query differences. The technique employs two learning methods. One learns ranking functions with pre-defined query difference, while the other one learns both of them simultaneously.

Type: Application

Filed: April 1, 2009

Publication date: October 7, 2010

Applicant: MICROSOFT CORPORATION

Inventor: Tie-Yan Liu
Distributed hierarchical text classification framework

Patent number: 7809723

Abstract: A method and system for distributed training of a hierarchical classifier for classifying documents using a classification hierarchy is provided. A training system provides training data that includes the documents and classifications of the documents within the classification hierarchy. The training system distributes the training of the classifiers of the hierarchical classifier to various agents so that the classifiers can be trained in parallel. For each classifier, the training system identifies an agent that is to train the classifier. Each agent then trains its classifiers.

Type: Grant

Filed: August 15, 2006

Date of Patent: October 5, 2010

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Wei-Ying Ma, Hua-Jun Zeng
Calculating Web Page Importance

Publication number: 20100250555

Abstract: The page ranking technique described herein employs a Markov Skeleton Mirror Process (MSMP), which is a particular case of Markov Skeleton Processes, to model and calculate page importance scores. Given a web graph and its metadata, the technique builds an MSMP model on the web graph. It first estimates the stationary distribution of a EMC and views it as transition probability. It next computes the mean staying time using the metadata. Finally, it calculates the product of transition probability and mean staying time, which is actually the stationary distribution of MSMP. This is regarded as page importance.

Type: Application

Filed: March 27, 2009

Publication date: September 30, 2010

Applicant: Microsoft Corporation

Inventors: Bin Gao, Tie-Yan Liu
Learning a document ranking function using fidelity-based error measurements

Patent number: 7805438

Abstract: A method and system for generating a ranking function using a fidelity-based loss between a target probability and a model probability for a pair of documents is provided. A fidelity ranking system generates a fidelity ranking function that ranks the relevance of documents to queries. The fidelity ranking system operates to minimize a fidelity loss between pairs of documents of training data. The fidelity loss may be derived from “fidelity” as used in the field of quantum physics. The fidelity ranking system may use a learning technique in conjunction with a fidelity loss when generating the ranking function. After the fidelity ranking system generates the fidelity ranking function, it uses the fidelity ranking function to rank the relevance of documents to queries.

Type: Grant

Filed: July 31, 2006

Date of Patent: September 28, 2010

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Ming-Feng Tsai, Wei-Ying Ma
Query-Dependent Ranking Using K-Nearest Neighbor

Publication number: 20100169323

Abstract: Described is a technology in which documents associated with a query are ranked by a ranking model that depends on the query. When a query is processed, a ranking model for the query is selected/determined based upon nearest neighbors to the query in query feature space. In one aspect, the ranking model is trained online, based on a training set obtained from a number of nearest neighbors to the query. In an alternative aspect, ranking models are trained offline using training sets; the query is used to find a most similar training set based on nearest neighbors of the query, with the ranking model that corresponds to the most similar training set being selected for ranking. In another alternative aspect, the ranking models are trained offline, with the nearest neighbor to the query determined and used to select its associated ranking model.

Type: Application

Filed: December 29, 2008

Publication date: July 1, 2010

Applicant: Microsoft Corporation

Inventors: Tie-Yan Liu, Xiubo Geng, Hang Li
Co-clustering objects of heterogeneous types

Patent number: 7743058

Abstract: A method and system for high-order co-clustering of objects of heterogeneous types is provided. A clustering system co-clusters objects of heterogeneous types based on joint distributions for objects of non-central types and objects of a central type. The clustering system uses an iterative approach to co-clustering the objects of the various types. The clustering system divides the co-clustering into a sub-problem, for each non-central type (e.g., first type and second type), of co-clustering objects of that non-central type and objects of the central type based on the joint distribution for that non-central type. After the co-clustering is completed, the clustering system clusters objects of the central type based on the clusters of the objects of the non-central types identified during co-clustering. The clustering system repeats the iterations until the clusters of objects of the central type converge on a solution.

Type: Grant

Filed: January 10, 2007

Date of Patent: June 22, 2010

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Bin Gao, Wei-Ying Ma
Listwise ranking

Patent number: 7734633

Abstract: Procedures for learning and ranking items in a listwise manner are discussed. A listwise methodology may consider a ranked list, of individual items, as a specific permutation of the items being ranked. In implementations, a listwise loss function may be used in ranking items. A listwise loss function may be a metric which reflects the departure or disorder from an exemplary ranking for one or more sample listwise rankings used in learning. In this manner, the loss function may approximate the exemplary ranking for the plurality of items being ranked.

Type: Grant

Filed: October 18, 2007

Date of Patent: June 8, 2010

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Hang Li, Tao Qin, Zhe Cao
Projecting queries and images into a similarity space

Patent number: 7698332

Abstract: A method and system for projecting queries and images into a similarity space where queries are close to their relevant images is provided. A similarity space projection (“SSP”) system learns a query projection function and an image projection function based on training data. The query projection function projects the relevance of the most relevant words of a query into a similarity space and the image projection function projects the relevance to an image of the most relevant words of a query into the same similarity space so that queries and their relevant images are close in the similarity space. The SSP system can then identify images that are relevant to a target query and queries that are relevant to a target image using the projection functions.

Type: Grant

Filed: March 13, 2006

Date of Patent: April 13, 2010

Assignee: Microsoft Corporation

Inventors: Tie-Yan Liu, Tao Qin, Wei-Ying Ma
OPTIMIZING RANKING OF DOCUMENTS USING CONTINUOUS CONDITIONAL RANDOM FIELDS

Publication number: 20100082613

Abstract: The present invention provides an improved method for ranking documents using a ranking model. One embodiment employs Continuous Conditional Random Fields (CRF) as a model, which is a conditional probability distribution representing a mapping relationship from retrieved documents to their ranking scores. The model can naturally utilize features of the content information of documents as well as the relation information between documents for global ranking. The present invention also provides a learning algorithm for creating Continuous CRF. Also provided, the invention introduces Pseudo Relevance Feedback and Topic Distillation.

Type: Application

Filed: September 22, 2008

Publication date: April 1, 2010

Applicant: Microsoft Corporation

Inventors: Tie-Yan Liu, Tao Qin, Hang Li
PROCESSING MAXIMUM LIKELIHOOD FOR LISTWISE RANKINGS

Publication number: 20100082639

Abstract: The present invention introduces a new approach to learning systems. More specifically, the present invention provides learned methods for optimize ranking models. In one aspect of the present invention, an objective function is defined as the likelihood of ground truth based on a Luce model. In another aspect, techniques of the present invention provide a way of representing different kinds of ground truths as a constraint set of permutations. In yet another aspect of the present invention, techniques of the present invention provide a way of learning the model parameter by maximizing the likelihood of the ground truth.

Type: Application

Filed: September 30, 2008

Publication date: April 1, 2010

Applicant: Microsoft Corporation

Inventors: Hang Li, Tie-Yan Liu
DIRECTLY OPTIMIZING EVALUATION MEASURES IN LEARNING TO RANK

Publication number: 20100082606

Abstract: The present invention provides methods for improving a ranking model. In one embodiment, a method includes the step of obtaining queries, documents, and document labels. The process then initializes active sets using the document labels, wherein two active sets are established for each query, a perfect active set and an imperfect active set. Then, the process optimizes an empirical loss function by the use of the first and second active set, whereby parameters of the ranking model are modified in accordance to the empirical loss function. The method then updates the active sets with additional ranking data, wherein the updates are configured to work in conjunction with the optimized loss function and modified ranking model. The recalculated active sets provide an indication for ranking the documents in a way that is more consistent with the document metadata.

Type: Application

Filed: September 24, 2008

Publication date: April 1, 2010

Applicant: Microsoft Corporation

Inventors: Jun Xu, Tie-Yan Liu, Hang Li
PAIR-WISE RANKING MODEL FOR INFORMATION RETRIEVAL

Publication number: 20100082617

Abstract: The present invention provides techniques for generating data that is used for ranking documents. In one embodiment, a method involves the step of extracting data features from a number of documents to be ranked. The data features extracted from the documents are established in conjunction with a first feature map and a second feature map, wherein the first feature map and the second feature map are capable of keeping the relative ordering between two document instances. In one embodiment, the two feature maps are specially a divide feature map and a minus feature map. Once the data is mapped, the method involves the step of generating pairwise preferences from the first feature map and the second feature map. Then the pairwise preferences are aggregated into a total order, which can be used to produce one or more relevancy scores.

Type: Application

Filed: September 24, 2008

Publication date: April 1, 2010

Applicant: Microsoft Corporation

Inventors: Tie-Yan Liu, Hang Li

prev 1 2 3 4 5 6 7 next