Patents by Inventor Tie-Yan Liu
Tie-Yan Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 7930303Abstract: A calculate importance system calculates the global importance of a web page based on a “mean hitting time.” Hitting time of a target web page is a measure of the minimum number of transitions needed to land on the target web page. Mean hitting time of a target web page is an average number of such transitions for all possible starting web pages. The calculate importance system calculates a global importance score for a web page based on the reciprocal of a mean hitting time. A search engine may rank web pages of a search result based on a combination of relevance of the web pages to the search request and global importance of the web pages based on a global hitting time.Type: GrantFiled: April 30, 2007Date of Patent: April 19, 2011Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Hang Li, Lei Qi, Bin Gao
-
Patent number: 7890502Abstract: A method and system for determining the contribution of a document within a hierarchy of documents based on the contribution of descendant documents is provided. The contribution system provides a hierarchy of documents that specifies the ancestor/descendant relations between documents. For each document of a hierarchy, the contribution system determines the contribution of each document factoring in the contribution of descendant documents. The contribution may be the relevance of a document to a topic, a feature of a document, and so on.Type: GrantFiled: November 14, 2005Date of Patent: February 15, 2011Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Wei-Ying Ma, Tao Qin
-
Publication number: 20110029466Abstract: A method and system for rank aggregation of entities based on supervised learning is provided. A rank aggregation system provides an order-based aggregation of rankings of entities by learning weights within an optimization framework for combining the rankings of the entities using labeled training data and the ordering of the individual rankings. The rank aggregation system is provided with multiple rankings of entities. The rank aggregation system is also provided with training data that indicates the relative ranking of pairs of entities. The rank aggregation system then learns weights for each of the ranking sources by attempting to optimize the difference between the relative rankings of pairs of entities using the weights and the relative rankings of pairs of entities of the training data.Type: ApplicationFiled: October 15, 2010Publication date: February 3, 2011Applicant: Microsoft CorporationInventors: Tie-Yan Liu, Hang Li, Yu-Ting Liu
-
Patent number: 7860971Abstract: An anti-spam tool works with a web browser to detect spam webpages locally on a client machine. The anti-spam tool can be implemented either as a plug-in module or an integral part of the browser, and manifested as a toolbar. The tool can perform an anti-spam action whenever a webpage is accessed through the browser, and does not require direct involvement of a search engine. A spam detection module installed on the computing device determines whether a webpage being accessed or whether a link contained in the webpage being accessed is spam, by comparing the URL of the webpage or the link with a spam list. The spam list can be downloaded from a remote search engine server, stored locally and updated from time to time. A two-level indexing technique is also introduced to improve the efficiency of the anti-spam tool's use of the spam list.Type: GrantFiled: February 21, 2008Date of Patent: December 28, 2010Assignee: Microsoft CorporationInventors: Bin Gao, Tie-Yan Liu, Hang Li, Lei Yang
-
Patent number: 7853599Abstract: This disclosure describes various exemplary methods, computer program products, and systems for selecting features for ranking in information retrieval. This disclosure describes calculating importance scores for features, measuring similarity scores between two features, selecting features that maximizes total importance scores of the features and minimizes total similarity scores between the features. Also, the disclosure includes selecting features for ranking that solves an optimization problem. Thus, this disclosure identifies relevant features by removing noisy and redundant features and speeds up a process of model training.Type: GrantFiled: January 21, 2008Date of Patent: December 14, 2010Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Geng Xiubo, Tao Qin, Hang Li
-
Patent number: 7840522Abstract: A method and system for rank aggregation of entities based on supervised learning is provided. A rank aggregation system provides an order-based aggregation of rankings of entities by learning weights within an optimization framework for combining the rankings of the entities using labeled training data and the ordering of the individual rankings. The rank aggregation system is provided with multiple rankings of entities. The rank aggregation system is also provided with training data that indicates the relative ranking of pairs of entities. The rank aggregation system then learns weights for each of the ranking sources by attempting to optimize the difference between the relative rankings of pairs of entities using the weights and the relative rankings of pairs of entities of the training data.Type: GrantFiled: March 7, 2007Date of Patent: November 23, 2010Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Hang Li, Yu-Ting Liu
-
Publication number: 20100281078Abstract: A distributed data reorganization system and method for mapping and reducing raw data containing a plurality of data records. Embodiments of the distributed data reorganization system and method operate in a general-purpose parallel execution environment that use an arbitrary communication directed acyclic graph. The vertices of the graph accept multiple data inputs and generate multiple data inputs, and may be of different types. Embodiments of the distributed data reorganization system and method include a plurality of distributed mappers that use a mapping criteria supplied by a developer to map the plurality of data records to data buckets. The mapped data record and data bucket identifications are input for a plurality of distributed reducers. Each distributed reducer groups together data records having the same data bucket identification and then uses a merge logic supplied by the developer to reduce the grouped data records to obtain reorganized data.Type: ApplicationFiled: April 30, 2009Publication date: November 4, 2010Applicant: Microsoft CorporationInventors: Taifeng Wang, Tie-Yan Liu
-
Patent number: 7818279Abstract: A method and system for detecting events based on query-page relationships is provided. The event detection system detects events by analyzing occurrences of query-page pairs generated from a user selecting the page of the pair from a search result for the query of the pair. The event detection system may identify semantic and temporal similarity between query-page pairs. The event detection system then identifies clusters of query-page pairs that are semantically and temporally similar.Type: GrantFiled: March 13, 2006Date of Patent: October 19, 2010Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Wei-Ying Ma
-
Publication number: 20100257167Abstract: Queries describe users' search needs and therefore they play a role in the context of learning to rank for information retrieval and Web search. However, most existing approaches for learning to rank do not explicitly take into consideration the fact that queries vary significantly along several dimensions and require different objectives for the ranking models. The technique described herein incorporates query difference into learning to rank by introducing query-dependent loss functions. Specifically, the technique employs query categorization to represent query differences and employs specific query-dependent loss functions based on such kind of query differences. The technique employs two learning methods. One learns ranking functions with pre-defined query difference, while the other one learns both of them simultaneously.Type: ApplicationFiled: April 1, 2009Publication date: October 7, 2010Applicant: MICROSOFT CORPORATIONInventor: Tie-Yan Liu
-
Patent number: 7809723Abstract: A method and system for distributed training of a hierarchical classifier for classifying documents using a classification hierarchy is provided. A training system provides training data that includes the documents and classifications of the documents within the classification hierarchy. The training system distributes the training of the classifiers of the hierarchical classifier to various agents so that the classifiers can be trained in parallel. For each classifier, the training system identifies an agent that is to train the classifier. Each agent then trains its classifiers.Type: GrantFiled: August 15, 2006Date of Patent: October 5, 2010Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Wei-Ying Ma, Hua-Jun Zeng
-
Publication number: 20100250555Abstract: The page ranking technique described herein employs a Markov Skeleton Mirror Process (MSMP), which is a particular case of Markov Skeleton Processes, to model and calculate page importance scores. Given a web graph and its metadata, the technique builds an MSMP model on the web graph. It first estimates the stationary distribution of a EMC and views it as transition probability. It next computes the mean staying time using the metadata. Finally, it calculates the product of transition probability and mean staying time, which is actually the stationary distribution of MSMP. This is regarded as page importance.Type: ApplicationFiled: March 27, 2009Publication date: September 30, 2010Applicant: Microsoft CorporationInventors: Bin Gao, Tie-Yan Liu
-
Patent number: 7805438Abstract: A method and system for generating a ranking function using a fidelity-based loss between a target probability and a model probability for a pair of documents is provided. A fidelity ranking system generates a fidelity ranking function that ranks the relevance of documents to queries. The fidelity ranking system operates to minimize a fidelity loss between pairs of documents of training data. The fidelity loss may be derived from “fidelity” as used in the field of quantum physics. The fidelity ranking system may use a learning technique in conjunction with a fidelity loss when generating the ranking function. After the fidelity ranking system generates the fidelity ranking function, it uses the fidelity ranking function to rank the relevance of documents to queries.Type: GrantFiled: July 31, 2006Date of Patent: September 28, 2010Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Ming-Feng Tsai, Wei-Ying Ma
-
Publication number: 20100169323Abstract: Described is a technology in which documents associated with a query are ranked by a ranking model that depends on the query. When a query is processed, a ranking model for the query is selected/determined based upon nearest neighbors to the query in query feature space. In one aspect, the ranking model is trained online, based on a training set obtained from a number of nearest neighbors to the query. In an alternative aspect, ranking models are trained offline using training sets; the query is used to find a most similar training set based on nearest neighbors of the query, with the ranking model that corresponds to the most similar training set being selected for ranking. In another alternative aspect, the ranking models are trained offline, with the nearest neighbor to the query determined and used to select its associated ranking model.Type: ApplicationFiled: December 29, 2008Publication date: July 1, 2010Applicant: Microsoft CorporationInventors: Tie-Yan Liu, Xiubo Geng, Hang Li
-
Patent number: 7743058Abstract: A method and system for high-order co-clustering of objects of heterogeneous types is provided. A clustering system co-clusters objects of heterogeneous types based on joint distributions for objects of non-central types and objects of a central type. The clustering system uses an iterative approach to co-clustering the objects of the various types. The clustering system divides the co-clustering into a sub-problem, for each non-central type (e.g., first type and second type), of co-clustering objects of that non-central type and objects of the central type based on the joint distribution for that non-central type. After the co-clustering is completed, the clustering system clusters objects of the central type based on the clusters of the objects of the non-central types identified during co-clustering. The clustering system repeats the iterations until the clusters of objects of the central type converge on a solution.Type: GrantFiled: January 10, 2007Date of Patent: June 22, 2010Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Bin Gao, Wei-Ying Ma
-
Patent number: 7734633Abstract: Procedures for learning and ranking items in a listwise manner are discussed. A listwise methodology may consider a ranked list, of individual items, as a specific permutation of the items being ranked. In implementations, a listwise loss function may be used in ranking items. A listwise loss function may be a metric which reflects the departure or disorder from an exemplary ranking for one or more sample listwise rankings used in learning. In this manner, the loss function may approximate the exemplary ranking for the plurality of items being ranked.Type: GrantFiled: October 18, 2007Date of Patent: June 8, 2010Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Hang Li, Tao Qin, Zhe Cao
-
Patent number: 7698332Abstract: A method and system for projecting queries and images into a similarity space where queries are close to their relevant images is provided. A similarity space projection (“SSP”) system learns a query projection function and an image projection function based on training data. The query projection function projects the relevance of the most relevant words of a query into a similarity space and the image projection function projects the relevance to an image of the most relevant words of a query into the same similarity space so that queries and their relevant images are close in the similarity space. The SSP system can then identify images that are relevant to a target query and queries that are relevant to a target image using the projection functions.Type: GrantFiled: March 13, 2006Date of Patent: April 13, 2010Assignee: Microsoft CorporationInventors: Tie-Yan Liu, Tao Qin, Wei-Ying Ma
-
Publication number: 20100082613Abstract: The present invention provides an improved method for ranking documents using a ranking model. One embodiment employs Continuous Conditional Random Fields (CRF) as a model, which is a conditional probability distribution representing a mapping relationship from retrieved documents to their ranking scores. The model can naturally utilize features of the content information of documents as well as the relation information between documents for global ranking. The present invention also provides a learning algorithm for creating Continuous CRF. Also provided, the invention introduces Pseudo Relevance Feedback and Topic Distillation.Type: ApplicationFiled: September 22, 2008Publication date: April 1, 2010Applicant: Microsoft CorporationInventors: Tie-Yan Liu, Tao Qin, Hang Li
-
Publication number: 20100082639Abstract: The present invention introduces a new approach to learning systems. More specifically, the present invention provides learned methods for optimize ranking models. In one aspect of the present invention, an objective function is defined as the likelihood of ground truth based on a Luce model. In another aspect, techniques of the present invention provide a way of representing different kinds of ground truths as a constraint set of permutations. In yet another aspect of the present invention, techniques of the present invention provide a way of learning the model parameter by maximizing the likelihood of the ground truth.Type: ApplicationFiled: September 30, 2008Publication date: April 1, 2010Applicant: Microsoft CorporationInventors: Hang Li, Tie-Yan Liu
-
Publication number: 20100082606Abstract: The present invention provides methods for improving a ranking model. In one embodiment, a method includes the step of obtaining queries, documents, and document labels. The process then initializes active sets using the document labels, wherein two active sets are established for each query, a perfect active set and an imperfect active set. Then, the process optimizes an empirical loss function by the use of the first and second active set, whereby parameters of the ranking model are modified in accordance to the empirical loss function. The method then updates the active sets with additional ranking data, wherein the updates are configured to work in conjunction with the optimized loss function and modified ranking model. The recalculated active sets provide an indication for ranking the documents in a way that is more consistent with the document metadata.Type: ApplicationFiled: September 24, 2008Publication date: April 1, 2010Applicant: Microsoft CorporationInventors: Jun Xu, Tie-Yan Liu, Hang Li
-
Publication number: 20100082617Abstract: The present invention provides techniques for generating data that is used for ranking documents. In one embodiment, a method involves the step of extracting data features from a number of documents to be ranked. The data features extracted from the documents are established in conjunction with a first feature map and a second feature map, wherein the first feature map and the second feature map are capable of keeping the relative ordering between two document instances. In one embodiment, the two feature maps are specially a divide feature map and a minus feature map. Once the data is mapped, the method involves the step of generating pairwise preferences from the first feature map and the second feature map. Then the pairwise preferences are aggregated into a total order, which can be used to produce one or more relevancy scores.Type: ApplicationFiled: September 24, 2008Publication date: April 1, 2010Applicant: Microsoft CorporationInventors: Tie-Yan Liu, Hang Li