Patents by Inventor Tie-Yan Liu

Tie-Yan Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 7930303
    Abstract: A calculate importance system calculates the global importance of a web page based on a “mean hitting time.” Hitting time of a target web page is a measure of the minimum number of transitions needed to land on the target web page. Mean hitting time of a target web page is an average number of such transitions for all possible starting web pages. The calculate importance system calculates a global importance score for a web page based on the reciprocal of a mean hitting time. A search engine may rank web pages of a search result based on a combination of relevance of the web pages to the search request and global importance of the web pages based on a global hitting time.
    Type: Grant
    Filed: April 30, 2007
    Date of Patent: April 19, 2011
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Hang Li, Lei Qi, Bin Gao
  • Patent number: 7890502
    Abstract: A method and system for determining the contribution of a document within a hierarchy of documents based on the contribution of descendant documents is provided. The contribution system provides a hierarchy of documents that specifies the ancestor/descendant relations between documents. For each document of a hierarchy, the contribution system determines the contribution of each document factoring in the contribution of descendant documents. The contribution may be the relevance of a document to a topic, a feature of a document, and so on.
    Type: Grant
    Filed: November 14, 2005
    Date of Patent: February 15, 2011
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Wei-Ying Ma, Tao Qin
  • Publication number: 20110029466
    Abstract: A method and system for rank aggregation of entities based on supervised learning is provided. A rank aggregation system provides an order-based aggregation of rankings of entities by learning weights within an optimization framework for combining the rankings of the entities using labeled training data and the ordering of the individual rankings. The rank aggregation system is provided with multiple rankings of entities. The rank aggregation system is also provided with training data that indicates the relative ranking of pairs of entities. The rank aggregation system then learns weights for each of the ranking sources by attempting to optimize the difference between the relative rankings of pairs of entities using the weights and the relative rankings of pairs of entities of the training data.
    Type: Application
    Filed: October 15, 2010
    Publication date: February 3, 2011
    Applicant: Microsoft Corporation
    Inventors: Tie-Yan Liu, Hang Li, Yu-Ting Liu
  • Patent number: 7860971
    Abstract: An anti-spam tool works with a web browser to detect spam webpages locally on a client machine. The anti-spam tool can be implemented either as a plug-in module or an integral part of the browser, and manifested as a toolbar. The tool can perform an anti-spam action whenever a webpage is accessed through the browser, and does not require direct involvement of a search engine. A spam detection module installed on the computing device determines whether a webpage being accessed or whether a link contained in the webpage being accessed is spam, by comparing the URL of the webpage or the link with a spam list. The spam list can be downloaded from a remote search engine server, stored locally and updated from time to time. A two-level indexing technique is also introduced to improve the efficiency of the anti-spam tool's use of the spam list.
    Type: Grant
    Filed: February 21, 2008
    Date of Patent: December 28, 2010
    Assignee: Microsoft Corporation
    Inventors: Bin Gao, Tie-Yan Liu, Hang Li, Lei Yang
  • Patent number: 7853599
    Abstract: This disclosure describes various exemplary methods, computer program products, and systems for selecting features for ranking in information retrieval. This disclosure describes calculating importance scores for features, measuring similarity scores between two features, selecting features that maximizes total importance scores of the features and minimizes total similarity scores between the features. Also, the disclosure includes selecting features for ranking that solves an optimization problem. Thus, this disclosure identifies relevant features by removing noisy and redundant features and speeds up a process of model training.
    Type: Grant
    Filed: January 21, 2008
    Date of Patent: December 14, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Geng Xiubo, Tao Qin, Hang Li
  • Patent number: 7840522
    Abstract: A method and system for rank aggregation of entities based on supervised learning is provided. A rank aggregation system provides an order-based aggregation of rankings of entities by learning weights within an optimization framework for combining the rankings of the entities using labeled training data and the ordering of the individual rankings. The rank aggregation system is provided with multiple rankings of entities. The rank aggregation system is also provided with training data that indicates the relative ranking of pairs of entities. The rank aggregation system then learns weights for each of the ranking sources by attempting to optimize the difference between the relative rankings of pairs of entities using the weights and the relative rankings of pairs of entities of the training data.
    Type: Grant
    Filed: March 7, 2007
    Date of Patent: November 23, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Hang Li, Yu-Ting Liu
  • Publication number: 20100281078
    Abstract: A distributed data reorganization system and method for mapping and reducing raw data containing a plurality of data records. Embodiments of the distributed data reorganization system and method operate in a general-purpose parallel execution environment that use an arbitrary communication directed acyclic graph. The vertices of the graph accept multiple data inputs and generate multiple data inputs, and may be of different types. Embodiments of the distributed data reorganization system and method include a plurality of distributed mappers that use a mapping criteria supplied by a developer to map the plurality of data records to data buckets. The mapped data record and data bucket identifications are input for a plurality of distributed reducers. Each distributed reducer groups together data records having the same data bucket identification and then uses a merge logic supplied by the developer to reduce the grouped data records to obtain reorganized data.
    Type: Application
    Filed: April 30, 2009
    Publication date: November 4, 2010
    Applicant: Microsoft Corporation
    Inventors: Taifeng Wang, Tie-Yan Liu
  • Patent number: 7818279
    Abstract: A method and system for detecting events based on query-page relationships is provided. The event detection system detects events by analyzing occurrences of query-page pairs generated from a user selecting the page of the pair from a search result for the query of the pair. The event detection system may identify semantic and temporal similarity between query-page pairs. The event detection system then identifies clusters of query-page pairs that are semantically and temporally similar.
    Type: Grant
    Filed: March 13, 2006
    Date of Patent: October 19, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Wei-Ying Ma
  • Publication number: 20100257167
    Abstract: Queries describe users' search needs and therefore they play a role in the context of learning to rank for information retrieval and Web search. However, most existing approaches for learning to rank do not explicitly take into consideration the fact that queries vary significantly along several dimensions and require different objectives for the ranking models. The technique described herein incorporates query difference into learning to rank by introducing query-dependent loss functions. Specifically, the technique employs query categorization to represent query differences and employs specific query-dependent loss functions based on such kind of query differences. The technique employs two learning methods. One learns ranking functions with pre-defined query difference, while the other one learns both of them simultaneously.
    Type: Application
    Filed: April 1, 2009
    Publication date: October 7, 2010
    Applicant: MICROSOFT CORPORATION
    Inventor: Tie-Yan Liu
  • Patent number: 7809723
    Abstract: A method and system for distributed training of a hierarchical classifier for classifying documents using a classification hierarchy is provided. A training system provides training data that includes the documents and classifications of the documents within the classification hierarchy. The training system distributes the training of the classifiers of the hierarchical classifier to various agents so that the classifiers can be trained in parallel. For each classifier, the training system identifies an agent that is to train the classifier. Each agent then trains its classifiers.
    Type: Grant
    Filed: August 15, 2006
    Date of Patent: October 5, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Wei-Ying Ma, Hua-Jun Zeng
  • Publication number: 20100250555
    Abstract: The page ranking technique described herein employs a Markov Skeleton Mirror Process (MSMP), which is a particular case of Markov Skeleton Processes, to model and calculate page importance scores. Given a web graph and its metadata, the technique builds an MSMP model on the web graph. It first estimates the stationary distribution of a EMC and views it as transition probability. It next computes the mean staying time using the metadata. Finally, it calculates the product of transition probability and mean staying time, which is actually the stationary distribution of MSMP. This is regarded as page importance.
    Type: Application
    Filed: March 27, 2009
    Publication date: September 30, 2010
    Applicant: Microsoft Corporation
    Inventors: Bin Gao, Tie-Yan Liu
  • Patent number: 7805438
    Abstract: A method and system for generating a ranking function using a fidelity-based loss between a target probability and a model probability for a pair of documents is provided. A fidelity ranking system generates a fidelity ranking function that ranks the relevance of documents to queries. The fidelity ranking system operates to minimize a fidelity loss between pairs of documents of training data. The fidelity loss may be derived from “fidelity” as used in the field of quantum physics. The fidelity ranking system may use a learning technique in conjunction with a fidelity loss when generating the ranking function. After the fidelity ranking system generates the fidelity ranking function, it uses the fidelity ranking function to rank the relevance of documents to queries.
    Type: Grant
    Filed: July 31, 2006
    Date of Patent: September 28, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Ming-Feng Tsai, Wei-Ying Ma
  • Publication number: 20100169323
    Abstract: Described is a technology in which documents associated with a query are ranked by a ranking model that depends on the query. When a query is processed, a ranking model for the query is selected/determined based upon nearest neighbors to the query in query feature space. In one aspect, the ranking model is trained online, based on a training set obtained from a number of nearest neighbors to the query. In an alternative aspect, ranking models are trained offline using training sets; the query is used to find a most similar training set based on nearest neighbors of the query, with the ranking model that corresponds to the most similar training set being selected for ranking. In another alternative aspect, the ranking models are trained offline, with the nearest neighbor to the query determined and used to select its associated ranking model.
    Type: Application
    Filed: December 29, 2008
    Publication date: July 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Tie-Yan Liu, Xiubo Geng, Hang Li
  • Patent number: 7743058
    Abstract: A method and system for high-order co-clustering of objects of heterogeneous types is provided. A clustering system co-clusters objects of heterogeneous types based on joint distributions for objects of non-central types and objects of a central type. The clustering system uses an iterative approach to co-clustering the objects of the various types. The clustering system divides the co-clustering into a sub-problem, for each non-central type (e.g., first type and second type), of co-clustering objects of that non-central type and objects of the central type based on the joint distribution for that non-central type. After the co-clustering is completed, the clustering system clusters objects of the central type based on the clusters of the objects of the non-central types identified during co-clustering. The clustering system repeats the iterations until the clusters of objects of the central type converge on a solution.
    Type: Grant
    Filed: January 10, 2007
    Date of Patent: June 22, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Bin Gao, Wei-Ying Ma
  • Patent number: 7734633
    Abstract: Procedures for learning and ranking items in a listwise manner are discussed. A listwise methodology may consider a ranked list, of individual items, as a specific permutation of the items being ranked. In implementations, a listwise loss function may be used in ranking items. A listwise loss function may be a metric which reflects the departure or disorder from an exemplary ranking for one or more sample listwise rankings used in learning. In this manner, the loss function may approximate the exemplary ranking for the plurality of items being ranked.
    Type: Grant
    Filed: October 18, 2007
    Date of Patent: June 8, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Hang Li, Tao Qin, Zhe Cao
  • Patent number: 7698332
    Abstract: A method and system for projecting queries and images into a similarity space where queries are close to their relevant images is provided. A similarity space projection (“SSP”) system learns a query projection function and an image projection function based on training data. The query projection function projects the relevance of the most relevant words of a query into a similarity space and the image projection function projects the relevance to an image of the most relevant words of a query into the same similarity space so that queries and their relevant images are close in the similarity space. The SSP system can then identify images that are relevant to a target query and queries that are relevant to a target image using the projection functions.
    Type: Grant
    Filed: March 13, 2006
    Date of Patent: April 13, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Tao Qin, Wei-Ying Ma
  • Publication number: 20100082613
    Abstract: The present invention provides an improved method for ranking documents using a ranking model. One embodiment employs Continuous Conditional Random Fields (CRF) as a model, which is a conditional probability distribution representing a mapping relationship from retrieved documents to their ranking scores. The model can naturally utilize features of the content information of documents as well as the relation information between documents for global ranking. The present invention also provides a learning algorithm for creating Continuous CRF. Also provided, the invention introduces Pseudo Relevance Feedback and Topic Distillation.
    Type: Application
    Filed: September 22, 2008
    Publication date: April 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Tie-Yan Liu, Tao Qin, Hang Li
  • Publication number: 20100082639
    Abstract: The present invention introduces a new approach to learning systems. More specifically, the present invention provides learned methods for optimize ranking models. In one aspect of the present invention, an objective function is defined as the likelihood of ground truth based on a Luce model. In another aspect, techniques of the present invention provide a way of representing different kinds of ground truths as a constraint set of permutations. In yet another aspect of the present invention, techniques of the present invention provide a way of learning the model parameter by maximizing the likelihood of the ground truth.
    Type: Application
    Filed: September 30, 2008
    Publication date: April 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Hang Li, Tie-Yan Liu
  • Publication number: 20100082606
    Abstract: The present invention provides methods for improving a ranking model. In one embodiment, a method includes the step of obtaining queries, documents, and document labels. The process then initializes active sets using the document labels, wherein two active sets are established for each query, a perfect active set and an imperfect active set. Then, the process optimizes an empirical loss function by the use of the first and second active set, whereby parameters of the ranking model are modified in accordance to the empirical loss function. The method then updates the active sets with additional ranking data, wherein the updates are configured to work in conjunction with the optimized loss function and modified ranking model. The recalculated active sets provide an indication for ranking the documents in a way that is more consistent with the document metadata.
    Type: Application
    Filed: September 24, 2008
    Publication date: April 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Jun Xu, Tie-Yan Liu, Hang Li
  • Publication number: 20100082617
    Abstract: The present invention provides techniques for generating data that is used for ranking documents. In one embodiment, a method involves the step of extracting data features from a number of documents to be ranked. The data features extracted from the documents are established in conjunction with a first feature map and a second feature map, wherein the first feature map and the second feature map are capable of keeping the relative ordering between two document instances. In one embodiment, the two feature maps are specially a divide feature map and a minus feature map. Once the data is mapped, the method involves the step of generating pairwise preferences from the first feature map and the second feature map. Then the pairwise preferences are aggregated into a total order, which can be used to produce one or more relevancy scores.
    Type: Application
    Filed: September 24, 2008
    Publication date: April 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Tie-Yan Liu, Hang Li