Patents by Inventor Tie-Yan Liu

Tie-Yan Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100082617
    Abstract: The present invention provides techniques for generating data that is used for ranking documents. In one embodiment, a method involves the step of extracting data features from a number of documents to be ranked. The data features extracted from the documents are established in conjunction with a first feature map and a second feature map, wherein the first feature map and the second feature map are capable of keeping the relative ordering between two document instances. In one embodiment, the two feature maps are specially a divide feature map and a minus feature map. Once the data is mapped, the method involves the step of generating pairwise preferences from the first feature map and the second feature map. Then the pairwise preferences are aggregated into a total order, which can be used to produce one or more relevancy scores.
    Type: Application
    Filed: September 24, 2008
    Publication date: April 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Tie-Yan Liu, Hang Li
  • Publication number: 20100082613
    Abstract: The present invention provides an improved method for ranking documents using a ranking model. One embodiment employs Continuous Conditional Random Fields (CRF) as a model, which is a conditional probability distribution representing a mapping relationship from retrieved documents to their ranking scores. The model can naturally utilize features of the content information of documents as well as the relation information between documents for global ranking. The present invention also provides a learning algorithm for creating Continuous CRF. Also provided, the invention introduces Pseudo Relevance Feedback and Topic Distillation.
    Type: Application
    Filed: September 22, 2008
    Publication date: April 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Tie-Yan Liu, Tao Qin, Hang Li
  • Publication number: 20100082606
    Abstract: The present invention provides methods for improving a ranking model. In one embodiment, a method includes the step of obtaining queries, documents, and document labels. The process then initializes active sets using the document labels, wherein two active sets are established for each query, a perfect active set and an imperfect active set. Then, the process optimizes an empirical loss function by the use of the first and second active set, whereby parameters of the ranking model are modified in accordance to the empirical loss function. The method then updates the active sets with additional ranking data, wherein the updates are configured to work in conjunction with the optimized loss function and modified ranking model. The recalculated active sets provide an indication for ranking the documents in a way that is more consistent with the document metadata.
    Type: Application
    Filed: September 24, 2008
    Publication date: April 1, 2010
    Applicant: Microsoft Corporation
    Inventors: Jun Xu, Tie-Yan Liu, Hang Li
  • Publication number: 20100073374
    Abstract: Method for creating a graph representing web browsing behavior, including receiving web browsing behavior data from one or more web browsers; adding a node on the graph for each web page listed in the web browsing behavior data; adding a first link connecting two or more nodes on the graph, wherein the first link representing a hyperlink for accessing a webpage; calculating an amount of time in which each web page is being accessed; determining a number of units of time in the calculated amount of time; adding one or more virtual nodes to the graph based on the number of units of time; and adding a second link connecting two or more virtual nodes on the graph, wherein the second link representing a virtual hyperlink for accessing a webpage.
    Type: Application
    Filed: September 24, 2008
    Publication date: March 25, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Bin Gao, Tie-Yan Liu, Hang Li, Yuting Liu
  • Publication number: 20100076910
    Abstract: Method for determining a webpage importance, including receiving web browsing behavior data of one or more users; creating a model of the web browsing behavior data; calculating a stationary probability distribution of the model; and correlating the stationary probability distribution to the webpage importance.
    Type: Application
    Filed: September 25, 2008
    Publication date: March 25, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Bin Gao, Tie-Yan Liu, Hang Li, Yuting Liu
  • Patent number: 7680851
    Abstract: A method and system for introducing spam into a search engine for testing purposes is provided. An active spam testing system receives from a tester a specification of spam that is to be introduced into the search engine for testing purposes. The testing system may then generate auxiliary data structures for storing indications of the spam that is to be introduced. A search engine has original data structures that may include a content index and a link data structure. The testing system stores the indications of the spam in the auxiliary data structures so that use of the search engine for non-testing purposes is not affected. When the search engine is used for testing purposes, the search engine generates search results based on a combination of the original data structures and the auxiliary data structures.
    Type: Grant
    Filed: March 7, 2007
    Date of Patent: March 16, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Hang Li
  • Patent number: 7676520
    Abstract: A method and system for determining temporal importance of documents having links between documents based on a temporal analysis of the links is provided. A temporal ranking system collects link information or snapshots indicating the links between documents at various snapshot times. The temporal ranking system calculates a current temporal importance of a document by factoring in the current importance of the document derived from the current snapshot (i.e., with the latest snapshot time) and the historical importance of the document derived from the past snapshots. To calculate the current temporal importance of a web page, the temporal ranking system aggregates the importance of the web page for each snapshot.
    Type: Grant
    Filed: April 12, 2007
    Date of Patent: March 9, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Hang Li, Lei Qi, Bin Gao, Lei Yang
  • Patent number: 7650031
    Abstract: Methods and systems for identifying black frames within a sequence of frames are provided. In one embodiment, the detection system detects black frames within a sequence of frames by fully decoding base frames and then partially decoding non-black, non-base frames in a way that ensures the blackness of each frame can be determined. The detection system decodes base frames before decoding dependent frames, which is referred to as processing frames in reverse order of dependency since a frame is processed before the frames that depend on it are processed. In another embodiment, the detection system determines the blackness of frames within a sequence of frames by processing the frames in order of their dependency and following chains of block dependency to decode and determine the blackness of blocks.
    Type: Grant
    Filed: November 23, 2004
    Date of Patent: January 19, 2010
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Bo Feng, Hong-Jiang Zhang
  • Patent number: 7634476
    Abstract: A method and system for determining a ranking of web sites based on an aggregation of rankings of the web pages within the web sites is provided. A ranking system identifies for each web site a stationary distribution of a stochastic complement of the transition probabilities between web pages of the web site. The ranking system then calculates transition probabilities between web sites based on the web page transition probabilities weighted by the stationary distribution of the stochastic complements. The ranking system then calculates the stationary distribution of the transition probabilities of the web sites to represent a ranking of the web sites.
    Type: Grant
    Filed: July 25, 2006
    Date of Patent: December 15, 2009
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Wei-Ying Ma
  • Patent number: 7624081
    Abstract: A community mining system analyzes objects of different types and relationships between the objects of different types to identify communities. The relationships between the objects have an associated time. The community mining system extracts various features related to objects of a designated type from the relationships between objects of different types that represent the evolution of the features over time. The community mining system collects training data that indicates extracted features associated with members of the communities. The community mining system then classifies an object of the designated type as being within the community based on closeness of the features of the object to the features of the training data.
    Type: Grant
    Filed: March 28, 2006
    Date of Patent: November 24, 2009
    Assignee: Microsoft Corporation
    Inventors: Qiankun Zhao, Tie-Yan Liu, Wei-Ying Ma
  • Publication number: 20090282031
    Abstract: A method and system is provided for calculating importance of documents based on transition probabilities from a source document to a target document based on looking ahead to information content of target documents of the source document. A look-ahead importance system generates transition probabilities of transitioning between any pair of source and target documents based on analysis of links to target documents of the source document. The system may calculate the transition probabilities based on the number of links on documents a look-ahead distance away. The system then solves for the stationary probabilities of the transition probabilities. The stationary probabilities represent the importance of the documents.
    Type: Application
    Filed: July 15, 2009
    Publication date: November 12, 2009
    Applicant: Microsoft Corporation
    Inventor: Tie-Yan Liu
  • Publication number: 20090282032
    Abstract: A method and system for generating a search result for a query of hierarchically organized documents based on retrieval of subtrees that are key resources for topic distillation is provided. The retrieval system may identify documents relevant to a query using conventional searching techniques. The retrieval system then calculates a subtree feature for subtrees that have an identified document as their root. After the retrieval system calculates the subtree feature for the subtrees, the retrieval system may generate a subtree relevance score for each subtree based on its subtree feature. The retrieval system may then order the identified documents based on their corresponding subtree relevances.
    Type: Application
    Filed: July 17, 2009
    Publication date: November 12, 2009
    Applicant: Microsoft Corporation
    Inventors: Tie-Yan Liu, Tao Qin, Wei-Ying Ma
  • Patent number: 7617194
    Abstract: A method and system for ranking importance of vertices of a directed graph based on links between the vertices and some prior knowledge of importance of the vertices is provided. A ranking system inputs an indication of the vertices along with an indication of the links between the vertices as the directed graph. The ranking system generates a transition-probability matrix that represents the probability of transitioning from vertex to vertex. The ranking system then generates a ranking of the vertices based on the links between the vertices represented by the stationary distribution of the transition-probability matrix that is minimally perturbed to satisfy the prior knowledge, which may be a partial ranking of the vertices.
    Type: Grant
    Filed: December 29, 2006
    Date of Patent: November 10, 2009
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Tao Qin, Wei-Ying Ma
  • Publication number: 20090249004
    Abstract: Embodiments for caching and accessing Directed Acyclic Graph (DAG) data to and from a computing device of a DAG distributed execution engine during the processing of an iterative algorithm. In accordance with one embodiment, a method includes processing a first subgraph of the plurality of subgraphs from the distributed storage system in the computing device. The first subgraph being processed with associated input values in the computing device to generate first output values in an iteration. The method further includes storing a second subgraph in a cache of the device. The second subgraph being a duplicate of the first subgraph. Moreover, the method also includes processing the second subgraph with the first output values to generate second output values if the device is to process the first subgraph in each of one or more subsequent iterations.
    Type: Application
    Filed: March 26, 2008
    Publication date: October 1, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Taifeng Wang, Tie-Yan Liu, Minghao Liu, Zhi Chen
  • Patent number: 7593934
    Abstract: A method and system for generating a ranking function to rank the relevance of documents to a query is provided. The ranking system learns a ranking function from training data that includes queries, resultant documents, and relevance of each document to its query. The ranking system learns a ranking function using the training data by weighting incorrect rankings of relevant documents more heavily than the incorrect rankings of not relevant documents so that more emphasis is placed on correctly ranking relevant documents. The ranking system may also learn a ranking function using the training data by normalizing the contribution of each query to the ranking function so that it is independent of the number of relevant documents of each query.
    Type: Grant
    Filed: July 28, 2006
    Date of Patent: September 22, 2009
    Assignee: Microsoft Corporation
    Inventors: Hang Li, Jun Xu, Yunbo Cao, Tie-Yan Liu
  • Publication number: 20090216868
    Abstract: An anti-spam tool works with a web browser to detect spam webpages locally on a client machine. The anti-spam tool can be implemented either as a plug-in module or an integral part of the browser, and manifested as a toolbar. The tool can perform an anti-spam action whenever a webpage is accessed through the browser, and does not require direct involvement of a search engine. A spam detection module installed on the computing device determines whether a webpage being accessed or whether a link contained in the webpage being accessed is spam, by comparing the URL of the webpage or the link with a spam list. The spam list can be downloaded from a remote search engine server, stored locally and updated from time to time. A two-level indexing technique is also introduced to improve the efficiency of the anti-spam tool's use of the spam list.
    Type: Application
    Filed: February 21, 2008
    Publication date: August 27, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Bin Gao, Tie-Yan Liu, Hang Li, Lei Yang
  • Patent number: 7580945
    Abstract: A method and system is provided for calculating importance of documents based on transition probabilities from a source document to a target document based on looking ahead to information content of target documents of the source document. A look-ahead importance system generates transition probabilities of transitioning between any pair of source and target documents based on analysis of links to target documents of the source document. The system may calculate the transition probabilities based on the number of links on documents a look-ahead distance away. The system then solves for the stationary probabilities of the transition probabilities. The stationary probabilities represent the importance of the documents.
    Type: Grant
    Filed: March 30, 2007
    Date of Patent: August 25, 2009
    Assignee: Microsoft Corporation
    Inventor: Tie-Yan Liu
  • Patent number: 7580931
    Abstract: A method and system for generating a search result for a query of hierarchically organized documents based on retrieval of subtrees that are key resources for topic distillation is provided. The retrieval system may identify documents relevant to a query using conventional searching techniques. The retrieval system then calculates a subtree feature for subtrees that have an identified document as their root. After the retrieval system calculates the subtree feature for the subtrees, the retrieval system may generate a subtree relevance score for each subtree based on its subtree feature. The retrieval system may then order the identified documents based on their corresponding subtree relevances.
    Type: Grant
    Filed: March 13, 2006
    Date of Patent: August 25, 2009
    Assignee: Microsoft Corporation
    Inventors: Tie-Yan Liu, Tao Qin, Wei-Ying Ma
  • Publication number: 20090198673
    Abstract: An anti-spam technique for protecting search engine ranking is based on mining search engine optimization (SEO) forums. The anti-spam technique collects webpages such as SEO forum posts from a list of suspect spam websites, and extracts suspicious link exchange URLs and corresponding link formation from the collected webpages. A search engine ranking penalty is then applied to the suspicious link exchange URLs. The penalty is at least partially determined by the link information associated with the respective suspicious link exchange URL. To detect more suspicious link exchange URLs, the technique may propagate one or more levels from a seed set of suspicious link exchange URLs generated by mining SEO forums.
    Type: Application
    Filed: February 6, 2008
    Publication date: August 6, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Bin Gao, Tie-Yan Liu, Hang Li, Congkai Sun
  • Publication number: 20090187555
    Abstract: This disclosure describes various exemplary methods, computer program products, and systems for selecting features for ranking in information retrieval. This disclosure describes calculating importance scores for features, measuring similarity scores between two features, selecting features that maximizes total importance scores of the features and minimizes total similarity scores between the features. Also, the disclosure includes selecting features for ranking that solves an optimization problem. Thus, this disclosure identifies relevant features by removing noisy and redundant features and speeds up a process of model training.
    Type: Application
    Filed: January 21, 2008
    Publication date: July 23, 2009
    Applicant: Microsoft Corporation
    Inventors: Tie-Yan Liu, Geng Xiubo, Tao Qin, Hang Li