Patents by Inventor Jinliang Fan

Jinliang Fan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Distributed non-negative matrix factorization

Patent number: 8356086

Abstract: Architecture that scales up the non-negative matrix factorization (NMF) technique to a distributed NMF (denoted DNMF) to handle large matrices, for example, on a web scale that can include millions and billions of data points. To analyze web-scale data, DNMF is applied through parallelism on distributed computer clusters, for example, with thousands of machines. In order to maximize the parallelism and data locality, matrices are partitioned in the short dimension. The probabilistic DNMF can employ not only Gaussian and Poisson NMF techniques, but also exponential NMF for modeling web dyadic data (e.g., dwell time of a user on browsed web pages).

Type: Grant

Filed: March 31, 2010

Date of Patent: January 15, 2013

Assignee: Microsoft Corporation

Inventors: Chao Liu, Hung-chih Yang, Jinliang Fan, Li-wei He, Yi-Min Wang
Automatic diagnosis of search relevance failures

Patent number: 8041710

Abstract: Search relevance failures are diagnosed automatically. Users presented with unsatisfactory search results can report their dissatisfaction through various mechanisms. Dissatisfaction reports can trigger automatic investigation into the root cause of such dissatisfaction. Based on the identified root cause, a search engine can be modified to resolve the issue creating dissatisfaction thereby improving search engine quality.

Type: Grant

Filed: November 13, 2008

Date of Patent: October 18, 2011

Assignee: Microsoft Corporation

Inventors: Li-wei He, Wenzhao Tan, Jinliang Fan, Yi-Min Wang, Xiaoxin Yin
DISTRIBUTED NON-NEGATIVE MATRIX FACTORIZATION

Publication number: 20110246573

Abstract: Architecture that scales up the non-negative matrix factorization (NMF) technique to a distributed NMF (denoted DNMF) to handle large matrices, for example, on a web scale that can include millions and billions of data points. To analyze web-scale data, DNMF is applied through parallelism on distributed computer clusters, for example, with thousands of machines. In order to maximize the parallelism and data locality, matrices are partitioned in the short dimension. The probabilistic DNMF can employ not only Gaussian and Poisson NMF techniques, but also exponential NMF for modeling web dyadic data (e.g., dwell time of a user on browsed web pages).

Type: Application

Filed: March 31, 2010

Publication date: October 6, 2011

Applicant: Microsoft Corporation

Inventors: Chao Liu, Hung-Chih Yang, Jinliang Fan, Li-Wei He, Yi-Min Wang
Data replica selector

Patent number: 7778183

Abstract: A method is provided for selecting a replication node from eligible nodes in a network. A multidimensional model is constructed that defines a multidimensional space and includes the eligible nodes, with each of the dimensions of the multidimensional model being a system characteristic. A data availability value is determined for each of the eligible nodes, and a cost of deploying is determined for each of at least two availability strategies to the eligible nodes. At least one of the eligible nodes is selected for replication of data that is stored on a source node in the network. The selecting step includes selecting the eligible node whose: data availability value is determined to be highest among the eligible nodes whose cost of deploying does not exceed a specified maximum, or cost of deploying is determined to be lowest among the eligible nodes whose data availability value does not exceed a specified minimum.

Type: Grant

Filed: February 20, 2007

Date of Patent: August 17, 2010

Assignee: International Business Machines Corporation

Inventors: Jinliang Fan, Nagui Halim, Zhen Liu, Dimitrios Pendarakis
AUTOMATIC DIAGNOSIS OF SEARCH RELEVANCE FAILURES

Publication number: 20100121841

Abstract: Search relevance failures are diagnosed automatically. Users presented with unsatisfactory search results can report their dissatisfaction through various mechanisms. Dissatisfaction reports can trigger automatic investigation into the root cause of such dissatisfaction. Based on the identified root cause, a search engine can be modified to resolve the issue creating dissatisfaction thereby improving search engine quality.

Type: Application

Filed: November 13, 2008

Publication date: May 13, 2010

Applicant: MICROSOFT CORPORATION

Inventors: Li-wei He, Wenzhao Tan, Jinliang Fan, Yi-Min Wang, Xiaoxin Yin
Data replica selector

Patent number: 7650529

Abstract: There is provided a method and system for replicating data at another location. The system includes a source node that contains data in a data storage area. The source node is coupled to a network of potential replication nodes. The processor determines at least two eligible nodes in the network of nodes and determines the communication cost associated with a each of the eligible nodes. The processor also determines a probability of a concurrent failure of the source node and each of eligible nodes, and selects at least one of the eligible nodes for replication of the data located on the source node. The selection is based on the determined communication costs and probability of concurrent failure.

Type: Grant

Filed: June 25, 2008

Date of Patent: January 19, 2010

Assignee: International Business Machines Corporation

Inventors: Jinliang Fan, Zhen Liu, Dimitrios Pendarakis
Method for replicating data based on probability of concurrent failure

Patent number: 7480817

Abstract: A method is provided for replicating data. All nodes coupled to a source node via a network are surveyed to determine candidate replication nodes, and coordinates for each candidate replication node are acquired. The coordinates are used to determine a geographic location of and a communication cost for each candidate replication node. Each geographic location is rated based on probability of a concurrent failure of the source node and the candidate replication node, and a branch-and-bound algorithm is used to assign values to sets of candidate replication nodes based on the communication costs and the ratings. One set of candidate replication nodes is selected based on the assigned values. The data is replicated on the nodes of the selected set of candidate replication nodes, and all nodes coupled to the source node via the network are at least periodically monitored to determine availability of new nodes.

Type: Grant

Filed: March 31, 2006

Date of Patent: January 20, 2009

Assignee: International Business Machines Corporation

Inventors: Jinliang Fan, Zhen Liu, Dimitrios Pendarakis
DATA REPLICA SELECTOR

Publication number: 20080270822

Abstract: There is provided a method and system for replicating data at another location. The system includes a source node that contains data in a data storage area. The source node is coupled to a network of potential replication nodes. The processor determines at least two eligible nodes in the network of nodes and determines the communication cost associated with a each of the eligible nodes. The processor also determines a probability of a concurrent failure of the source node and each of eligible nodes, and selects at least one of the eligible nodes for replication of the data located on the source node. The selection is based on the determined communication costs and probability of concurrent failure.

Type: Application

Filed: June 25, 2008

Publication date: October 30, 2008

Applicant: International Business Machines Corp.

Inventors: JINLIANG FAN, Zhen Liu, Dimitrios Pendarakis
DATA REPLICA SELECTOR

Publication number: 20080198752

Abstract: A method is provided for selecting a replication node from eligible nodes in a network. A multidimensional model is constructed that defines a multidimensional space and includes the eligible nodes, with each of the dimensions of the multidimensional model being a system characteristic. A data availability value is determined for each of the eligible nodes, and a cost of deploying is determined for each of at least two availability strategies to the eligible nodes. At least one of the eligible nodes is selected for replication of data that is stored on a source node in the network. The selecting step includes selecting the eligible node whose: data availability value is determined to be highest among the eligible nodes whose cost of deploying does not exceed a specified maximum, or cost of deploying is determined to be lowest among the eligible nodes whose data availability value does not exceed a specified minimum.

Type: Application

Filed: February 20, 2007

Publication date: August 21, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jinliang Fan, Nagui Halim, Zhen Liu, Dimitrios Pendarakis
Data replica selector

Publication number: 20070234102

Abstract: There is provided a method and system for replicating data at another location. The system includes a source node that contains data in a data storage area. The source node is coupled to a network of potential replication nodes. The processor determines at least two eligible nodes in the network of nodes and determines the communication cost associated with a each of the eligible nodes. The processor also determines a probability of a concurrent failure of the source node and each of eligible nodes, and selects at least one of the eligible nodes for replication of the data located on the source node. The selection is based on an the determined communication costs and probability of concurrent failure.

Type: Application

Filed: March 31, 2006

Publication date: October 4, 2007

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jinliang Fan, Zhen Liu, Dimitrios Pendarakis