Patents by Inventor Jinliang Fan
Jinliang Fan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8356086Abstract: Architecture that scales up the non-negative matrix factorization (NMF) technique to a distributed NMF (denoted DNMF) to handle large matrices, for example, on a web scale that can include millions and billions of data points. To analyze web-scale data, DNMF is applied through parallelism on distributed computer clusters, for example, with thousands of machines. In order to maximize the parallelism and data locality, matrices are partitioned in the short dimension. The probabilistic DNMF can employ not only Gaussian and Poisson NMF techniques, but also exponential NMF for modeling web dyadic data (e.g., dwell time of a user on browsed web pages).Type: GrantFiled: March 31, 2010Date of Patent: January 15, 2013Assignee: Microsoft CorporationInventors: Chao Liu, Hung-chih Yang, Jinliang Fan, Li-wei He, Yi-Min Wang
-
Patent number: 8041710Abstract: Search relevance failures are diagnosed automatically. Users presented with unsatisfactory search results can report their dissatisfaction through various mechanisms. Dissatisfaction reports can trigger automatic investigation into the root cause of such dissatisfaction. Based on the identified root cause, a search engine can be modified to resolve the issue creating dissatisfaction thereby improving search engine quality.Type: GrantFiled: November 13, 2008Date of Patent: October 18, 2011Assignee: Microsoft CorporationInventors: Li-wei He, Wenzhao Tan, Jinliang Fan, Yi-Min Wang, Xiaoxin Yin
-
Publication number: 20110246573Abstract: Architecture that scales up the non-negative matrix factorization (NMF) technique to a distributed NMF (denoted DNMF) to handle large matrices, for example, on a web scale that can include millions and billions of data points. To analyze web-scale data, DNMF is applied through parallelism on distributed computer clusters, for example, with thousands of machines. In order to maximize the parallelism and data locality, matrices are partitioned in the short dimension. The probabilistic DNMF can employ not only Gaussian and Poisson NMF techniques, but also exponential NMF for modeling web dyadic data (e.g., dwell time of a user on browsed web pages).Type: ApplicationFiled: March 31, 2010Publication date: October 6, 2011Applicant: Microsoft CorporationInventors: Chao Liu, Hung-Chih Yang, Jinliang Fan, Li-Wei He, Yi-Min Wang
-
Patent number: 7778183Abstract: A method is provided for selecting a replication node from eligible nodes in a network. A multidimensional model is constructed that defines a multidimensional space and includes the eligible nodes, with each of the dimensions of the multidimensional model being a system characteristic. A data availability value is determined for each of the eligible nodes, and a cost of deploying is determined for each of at least two availability strategies to the eligible nodes. At least one of the eligible nodes is selected for replication of data that is stored on a source node in the network. The selecting step includes selecting the eligible node whose: data availability value is determined to be highest among the eligible nodes whose cost of deploying does not exceed a specified maximum, or cost of deploying is determined to be lowest among the eligible nodes whose data availability value does not exceed a specified minimum.Type: GrantFiled: February 20, 2007Date of Patent: August 17, 2010Assignee: International Business Machines CorporationInventors: Jinliang Fan, Nagui Halim, Zhen Liu, Dimitrios Pendarakis
-
Publication number: 20100121841Abstract: Search relevance failures are diagnosed automatically. Users presented with unsatisfactory search results can report their dissatisfaction through various mechanisms. Dissatisfaction reports can trigger automatic investigation into the root cause of such dissatisfaction. Based on the identified root cause, a search engine can be modified to resolve the issue creating dissatisfaction thereby improving search engine quality.Type: ApplicationFiled: November 13, 2008Publication date: May 13, 2010Applicant: MICROSOFT CORPORATIONInventors: Li-wei He, Wenzhao Tan, Jinliang Fan, Yi-Min Wang, Xiaoxin Yin
-
Patent number: 7650529Abstract: There is provided a method and system for replicating data at another location. The system includes a source node that contains data in a data storage area. The source node is coupled to a network of potential replication nodes. The processor determines at least two eligible nodes in the network of nodes and determines the communication cost associated with a each of the eligible nodes. The processor also determines a probability of a concurrent failure of the source node and each of eligible nodes, and selects at least one of the eligible nodes for replication of the data located on the source node. The selection is based on the determined communication costs and probability of concurrent failure.Type: GrantFiled: June 25, 2008Date of Patent: January 19, 2010Assignee: International Business Machines CorporationInventors: Jinliang Fan, Zhen Liu, Dimitrios Pendarakis
-
Patent number: 7480817Abstract: A method is provided for replicating data. All nodes coupled to a source node via a network are surveyed to determine candidate replication nodes, and coordinates for each candidate replication node are acquired. The coordinates are used to determine a geographic location of and a communication cost for each candidate replication node. Each geographic location is rated based on probability of a concurrent failure of the source node and the candidate replication node, and a branch-and-bound algorithm is used to assign values to sets of candidate replication nodes based on the communication costs and the ratings. One set of candidate replication nodes is selected based on the assigned values. The data is replicated on the nodes of the selected set of candidate replication nodes, and all nodes coupled to the source node via the network are at least periodically monitored to determine availability of new nodes.Type: GrantFiled: March 31, 2006Date of Patent: January 20, 2009Assignee: International Business Machines CorporationInventors: Jinliang Fan, Zhen Liu, Dimitrios Pendarakis
-
Publication number: 20080270822Abstract: There is provided a method and system for replicating data at another location. The system includes a source node that contains data in a data storage area. The source node is coupled to a network of potential replication nodes. The processor determines at least two eligible nodes in the network of nodes and determines the communication cost associated with a each of the eligible nodes. The processor also determines a probability of a concurrent failure of the source node and each of eligible nodes, and selects at least one of the eligible nodes for replication of the data located on the source node. The selection is based on the determined communication costs and probability of concurrent failure.Type: ApplicationFiled: June 25, 2008Publication date: October 30, 2008Applicant: International Business Machines Corp.Inventors: JINLIANG FAN, Zhen Liu, Dimitrios Pendarakis
-
Publication number: 20080198752Abstract: A method is provided for selecting a replication node from eligible nodes in a network. A multidimensional model is constructed that defines a multidimensional space and includes the eligible nodes, with each of the dimensions of the multidimensional model being a system characteristic. A data availability value is determined for each of the eligible nodes, and a cost of deploying is determined for each of at least two availability strategies to the eligible nodes. At least one of the eligible nodes is selected for replication of data that is stored on a source node in the network. The selecting step includes selecting the eligible node whose: data availability value is determined to be highest among the eligible nodes whose cost of deploying does not exceed a specified maximum, or cost of deploying is determined to be lowest among the eligible nodes whose data availability value does not exceed a specified minimum.Type: ApplicationFiled: February 20, 2007Publication date: August 21, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jinliang Fan, Nagui Halim, Zhen Liu, Dimitrios Pendarakis
-
Publication number: 20070234102Abstract: There is provided a method and system for replicating data at another location. The system includes a source node that contains data in a data storage area. The source node is coupled to a network of potential replication nodes. The processor determines at least two eligible nodes in the network of nodes and determines the communication cost associated with a each of the eligible nodes. The processor also determines a probability of a concurrent failure of the source node and each of eligible nodes, and selects at least one of the eligible nodes for replication of the data located on the source node. The selection is based on an the determined communication costs and probability of concurrent failure.Type: ApplicationFiled: March 31, 2006Publication date: October 4, 2007Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jinliang Fan, Zhen Liu, Dimitrios Pendarakis