Patents by Inventor Lin Hao Xu

Lin Hao Xu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Prefetching RDF triple data

Patent number: 10831767

Abstract: Query requests for RDF triples are obtained, wherein the query request(s) contain(s) at least one triple pattern; for each triple pattern, the corresponding elementary pattern is determined, and each triple pattern is converted to a weighted elementary pattern. The occurrence frequency of each elementary pattern is computed based on the weighted elementary patterns; at least one elementary pattern is chosen at least according to the occurrence frequency; and the RDF triples corresponding to the chosen at least elementary pattern are prefetched into the buffer. The corresponding apparatus is also provided. With the above method and apparatus, the frequently accessed RDF triples can be determined and prefetched into the buffer, which improves the query efficiency.

Type: Grant

Filed: November 13, 2016

Date of Patent: November 10, 2020

Assignee: International Business Machines Corporation

Inventors: Yue Pan, Xing Zhi Sun, Qing Fa Wang, Shuo Wu, Lin Hao Xu
Generating job alert

Patent number: 10705935

Abstract: A method and system for generating a job alert. According to embodiments of the present invention, before a target job is processed, a characteristic of input and output of the target job in at least one stage is determined through analyzing a historical job, and a resource overhead associated with the processing of the target job is calculated based on the characteristic of input and output. Then, an alert for the target job is generated in response to the resource overhead exceeding a predetermined threshold. In such manner, an alert for the target job can be proactively generated before the resource overhead problem occurs, so as to enable an administrator or developer to discover a fault in advance and adopt measures actively to avoid loss and damage to the intermediate results or output data when the target job is processed.

Type: Grant

Filed: September 24, 2015

Date of Patent: July 7, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Zhao Cao, Peng Li, Jun Ma, Ju Wei Shi, Bing Jiang Sun, Chen Wang, Lin Hao Xu, Chang Hai Yan, Xiao Ning Zhang, Jia Zou
System and method of query processing with schema change in JSON document store

Patent number: 10664471

Abstract: An information processing system, a computer readable storage medium, and a method of managing a query to find a set of JSON documents in a multi-schema JSON document store. A query engine receives a first query to find at least one JSON document in a plurality of sets of JSON documents stored in the JSON document store, each set of JSON documents being organized in a unique JSON schema version related to a unique JSON schema version of each other set of JSON documents by at least one schema change. The first query is organized in a first unique JSON schema version. A query translator translates the first query into a plurality of queries based on the plurality of schema changes. The first and the plurality of queries are executed to provide a collective set of query results.

Type: Grant

Filed: December 29, 2017

Date of Patent: May 26, 2020

Assignee: International Business Machines Corporation

Inventors: Zhao Cao, Yuan Feng, Tao Li, Lanjun Wang, Lin Hao Xu
Data analytics on distributed databases

Patent number: 10614087

Abstract: Data analytics is performed on a distributed document storage database by receiving a request for initiating a data analytics job; collecting statistics from the database in response to the request; using the statistics to estimate a first cost for merging an incremental data update for the job into a first resilient distributed dataset; using the statistics to estimate a second cost for newly creating a second resilient distributed dataset for the job; when the first cost is less than the second cost, reading data updates from the database and merging the data updates into the first resilient distributed dataset; and when the first cost is not less than the second cost, newly creating the second resilient distributed dataset by reading all documents from the database.

Type: Grant

Filed: January 17, 2017

Date of Patent: April 7, 2020

Assignee: International Business Machines Corporation

Inventors: Zhao Cao, Jing Wang, Lanjun Wang, Li Wen, Yan Wu, Lin Hao Xu
Processing time series

Patent number: 10423635

Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.

Type: Grant

Filed: May 26, 2015

Date of Patent: September 24, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xiao Yan Chen, Yao Liang Chen, Sheng Huang, Kai Liu, Wei Lu, Lin Hao Xu, Xiao Min Xu
Processing time series

Patent number: 10366095

Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.

Type: Grant

Filed: June 24, 2015

Date of Patent: July 30, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Xiao Yan Chen, Yao Liang Chen, Sheng Huang, Kai Liu, Wei Lu, Lin Hao Xu, Xiao Min Xu
DATA ANALYTICS ON DISTRIBUTED DATABASES

Publication number: 20180203912

Abstract: Data analytics is performed on a distributed document storage database by receiving a request for initiating a data analytics job; collecting statistics from the database in response to the request; using the statistics to estimate a first cost for merging an incremental data update for the job into a first resilient distributed dataset; using the statistics to estimate a second cost for newly creating a second resilient distributed dataset for the job; when the first cost is less than the second cost, reading data updates from the database and merging the data updates into the first resilient distributed dataset; and when the first cost is not less than the second cost, newly creating the second resilient distributed dataset by reading all documents from the database.

Type: Application

Filed: January 17, 2017

Publication date: July 19, 2018

Inventors: Zhao Cao, Jing Wang, Lanjun Wang, Li Wen, Yan Wu, Lin Hao Xu
SYSTEM AND METHOD OF QUERY PROCESSING WITH SCHEMA CHANGE IN JSON DOCUMENT STORE

Publication number: 20180121498

Abstract: An information processing system, a computer readable storage medium, and a method of managing a query to find a set of JSON documents in a multi-schema JSON document store. A query engine receives a first query to find at least one JSON document in a plurality of sets of JSON documents stored in the JSON document store, each set of JSON documents being organized in a unique JSON schema version related to a unique JSON schema version of each other set of JSON documents by at least one schema change. The first query is organized in a first unique JSON schema version. A query translator translates the first query into a plurality of queries based on the plurality of schema changes. The first and the plurality of queries are executed to provide a collective set of query results.

Type: Application

Filed: December 29, 2017

Publication date: May 3, 2018

Applicant: International Business Machines Corporation

Inventors: Zhao CAO, Yuan FENG, Tao LI, Lanjun WANG, Lin Hao XU
System and method of query processing with schema change in JSON document store

Patent number: 9881054

Abstract: An information processing system, a computer readable storage medium, and a method of managing a query to find a set of JSON documents in a multi-schema JSON document store. A query engine receives a first query to find at least one JSON document in a plurality of sets of JSON documents stored in the JSON document store, each set of JSON documents being organized in a unique JSON schema version related to a unique JSON schema version of each other set of JSON documents by at least one schema change. The first query is organized in a first unique JSON schema version. A query translator translates the first query into a second query based on the at least one schema change. The first and second queries are executed to provide first and second query results which are collectively returned.

Type: Grant

Filed: September 30, 2015

Date of Patent: January 30, 2018

Assignee: International Business Machines Corporation

Inventors: Zhao Cao, Yuan Feng, Tao Li, Lanjun Wang, Lin Hao Xu
SYSTEM AND METHOD OF QUERY PROCESSING WITH SCHEMA CHANGE IN JSON DOCUMENT STORE

Publication number: 20170091265

Abstract: An information processing system, a computer readable storage medium, and a method of managing a query to find a set of JSON documents in a multi-schema JSON document store. A query engine receives a first query to find at least one JSON document in a plurality of sets of JSON documents stored in the JSON document store, each set of JSON documents being organized in a unique JSON schema version related to a unique JSON schema version of each other set of JSON documents by at least one schema change. The first query is organized in a first unique JSON schema version. A query translator translates the first query into a second query based on the at least one schema change. The first and second queries are executed to provide first and second query results which are collectively returned.

Type: Application

Filed: September 30, 2015

Publication date: March 30, 2017

Inventors: Zhao CAO, Yuan FENG, Tao LI, Lanjun WANG, Lin Hao XU
PREFETCHING RDF TRIPLE DATA

Publication number: 20170060876

Abstract: Query requests for RDF triples are obtained, wherein the query request(s) contain(s) at least one triple pattern; for each triple pattern, the corresponding elementary pattern is determined, and each triple pattern is converted to a weighted elementary pattern. The occurrence frequency of each elementary pattern is computed based on the weighted elementary patterns; at least one elementary pattern is chosen at least according to the occurrence frequency; and the RDF triples corresponding to the chosen at least elementary pattern are prefetched into the buffer. The corresponding apparatus is also provided. With the above method and apparatus, the frequently accessed RDF triples can be determined and prefetched into the buffer, which improves the query efficiency.

Type: Application

Filed: November 13, 2016

Publication date: March 2, 2017

Inventors: Yue Pan, Xing Zhi Sun, Qing Fa Wang, Shuo Wu, Lin Hao Xu
Prefetching RDF triple data

Patent number: 9495423

Abstract: Query requests for RDF triples are obtained, wherein the query request(s) contain(s) at least one triple pattern; for each triple pattern, the corresponding elementary pattern is determined, and each triple pattern is converted to a weighted elementary pattern. The occurrence frequency of each elementary pattern is computed based on the weighted elementary patterns; at least one elementary pattern is chosen at least according to the occurrence frequency; and the RDF triples corresponding to the chosen at least elementary pattern are prefetched into the buffer. The corresponding apparatus is also provided. With the above method and apparatus, the frequently accessed RDF triples can be determined and prefetched into the buffer, which improves the query efficiency.

Type: Grant

Filed: October 9, 2013

Date of Patent: November 15, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yue Pan, Xing Zhi Sun, Qing Fa Wang, Shuo Wu, Lin Hao Xu
GENERATING JOB ALERT

Publication number: 20160110224

Abstract: A method and system for generating a job alert. According to embodiments of the present invention, before a target job is processed, a characteristic of input and output of the target job in at least one stage is determined through analyzing a historical job, and a resource overhead associated with the processing of the target job is calculated based on the characteristic of input and output. Then, an alert for the target job is generated in response to the resource overhead exceeding a predetermined threshold. In such manner, an alert for the target job can be proactively generated before the resource overhead problem occurs, so as to enable an administrator or developer to discover a fault in advance and adopt measures actively to avoid loss and damage to the intermediate results or output data when the target job is processed.

Type: Application

Filed: September 24, 2015

Publication date: April 21, 2016

Inventors: Zhao Cao, Peng Li, Jun Ma, Ju Wei Shi, Bing Jiang Sun, Chen Wang, Lin Hao Xu, Chang Hai Yan, Xiao Ning Zhang, Jia Zou
PROCESSING TIME SERIES

Publication number: 20150347568

Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.

Type: Application

Filed: May 26, 2015

Publication date: December 3, 2015

Inventors: Xiao Yan CHEN, Yao Liang CHEN, Sheng HUANG, Kai LIU, Wei LU, Lin Hao XU, Xiao Min XU
PROCESSING TIME SERIES

Publication number: 20150347537

Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.

Type: Application

Filed: June 24, 2015

Publication date: December 3, 2015

Inventors: XIAO YAN CHEN, YAO LIANG CHEN, SHENG HUANG, KAI LIU, WEI LU, LIN HAO XU, XIAO MIN XU
Hypothesis derived from relationship graph

Patent number: 9043256

Abstract: A method and apparatus for data processing. The method calculates correlations between a plurality of attributes in a dataset. The attributes are factors involved in transaction processing. The method generates a relationship graph by using the plurality of attributes and the correlations between the plurality of attributes; and extracts a sub-graph from the relationship graph to represent a hypothesis. The hypothesis describes the impacts of the factors on the transaction processing. Also provided is an apparatus for implementing the above data processing method.

Type: Grant

Filed: November 29, 2012

Date of Patent: May 26, 2015

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yue Pan, Wei Jia Shen, Xing Zhi Sun, Xiao Fei Teng, Lin Hao Xu, Yi Qin Yu, Yu Chen Zhou
Obtaining hierarchical information of planar data

Patent number: 8996581

Abstract: The invention provides a method and apparatus for obtaining hierarchical information of planar data. The method comprises mapping at least one data item from a same data set in the planar data to at least one node in a tree structure formed by a structured terminology system. The method also comprises obtaining at least one sub tree structure in the tree structure, each of the at least one sub tree structure taking the at least one node as all of its leaf node. The method also comprises selecting a target tree structure from the at least one sub tree structure and obtaining hierarchical information in the target tree structure. An apparatus corresponding to the above method is also provided. With the above method and apparatus, hierarchical information of data items may be obtained from planar organized data to facilitate subsequent and further analysis and management.

Type: Grant

Filed: December 20, 2011

Date of Patent: March 31, 2015

Assignee: International Business Machines Corporation

Inventors: Yue Pan, Xing Zhi Sun, Ying Tao, Lin Hao Xu
Method and system for determining node to be materialized

Patent number: 8768953

Abstract: A dependency graph of rule predicates without strongly connected sub-graph is obtained. The dependency graph indicates the dependency among the rule predicates. An update frequency of node in the dependency graph is calculated, and a query frequency of node in the dependency graph is also calculated. Furthermore, a runtime query cost value and a materialization cost value of the node are calculated based on the query frequency and update frequency. Node to be materialized are determined based on the runtime query cost value and the materialization cost value. A rule predicate corresponding to the node to be materialized is the rule predicate to be materialized. In at least some instances, an exemplary technical effect is that the return time of result of runtime query is saved and the affect by the data update is reduced when a query is performed in relation data reasoning system constructed with rule predicates.

Type: Grant

Filed: October 27, 2010

Date of Patent: July 1, 2014

Assignee: International Business Machines Corporation

Inventors: Yue Pan, Xing Zhi Sun, Lin Hao Xu
PREFETCHING RDF TRIPLE DATA

Publication number: 20140040283

Abstract: Query requests for RDF triples are obtained, wherein the query request(s) contain(s) at least one triple pattern; for each triple pattern, the corresponding elementary pattern is determined, and each triple pattern is converted to a weighted elementary pattern. The occurrence frequency of each elementary pattern is computed based on the weighted elementary patterns; at least one elementary pattern is chosen at least according to the occurrence frequency; and the RDF triples corresponding to the chosen at least elementary pattern are prefetched into the buffer. The corresponding apparatus is also provided. With the above method and apparatus, the frequently accessed RDF triples can be determined and prefetched into the buffer, which improves the query efficiency.

Type: Application

Filed: October 9, 2013

Publication date: February 6, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yue Pan, Xing Zhi Sun, Qing Fa Wang, Shuo Wu, Lin Hao Xu
OBTAINING HIERARCHICAL INFORMATION OF PLANAR DATA

Publication number: 20120173585

Abstract: The invention provides a method and apparatus for obtaining hierarchical information of planar data. The method comprises mapping at least one data item from a same data set in the planar data to at least one node in a tree structure formed by a structured terminology system. The method also comprises obtaining at least one sub tree structure in the tree structure, each of the at least one sub tree structure taking the at least one node as all of its leaf node. The method also comprises selecting a target tree structure from the at least one sub tree structure and obtaining hierarchical information in the target tree structure. An apparatus corresponding to the above method is also provided. With the above method and apparatus, hierarchical information of data items may be obtained from planar organized data to facilitate subsequent and further analysis and management.

Type: Application

Filed: December 20, 2011

Publication date: July 5, 2012

Inventors: Yue Pan, Xing Zhi Sun, Ying Tao, Lin Hao Xu

1 2 next