Patents by Inventor Lin Hao Xu
Lin Hao Xu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10831767Abstract: Query requests for RDF triples are obtained, wherein the query request(s) contain(s) at least one triple pattern; for each triple pattern, the corresponding elementary pattern is determined, and each triple pattern is converted to a weighted elementary pattern. The occurrence frequency of each elementary pattern is computed based on the weighted elementary patterns; at least one elementary pattern is chosen at least according to the occurrence frequency; and the RDF triples corresponding to the chosen at least elementary pattern are prefetched into the buffer. The corresponding apparatus is also provided. With the above method and apparatus, the frequently accessed RDF triples can be determined and prefetched into the buffer, which improves the query efficiency.Type: GrantFiled: November 13, 2016Date of Patent: November 10, 2020Assignee: International Business Machines CorporationInventors: Yue Pan, Xing Zhi Sun, Qing Fa Wang, Shuo Wu, Lin Hao Xu
-
Patent number: 10705935Abstract: A method and system for generating a job alert. According to embodiments of the present invention, before a target job is processed, a characteristic of input and output of the target job in at least one stage is determined through analyzing a historical job, and a resource overhead associated with the processing of the target job is calculated based on the characteristic of input and output. Then, an alert for the target job is generated in response to the resource overhead exceeding a predetermined threshold. In such manner, an alert for the target job can be proactively generated before the resource overhead problem occurs, so as to enable an administrator or developer to discover a fault in advance and adopt measures actively to avoid loss and damage to the intermediate results or output data when the target job is processed.Type: GrantFiled: September 24, 2015Date of Patent: July 7, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Zhao Cao, Peng Li, Jun Ma, Ju Wei Shi, Bing Jiang Sun, Chen Wang, Lin Hao Xu, Chang Hai Yan, Xiao Ning Zhang, Jia Zou
-
Patent number: 10664471Abstract: An information processing system, a computer readable storage medium, and a method of managing a query to find a set of JSON documents in a multi-schema JSON document store. A query engine receives a first query to find at least one JSON document in a plurality of sets of JSON documents stored in the JSON document store, each set of JSON documents being organized in a unique JSON schema version related to a unique JSON schema version of each other set of JSON documents by at least one schema change. The first query is organized in a first unique JSON schema version. A query translator translates the first query into a plurality of queries based on the plurality of schema changes. The first and the plurality of queries are executed to provide a collective set of query results.Type: GrantFiled: December 29, 2017Date of Patent: May 26, 2020Assignee: International Business Machines CorporationInventors: Zhao Cao, Yuan Feng, Tao Li, Lanjun Wang, Lin Hao Xu
-
Patent number: 10614087Abstract: Data analytics is performed on a distributed document storage database by receiving a request for initiating a data analytics job; collecting statistics from the database in response to the request; using the statistics to estimate a first cost for merging an incremental data update for the job into a first resilient distributed dataset; using the statistics to estimate a second cost for newly creating a second resilient distributed dataset for the job; when the first cost is less than the second cost, reading data updates from the database and merging the data updates into the first resilient distributed dataset; and when the first cost is not less than the second cost, newly creating the second resilient distributed dataset by reading all documents from the database.Type: GrantFiled: January 17, 2017Date of Patent: April 7, 2020Assignee: International Business Machines CorporationInventors: Zhao Cao, Jing Wang, Lanjun Wang, Li Wen, Yan Wu, Lin Hao Xu
-
Patent number: 10423635Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.Type: GrantFiled: May 26, 2015Date of Patent: September 24, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Xiao Yan Chen, Yao Liang Chen, Sheng Huang, Kai Liu, Wei Lu, Lin Hao Xu, Xiao Min Xu
-
Patent number: 10366095Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.Type: GrantFiled: June 24, 2015Date of Patent: July 30, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Xiao Yan Chen, Yao Liang Chen, Sheng Huang, Kai Liu, Wei Lu, Lin Hao Xu, Xiao Min Xu
-
Publication number: 20180203912Abstract: Data analytics is performed on a distributed document storage database by receiving a request for initiating a data analytics job; collecting statistics from the database in response to the request; using the statistics to estimate a first cost for merging an incremental data update for the job into a first resilient distributed dataset; using the statistics to estimate a second cost for newly creating a second resilient distributed dataset for the job; when the first cost is less than the second cost, reading data updates from the database and merging the data updates into the first resilient distributed dataset; and when the first cost is not less than the second cost, newly creating the second resilient distributed dataset by reading all documents from the database.Type: ApplicationFiled: January 17, 2017Publication date: July 19, 2018Inventors: Zhao Cao, Jing Wang, Lanjun Wang, Li Wen, Yan Wu, Lin Hao Xu
-
Publication number: 20180121498Abstract: An information processing system, a computer readable storage medium, and a method of managing a query to find a set of JSON documents in a multi-schema JSON document store. A query engine receives a first query to find at least one JSON document in a plurality of sets of JSON documents stored in the JSON document store, each set of JSON documents being organized in a unique JSON schema version related to a unique JSON schema version of each other set of JSON documents by at least one schema change. The first query is organized in a first unique JSON schema version. A query translator translates the first query into a plurality of queries based on the plurality of schema changes. The first and the plurality of queries are executed to provide a collective set of query results.Type: ApplicationFiled: December 29, 2017Publication date: May 3, 2018Applicant: International Business Machines CorporationInventors: Zhao CAO, Yuan FENG, Tao LI, Lanjun WANG, Lin Hao XU
-
Patent number: 9881054Abstract: An information processing system, a computer readable storage medium, and a method of managing a query to find a set of JSON documents in a multi-schema JSON document store. A query engine receives a first query to find at least one JSON document in a plurality of sets of JSON documents stored in the JSON document store, each set of JSON documents being organized in a unique JSON schema version related to a unique JSON schema version of each other set of JSON documents by at least one schema change. The first query is organized in a first unique JSON schema version. A query translator translates the first query into a second query based on the at least one schema change. The first and second queries are executed to provide first and second query results which are collectively returned.Type: GrantFiled: September 30, 2015Date of Patent: January 30, 2018Assignee: International Business Machines CorporationInventors: Zhao Cao, Yuan Feng, Tao Li, Lanjun Wang, Lin Hao Xu
-
Publication number: 20170091265Abstract: An information processing system, a computer readable storage medium, and a method of managing a query to find a set of JSON documents in a multi-schema JSON document store. A query engine receives a first query to find at least one JSON document in a plurality of sets of JSON documents stored in the JSON document store, each set of JSON documents being organized in a unique JSON schema version related to a unique JSON schema version of each other set of JSON documents by at least one schema change. The first query is organized in a first unique JSON schema version. A query translator translates the first query into a second query based on the at least one schema change. The first and second queries are executed to provide first and second query results which are collectively returned.Type: ApplicationFiled: September 30, 2015Publication date: March 30, 2017Inventors: Zhao CAO, Yuan FENG, Tao LI, Lanjun WANG, Lin Hao XU
-
Publication number: 20170060876Abstract: Query requests for RDF triples are obtained, wherein the query request(s) contain(s) at least one triple pattern; for each triple pattern, the corresponding elementary pattern is determined, and each triple pattern is converted to a weighted elementary pattern. The occurrence frequency of each elementary pattern is computed based on the weighted elementary patterns; at least one elementary pattern is chosen at least according to the occurrence frequency; and the RDF triples corresponding to the chosen at least elementary pattern are prefetched into the buffer. The corresponding apparatus is also provided. With the above method and apparatus, the frequently accessed RDF triples can be determined and prefetched into the buffer, which improves the query efficiency.Type: ApplicationFiled: November 13, 2016Publication date: March 2, 2017Inventors: Yue Pan, Xing Zhi Sun, Qing Fa Wang, Shuo Wu, Lin Hao Xu
-
Patent number: 9495423Abstract: Query requests for RDF triples are obtained, wherein the query request(s) contain(s) at least one triple pattern; for each triple pattern, the corresponding elementary pattern is determined, and each triple pattern is converted to a weighted elementary pattern. The occurrence frequency of each elementary pattern is computed based on the weighted elementary patterns; at least one elementary pattern is chosen at least according to the occurrence frequency; and the RDF triples corresponding to the chosen at least elementary pattern are prefetched into the buffer. The corresponding apparatus is also provided. With the above method and apparatus, the frequently accessed RDF triples can be determined and prefetched into the buffer, which improves the query efficiency.Type: GrantFiled: October 9, 2013Date of Patent: November 15, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yue Pan, Xing Zhi Sun, Qing Fa Wang, Shuo Wu, Lin Hao Xu
-
Publication number: 20160110224Abstract: A method and system for generating a job alert. According to embodiments of the present invention, before a target job is processed, a characteristic of input and output of the target job in at least one stage is determined through analyzing a historical job, and a resource overhead associated with the processing of the target job is calculated based on the characteristic of input and output. Then, an alert for the target job is generated in response to the resource overhead exceeding a predetermined threshold. In such manner, an alert for the target job can be proactively generated before the resource overhead problem occurs, so as to enable an administrator or developer to discover a fault in advance and adopt measures actively to avoid loss and damage to the intermediate results or output data when the target job is processed.Type: ApplicationFiled: September 24, 2015Publication date: April 21, 2016Inventors: Zhao Cao, Peng Li, Jun Ma, Ju Wei Shi, Bing Jiang Sun, Chen Wang, Lin Hao Xu, Chang Hai Yan, Xiao Ning Zhang, Jia Zou
-
Publication number: 20150347568Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.Type: ApplicationFiled: May 26, 2015Publication date: December 3, 2015Inventors: Xiao Yan CHEN, Yao Liang CHEN, Sheng HUANG, Kai LIU, Wei LU, Lin Hao XU, Xiao Min XU
-
Publication number: 20150347537Abstract: A method for processing a time series includes dividing, with a processing device, the time series into a plurality of windows by time; extracting at least one group of similar subsequences from a current window among the plurality of windows; and updating a candidate list on the basis of comparison between similar subsequences in each group of the at least one group with k characteristic subsequences in the candidate list; wherein the k characteristic subsequences are k characteristic subsequences with a greatest number of occurrences in at least processed parts of the time series.Type: ApplicationFiled: June 24, 2015Publication date: December 3, 2015Inventors: XIAO YAN CHEN, YAO LIANG CHEN, SHENG HUANG, KAI LIU, WEI LU, LIN HAO XU, XIAO MIN XU
-
Patent number: 9043256Abstract: A method and apparatus for data processing. The method calculates correlations between a plurality of attributes in a dataset. The attributes are factors involved in transaction processing. The method generates a relationship graph by using the plurality of attributes and the correlations between the plurality of attributes; and extracts a sub-graph from the relationship graph to represent a hypothesis. The hypothesis describes the impacts of the factors on the transaction processing. Also provided is an apparatus for implementing the above data processing method.Type: GrantFiled: November 29, 2012Date of Patent: May 26, 2015Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yue Pan, Wei Jia Shen, Xing Zhi Sun, Xiao Fei Teng, Lin Hao Xu, Yi Qin Yu, Yu Chen Zhou
-
Patent number: 8996581Abstract: The invention provides a method and apparatus for obtaining hierarchical information of planar data. The method comprises mapping at least one data item from a same data set in the planar data to at least one node in a tree structure formed by a structured terminology system. The method also comprises obtaining at least one sub tree structure in the tree structure, each of the at least one sub tree structure taking the at least one node as all of its leaf node. The method also comprises selecting a target tree structure from the at least one sub tree structure and obtaining hierarchical information in the target tree structure. An apparatus corresponding to the above method is also provided. With the above method and apparatus, hierarchical information of data items may be obtained from planar organized data to facilitate subsequent and further analysis and management.Type: GrantFiled: December 20, 2011Date of Patent: March 31, 2015Assignee: International Business Machines CorporationInventors: Yue Pan, Xing Zhi Sun, Ying Tao, Lin Hao Xu
-
Patent number: 8768953Abstract: A dependency graph of rule predicates without strongly connected sub-graph is obtained. The dependency graph indicates the dependency among the rule predicates. An update frequency of node in the dependency graph is calculated, and a query frequency of node in the dependency graph is also calculated. Furthermore, a runtime query cost value and a materialization cost value of the node are calculated based on the query frequency and update frequency. Node to be materialized are determined based on the runtime query cost value and the materialization cost value. A rule predicate corresponding to the node to be materialized is the rule predicate to be materialized. In at least some instances, an exemplary technical effect is that the return time of result of runtime query is saved and the affect by the data update is reduced when a query is performed in relation data reasoning system constructed with rule predicates.Type: GrantFiled: October 27, 2010Date of Patent: July 1, 2014Assignee: International Business Machines CorporationInventors: Yue Pan, Xing Zhi Sun, Lin Hao Xu
-
Publication number: 20140040283Abstract: Query requests for RDF triples are obtained, wherein the query request(s) contain(s) at least one triple pattern; for each triple pattern, the corresponding elementary pattern is determined, and each triple pattern is converted to a weighted elementary pattern. The occurrence frequency of each elementary pattern is computed based on the weighted elementary patterns; at least one elementary pattern is chosen at least according to the occurrence frequency; and the RDF triples corresponding to the chosen at least elementary pattern are prefetched into the buffer. The corresponding apparatus is also provided. With the above method and apparatus, the frequently accessed RDF triples can be determined and prefetched into the buffer, which improves the query efficiency.Type: ApplicationFiled: October 9, 2013Publication date: February 6, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yue Pan, Xing Zhi Sun, Qing Fa Wang, Shuo Wu, Lin Hao Xu
-
Publication number: 20120173585Abstract: The invention provides a method and apparatus for obtaining hierarchical information of planar data. The method comprises mapping at least one data item from a same data set in the planar data to at least one node in a tree structure formed by a structured terminology system. The method also comprises obtaining at least one sub tree structure in the tree structure, each of the at least one sub tree structure taking the at least one node as all of its leaf node. The method also comprises selecting a target tree structure from the at least one sub tree structure and obtaining hierarchical information in the target tree structure. An apparatus corresponding to the above method is also provided. With the above method and apparatus, hierarchical information of data items may be obtained from planar organized data to facilitate subsequent and further analysis and management.Type: ApplicationFiled: December 20, 2011Publication date: July 5, 2012Inventors: Yue Pan, Xing Zhi Sun, Ying Tao, Lin Hao Xu