Patents by Inventor Sheng Yan Sun

Sheng Yan Sun has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11853697
    Abstract: An approach is provided in which a method, system, and program product build a time series prediction model based on one or more relationships between a first set of keywords in a set of first news articles and a second set of keywords in a set of second news articles. The time series prediction model includes a time-based interest level adjustment corresponding to a publication time between the set of first news articles and the set second of news articles. The method, system, and program product use the time series prediction model to compute an inherited initial interest level of a third news article that includes a set of new keywords based on the set of new keywords and the time-based interest level adjustment. The method, system, and program product assign the inherited initial interest level to the third news article.
    Type: Grant
    Filed: April 23, 2021
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shuo Li, June-Ray Lin, Sheng Yan Sun, Xiaobo Wang
  • Publication number: 20230409602
    Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for data management. According to the method, one or more processors divide data into a plurality of partitions. The one or more processors store the plurality of partitions in a plurality of nodes of a mixed distributed database system, wherein a first node of the mixed distributed database system comprises a plurality of databases, and wherein at least a part of the plurality of partitions are shared by the plurality of databases of the first node and being not shared by other of the plurality of nodes.
    Type: Application
    Filed: June 21, 2022
    Publication date: December 21, 2023
    Inventors: Hong Mei Zhang, Sheng Yan Sun, Meng Wan, Peng Hui Jiang
  • Publication number: 20230409593
    Abstract: An embodiment for analyzing and tracking data flow to determine proper schemas for unstructured data. The embodiment may automatically use a sidecar to collect schema discovery rules during conversion of raw data to unstructured data. The embodiment may automatically generate multiple schemas for different tenants using the collected schema discovery rules. The embodiment may automatically use ETL to export unstructured data to SQL databases with the generated multiple schemas for the different tenants. The embodiment may automatically monitor usage data of the SQL databases and collect the usage data. The embodiment may automatically optimize schema discovery using the collected usage data. The embodiment may automatically discover schemas with hot usage and apply the discovered schemas with hot usage to other tenants for consumption and further monitoring.
    Type: Application
    Filed: June 21, 2022
    Publication date: December 21, 2023
    Inventors: Peng Hui Jiang, Jun Su, Sheng Yan Sun, Hong Mei Zhang, Meng Wan
  • Publication number: 20230409575
    Abstract: Embodiments of the present disclosure describe an approach for database query processing with database clients. According to the approach, a first set of queries are obtained from a plurality of clients in communication with a database server. A second set of queries are generated by normalizing the first set of queries. A set of access paths corresponding to the second set of queries are determined for retrieving data from at least one of the plurality of clients and the database server. Data is retrieved from at least one of the plurality of clients and the database server based on the set of access paths.
    Type: Application
    Filed: June 16, 2022
    Publication date: December 21, 2023
    Inventors: Shuo Li, Xiaobo Wang, Sheng Yan Sun, Ping Wang
  • Patent number: 11847063
    Abstract: Systems and methods for high availability distributed data storage are provided. In embodiments, a method includes: receiving, by a remote direct memory access (RDMA) switch operatively coupled to a computing device, a request to access a page of a database; determining, by the RDMA switch, a validation state of the page; determining, by the RDMA switch, a status of the page; updating, by the RDMA switch, the status of the page based on the validation state and the request; and reporting, by the RDMA switch, the validation state.
    Type: Grant
    Filed: January 11, 2022
    Date of Patent: December 19, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shuo Li, Xiaobo Wang, Sheng Yan Sun, Hong Mei Zhang
  • Patent number: 11847120
    Abstract: A method, computer program product, and computer system for improving performance of a SQL execution sequence of SQL statements. The SQL execution sequence is recorded in an event log. Original results of executing the SQL statements and an original CPU cost of executing the SQL statements in accordance with the original access path are recorded in a logical log. A new access path is generated from analysis of the event log and the logical log. The SQL statements are executed in accordance with the new access path resulting in new results of executing the SQL statements including a new CPU cost of executing the SQL statements in accordance with the new access path. In response to a determination that the new results replicate the original results and that the new CPU cost is less than the original CPU cost, the original access path is replaced with the new access path.
    Type: Grant
    Filed: December 3, 2021
    Date of Patent: December 19, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shuo Li, Xiaobo Wang, Min Li, Sheng Yan Sun
  • Publication number: 20230401204
    Abstract: This disclosure provides a computer-implemented method, a computer system and a computer program product for database compression oriented to combinations of fields of a database record. One or more combinations of fields of a record of a database are determined that satisfy a frequency criterion indicating that access frequencies of the one or more combinations of fields are higher than an access frequency threshold. The record is reorganized based on the one or more combinations of fields to store fields of each combination of the one or more combinations of fields in a respective contiguous storage space. The reorganized record is compressed by applying a compression scheme to the one or more combinations of fields.
    Type: Application
    Filed: June 10, 2022
    Publication date: December 14, 2023
    Inventors: Ying Zhang, Xiaobo Wang, Shuo Li, Sheng Yan Sun
  • Patent number: 11841857
    Abstract: A computer-implemented method to transform and execute queries by merging sparsely populated columns. The method includes receiving, from a host, a first query configured to perform a command on one or more target columns in a database. The method further includes analyzing a set of statistics for the database. The method also includes determining, based on the analyzing, a first column of the one or more target columns is included in a set of sparse columns. The method includes generating a plurality of access plans for the first query, including a first access plan that merges the first column with a second column. The method further includes transforming, based on the first access plan, the first query to merge the first column with the second column. The method also includes executing, in response to the transforming of the first query, the first query.
    Type: Grant
    Filed: February 22, 2022
    Date of Patent: December 12, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shuo Li, Xiaobo Wang, Leilei Li, Sheng Yan Sun
  • Patent number: 11822528
    Abstract: In an approach for database self-diagnosis and self-healing, a processor receives a problem description related to a database. A processor classifies the problem description into a natural language description portion and a database-know-who content portion. A processor processes the natural language description portion using natural language processing techniques. A processor evaluates the database-know-who content portion. A processor combines a result of processing the natural language description portion and evaluating the database-know-who content portion. A processor identifies a solution based on the problem description and the combined result. A processor solves a problem using the identified solution.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: November 21, 2023
    Assignee: International Business Machines Corporation
    Inventors: Sheng Yan Sun, Min Li, Shuo Li, Xiaobo Wang, Jian Xu
  • Publication number: 20230370086
    Abstract: A system collects statistical data for a data page, divides the data page into parts, analyzes the data page and the statistical data, based on compression efficiency of one or more compression methods for each part of each page, to determine a compression method for each part of page, and compresses, based on the analyzing, the parts of the data page.
    Type: Application
    Filed: July 26, 2023
    Publication date: November 16, 2023
    Inventors: Shuo Li, Xiaobo Wang, Leilei Li, Sheng Yan Sun
  • Patent number: 11797522
    Abstract: Database log writing is based on log pipeline contention. A determination is made as to whether contention in writing data to a log pipeline, which is used in writing data from memory to storage, is at a prespecified level. Based on determining that the contention in writing the data to the log pipeline is at the prespecified level, a split operation is automatically performed to create a new log pipeline.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: October 24, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shuo Li, Xiaobo Wang, Jia Tian Zhong, Sheng Yan Sun
  • Publication number: 20230325256
    Abstract: A processor receives an optimization target for a first cluster in a distributed computing environment. A processor trains a neural network based on the optimization target. A processor generates a decision tree based on the neural network. A processor selects a workload executing in the first cluster of the distributed computing environment. A processor identifies a second cluster to relocate the workload. A processor determines a current optimization of the distributed computing environment based on the workload being retained in the first cluster. A processor determines a migration optimization of the distributed computing environment based on the workload being migrated to the second cluster. A processor, in response to the migration optimization improving the current optimization of the distributed computing environment, migrates the workload to the second cluster.
    Type: Application
    Filed: March 23, 2022
    Publication date: October 12, 2023
    Inventors: Sheng Yan Sun, Meng Wan, Peng Hui Jiang, Hong Mei Zhang
  • Publication number: 20230325471
    Abstract: A supervised similarity measure machine learning method, system, and computer program product that includes generating embeddings by training a supervised deep neural network (DNN) on a feature data to determine which nodes correspond to which clustered learning group of clustered learning groups, performing half-distributed learning by distributing data in a time-series database to the clustered learning groups, and evaluating, based on the embeddings, new tenant data in the clustered learning groups with an upward bow pose.
    Type: Application
    Filed: April 7, 2022
    Publication date: October 12, 2023
    Inventors: Meng Wan, Sheng Yan Sun, Peng Hui Jiang, Hong Mei Zhang
  • Patent number: 11782918
    Abstract: A computer-implemented method selects an access path for high cost and/or complex queries. The method includes building a classification model configured to identify a lowest cost access path. The method further includes receiving a query, where the query is configured to retrieve a set of data from a database. The method also includes generating an access map for the query, where the access map includes one or more potential access paths to execute the query. The method includes collecting, for the query, a set of data for each potential access path. The method further includes classifying, by the classification model, the query. The method also includes selecting a first access path of the one or more potential access paths and executing the query.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: October 10, 2023
    Assignee: International Business Machines Corporation
    Inventors: Sheng Yan Sun, Shuo Li, Xiaobo Wang, Hong Mei Zhang
  • Publication number: 20230315725
    Abstract: An embodiment includes generating a partition schema for a distributed database based on historical usage data indicative of usage of the distributed database, where the generating of the partition schema comprises determining a partition range of a partition of the partition schema. The embodiment also includes generating a node identifier for the partition using a hash function and a first weight value assigned to the partition. The embodiment also includes monitoring performance data indicative of a performance of the distributed database, the monitoring comprising detecting a failure of the performance to satisfy a performance threshold. The embodiment also includes initiating, responsive to detecting the failure, a redistribution procedure by changing the node identifier of the partition by replacing the first weight value with a second weight value.
    Type: Application
    Filed: March 31, 2022
    Publication date: October 5, 2023
    Applicant: International Business Machines Corporation
    Inventors: Hong Mei Zhang, Sheng Yan Sun, Meng Wan, Peng Hui Jiang
  • Publication number: 20230315710
    Abstract: A computer-implemented method includes: collecting, by a computing device, database activities and database structure information of a database; identifying, by the computing device, related columns in the database; determining, by the computing device, one or more data types for column transference of the identified related columns; generating, by the computing device, a super union column based on the column transference and the identified related columns; and updating, by the computing device, the database with the super union column.
    Type: Application
    Filed: March 30, 2022
    Publication date: October 5, 2023
    Inventors: Sheng Yan SUN, Hong Mei ZHANG, Peng Hui JIANG, Meng WAN
  • Patent number: 11777519
    Abstract: A system collects statistical data for a data page, divides the data page into parts, analyzes the data page and the statistical data, based on compression efficiency of one or more compression methods for each part of each page, to determine a compression method for each part of page, and compresses, based on the analyzing, the parts of the data page.
    Type: Grant
    Filed: February 10, 2022
    Date of Patent: October 3, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shuo Li, Xiaobo Wang, Leilei Li, Sheng Yan Sun
  • Publication number: 20230306012
    Abstract: Disclosed are techniques for relational database locks based on columns. Database transactions may be targeted to specific columns of one or more records, instead of the entire row for those records, using primary keys. Column locks on specific keys are stored separately than column locks on ranges of keys, which are both checked when requesting a new column lock for either a single key or a range of keys. When a threshold number of columns for a given record, or range of records/keys, have been locked, the column locks for that record, or range of records, can be combined into a single row level lock to reduce resource costs for maintaining multiple concurrent locks.
    Type: Application
    Filed: March 23, 2022
    Publication date: September 28, 2023
    Inventors: Shuo Li, Xiaobo Wang, Hong Mei Zhang, Sheng Yan Sun
  • Patent number: 11762578
    Abstract: A computer-implemented method that includes managing a buffer pool of pages into a ring sub-chain comprising pages linked in a ring, and a linear sub-chain comprising pages linked in a line from a header, and moving a page between the linear sub-chain and the ring sub-chain based on a moving schema evaluating a chain management characteristic.
    Type: Grant
    Filed: September 29, 2020
    Date of Patent: September 19, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shuo Li, Xiaobo Wang, Sheng Yan Sun, Hong Mei Zhang
  • Patent number: 11762859
    Abstract: Embodiments of the present disclosure relate to an approach for database query. According to the approach, a query for a group of data records is received. At least one index is created on at least one field of the data records and comprises index entries for storing and sorting respective values of the at least one field of the data records. It is determined if the query satisfies a predetermined condition. In response to the query satisfying the predetermined condition, a result of the query is determined by skipping at least a part of operations required by the query based on the at least one index.
    Type: Grant
    Filed: September 28, 2020
    Date of Patent: September 19, 2023
    Assignee: International Business Machines Corporation
    Inventors: Sheng Yan Sun, Shuo Li, Xiaobo Wang, Peng Hui Jiang