Patents by Inventor Ting Yu Cliff Leung

Ting Yu Cliff Leung has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11966381
    Abstract: Embodiments maintain a data pool that includes heterogeneous data sets, and receiving a first data batch of a data set from a data source into the data pool. Embodiments determine a current state of the data set based on a data set state diagram including a plurality of data set states, and identify a condition of the first data batch. Embodiments further set a data batch state for the first data batch, based on a data batch state diagram, and update the data batch state of a prior data batch received before the first data batch, based on the condition of the first data batch. Embodiments additionally transition the data set state diagram, based on the condition of the first data batch, to an updated data set state. Embodiments maintain a data state repository storing the data set state for each of the plurality of heterogeneous data sets.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: April 23, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Liangzhao Zeng, Ting Yu Cliff Leung, Yat On Lau, Jimmy Hong, Chuang Yao, Yen-Ting Liu, Ting-Kuan Wu
  • Patent number: 11907213
    Abstract: A query processing method including decomposing an SQL into logical plans based on data source feature information, to obtain a logical plan set, where the data source feature information is stored in an internal storage space of the query engine; generating physical plans for the logical plan set based on the data source feature information, to obtain a physical plan set; determining query costs of the physical plan set based on the data source feature information, to obtain a physical plan with a highest priority; and executing the physical plan with the highest priority, to obtain a query result queried by a user.
    Type: Grant
    Filed: June 6, 2022
    Date of Patent: February 20, 2024
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Bing Zhou, Wenwei Xue, Ting Yu Cliff Leung, Tao Li
  • Patent number: 11860853
    Abstract: Embodiments of the system in include a memory that stores a metamodel including a plurality of predefined characteristics for data sets. A data repository stores a plurality of heterogeneous data sets, each of the plurality of data sets comprising a plurality of data batches received over time. An interface receives a new data set for storage into the data repository, and data health reasoner to retrieve the stored metamodel from the memory, the stored metamodel including a plurality of predefined characteristics. The data health reasoner determines measured values of a subset of the plurality of predefined characteristics identified based on the stored metamodel, and determines a set of data health metrics for the data set based on the measured values of the subset of the set of the predefined characteristics. The data health reasoner formulates a plurality of data validation assertions for the data set and apply the plurality of data validation assertions to each instance of the data set.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: January 2, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Liangzhao Zeng, Ting Yu Cliff Leung, Jimmy Hong, Yat On Lau
  • Publication number: 20230376469
    Abstract: Embodiments of the system in includes data set analytics to identify one or more dataset utilized by a workflow. The data set analytics identifies upstream data sets referenced by the data sets utilized by the workflow. The set of all data sets relevant to the workflow are considered applicable data sets, and are analyzed. The data set analytics determines a usage pattern of each of the applicable data sets by the workflow, and identifies one or more data quality assertions for each of the applicable of data sets based on the usage pattern. The data set analytics further perform a quality evaluation of the applicable data sets by applying data quality assertions to the applicable data sets used by the workflow.
    Type: Application
    Filed: May 23, 2022
    Publication date: November 23, 2023
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Liangzhao ZENG, Ting Yu Cliff LEUNG, Xiaoyang GAO
  • Publication number: 20230147939
    Abstract: Embodiments maintain a data pool that includes heterogeneous data sets, and receiving a first data batch of a data set from a data source into the data pool. Embodiments determine a current state of the data set based on a data set state diagram including a plurality of data set states, and identify a condition of the first data batch. Embodiments further set a data batch state for the first data batch, based on a data batch state diagram, and update the data batch state of a prior data batch received before the first data batch, based on the condition of the first data batch. Embodiments additionally transition the data set state diagram, based on the condition of the first data batch, to an updated data set state. Embodiments maintain a data state repository storing the data set state for each of the plurality of heterogeneous data sets.
    Type: Application
    Filed: November 9, 2021
    Publication date: May 11, 2023
    Inventors: Liangzhao ZENG, Ting Yu Cliff LEUNG, Yat On LAU, Jimmy HONG, Chuang YAO, Yen-Ting LIU, Ting-Kuan WU
  • Publication number: 20230145069
    Abstract: Embodiments of the system in include a memory that stores a metamodel including a plurality of predefined characteristics for data sets. A data repository stores a plurality of heterogeneous data sets, each of the plurality of data sets comprising a plurality of data batches received over time. An interface receives a new data set for storage into the data repository, and data health reasoner to retrieve the stored metamodel from the memory, the stored metamodel including a plurality of predefined characteristics. The data health reasoner determines measured values of a subset of the plurality of predefined characteristics identified based on the stored metamodel, and determines a set of data health metrics for the data set based on the measured values of the subset of the set of the predefined characteristics. The data health reasoner formulates a plurality of data validation assertions for the data set and apply the plurality of data validation assertions to each instance of the data set.
    Type: Application
    Filed: November 9, 2021
    Publication date: May 11, 2023
    Inventors: Liangzhao ZENG, Ting Yu Cliff LEUNG, Jimmy HONG, Yat On LAU
  • Publication number: 20220300506
    Abstract: A query processing method including decomposing an SQL into logical plans based on data source feature information, to obtain a logical plan set, where the data source feature information is stored in an internal storage space of the query engine; generating physical plans for the logical plan set based on the data source feature information, to obtain a physical plan set; determining query costs of the physical plan set based on the data source feature information, to obtain a physical plan with a highest priority; and executing the physical plan with the highest priority, to obtain a query result queried by a user. The embodiments of this application further disclose a data source registration method and a query engine.
    Type: Application
    Filed: June 6, 2022
    Publication date: September 22, 2022
    Inventors: Bing ZHOU, Wenwei XUE, Ting Yu Cliff LEUNG, Tao LI
  • Patent number: 11366808
    Abstract: A query processing method includes decomposing an SQL into logical plans based on data source feature information, to obtain a logical plan set, where the data source feature information is stored in an internal data source feature library of a query engine, and the internal data source feature library is stored in cache space of the query engine; generating physical plans for the logical plan set based on the data source feature information, to obtain a physical plan set; determining query costs of the physical plan set based on the data source feature information, to obtain a physical plan with a highest priority; and executing the physical plan with the highest priority, to obtain a query result queried by a user. A data source registration method and a query engine is further disclosed.
    Type: Grant
    Filed: October 24, 2019
    Date of Patent: June 21, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Bing Zhou, Wenwei Xue, Ting Yu Cliff Leung, Tao Li
  • Publication number: 20200073867
    Abstract: a query processing method including decomposing an SQL into logical plans based on data source feature information, to obtain a logical plan set, where the data source feature information is stored in an internal data source feature library of a query engine, and the internal data source feature library is stored in cache space of the query engine; generating physical plans for the logical plan set based on the data source feature information, to obtain a physical plan set; determining query costs of the physical plan set based on the data source feature information, to obtain a physical plan with a highest priority; and executing the physical plan with the highest priority, to obtain a query result queried by a user. The embodiments of this application further disclose a data source registration method and a query engine.
    Type: Application
    Filed: October 24, 2019
    Publication date: March 5, 2020
    Inventors: Bing ZHOU, Wenwei XUE, Ting Yu Cliff LEUNG, Tao LI
  • Publication number: 20180293272
    Abstract: A method for cloning data samples in a data set based on statistic information of the data samples. The method does not use any of the data samples to perform the cloning. The statistic information includes a first set of statistic parameters obtained from a data matrix formed by data entries of the data samples based on Eckart-Young theorem, and a second set of statistic parameters indicating statistical properties of the data entries of the data samples. The data samples are reconstructed using the first and the second sets of statistic parameters based on Eckart-Young theorem.
    Type: Application
    Filed: April 5, 2017
    Publication date: October 11, 2018
    Inventors: Jiangsheng Yu, Shijun Ma, Qingqing Zhou, Ting Yu Cliff Leung