Patents by Inventor Ting Yu Cliff Leung
Ting Yu Cliff Leung has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11966381Abstract: Embodiments maintain a data pool that includes heterogeneous data sets, and receiving a first data batch of a data set from a data source into the data pool. Embodiments determine a current state of the data set based on a data set state diagram including a plurality of data set states, and identify a condition of the first data batch. Embodiments further set a data batch state for the first data batch, based on a data batch state diagram, and update the data batch state of a prior data batch received before the first data batch, based on the condition of the first data batch. Embodiments additionally transition the data set state diagram, based on the condition of the first data batch, to an updated data set state. Embodiments maintain a data state repository storing the data set state for each of the plurality of heterogeneous data sets.Type: GrantFiled: November 9, 2021Date of Patent: April 23, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Liangzhao Zeng, Ting Yu Cliff Leung, Yat On Lau, Jimmy Hong, Chuang Yao, Yen-Ting Liu, Ting-Kuan Wu
-
Patent number: 11907213Abstract: A query processing method including decomposing an SQL into logical plans based on data source feature information, to obtain a logical plan set, where the data source feature information is stored in an internal storage space of the query engine; generating physical plans for the logical plan set based on the data source feature information, to obtain a physical plan set; determining query costs of the physical plan set based on the data source feature information, to obtain a physical plan with a highest priority; and executing the physical plan with the highest priority, to obtain a query result queried by a user.Type: GrantFiled: June 6, 2022Date of Patent: February 20, 2024Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Bing Zhou, Wenwei Xue, Ting Yu Cliff Leung, Tao Li
-
Patent number: 11860853Abstract: Embodiments of the system in include a memory that stores a metamodel including a plurality of predefined characteristics for data sets. A data repository stores a plurality of heterogeneous data sets, each of the plurality of data sets comprising a plurality of data batches received over time. An interface receives a new data set for storage into the data repository, and data health reasoner to retrieve the stored metamodel from the memory, the stored metamodel including a plurality of predefined characteristics. The data health reasoner determines measured values of a subset of the plurality of predefined characteristics identified based on the stored metamodel, and determines a set of data health metrics for the data set based on the measured values of the subset of the set of the predefined characteristics. The data health reasoner formulates a plurality of data validation assertions for the data set and apply the plurality of data validation assertions to each instance of the data set.Type: GrantFiled: November 9, 2021Date of Patent: January 2, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Liangzhao Zeng, Ting Yu Cliff Leung, Jimmy Hong, Yat On Lau
-
Publication number: 20230376469Abstract: Embodiments of the system in includes data set analytics to identify one or more dataset utilized by a workflow. The data set analytics identifies upstream data sets referenced by the data sets utilized by the workflow. The set of all data sets relevant to the workflow are considered applicable data sets, and are analyzed. The data set analytics determines a usage pattern of each of the applicable data sets by the workflow, and identifies one or more data quality assertions for each of the applicable of data sets based on the usage pattern. The data set analytics further perform a quality evaluation of the applicable data sets by applying data quality assertions to the applicable data sets used by the workflow.Type: ApplicationFiled: May 23, 2022Publication date: November 23, 2023Applicant: Microsoft Technology Licensing, LLCInventors: Liangzhao ZENG, Ting Yu Cliff LEUNG, Xiaoyang GAO
-
Publication number: 20230147939Abstract: Embodiments maintain a data pool that includes heterogeneous data sets, and receiving a first data batch of a data set from a data source into the data pool. Embodiments determine a current state of the data set based on a data set state diagram including a plurality of data set states, and identify a condition of the first data batch. Embodiments further set a data batch state for the first data batch, based on a data batch state diagram, and update the data batch state of a prior data batch received before the first data batch, based on the condition of the first data batch. Embodiments additionally transition the data set state diagram, based on the condition of the first data batch, to an updated data set state. Embodiments maintain a data state repository storing the data set state for each of the plurality of heterogeneous data sets.Type: ApplicationFiled: November 9, 2021Publication date: May 11, 2023Inventors: Liangzhao ZENG, Ting Yu Cliff LEUNG, Yat On LAU, Jimmy HONG, Chuang YAO, Yen-Ting LIU, Ting-Kuan WU
-
Publication number: 20230145069Abstract: Embodiments of the system in include a memory that stores a metamodel including a plurality of predefined characteristics for data sets. A data repository stores a plurality of heterogeneous data sets, each of the plurality of data sets comprising a plurality of data batches received over time. An interface receives a new data set for storage into the data repository, and data health reasoner to retrieve the stored metamodel from the memory, the stored metamodel including a plurality of predefined characteristics. The data health reasoner determines measured values of a subset of the plurality of predefined characteristics identified based on the stored metamodel, and determines a set of data health metrics for the data set based on the measured values of the subset of the set of the predefined characteristics. The data health reasoner formulates a plurality of data validation assertions for the data set and apply the plurality of data validation assertions to each instance of the data set.Type: ApplicationFiled: November 9, 2021Publication date: May 11, 2023Inventors: Liangzhao ZENG, Ting Yu Cliff LEUNG, Jimmy HONG, Yat On LAU
-
Publication number: 20220300506Abstract: A query processing method including decomposing an SQL into logical plans based on data source feature information, to obtain a logical plan set, where the data source feature information is stored in an internal storage space of the query engine; generating physical plans for the logical plan set based on the data source feature information, to obtain a physical plan set; determining query costs of the physical plan set based on the data source feature information, to obtain a physical plan with a highest priority; and executing the physical plan with the highest priority, to obtain a query result queried by a user. The embodiments of this application further disclose a data source registration method and a query engine.Type: ApplicationFiled: June 6, 2022Publication date: September 22, 2022Inventors: Bing ZHOU, Wenwei XUE, Ting Yu Cliff LEUNG, Tao LI
-
Patent number: 11366808Abstract: A query processing method includes decomposing an SQL into logical plans based on data source feature information, to obtain a logical plan set, where the data source feature information is stored in an internal data source feature library of a query engine, and the internal data source feature library is stored in cache space of the query engine; generating physical plans for the logical plan set based on the data source feature information, to obtain a physical plan set; determining query costs of the physical plan set based on the data source feature information, to obtain a physical plan with a highest priority; and executing the physical plan with the highest priority, to obtain a query result queried by a user. A data source registration method and a query engine is further disclosed.Type: GrantFiled: October 24, 2019Date of Patent: June 21, 2022Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Bing Zhou, Wenwei Xue, Ting Yu Cliff Leung, Tao Li
-
Publication number: 20200073867Abstract: a query processing method including decomposing an SQL into logical plans based on data source feature information, to obtain a logical plan set, where the data source feature information is stored in an internal data source feature library of a query engine, and the internal data source feature library is stored in cache space of the query engine; generating physical plans for the logical plan set based on the data source feature information, to obtain a physical plan set; determining query costs of the physical plan set based on the data source feature information, to obtain a physical plan with a highest priority; and executing the physical plan with the highest priority, to obtain a query result queried by a user. The embodiments of this application further disclose a data source registration method and a query engine.Type: ApplicationFiled: October 24, 2019Publication date: March 5, 2020Inventors: Bing ZHOU, Wenwei XUE, Ting Yu Cliff LEUNG, Tao LI
-
Publication number: 20180293272Abstract: A method for cloning data samples in a data set based on statistic information of the data samples. The method does not use any of the data samples to perform the cloning. The statistic information includes a first set of statistic parameters obtained from a data matrix formed by data entries of the data samples based on Eckart-Young theorem, and a second set of statistic parameters indicating statistical properties of the data entries of the data samples. The data samples are reconstructed using the first and the second sets of statistic parameters based on Eckart-Young theorem.Type: ApplicationFiled: April 5, 2017Publication date: October 11, 2018Inventors: Jiangsheng Yu, Shijun Ma, Qingqing Zhou, Ting Yu Cliff Leung