Patents by Inventor Wangchao Le

Wangchao Le has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250139091
    Abstract: A click-to-script service enables developers of big-data job scripts to quickly see the underlying script operations from optimized execution plans. Once a big-data job is received, the disclosed examples compile it and generate tokens that are associated with each operation of the big-data job. These tokens include may include the file name of the job, the line number of the operation, and/or an Abstract Syntax Tree (AST) node for the given operations. An original execution plan is optimized into an optimized execution plan, and the tokens for the original operations of the job script are assigned to the optimized operations of the optimized execution plan. The optimized execution plan is graphically displayed in an interactive manner such that users may view the optimized execution plan and click on its optimized operations to find the original operations of the job script.
    Type: Application
    Filed: November 4, 2024
    Publication date: May 1, 2025
    Inventors: Xiangnan LI, Marc Todd FRIEDMAN, Wangchao LE, Evgueni ZABOKRITSKI
  • Patent number: 12164516
    Abstract: A click-to-script service enables developers of big-data job scripts to quickly see the underlying script operations from optimized execution plans. Once a big-data job is received, the disclosed examples compile it and generate tokens that are associated with each operation of the big-data job. These tokens include may include the file name of the job, the line number of the operation, and/or an Abstract Syntax Tree (AST) node for the given operations. An original execution plan is optimized into an optimized execution plan, and the tokens for the original operations of the job script are assigned to the optimized operations of the optimized execution plan. The optimized execution plan is graphically displayed in an interactive manner such that users may view the optimized execution plan and click on its optimized operations to find the original operations of the job script.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: December 10, 2024
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Xiangnan Li, Marc Todd Friedman, Wangchao Le, Evgueni Zabokritski
  • Patent number: 12067014
    Abstract: Example aspects include techniques for clustering delete targets for vectorized deletion including retrieving, from a set of delete targets in a distributed database system, a file to be deleted, scanning existing clusters of files marked for deletion to identify at least one existing cluster of files having constraints corresponding to the file, based on identifying the at least one existing cluster of files, adding the file to the at least one existing cluster of files to create a new cluster of files, based on failing to identify the at least one existing cluster of files, creating the new cluster of files including the file, and generating, for each file in the new cluster of files and based on a deletion signal, a delta array including multiple bits representing data items in each file and indicating, based on bit value, target data items to be deleted from each file.
    Type: Grant
    Filed: June 14, 2023
    Date of Patent: August 20, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Wangchao Le, Marc Todd Friedman, Hiren Patel
  • Publication number: 20240126754
    Abstract: A click-to-script service enables developers of big-data job scripts to quickly see the underlying script operations from optimized execution plans. Once a big-data job is received, the disclosed examples compile it and generate tokens that are associated with each operation of the big-data job. These tokens include may include the file name of the job, the line number of the operation, and/or an Abstract Syntax Tree (AST) node for the given operations. An original execution plan is optimized into an optimized execution plan, and the tokens for the original operations of the job script are assigned to the optimized operations of the optimized execution plan. The optimized execution plan is graphically displayed in an interactive manner such that users may view the optimized execution plan and click on its optimized operations to find the original operations of the job script.
    Type: Application
    Filed: June 25, 2021
    Publication date: April 18, 2024
    Inventors: Xiangnan LI, Marc Todd FRIEDMAN, Wangchao LE, Evgueni ZABOKRITSKI
  • Publication number: 20230325390
    Abstract: Example aspects include techniques for clustering delete targets for vectorized deletion including retrieving, from a set of delete targets in a distributed database system, a file to be deleted, scanning existing clusters of files marked for deletion to identify at least one existing cluster of files having constraints corresponding to the file, based on identifying the at least one existing cluster of files, adding the file to the at least one existing cluster of files to create a new cluster of files, based on failing to identify the at least one existing cluster of files, creating the new cluster of files including the file, and generating, for each file in the new cluster of files and based on a deletion signal, a delta array including multiple bits representing data items in each file and indicating, based on bit value, target data items to be deleted from each file.
    Type: Application
    Filed: June 14, 2023
    Publication date: October 12, 2023
    Inventors: Wangchao LE, Marc Todd Friedman, Hiren Patel
  • Patent number: 11734282
    Abstract: Example aspects include techniques for performing vectorized delete operations in a distributed database system including clustering multiple files stored in the distributed database system, and generating, for each of the multiple files and based on a deletion signal, a delta array including multiple bits representing the data items in the file and indicating, based on bit value, the target data items to be deleted from the file. Generating, for each of the multiple files, the delta array can include reading at least one second file shard of multiple second file shards before performing a join operation on at least one first file shard of multiple first file shards is completed.
    Type: Grant
    Filed: March 30, 2022
    Date of Patent: August 22, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Wangchao Le, Marc Todd Friedman, Hiren Patel
  • Patent number: 10726007
    Abstract: Constructing a heavy hitter summary for query optimization. The heavy hitter summary is constructed by sampling each of multiple partitions of a dataset using a uniformed sampling rate. For each partition, performing a two-stage heavy hitter estimation process to determine whether an estimated frequency of a key of the sampled data units may be included in a partition-level heavy hitter summary. Constructing a partition-level heavy hitter summary for each partition of the dataset based on the keys determined via the two-stage process, and constructing a dataset-level heavy hitter summary based on the partition-level heavy hitter summary. The dataset-level heavy hitter summary may be used to optimize query trees.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: July 28, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Wangchao Le, Yongchul Kwon, Marc Todd Friedman
  • Patent number: 10726006
    Abstract: Query optimization using of a query that is compiled into a query tree. The optimization is efficiently performed by using a distinct value estimation data structure (e.g., a KMV synopsis) that represents within an interval distinctness of values that are generated based on data within an interval, even if the resultant data from a subinterval is considered. Various candidate query trees are evaluated, with distinct value generation data structures being propagated for parent nodes based on the distinct value generation data structures of its child node(s). Propagation operations correlate to the operation represented by the parent node in the query tree. The optimizer uses the propagated distinct value estimation structure in order to evaluate the number of distinct values of data that would result from execution of the candidate query tree at least at the corresponding operations (and not just based on the distinct values of the input data).
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: July 28, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Wangchao Le, Yongchul Kwon, Marc Todd Friedman
  • Publication number: 20190095487
    Abstract: Constructing a heavy hitter summary for query optimization. The heavy hitter summary is constructed by sampling each of multiple partitions of a dataset using a uniformed sampling rate. For each partition, performing a two-stage heavy hitter estimation process to determine whether an estimated frequency of a key of the sampled data units may be included in a partition-level heavy hitter summary. Constructing a partition-level heavy hitter summary for each partition of the dataset based on the keys determined via the two-stage process, and constructing a dataset-level heavy hitter summary based on the partition-level heavy hitter summary. The dataset-level heavy hitter summary may be used to optimize query trees.
    Type: Application
    Filed: September 26, 2017
    Publication date: March 28, 2019
    Inventors: Wangchao LE, Yongchul KWON, Marc Todd FRIEDMAN
  • Publication number: 20190005092
    Abstract: Query optimization using of a query that is compiled into a query tree. The optimization is efficiently performed by using a distinct value estimation data structure (e.g., a KMV synopsis) that represents within an interval distinctness of values that are generated based on data within an interval, even if the resultant data from a subinterval is considered. Various candidate query trees are evaluated, with distinct value generation data structures being propagated for parent nodes based on the distinct value generation data structures of its child node(s). Propagation operations correlate to the operation represented by the parent node in the query tree. The optimizer uses the propagated distinct value estimation structure in order to evaluate the number of distinct values of data that would result from execution of the candidate query tree at least at the corresponding operations (and not just based on the distinct values of the input data).
    Type: Application
    Filed: June 30, 2017
    Publication date: January 3, 2019
    Inventors: Wangchao LE, Yongchul KWON, Marc Todd FRIEDMAN
  • Patent number: 10095742
    Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.
    Type: Grant
    Filed: November 28, 2016
    Date of Patent: October 9, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
  • Publication number: 20170083577
    Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common substructures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.
    Type: Application
    Filed: November 28, 2016
    Publication date: March 23, 2017
    Inventors: Songyun DUAN, Anastasios KEMENTSIETSIDIS, Wangchao LE, Feifei LI
  • Patent number: 9542444
    Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.
    Type: Grant
    Filed: January 28, 2016
    Date of Patent: January 10, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
  • Publication number: 20160162549
    Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.
    Type: Application
    Filed: January 28, 2016
    Publication date: June 9, 2016
    Applicant: International Business Machines Corporation
    Inventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
  • Patent number: 9280583
    Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: March 8, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
  • Patent number: 8983990
    Abstract: A method of performing a graph query issued by a user is provided. The method includes performing on a processor, receiving a user graph query. The method includes rewriting the user graph query as a new query based on a query policy expressed in a graph query language. The method includes performing the new query on graph data to obtain a result.
    Type: Grant
    Filed: August 17, 2010
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Min Wang
  • Patent number: 8984019
    Abstract: Keyword searching is used to explore and search large Resource Description Framework datasets having unknown or constantly changing structures. A succinct and effective summarization is built from the underlying resource description framework data. Given a keyword query, the summarization lends significant pruning powers to exploratory keyword searches and leads to much better efficiency compared to previous work. The summarization returns exact results and can be updated incrementally and efficiently.
    Type: Grant
    Filed: November 20, 2012
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Songyun Duan, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Wangchao Le, Feifei Li, Kavitha Srinivas
  • Patent number: 8977650
    Abstract: Keyword searching is used to explore and search large Resource Description Framework datasets having unknown or constantly changing structures. A succinct and effective summarization is built from the underlying resource description framework data. Given a keyword query, the summarization lends significant pruning powers to exploratory keyword searches and leads to much better efficiency compared to previous work. The summarization returns exact results and can be updated incrementally and efficiently.
    Type: Grant
    Filed: November 21, 2012
    Date of Patent: March 10, 2015
    Assignee: International Business Machines Corporation
    Inventors: Songyun Duan, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Wangchao Le, Feifei Li, Kavitha Srinivas
  • Publication number: 20140156633
    Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.
    Type: Application
    Filed: November 30, 2012
    Publication date: June 5, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
  • Publication number: 20140143281
    Abstract: Keyword searching is used to explore and search large Resource Description Framework datasets having unknown or constantly changing structures. A succinct and effective summarization is built from the underlying resource description framework data. Given a keyword query, the summarization lends significant pruning powers to exploratory keyword searches and leads to much better efficiency compared to previous work. The summarization returns exact results and can be updated incrementally and efficiently.
    Type: Application
    Filed: November 21, 2012
    Publication date: May 22, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Songyun Duan, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Wangchao Le, Feifei Li, Kavitha Srinivas