Patents by Inventor Yeye He

Yeye He has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12386849
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a transformation function is executed using an example input value to obtain an initial output value. Thereafter, a plurality of supplemental transformation tools is applied to the initial output value to generate a plurality of intermediary output values. Based on a comparison of each of the intermediary output values to an example output value, the supplemental transformation tool that generated an intermediary output value having a greatest extent of similarity to the example output values is identified. The identified supplemental transformation tool and the transformation function are used to generate a transformation program that transforms the example input values to the desired form in which to transform data.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: August 12, 2025
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yeye He, Kris Ganjam, Vivek Ravindranath Narasayya, Surajit Chaudhuri
  • Patent number: 12380082
    Abstract: Systems and methods relate to auto-tagging of data in a data lake or a data storage. Generating a statistical summary of the data lake and interactively receiving data in a selected column of an exemplar data addresses an issue of efficiently and accurately auto-tagging data in a data lake. The present disclosure automatically generates a statistical summary of the data lake using a lightweight off-line processing. A graphical user interface interactively receives an exemplar data file with a selection of a column in the exemplar data file. A list of candidate data-tagging patterns is generated based on the statistical summary and updates the list by removing candidate data-tagging patterns that under-generalize the data. The present disclosure determines a data-tagging pattern by selecting a candidate data-tagging profile from the list based on having the least number of matching columns in the data lake.
    Type: Grant
    Filed: June 23, 2022
    Date of Patent: August 5, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Jie Song, Yue Wang, Surajit Chaudhuri, Vishal Kumar Seshagirirao Anil, Yaron Y. Goland, Gaurav Malhotra, Blake Lassiter
  • Patent number: 12373415
    Abstract: This document relates to relational databases and corresponding data tables. Non-conforming data tables can be automatically transformed into conforming relational data tables. One example can obtain conforming relational data tables and can generate training data without human labelling by identifying a transformational operator that will transform an individual conforming relational data table to a non-conforming data table and an inverse transformational operator that will transform the non-conforming data table back to the individual conforming relational data table. The example can train a model with the training data. The trained model can synthesize programs to transform other non-conforming data tables to conforming relational data tables.
    Type: Grant
    Filed: June 7, 2023
    Date of Patent: July 29, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Cong Yan, Yue Wang, Surajit Chaudhuri, Peng Li
  • Patent number: 12298996
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values including example input values that indicate data values to be transformed and example output values that indicate a desired form in which to transform data. Based on the set of example values, a data transformation function that is relevant to the set of example values is identified. The data transformation function is used to generate a transformation program to transform the example input values to the desired form in which to transform data. A suggestion of the transformation program can be provided to a user device, wherein selection of the transformation program suggestion results in a data transformation.
    Type: Grant
    Filed: September 28, 2023
    Date of Patent: May 13, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Kris K. Ganjam, Vivek Ravindranath Narasayya, Surajit Chaudhuri
  • Patent number: 12292866
    Abstract: Solutions for data unification include: receiving a data record, the data record comprising a plurality of data fields; selecting, from among the plurality of data fields, a subset of the data fields, the subset of the data fields being fewer in number than the plurality of data fields, wherein selecting the subset of the data fields comprises: applying a first rule to select at least a first one of the data fields within the data record for inclusion in the subset of the data fields; using content of the subset of the data fields, generating a stable identifier (stableID) for the data record; and inserting the stableID into a primary key data field of the data record.
    Type: Grant
    Filed: June 7, 2023
    Date of Patent: May 6, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Meiyalagan Balasubramanian, Lengning Liu, Aditya Kuppa, Kirk Hartmann Freiheit, Kalen Wong, Paula Budig Greve, Patrick Clinton Little, Lucas Pritz, Yue Wang, Vivek Ravindranath Narasayya, Katchaguy Areekijseree, Yeye He, Surajit Chaudhuri, Gaurav Ghosh
  • Publication number: 20240411740
    Abstract: This document relates to relational databases and corresponding data tables. Non-conforming data tables can be automatically transformed into conforming relational data tables. One example can obtain conforming relational data tables and can generate training data without human labelling by identifying a transformational operator that will transform an individual conforming relational data table to a non-conforming data table and an inverse transformational operator that will transform the non-conforming data table back to the individual conforming relational data table. The example can train a model with the training data. The trained model can synthesize programs to transform other non-conforming data tables to conforming relational data tables.
    Type: Application
    Filed: June 7, 2023
    Publication date: December 12, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Yeye HE, Cong YAN, Yue WANG, Surajit CHAUDHURI, Peng LI
  • Publication number: 20240346427
    Abstract: The present disclosure relates to methods and systems that automatically predict a business intelligence model for tables of data provided as input. The methods and systems automatically generate a graph representing the business intelligence model and provide the graph as output. The graph provides a visual representation of the business intelligence model with nodes of the graph representing each input table and edges of the graph representing weighted edges joining pairs of tables together.
    Type: Application
    Filed: April 14, 2023
    Publication date: October 17, 2024
    Inventors: Yeye HE, Yiming Stefania LIN, Surajit CHAUDHURI
  • Publication number: 20240184798
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values are received. A repository of transformation tools is searched to identify a new transformation tool as relevant to a data transformation associated with the received set of example values. The repository includes annotations associated with the new transformation tool. The new transformation tool is used to generate a transformation program that produces transformed output values. Additional annotations are generated for the new transformation tool based on the transformed output values.
    Type: Application
    Filed: December 5, 2022
    Publication date: June 6, 2024
    Inventors: Kris K. GANJAM, Yeye HE, Vivek Ravindranath NARASAYYA, Surajit CHAUDHURI
  • Patent number: 11928564
    Abstract: Systems are provided for facilitating the building and use of models used to make data preparation recommendations. The systems identify ground truth from a plurality of notebooks and utilizes the ground truth to generate the corresponding data preparation recommendation models. The data preparation recommendation models are used to predict accurate (e.g., useful and relevant) data preparations steps based on user input and user notebook data. The data preparation computing system generates a recommendation prompt based on output from the data preparation recommendation model that can be viewed and/or selected by the user to be applied to the user's notebook data.
    Type: Grant
    Filed: October 19, 2022
    Date of Patent: March 12, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Cong Yan
  • Patent number: 11886457
    Abstract: A transform-by-pattern (TBP) system is configured to proactively suggest relevant TBP programs based on inputted source dataset and target dataset without requiring users typing in examples. The TBP system has access to multiple TBP programs, each of which includes a combination of a source pattern, a target pattern, and a transformation program that is configured to transform data that fits into the target pattern into data that fits into the source pattern. When a source dataset and a target dataset are received from a user, the TBP system identifies a subset of the source dataset and a subset of the target dataset as related data. The TBP system then identifies one or more applicable TBP programs amongst the multiple TBP programs, and suggest or apply at least one of the one or more applicable TBP programs.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: January 30, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Surajit Chaudhuri, Zhongjun Jin
  • Publication number: 20240028607
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values including example input values that indicate data values to be transformed and example output values that indicate a desired form in which to transform data. Based on the set of example values, a data transformation function that is relevant to the set of example values is identified. The data transformation function is used to generate a transformation program to transform the example input values to the desired form in which to transform data. A suggestion of the transformation program can be provided to a user device, wherein selection of the transformation program suggestion results in a data transformation.
    Type: Application
    Filed: September 28, 2023
    Publication date: January 25, 2024
    Inventors: Yeye HE, Kris K. GANJAM, Vivek Ravindranath NARASAYYA, Surajit CHAUDHURI
  • Patent number: 11880344
    Abstract: Methods and systems for generating multi-operator data transformation pipelines. An example method includes accessing raw data for transformation; receiving a selection of a target table or target visualization, wherein the target table or target visualization is for data other than the raw data; extracting table properties and target constraints; and based on the extracted table properties and target constraints, synthesizing one or more multi-operator data transformation pipelines for transforming the raw data to a generated table or generated visualization.
    Type: Grant
    Filed: May 14, 2021
    Date of Patent: January 23, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yeye He, Surajit Chaudhuri, Junwen Yang
  • Publication number: 20230368068
    Abstract: The present disclosure relates to systems, methods, and computer-readable media for training and implementing pipeline error detection models to facilitate automated detection of data quality (DQ) issues within recurring data pipelines. For example, systems described herein involve training a pipeline error detection model by first constructing a plurality of DQ constraints for a recurring data pipeline based on ranges of values observed over a history of pipeline executions. The systems may further train the model to predict DQ issues by synthetically applying data variants to historical executions of the recurring data pipeline or to data pipelines having similar characteristics thereto. Once trained, the pipeline error detection model(s) can be applied to new executions of the data pipeline as they become available to quickly and efficiently predict whether a given execution includes a predicted DQ issue therein.
    Type: Application
    Filed: May 12, 2022
    Publication date: November 16, 2023
    Inventors: Yeye HE, Weiwei CUI, Song GE, Haidong ZHANG, Shi HAN, Dongmei ZHANG, Surajit CHAUDHURI
  • Patent number: 11809442
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a set of example values including example input values that indicate data values to be transformed and example output values that indicate a desired form in which to transform data. Based on the set of example values, a data transformation function that is relevant to the set of example values is identified. The data transformation function is used to generate a transformation program to transform the example input values to the desired form in which to transform data. A suggestion of the transformation program can be provided to a user device, wherein selection of the transformation program suggestion results in a data transformation.
    Type: Grant
    Filed: April 13, 2020
    Date of Patent: November 7, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Kris Ganjam, Vivek Ravindranath Narasayya, Surajit Chaudhuri
  • Patent number: 11809223
    Abstract: Methods, computer systems, computer-storage media, and graphical user interfaces are provided for facilitating data transformations, according to embodiments of the present invention. In one embodiment, a plurality of remote sources is searched to identify candidate transformation tools relevant for performing data transformations. The candidate transformation tools are analyzed to identify tool examples corresponding with each of the candidate transformation tools. For each of the candidate transformation tools, the tool examples are stored in association with the corresponding candidate transformation tool. Based on a comparison of tool examples with example values, a transformation tool is identified as relevant to facilitate transforming example input values to the desired form in which to transform data.
    Type: Grant
    Filed: November 8, 2021
    Date of Patent: November 7, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yeye He, Kris Ganjam, Vivek Ravindranath Narasayya, Surajit Chaudhuri, Xu Chu
  • Patent number: 11714790
    Abstract: Solutions for data unification include: receiving a data record, the data record comprising a plurality of data fields; selecting, from among the plurality of data fields, a subset of the data fields, the subset of the data fields being fewer in number than the plurality of data fields, wherein selecting the subset of the data fields comprises: applying a first rule to select at least a first one of the data fields within the data record for inclusion in the subset of the data fields; using content of the subset of the data fields, generating a stable identifier (stableID) for the data record; and inserting the stableID into a primary key data field of the data record.
    Type: Grant
    Filed: September 30, 2021
    Date of Patent: August 1, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Meiyalagan Balasubramanian, Lengning Liu, Aditya Kuppa, Kirk Hartmann Freiheit, Kalen Wong, Paula Budig Greve, Patrick Clinton Little, Lucas Pritz, Yue Wang, Vivek Ravindranath Narasayya, Katchaguy Areekijseree, Yeye He, Surajit Chaudhuri, Gaurav Ghosh
  • Patent number: 11698892
    Abstract: The present disclosure relates to systems, methods, and computer-readable media for using a variety of hypothesis tests to identify errors within tables and other structured datasets. For example, systems disclosed herein can generate a modified table from an input table by removing one or more entries from the input table. The systems disclosed herein can further leverage a collection of training tables to determine probabilities associated with whether the input table and modified table are drawn from the collection of training tables. The systems disclosed herein can additionally compare the probabilities to accurately determine whether the one or more entries include errors therein. The systems disclosed herein may apply to a variety of different sizes and types of tables to identify different types of common errors within input tables.
    Type: Grant
    Filed: October 25, 2021
    Date of Patent: July 11, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Pei Wang
  • Publication number: 20230098926
    Abstract: Solutions for data unification include: receiving a data record, the data record comprising a plurality of data fields; selecting, from among the plurality of data fields, a subset of the data fields, the subset of the data fields being fewer in number than the plurality of data fields, wherein selecting the subset of the data fields comprises: applying a first rule to select at least a first one of the data fields within the data record for inclusion in the subset of the data fields; using content of the subset of the data fields, generating a stable identifier (stableID) for the data record; and inserting the stableID into a primary key data field of the data record.
    Type: Application
    Filed: September 30, 2021
    Publication date: March 30, 2023
    Inventors: Meiyalagan BALASUBRAMANIAN, Lengning LIU, Aditya KUPPA, Kirk Hartmann FREIHEIT, Kalen WONG, Paula Budig GREVE, Patrick Clinton LITTLE, Lucas PRITZ, Yue WANG, Vivek Ravindranath NARASAYYA, Katchaguy AREEKIJSEREE, Yeye HE, Surajit CHAUDHURI
  • Patent number: 11586838
    Abstract: Systems and techniques for end-to-end fuzzy entity matching are described herein. A first input and a second input may be received. The first input and the second input may be evaluated to identify common attribute types. A set of attribute entity matching models may be selected that correspond to the attribute types. The first input and the second input may be evaluated using the set of attribute entity matching models to determine a set of weighted scores for attribute pairs in the first input and the second input. The set of weighted scores may be evaluated using a table-level entity matching model to identify a common entity included in the first input and the second input. A linking dataset may be generated that includes a cross-linking facility indicating a relationship between a first entity descriptor in the first input and a second entity descriptor in the second input.
    Type: Grant
    Filed: July 2, 2019
    Date of Patent: February 21, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yeye He, Chen Zhao
  • Publication number: 20230043015
    Abstract: Systems are provided for facilitating the building and use of models used to make data preparation recommendations. The systems identify ground truth from a plurality of notebooks and utilizes the ground truth to generate the corresponding data preparation recommendation models. The data preparation recommendation models are used to predict accurate (e.g., useful and relevant) data preparations steps based on user input and user notebook data. The data preparation computing system generates a recommendation prompt based on output from the data preparation recommendation model that can be viewed and/or selected by the user to be applied to the user's notebook data.
    Type: Application
    Filed: October 19, 2022
    Publication date: February 9, 2023
    Inventors: Yeye He, Cong Yan