Patents by Inventor Mingge Deng

Mingge Deng has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11948159
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for scalable matrix factorization. A method includes obtaining a Structured Query Language (SQL) query to create a matrix factorization model based on a set of training data, generating SQL sub-queries that don't include non-scalable functions, obtaining the set of training data, and generating a matrix factorization model based on the set of training data and the SQL sub-queries that don't include non-scalable functions.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: April 2, 2024
    Assignee: Google LLC
    Inventors: Amir H. Hormati, Lisa Yin, Umar Ali Syed, Mingge Deng
  • Patent number: 11928017
    Abstract: A method includes receiving a point data anomaly detection query from a user. The query requests the data processing hardware to determine a quantity of anomalous point data values in a set of point data values. The method includes training a model using the set of point data values. For at least one respective point data value in the set of point data values, the method includes determining, using the trained model, a variance value for the respective point data value and determining that the variance value satisfies a threshold value. Based on the variance value satisfying the threshold value, the method includes determining that the respective point data value is an anomalous point data value. The method includes reporting the determined anomalous point data value to the user.
    Type: Grant
    Filed: May 21, 2022
    Date of Patent: March 12, 2024
    Assignee: Google LLC
    Inventors: Zichuan Ye, Jiashang Liu, Forest Elliott, Amir Hormati, Xi Cheng, Mingge Deng
  • Publication number: 20240045845
    Abstract: A method for unstructured data analytics in data warehouses includes receiving an unstructured data query from a user, the unstructured data query requesting the data processing hardware determine one or more unstructured data files stored at a data repository that match query parameters. The method includes determining, using an object table, a set of unstructured data files stored at the data repository that matches the query parameters. The object table includes a plurality of rows, each row of the plurality of rows associated with a respective unstructured data file stored at the data repository, and a plurality of columns, each column of the plurality of columns comprising metadata associated with the respective unstructured data file of each row of the plurality of rows. The method includes returning, to the user, a structured data table including the determined set of unstructured data files.
    Type: Application
    Filed: August 6, 2022
    Publication date: February 8, 2024
    Applicant: Google LLC
    Inventors: Thibaud Baptiste Hottelier, Yuri Volobuev, Mingge Deng, Justin Levandoski, Gaurav Saxena, Deepak Choudhary Nettem, Anoop Kochummen Johnson
  • Patent number: 11842291
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that creates a machine learning model with k-means clustering. In some implementations, an instruction to create a model is obtained. A data set including geographic data and non-geographic data is received. The data set includes multiple data entries. Geographic centroids are determined from the geographic data. The data set is analyzed to obtain statistics of the data set. Transformed data is generated from the data set, the statistics, and the geographic centroids. A model is generated with the transformed data, the model indicating multiple data groupings.
    Type: Grant
    Filed: December 6, 2022
    Date of Patent: December 12, 2023
    Assignee: Google LLC
    Inventors: Mingge Deng, Amir H. Hormati, Xi Cheng
  • Publication number: 20230274180
    Abstract: A method for forecasting time-series data, when executed by data processing hardware, causes the data processing hardware to perform operations including receiving a time series forecasting query from a user requesting a time series forecast forecasting future data based on a set of current time-series data. The operations include obtaining, from the set of current time-series data, a set of training data. The operations include training, using a first portion of the set of training data, a first sub-model of a forecasting model and training, using a second portion of the set of training data, a second sub-model of the forecasting model. The second portion is different than the first portion. The operations include forecasting, using the forecasting model, the future data based on the set of current time-series data and returning, to the user, the forecasted future data for the time series forecast.
    Type: Application
    Filed: February 28, 2022
    Publication date: August 31, 2023
    Applicant: Google LLC
    Inventors: Xi Cheng, Jiashang Liu, Lisa Yin, Amir Hossein Hormati, Mingge Deng, Weijie Shen, Kashif Yousuf
  • Publication number: 20230153311
    Abstract: A method for anomaly detection includes receiving an anomaly detection query from a user. The anomaly detection query requests data processing hardware determine one or more anomalies in a dataset including a plurality of examples. Each example in the plurality of examples is associated with one or more features. The method includes training a model using the dataset. The trained model is configured to use a local outlier factor (LOF) algorithm. For each respective example of the plurality of examples in the dataset, the method includes determining, using the trained model, a respective local deviation score based on the one or more features. The method includes determining that the respective local deviation score satisfies a deviation score threshold and, based on the location deviation score satisfying the threshold, determining that the respective example is anomalous. The method includes reporting the respective anomalous example to the user.
    Type: Application
    Filed: November 8, 2022
    Publication date: May 18, 2023
    Applicant: Google LLC
    Inventors: Xi Cheng, Zichuan Ye, Peng Lin, Jiashang Liu, Amir Hormati, Mingge Deng
  • Publication number: 20230094479
    Abstract: A method includes receiving a model analysis request from a user. The model analysis requests requesting the data processing hardware to provide one or more statistics of a model trained on a dataset. The method also includes obtaining the trained model. The trained model includes a plurality of weights. Each weight is assigned to a feature of the trained model. The model also includes determining, using the dataset and the plurality of weights, the one or more statistics of the trained model based on a linear regression of the trained model. The method includes reporting the one or more statistics of the trained model to the user.
    Type: Application
    Filed: September 30, 2021
    Publication date: March 30, 2023
    Applicant: Google LLC
    Inventors: Xi Cheng, Lisa Yin, Mingge Deng, Amir Hormati, Umar Ali Syed, Jiashang Liu
  • Publication number: 20230094005
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that creates a machine learning model with k-means clustering. In some implementations, an instruction to create a model is obtained. A data set including geographic data and non-geographic data is received. The data set includes multiple data entries. Geographic centroids are determined from the geographic data. The data set is analyzed to obtain statistics of the data set. Transformed data is generated from the data set, the statistics, and the geographic centroids. A model is generated with the transformed data, the model indicating multiple data groupings.
    Type: Application
    Filed: December 6, 2022
    Publication date: March 30, 2023
    Applicant: Google LLC
    Inventors: Mingge Deng, Amir H. Hormati, Xi Cheng
  • Publication number: 20230045139
    Abstract: A method for principal component analysis includes receiving a principal component analysis (PCA) request from a user requesting data processing hardware to perform PCA on a dataset, the dataset including a plurality of input features. The method further includes training a PCA model on the plurality of input features of the dataset. The method includes determining, using the trained PCA model, one or more principal components of the dataset. The method also includes generating, based on the plurality of input features and the one or more principal components, one or more embedded features of the dataset. The method includes returning the one or more embedded features to the user.
    Type: Application
    Filed: July 29, 2022
    Publication date: February 9, 2023
    Applicant: Google LLC
    Inventors: Xi Cheng, Mingge Deng, Amir Hossein Hormati
  • Patent number: 11544596
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that creates a machine learning model with k-means clustering. In some implementations, an instruction to create a model is obtained. A data set including geographic data and non-geographic data is received. The data set includes multiple data entries. Geographic centroids are determined from the geographic data. The data set is analyzed to obtain statistics of the data set. Transformed data is generated from the data set, the statistics, and the geographic centroids. A model is generated with the transformed data, the model indicating multiple data groupings.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: January 3, 2023
    Assignee: Google LLC
    Inventors: Mingge Deng, Amir H. Hormati, Xi Cheng
  • Publication number: 20220405623
    Abstract: The disclosure is directed to a query-driven machine learning platform for generating feature attributions and other data for interpreting the relationship between inputs and outputs of a machine learning model. The platform can receive query statements for selecting data, training a machine learning model, and generating model explanation data for the model. The platform can distribute processing for generating the model explanation data to scale in response to requests to process selected data, including multiple records with a variety of different feature values. The interface between a user device and the machine learning platform can streamline deployment of different model explainability approaches across a variety of different machine learning models.
    Type: Application
    Filed: June 22, 2021
    Publication date: December 22, 2022
    Inventors: Xi Cheng, Lisa Yin, Jiashang Liu, Amir H. Hormati, Mingge Deng, Christopher Avery Meyers
  • Publication number: 20220382622
    Abstract: A method includes receiving a point data anomaly detection query from a user. The query requests the data processing hardware to determine a quantity of anomalous point data values in a set of point data values. The method includes training a model using the set of point data values. For at least one respective point data value in the set of point data values, the method includes determining, using the trained model, a variance value for the respective point data value and determining that the variance value satisfies a threshold value. Based on the variance value satisfying the threshold value, the method includes determining that the respective point data value is an anomalous point data value. The method includes reporting the determined anomalous point data value to the user.
    Type: Application
    Filed: May 21, 2022
    Publication date: December 1, 2022
    Applicant: Google LLC
    Inventors: Zichaun Ye, Jiashang Liu, Forest Elliott, Amir Hormati, Xi Cheng, Mingge Deng
  • Publication number: 20220366318
    Abstract: A method, when executed by data processing hardware, causes the data processing hardware to perform operations including receiving, from a user device, a hyperparameter optimization request requesting optimization of one or more hyperparameters of a machine learning model. The operations include obtaining training data for training the machine learning model and determining a set of hyperparameter permutations of the one or more hyperparameters. For each respective hyperparameter permutation in the set of hyperparameter permutations, the operations include training a unique machine learning model using the training data and the respective hyperparameter permutation and determining a performance of the trained model. The operations include selecting, based on the performance of each of the trained unique machine learning models of the user device, one of the trained unique machine learning models.
    Type: Application
    Filed: May 15, 2022
    Publication date: November 17, 2022
    Applicant: Google LLC
    Inventors: Jiaxun Wu, Ye Zichaun, Mingge Deng, Amir Hormati
  • Publication number: 20200320072
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for scalable matrix factorization. A method includes obtaining a Structured Query Language (SQL) query to create a matrix factorization model based on a set of training data, generating SQL sub-queries that don't include non-scalable functions, obtaining the set of training data, and generating a matrix factorization model based on the set of training data and the SQL sub-queries that don't include non-scalable functions.
    Type: Application
    Filed: April 8, 2020
    Publication date: October 8, 2020
    Inventors: Amir H. Hormati, Lisa Yin, Umar Ali Syed, Mingge Deng
  • Publication number: 20200320413
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that creates a machine learning model with k-means clustering. In some implementations, an instruction to create a model is obtained. A data set including geographic data and non-geographic data is received. The data set includes multiple data entries. Geographic centroids are determined from the geographic data. The data set is analyzed to obtain statistics of the data set. Transformed data is generated from the data set, the statistics, and the geographic centroids. A model is generated with the transformed data, the model indicating multiple data groupings.
    Type: Application
    Filed: April 8, 2020
    Publication date: October 8, 2020
    Inventors: Mingge Deng, Amir H. Hormati, Xi Cheng