Patents by Inventor Tomas Jon Pfister

Tomas Jon Pfister has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12271822
    Abstract: A method for active learning includes obtaining a set of unlabeled training samples and for each unlabeled training sample, perturbing the unlabeled training sample to generate an augmented training sample. The method includes generating, using a machine learning model, a predicted label for both samples and determining an inconsistency value for the unlabeled training sample that represents variance between the predicted labels for the unlabeled and augmented training samples. The method includes sorting the unlabeled training samples based on the inconsistency values and obtaining, for a threshold number of samples selected from the sorted unlabeled training samples, a ground truth label. The method includes selecting a current set of labeled training samples including each selected unlabeled training samples paired with the corresponding ground truth label. The method includes training, using the current set and a proper subset of unlabeled training samples, the machine learning model.
    Type: Grant
    Filed: August 21, 2020
    Date of Patent: April 8, 2025
    Assignee: GOOGLE LLC
    Inventors: Zizhao Zhang, Tomas Jon Pfister, Sercan Omer Arik, Mingfei Gao
  • Patent number: 12125265
    Abstract: A method for training a locally interpretable model includes obtaining a set of training samples and training a black-box model using the set of training samples. The method also includes generating, using the trained black-box model and the set of training samples, a set of auxiliary training samples and training a baseline interpretable model using the set of auxiliary training samples. The method also includes training, using the set of auxiliary training samples and baseline interpretable model, an instance-wise weight estimator model. For each auxiliary training sample in the set of auxiliary training samples, the method also includes determining, using the trained instance-wise weight estimator model, a selection probability for the auxiliary training sample. The method also includes selecting, based on the selection probabilities, a subset of auxiliary training samples and training the locally interpretable model using the subset of auxiliary training samples.
    Type: Grant
    Filed: June 29, 2022
    Date of Patent: October 22, 2024
    Assignee: GOOGLE LLC
    Inventors: Sercan Omer Arik, Jinsung Yoon, Tomas Jon Pfister
  • Patent number: 12039443
    Abstract: A method includes receiving a training data set including a plurality of training data subsets. From two or more training data subsets in the training data set, the method includes selecting a support set of training examples and a query set of training examples. The method includes determining, using the classification model, a centroid value for each respective class. For each training example in the query set of training examples, the method includes generating, using the classification model, a query encoding, determining a class distance measure, determining a ground-truth distance, and updating parameters of the classification model. For each training example in the query set of training examples identified as being misclassified, the method further includes generating a standard deviation value, sampling a new query, and updating parameters of the confidence model based on the new query encoding.
    Type: Grant
    Filed: October 11, 2022
    Date of Patent: July 16, 2024
    Assignee: GOOGLE LLC
    Inventors: Sercan Omer Arik, Chen Xing, Zizhao Zhang, Tomas Jon Pfister
  • Patent number: 12026614
    Abstract: A method of interpreting tabular data includes receiving, at a deep tabular data learning network (TabNet) executing on data processing hardware, a set of features. For each of multiple sequential processing steps, the method also includes: selecting, using a sparse mask of the TabNet, a subset of relevant features of the set of features; processing using a feature transformer of the TabNet, the subset of relevant features to generate a decision step output and information for a next processing step in the multiple sequential processing steps; and providing the information to the next processing step. The method also includes determining a final decision output by aggregating the decision step outputs generated for the multiple sequential processing steps.
    Type: Grant
    Filed: August 2, 2020
    Date of Patent: July 2, 2024
    Assignee: Google LLC
    Inventors: Sercan Omer Arik, Tomas Jon Pfister
  • Publication number: 20240160937
    Abstract: A method includes obtaining a source training dataset that includes a plurality of source training images and obtaining a target training dataset that includes a plurality of target training images. For each source training image, the method includes translating, using the forward generator neural network G, the source training image to a respective translated target image according to current values of forward generator parameters. For each target training image, the method includes translating, using a backward generator neural network F, the target training image to a respective translated source image according to current values of backward generator parameters. The method also includes training the forward generator neural network G jointly with the backward generator neural network F by adjusting the current values of the forward generator parameters and the backward generator parameters to optimize an objective function.
    Type: Application
    Filed: January 19, 2024
    Publication date: May 16, 2024
    Applicant: Google LLC
    Inventors: Rui Zhang, Jia Li, Tomas Jon Pfister
  • Publication number: 20240153297
    Abstract: A method for extracting entities comprises obtaining a document that includes a series of textual fields that includes a plurality of entities. Each entity represents information associated with a predefined category. The method includes generating, using the document, a series of tokens representing the series of textual fields. The method includes generating an entity prompt that includes the series of tokens and one of the plurality of entities and generating a schema prompt that includes a schema associated with the document. The method includes generating a model query that includes the entity prompt and the schema prompt and determining, using an entity extraction model and the model query, a location of the one of the plurality of entities among the series of tokens. The method includes extracting, from the document, the one of the plurality of entities using the location of the one of the plurality of entities.
    Type: Application
    Filed: November 3, 2023
    Publication date: May 9, 2024
    Applicant: Google LLC
    Inventors: Zizhao Zhang, Zifeng Wang, Vincent Perot, Jacob Devlin, Chen-Yu Lee, Guolong Su, Hao Zhang, Tomas Jon Pfister
  • Publication number: 20240144005
    Abstract: A method of interpreting tabular data includes receiving, at a deep tabular data learning network (TabNet) executing on data processing hardware, a set of features. For each of multiple sequential processing steps, the method also includes: selecting, using a sparse mask of the TabNet, a subset of relevant features of the set of features; processing using a feature transformer of the TabNet, the subset of relevant features to generate a decision step output and information for a next processing step in the multiple sequential processing steps; and providing the information to the next processing step. The method also includes determining a final decision output by aggregating the decision step outputs generated for the multiple sequential processing steps.
    Type: Application
    Filed: January 4, 2024
    Publication date: May 2, 2024
    Applicant: Google LLC
    Inventors: Sercan Omer Arik, Tomas Jon Pfister
  • Patent number: 11941084
    Abstract: A method for training a machine learning model includes obtaining a set of training samples. For each training sample in the set of training samples, during each of one or more training iterations, the method includes cropping the training sample to generate a first cropped image, cropping the training sample to generate a second cropped image that is different than the first cropped image, and duplicating a first portion of the second cropped image. The method also includes overlaying the duplicated first portion of the second cropped image on a second portion of the second cropped image to form an augmented second cropped image. The first portion is different than the second portion. The method also includes training the machine learning model with the first cropped image and the augmented second cropped image.
    Type: Grant
    Filed: November 11, 2021
    Date of Patent: March 26, 2024
    Assignee: Google LLC
    Inventors: Kihyuk Sohn, Chun-Liang Li, Jinsung Yoon, Tomas Jon Pfister
  • Patent number: 11941531
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing an input data element to generate a prediction output that characterizes the input data element. In one aspect, a method comprises: determining a respective attention weight between an input data element and each of a plurality of reference data elements; processing each of the reference data elements using the encoder neural network to generate a respective value embedding of each reference data element; determining a combined value embedding of the reference data elements based on (i) the respective value embedding of each reference data element, and (ii) the respective attention weight between the input data element and each reference data element; and processing the combined value embedding of the reference data elements using a prediction neural network to generate the prediction output that characterizes the input data element.
    Type: Grant
    Filed: February 7, 2020
    Date of Patent: March 26, 2024
    Assignee: Google LLC
    Inventors: Sercan Omer Arik, Tomas Jon Pfister
  • Patent number: 11907850
    Abstract: A method includes obtaining a source training dataset that includes a plurality of source training images and obtaining a target training dataset that includes a plurality of target training images. For each source training image, the method includes translating, using the forward generator neural network G, the source training image to a respective translated target image according to current values of forward generator parameters. For each target training image, the method includes translating, using a backward generator neural network F, the target training image to a respective translated source image according to current values of backward generator parameters. The method also includes training the forward generator neural network G jointly with the backward generator neural network F by adjusting the current values of the forward generator parameters and the backward generator parameters to optimize an objective function.
    Type: Grant
    Filed: November 11, 2021
    Date of Patent: February 20, 2024
    Assignee: Google LLC
    Inventors: Rui Zhang, Jia Li, Tomas Jon Pfister
  • Publication number: 20240054345
    Abstract: A method includes receiving a source data set and a target data set and identifying a loss function for a deep learning model based on the source data set and the target data set. The loss function includes encoder weights, source classifier layer weights, target classifier layer weights, coefficients, and a policy weight. During a first phase of each of a plurality of learning iterations for a learning to transfer learn (L2TL) architecture, the method also includes: applying gradient decent-based optimization to learn the encoder weights, the source classifier layer weights, and the target classifier weights that minimize the loss function; and determining the coefficients by sampling actions of a policy model. During a second phase of each of the plurality of learning iterations, determining the policy weight that maximizes an evaluation metric.
    Type: Application
    Filed: August 24, 2023
    Publication date: February 15, 2024
    Applicant: Google LLC
    Inventors: Sercan Omer Arik, Tomas Jon Pfister, Linchao Zhu
  • Patent number: 11823058
    Abstract: A method includes obtaining a set of training samples. During each of a plurality of training iterations, the method also includes sampling a batch of training samples from the set of training samples. The method includes, for each training sample in the batch of training samples, determining, using a data value estimator, a selection probability. The selection probability for the training sample is based on estimator parameter values of the data value estimator. The method also includes selecting, based on the selection probabilities of each training sample, a subset of training samples from the batch of training samples, and determining, using a predictor model with the subset of training samples, performance measurements. The method also includes adjusting model parameter values of the predictor model based on the performance measurements, and updating the estimator parameter values of the data value estimator based on the performance measurements.
    Type: Grant
    Filed: September 18, 2020
    Date of Patent: November 21, 2023
    Assignee: Google LLC
    Inventors: Sercan Omer Arik, Jinsung Yoon, Tomas Jon Pfister
  • Publication number: 20230351192
    Abstract: A method for training a model comprises obtaining a set of labeled training samples each associated with a given label. For each labeled training sample, the method includes generating a pseudo label and estimating a weight of the labeled training sample indicative of an accuracy of the given label. The method also includes determining whether the weight of the labeled training sample satisfies a weight threshold. When the weight of the labeled training sample satisfies the weight threshold, the method includes adding the labeled training sample to a set of cleanly labeled training samples. Otherwise, the method includes adding the labeled training sample to a set of mislabeled training samples. The method includes training the model with the set of cleanly labeled training samples using corresponding given labels and the set of mislabeled training samples using corresponding pseudo labels.
    Type: Application
    Filed: July 7, 2023
    Publication date: November 2, 2023
    Applicant: Google LLC
    Inventors: Zizhao Zhang, Sercan Omer Arik, Tomas Jon Pfister, Han Zhang
  • Publication number: 20230325676
    Abstract: A method includes obtaining a set of unlabeled training samples. For each training sample in the set of unlabeled training samples generating, the method includes using a machine learning model and the training sample, a corresponding first prediction, generating, using the machine learning model and a modified unlabeled training sample, a second prediction, the modified unlabeled training sample based on the training sample, and determining a difference between the first prediction and the second prediction. The method includes selecting, based on the differences, a subset of the set of unlabeled training samples. For each training sample in the subset of the set of unlabeled training samples, the method includes obtaining a ground truth label for the training sample, and generating a corresponding labeled training sample based on the training sample paired with the ground truth label. The method includes training the machine learning model using the corresponding labeled training samples.
    Type: Application
    Filed: June 13, 2023
    Publication date: October 12, 2023
    Applicant: Google LLC
    Inventors: Zizhao Zhang, Tomas Jon Pfister, Sercan Omer Arik, Mingfei Gao
  • Publication number: 20230153980
    Abstract: A computer-implemented method includes receiving an anomaly clustering request that requests data processing hardware to assign each image of a plurality of images into one of a plurality of groups. The method also includes obtaining a plurality of images. For each respective image, the method includes extracting a respective set of patch embeddings from the respective image, determining a distance between the respective set of patch embeddings and each other set of patch embeddings, and assigning the respective image into one of the plurality of groups using the distances between the respective set of patch embeddings and each other set of patch embeddings.
    Type: Application
    Filed: November 10, 2022
    Publication date: May 18, 2023
    Applicant: Google LLC
    Inventors: Kihyuk Sohn, Jinsung Yoon, Chun-Liang Li, Tomas Jon Pfister, Chen-Yu Lee
  • Publication number: 20230120894
    Abstract: A method includes receiving a training data set including a plurality of training data subsets. From two or more training data subsets in the training data set, the method includes selecting a support set of training examples and a query set of training examples. The method includes determining, using the classification model, a centroid value for each respective class. For each training example in the query set of training examples, the method includes generating, using the classification model, a query encoding, determining a class distance measure, determining a ground-truth distance, and updating parameters of the classification model. For each training example in the query set of training examples identified as being misclassified, the method further includes generating a standard deviation value, sampling a new query, and updating parameters of the confidence model based on the new query encoding.
    Type: Application
    Filed: October 11, 2022
    Publication date: April 20, 2023
    Applicant: Google LLC
    Inventors: Sercan Omer Arik, Chen Xing, Zizhao Zhang, Tomas Jon Pfister
  • Patent number: 11487970
    Abstract: A method for jointly training a classification model and a confidence model. The method includes receiving a training data set including a plurality of training data subsets. From two or more training data subsets in the training data set, the method includes selecting a support set of training examples and a query set of training examples. The method includes determining, using the classification model, a centroid value for each respective class. For each training example in the query set of training examples, the method includes generating, using the classification model, a query encoding, determining a class distance measure, determining a ground-truth distance, and updating parameters of the classification model. For each training example in the query set of training examples identified as being misclassified, the method further includes generating a standard deviation value, sampling a new query, and updating parameters of the confidence model based on the new query encoding.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: November 1, 2022
    Assignee: Google LLC
    Inventors: Sercan Omer Arik, Chen Xing, Zizhao Zhang, Tomas Jon Pfister
  • Patent number: 11403490
    Abstract: A method for training a locally interpretable model includes obtaining a set of training samples and training a black-box model using the set of training samples. The method also includes generating, using the trained black-box model and the set of training samples, a set of auxiliary training samples and training a baseline interpretable model using the set of auxiliary training samples. The method also includes training, using the set of auxiliary training samples and baseline interpretable model, an instance-wise weight estimator model. For each auxiliary training sample in the set of auxiliary training samples, the method also includes determining, using the trained instance-wise weight estimator model, a selection probability for the auxiliary training sample. The method also includes selecting, based on the selection probabilities, a subset of auxiliary training samples and training the locally interpretable model using the subset of auxiliary training samples.
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: August 2, 2022
    Assignee: Google LLC
    Inventors: Sercan Omer Arik, Jinsung Yoon, Tomas Jon Pfister
  • Publication number: 20220156521
    Abstract: A method for training a machine learning model includes obtaining a set of training samples. For each training sample in the set of training samples, during each of one or more training iterations, the method includes cropping the training sample to generate a first cropped image, cropping the training sample to generate a second cropped image that is different than the first cropped image, and duplicating a first portion of the second cropped image. The method also includes overlaying the duplicated first portion of the second cropped image on a second portion of the second cropped image to form an augmented second cropped image. The first portion is different than the second portion. The method also includes training the machine learning model with the first cropped image and the augmented second cropped image.
    Type: Application
    Filed: November 11, 2021
    Publication date: May 19, 2022
    Applicant: Google LLC
    Inventors: Kihyuk Sohn, Chun-Liang Li, Jinsung Yoon, Tomas Jon Pfister
  • Publication number: 20220067441
    Abstract: A method includes obtaining a source training dataset that includes a plurality of source training images and obtaining a target training dataset that includes a plurality of target training images. For each source training image, the method includes translating, using the forward generator neural network G, the source training image to a respective translated target image according to current values of forward generator parameters. For each target training image, the method includes translating, using a backward generator neural network F, the target training image to a respective translated source image according to current values of backward generator parameters. The method also includes training the forward generator neural network G jointly with the backward generator neural network F by adjusting the current values of the forward generator parameters and the backward generator parameters to optimize an objective function.
    Type: Application
    Filed: November 11, 2021
    Publication date: March 3, 2022
    Applicant: Google LLC
    Inventors: Rui Zhang, Jia Li, Tomas Jon Pfister