Patents by Inventor Yasha Pushak

Yasha Pushak has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250139474
    Abstract: A computer obtains multipliers of a sensitive feature. From an input that contains a value of the feature, a probability of a class is inferred. Based on the value of the feature in the input, one of the multipliers of the feature is selected. The multiplier is specific to both of the feature and the value of the feature. The input is classified based on a multiplicative product of the probability of the class and the multiplier that is specific to both of the feature and the value of the feature. In an embodiment, a black-box tri-objective optimizer generates multipliers on a three-way Pareto frontier from which a user may interactively select a combination of multipliers that provides a best three-way tradeoff between fairness and accuracy. The optimizer has three objectives to respectively optimize three distinct validation metrics that may, for example, be accuracy, fairness, and favorable outcome rate decrease.
    Type: Application
    Filed: December 19, 2023
    Publication date: May 1, 2025
    Inventors: Yasha Pushak, Ehsan Soltan Aghai, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20250094862
    Abstract: In an embodiment, a computer generates a respective original inference from each of many records. Permuted values are selected for a feature from original values of the feature. Based on the permuted values for the feature, a permuted inference is generated from each record. Fairness and accuracy of the original and permuted inferences are measured. For each of many features, the computer measures a respective impact on fairness of a machine learning model, and a respective impact on accuracy of the machine learning model. A global explanation of the machine learning model is generated and presented based on, for multiple features, the impacts on fairness and accuracy. Based on the global explanation, an interactive indication to exclude or include a particular feature is received. The machine learning model is (re-)trained based on the interactive indication to exclude or include the particular feature, which may increase the fairness of the model.
    Type: Application
    Filed: December 5, 2023
    Publication date: March 20, 2025
    Inventors: Yasha Pushak, Mathieu Godbout, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20250077876
    Abstract: Techniques for selecting machine-learned (ML) models using diversity criteria are provided. In one technique, for each ML model of multiple ML models, output data is generated based on input data to the ML model. Multiple pairs of ML models are identified, where each ML model in the multiple pairs is from the multiple ML models. For each pair of ML models in the multiple pairs of ML models: (1) first output data that was previously generated by a first ML model in the pair is identified; (2) second output data that was previously generated by a second ML model in the pair is identified; (3) a diversity value that is based on the first and second output data is generated; and (4) the diversity value is added to a set of diversity values. A subset of the multiple ML models is selected based on the set of diversity values.
    Type: Application
    Filed: August 29, 2023
    Publication date: March 6, 2025
    Inventors: Moein Owhadi Kareshk, Giulia Carocari, Yasha Pushak, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Patent number: 12217136
    Abstract: Techniques are described that extend supervised machine-learning algorithms for use with semi-supervised training. Random labels are assigned to unlabeled training data, and the data is split into k partitions. During a label-training iteration, each of these k partitions is combined with the labeled training data, and the combination is used train a single instance of the machine-learning model. Each of these trained models are then used to predict labels for data points in the k?1 partitions of previously-unlabeled training data that were not used to train of the model. Thus, every data point in the previously-unlabeled training data obtains k?1 predicted labels. For each data point, these labels are aggregated to obtain a composite label prediction for the data point. After the labels are determined via one or more label-training iterations, a machine-learning model is trained on data with the resulting composite label predictions and on the labeled data set.
    Type: Grant
    Filed: July 22, 2020
    Date of Patent: February 4, 2025
    Assignee: Oracle International Corporation
    Inventors: Felix Schmidt, Yasha Pushak, Stuart Wray
  • Publication number: 20240403674
    Abstract: In an embodiment, a computer infers, from an input (e.g. that represents a person) that contains a value of a sensitive feature that has a plurality of multipliers, a probability of a majority class (i.e. an outcome). Based on the value of the sensitive feature in the input, from the multipliers of the sensitive feature, a multiplier is selected that is specific to both of the sensitive feature and the value of the sensitive feature. The input is classified based on a multiplicative product of the probability of the majority class and the multiplier that is specific to both of the sensitive feature and the value of the sensitive feature. In an embodiment, a black-box bi-objective optimizer generates multipliers on a Pareto frontier from which a user may interactively select a combination of multipliers that provide a best tradeoff between fairness and accuracy.
    Type: Application
    Filed: December 5, 2023
    Publication date: December 5, 2024
    Inventors: Mathieu Godbout, Yasha Pushak, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20240394557
    Abstract: In an embodiment, a computer combines first original hyperparameters and second original hyperparameters into combined hyperparameters. In each iteration of a binary search that selects hyperparameters, these are selected: a) important hyperparameters from the combined hyperparameters and b) based on an estimated complexity decrease by including only important hyperparameters as compared to the combined hyperparameters, which only one boundary of the binary search to adjust. For the important hyperparameters of a last iteration of the binary search that selects hyperparameters, a pruned value range of a particular hyperparameter is generated based on a first original value range of the particular hyperparameter for the first original hyperparameters and a second original value range of the same particular hyperparameter for the second original hyperparameters.
    Type: Application
    Filed: May 26, 2023
    Publication date: November 28, 2024
    Inventors: Yasha Pushak, Mobina Mahdavi, Ali Asgari Khoshouyeh, Ali Seyfi, Zahra Zohrevand, Ritesh Ahuja, Moein Owhadi Kareshk, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20240303541
    Abstract: In an embodiment, a computer generates, from an input, an inference that contains multiple probabilities respectively for multiple mutually exclusive classes that contain a first class and a second class. The probabilities contain (e.g. due to overfitting) a higher probability for the first class that is higher than a lower probability for the second class. In response to a threshold exceeding the higher probability, the input is automatically and more accurately classified as the second class. One, some, or almost all classes may have a respective distinct threshold that can be concurrently applied for acceleration. Data parallelism may simultaneously apply a threshold to a batch of multiple inputs for acceleration.
    Type: Application
    Filed: November 1, 2023
    Publication date: September 12, 2024
    Inventors: Yasha Pushak, Ali Seyfi, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20240303515
    Abstract: A computer stores a reference corpus that consists of many reference points that each has a respective class. Later, an expected class and a subject point (i.e. instance to explain) that does not have the expected class are received. Multiple reference points that have the expected class are selected as starting points. Based on the subject point and the starting points, multiple discrete interpolated points are generated that have the expected class. Based on the subject point and the discrete interpolated points, multiple continuous interpolated points are generated that have the expected class. A counterfactual explanation of why the subject point does not have the expected class is directly generated based on continuous interpolated point(s) and, thus, indirectly generated based on the discrete interpolated points. For acceleration, neither way of interpolation (i.e. counterfactual generation) is iterative.
    Type: Application
    Filed: November 17, 2023
    Publication date: September 12, 2024
    Inventors: Zahra Zohrevand, Ehsan Soltan Aghai, Yasha Pushak, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Patent number: 11966275
    Abstract: The present invention relates to machine learning (ML) explainability (MLX). Herein are local explanation techniques for black box ML models based on coalitions of features in a dataset. In an embodiment, a computer receives a request to generate a local explanation of which coalitions of features caused an anomaly detector to detect an anomaly. During unsupervised generation of a new coalition, a first feature is randomly selected from features in a dataset. Which additional features in the dataset can join the coalition, because they have mutual information with the first feature that exceeds a threshold, is detected. For each feature that is not in the coalition, values of the feature are permuted in imperfect copies of original tuples in the dataset. An average anomaly score of the imperfect copies is measured. Based on the average anomaly score of the imperfect copies, a local explanation is generated that references (e.g. defines) the coalition.
    Type: Grant
    Filed: November 22, 2022
    Date of Patent: April 23, 2024
    Assignee: Oracle International Corporation
    Inventors: Ali Seyfi, Yasha Pushak, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20240095231
    Abstract: In a computer, each of multiple anomaly detectors infers an anomaly score for each of many tuples. For each tuple, a synthetic label is generated that indicates for each anomaly detector: the anomaly detector, the anomaly score inferred by the anomaly detector for the tuple and, for each of multiple contamination factors, the contamination factor and, based on the contamination factor, a binary class of the anomaly score. For each particular anomaly detector excluding a best anomaly detector, a similarity score is measured for each contamination factor. The similarity score indicates how similar, between the particular anomaly detector and the best anomaly detector, are the binary classes of labels with that contamination factor. For each contamination factor, a combined similarity score is calculated based on the similarity scores for the contamination factor.
    Type: Application
    Filed: December 6, 2022
    Publication date: March 21, 2024
    Inventors: Yasha Pushak, Constantin Le Clei, Fatjon Zogaj, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20240095580
    Abstract: Herein is a universal anomaly threshold based on several labeled datasets and transformation of anomaly scores from one or more anomaly detectors. In an embodiment, a computer meta-learns from each anomaly detection algorithm and each labeled dataset as follows. A respective anomaly detector based on the anomaly detection algorithm is trained based on the dataset. The anomaly detector infers respective anomaly scores for tuples in the dataset. The following are ensured in the anomaly scores from the anomaly detector: i) regularity that an anomaly score of zero cannot indicate an anomaly and ii) normality that an inclusive range of zero to one contains the anomaly scores from the anomaly detector. A respective anomaly threshold is calculated for the anomaly scores from the anomaly detector. After all meta-learning, a universal anomaly threshold is calculated as an average of the anomaly thresholds. An anomaly is detected based on the universal anomaly threshold.
    Type: Application
    Filed: November 28, 2022
    Publication date: March 21, 2024
    Inventors: Yasha Pushak, Hesam Fathi Moghadam, Anatoly Yakovlev, Robert David Hopkins, II
  • Publication number: 20240095604
    Abstract: A computer sorts empirical validation scores of validated training scenarios of an anomaly detector. Each training scenario has a dataset to train an instance of the anomaly detector that is configured with values for hyperparameters. Each dataset has values for metafeatures. For each predefined ranking percentage, a subset of best training scenarios is selected that consists of the ranking percentage of validated training scenarios having the highest empirical validation scores. Linear optimizers train to infer a value for a hyperparameter. Into many distinct unvalidated training scenarios, a scenario is generated that has metafeatures values and hyperparameters values that contains the value inferred for that hyperparameter by a linear optimizer. For each unvalidated training scenario, a validation score is inferred. A best linear optimizer is selected having a highest combined inferred validation score. For a new dataset, the best linear optimizer infers a value of that hyperparameter.
    Type: Application
    Filed: December 6, 2022
    Publication date: March 21, 2024
    Inventors: Fatjon Zogaj, Yasha Pushak, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20240070471
    Abstract: Principal component analysis (PCA) accelerates and increases accuracy of genetic algorithms. In an embodiment, a computer generates many original chromosomes. Each original chromosome contains a sequence of original values. Each position in the sequences in the original chromosomes corresponds to only one respective distinct parameter in a set of parameters to be optimized. Based on the original chromosomes, many virtual chromosomes are generated. Each virtual chromosome contains a sequence of numeric values. Positions in the sequences in the virtual chromosomes do not correspond to only one respective distinct parameter in the set of parameters to be optimized. Based on the virtual chromosomes, many new chromosomes are generated. Each new chromosome contains a sequence of values. Each position in the sequences in the new chromosomes corresponds to only one respective distinct parameter in the set of parameters to be optimized. The computer may be configured based on a best new chromosome.
    Type: Application
    Filed: August 31, 2022
    Publication date: February 29, 2024
    Inventors: Yasha Pushak, Moein Owhadi Kareshk, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20230376366
    Abstract: The present invention relates to machine learning (ML) explainability (MLX). Herein are local explanation techniques for black box ML models based on coalitions of features in a dataset. In an embodiment, a computer receives a request to generate a local explanation of which coalitions of features caused an anomaly detector to detect an anomaly. During unsupervised generation of a new coalition, a first feature is randomly selected from features in a dataset. Which additional features in the dataset can join the coalition, because they have mutual information with the first feature that exceeds a threshold, is detected. For each feature that is not in the coalition, values of the feature are permuted in imperfect copies of original tuples in the dataset. An average anomaly score of the imperfect copies is measured. Based on the average anomaly score of the imperfect copies, a local explanation is generated that references (e.g. defines) the coalition.
    Type: Application
    Filed: November 22, 2022
    Publication date: November 23, 2023
    Inventors: Ali Seyfi, Yasha Pushak, Sungpack Hong, Hesam Fathi Moghadam, Hassan Chafi
  • Publication number: 20230334364
    Abstract: In an embodiment in a computer, each of several anomaly detectors infers a respective anomaly inference for each of many test tuples. For each available anomaly detector that is not the candidate anomaly detector, a respective fitness score is measured for the candidate anomaly detector that indicates how similar are anomaly inferences of the candidate anomaly detector to anomaly inferences of the available anomaly detector. Fitness scores of the candidate anomaly detector are combined into a combined fitness score for the candidate anomaly detector. The best anomaly detector that has a highest combined fitness score is selected for further operation such as inferring an anomaly inference for a new tuple while retraining or in production.
    Type: Application
    Filed: December 6, 2022
    Publication date: October 19, 2023
    Inventors: Yasha Pushak, Robert Wayne Harlow, Constantin Le Clei, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Patent number: 11720751
    Abstract: A model-agnostic global explainer for textual data processing (NLP) machine learning (ML) models, “NLP-MLX”, is described herein. NLP-MLX explains global behavior of arbitrary NLP ML models by identifying globally-important tokens within a textual dataset containing text data. NLP-MLX accommodates any arbitrary combination of training dataset pre-processing operations used by the NLP ML model. NLP-MLX includes four main stages. A Text Analysis stage converts text in documents of a target dataset into tokens. A Token Extraction stage uses pre-processing techniques to efficiently pre-filter the complete list of tokens into a smaller set of candidate important tokens. A Perturbation Generation stage perturbs tokens within documents of the dataset to help evaluate the effect of different tokens, and combinations of tokens, on the model's predictions.
    Type: Grant
    Filed: January 11, 2021
    Date of Patent: August 8, 2023
    Assignee: Oracle International Corporation
    Inventors: Zahra Zohrevand, Tayler Hetherington, Karoon Rashedi Nia, Yasha Pushak, Sanjay Jinturkar, Nipun Agarwal
  • Patent number: 11687540
    Abstract: Techniques are described for fast approximate conditional sampling by randomly sampling a dataset and then performing a nearest neighbor search on the pre-sampled dataset to reduce the data over which the nearest neighbor search must be performed and, according to an embodiment, to effectively reduce the number of nearest neighbors that are to be found within the random sample. Furthermore, KD-Tree-based stratified sampling is used to generate a representative sample of a dataset. KD-Tree-based stratified sampling may be used to identify the random sample for fast approximate conditional sampling, which reduces variance in the resulting data sample. As such, using KD-Tree-based stratified sampling to generate the random sample for fast approximate conditional sampling ensures that any nearest neighbor selected, for a target data instance, from the random sample is likely to be among the nearest neighbors of the target data instance within the unsampled dataset.
    Type: Grant
    Filed: February 18, 2021
    Date of Patent: June 27, 2023
    Assignee: Oracle International Corporation
    Inventors: Yasha Pushak, Tayler Hetherington, Karoon Rashedi Nia, Zahra Zohrevand, Sanjay Jinturkar, Nipun Agarwal
  • Publication number: 20230139718
    Abstract: Herein are acceleration and increased reliability based on classification and scoring techniques for machine learning that compare two similar datasets of different ages to detect data drift without a predefined drift threshold. Various subsets are randomly sampled from the datasets. The subsets are combined in various ways to generate subsets of various age mixtures. In an embodiment, ages are permuted and drift is detected based on whether or not fitness scores indicate that an age binary classifier is confused. In an embodiment, an anomaly detector measures outlier scores of two subsets of different age mixtures. Drift is detected when the outlier scores diverge. In a two-arm bandit embodiment, iterations randomly alternate between both datasets based on respective probabilities that are adjusted by a bandit reward based on outlier scores from an anomaly detector. Drift is detected based on the probability of the younger dataset.
    Type: Application
    Filed: October 28, 2021
    Publication date: May 4, 2023
    Inventors: Mojtaba Valipour, Yasha Pushak, Robert Harlow, Hesam Fathi Moghadam, Sungpack Hong, Hassan Chafi
  • Publication number: 20220366297
    Abstract: In an embodiment, a computer hosts a machine learning (ML) model that infers a particular inference for a particular tuple that is based on many features. For each feature, and for each of many original tuples, the computer: a) randomly selects many perturbed values from original values of the feature in the original tuples, b) generates perturbed tuples that are based on the original tuple and a respective perturbed value, c) causes the ML model to infer a respective perturbed inference for each perturbed tuple, and d) measures a respective difference between each perturbed inference of the perturbed tuples and the particular inference. For each feature, a respective importance of the feature is calculated based on the differences measured for the feature. Feature importances may be used to rank features by influence and/or generate a local ML explainability (MLX) explanation.
    Type: Application
    Filed: May 13, 2021
    Publication date: November 17, 2022
    Inventors: Yasha Pushak, Zahra Zohrevand, Tayler Hetherington, Karoon Rashedi Nia, Sanjay Jinturkar, Nipun Agarwal
  • Publication number: 20220335255
    Abstract: In an embodiment, a computer assigns a respective probability distribution to each of many features that include a first feature and a second feature that are assigned different probability distributions. For each original tuple that are based on the features, a machine learning (ML) model infers a respective original inference. For each feature, and for each original tuple, the computer: a) generates perturbed values based on the probability distribution of the feature, b) generates perturbed tuples that are based on the original tuple and a respective perturbed value, c) causes the ML model to infer a respective perturbed inference for each perturbed tuple, and d) measures a respective difference between each perturbed inference and the original inference. A respective importance of each feature is calculated based on the differences measured for the feature. Feature importances may be used to rank features by influence and/or generate a global or local ML explainability (MLX) explanation.
    Type: Application
    Filed: April 16, 2021
    Publication date: October 20, 2022
    Inventors: ZAHRA ZOHREVAND, YASHA PUSHAK, TAYLER HETHERINGTON, KAROON RASHEDI NIA, SANJAY JINTURKAR, NIPUN AGARWAL