Patents by Inventor Venkatanathan Varadarajan

Venkatanathan Varadarajan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11989657
    Abstract: Herein, a computer generates and evaluates many preprocessor configurations for a window preprocessor that transforms a training timeseries dataset for an ML model. With each preprocessor configuration, the window preprocessor is configured. The window preprocessor then converts the training timeseries dataset into a configuration-specific point-based dataset that is based on the preprocessor configuration. The ML model is trained based on the configuration-specific point-based dataset to calculate a score for the preprocessor configuration. Based on the scores of the many preprocessor configurations, an optimal preprocessor configuration is selected for finally configuring the window preprocessor, after which, the window preprocessor can optimally transform a new timeseries dataset such as in an offline or online production environment such as for real-time processing of a live streaming timeseries.
    Type: Grant
    Filed: October 15, 2020
    Date of Patent: May 21, 2024
    Assignee: Oracle International Corporation
    Inventors: Nikan Chavoshi, Anatoly Yakovlev, Hesam Fathi Moghadam, Venkatanathan Varadarajan, Sandeep Agrawal, Ali Moharrer, Jingxiao Cai, Sanjay Jinturkar, Nipun Agarwal
  • Patent number: 11868854
    Abstract: Herein are techniques that train regressor(s) to predict how effective would a machine learning model (MLM) be if trained with new hyperparameters and/or dataset. In an embodiment, for each training dataset, a computer derives, from the dataset, values for dataset metafeatures. The computer performs, for each hyperparameters configuration (HC) of a MLM, including landmark HCs: configuring the MLM based on the HC, training the MLM based on the dataset, and obtaining an empirical quality score that indicates how effective was said training the MLM when configured with the HC. A performance tuple is generated that contains: the HC, the values for the dataset metafeatures, the empirical quality score and, for each landmark configuration, the empirical quality score of the landmark configuration and/or the landmark configuration itself. Based on the performance tuples, a regressor is trained to predict an estimated quality score based on a given dataset and a given HC.
    Type: Grant
    Filed: May 30, 2019
    Date of Patent: January 9, 2024
    Assignee: Oracle International Corporation
    Inventors: Ali Moharrer, Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
  • Patent number: 11790242
    Abstract: Techniques are described for generating and applying mini-machine learning variants of machine learning algorithms to save computational resources in tuning and selection of machine learning algorithms. In an embodiment, at least one of the hyper-parameter values for a reference variant is modified to a new hyper-parameter value thereby generating a new variant of machine learning algorithm from the reference variant of machine learning algorithm. A performance score is determined for the new variant of machine learning algorithm using a training dataset, the performance score representing the accuracy of the new machine learning model for the training dataset. By performing training of the new variant of machine learning algorithm with the training data set, a cost metric of the new variant of machine learning algorithm is measured by measuring usage the used computing resources for the training.
    Type: Grant
    Filed: October 19, 2018
    Date of Patent: October 17, 2023
    Assignee: Oracle International Corporation
    Inventors: Sandeep Agrawal, Venkatanathan Varadarajan, Sam Idicula, Nipun Agarwal
  • Patent number: 11720822
    Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.
    Type: Grant
    Filed: October 13, 2021
    Date of Patent: August 8, 2023
    Assignee: Oracle International Corporation
    Inventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
  • Publication number: 20230153394
    Abstract: Herein are timeseries preprocessing, model selection, and hyperparameter tuning techniques for forecasting development based on temporal statistics of a timeseries and a single feed-forward pass through a machine learning (ML) pipeline. In an embodiment, a computer hosts and operates the ML pipeline that automatically measures temporal statistic(s) of a timeseries. ML algorithm selection, cross validation, and hyperparameters tuning is based on the temporal statistics of the timeseries. The result from the ML pipeline is a rigorously trained and production ready ML model that is validated to have increased accuracy for multiple prediction horizons. Based on the temporal statistics, efficiency is achieved by asymmetry of investment of computer resources in the tuning and training of the most promising ML algorithm(s). Compared to other approaches, this ML pipeline produces a more accurate ML model for a given amount of computer resources and consumes fewer computer resources to achieve a given accuracy.
    Type: Application
    Filed: November 17, 2021
    Publication date: May 18, 2023
    Inventors: Ritesh Ahuja, Anatoly Yakovlev, Venkatanathan Varadarajan, Sandeep R. Agrawal, Hesam Fathi Moghadam, Sanjay Jinturkar, Nipun Agarwal
  • Patent number: 11620568
    Abstract: Techniques are provided for selection of machine learning algorithms based on performance predictions by using hyperparameter predictors. In an embodiment, for each mini-machine learning model (MML model), a respective hyperparameter predictor set that predicts a respective set of hyperparameter settings for a data set is trained. Each MML model represents a respective reference machine learning model (RML model). Data set samples are generated from the data set. Meta-feature sets are generated, each meta-feature set describing a respective data set sample. A respective target set of hyperparameter settings are generated for said each MML model using a hypertuning algorithm. The meta-feature sets and the respective target set of hyperparameter settings are used to train the respective hyperparameter predictor set. Each hyperparameter predictor set is used during training and inference to improve the accuracy of automatically selecting a RML model per data set.
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: April 4, 2023
    Assignee: Oracle International Corporation
    Inventors: Hesam Fathi Moghadam, Sandeep Agrawal, Venkatanathan Varadarajan, Anatoly Yakovlev, Sam Idicula, Nipun Agarwal
  • Patent number: 11562178
    Abstract: According to an embodiment, a method includes generating a first dataset sample from a dataset, calculating a first validation score for the first dataset sample and a machine learning model, and determining whether a difference in validation score between the first validation score and a second validation score satisfies a first criteria. If the difference in validation score does not satisfy the first criteria, the method includes generating a second dataset sample from the dataset. If the difference in validation score does satisfy the first criteria, the method includes updating a convergence value and determining whether the updated convergence value satisfies a second criteria. If the updated convergence value satisfies the second criteria, the method includes returning the first dataset sample. If the updated convergence value does not satisfy the second criteria, the method includes generating the second dataset sample from the dataset.
    Type: Grant
    Filed: December 17, 2019
    Date of Patent: January 24, 2023
    Assignee: Oracle International Corporation
    Inventors: Jingxiao Cai, Sandeep Agrawal, Sam Idicula, Venkatanathan Varadarajan, Anatoly Yakovlev, Nipun Agarwal
  • Patent number: 11544494
    Abstract: Techniques are provided for selection of machine learning algorithms based on performance predictions by trained algorithm-specific regressors. In an embodiment, a computer derives meta-feature values from an inference dataset by, for each meta-feature, deriving a respective meta-feature value from the inference dataset. For each trainable algorithm and each regression meta-model that is respectively associated with the algorithm, a respective score is calculated by invoking the meta-model based on at least one of: a respective subset of meta-feature values, and/or hyperparameter values of a respective subset of hyperparameters of the algorithm. The algorithm(s) are selected based on the respective scores. Based on the inference dataset, the selected algorithm(s) may be invoked to obtain a result. In an embodiment, the trained regressors are distinctly configured artificial neural networks. In an embodiment, the trained regressors are contained within algorithm-specific ensembles.
    Type: Grant
    Filed: January 30, 2018
    Date of Patent: January 3, 2023
    Assignee: Oracle International Corporation
    Inventors: Sandeep Agrawal, Sam Idicula, Venkatanathan Varadarajan, Nipun Agarwal
  • Patent number: 11451670
    Abstract: Herein are machine learning (ML) techniques for unsupervised training with a corpus of signaling system 7 (SS7) messages having a diversity of called and calling parties, operation codes (opcodes) and transaction types, numbering plans and nature of address indicators, and mobile country codes and network codes. In an embodiment, a computer stores SS7 messages that are not labeled as anomalous or non-anomalous. Each SS7 message contains an opcode and other fields. For each SS7 message, the opcode of the SS7 message is stored into a respective feature vector (FV) of many FVs that are based on respective unlabeled SS7 messages. The FVs contain many distinct opcodes. Based on the FVs that contain many distinct opcodes and that are based on respective unlabeled SS7 messages, an ML model such as a reconstructive model such as an autoencoder is unsupervised trained to detect an anomalous SS7 message.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: September 20, 2022
    Assignee: Oracle International Corporation
    Inventors: Hamed Ahmadi, Ali Moharrer, Venkatanathan Varadarajan, Vaseem Akram, Nishesh Rai, Reema Hingorani, Sanjay Jinturkar, Nipun Agarwal
  • Patent number: 11429895
    Abstract: Herein are techniques for exploring hyperparameters of a machine learning model (MLM) and to train a regressor to predict a time needed to train the MLM based on a hyperparameter configuration and a dataset. In an embodiment that is deployed in production inferencing mode, for each landmark configuration, each containing values for hyperparameters of a MLM, a computer configures the MLM based on the landmark configuration and measures time spent training the MLM on a dataset. An already trained regressor predicts time needed to train the MLM based on a proposed configuration of the MLM, dataset meta-feature values, and training durations and hyperparameter values of landmark configurations of the MLM. When instead in training mode, a regressor in training ingests a training corpus of MLM performance history to learn, by reinforcement, to predict a training time for the MLM for new datasets and/or new hyperparameter configurations.
    Type: Grant
    Filed: April 15, 2019
    Date of Patent: August 30, 2022
    Assignee: Oracle International Corporation
    Inventors: Anatoly Yakovlev, Venkatanathan Varadarajan, Sandeep Agrawal, Hesam Fathi Moghadam, Sam Idicula, Nipun Agarwal
  • Publication number: 20220191332
    Abstract: Herein are machine learning (ML) techniques for unsupervised training with a corpus of signaling system 7 (SS7) messages having a diversity of called and calling parties, operation codes (opcodes) and transaction types, numbering plans and nature of address indicators, and mobile country codes and network codes. In an embodiment, a computer stores SS7 messages that are not labeled as anomalous or non-anomalous. Each SS7 message contains an opcode and other fields. For each SS7 message, the opcode of the SS7 message is stored into a respective feature vector (FV) of many FVs that are based on respective unlabeled SS7 messages. The FVs contain many distinct opcodes. Based on the FVs that contain many distinct opcodes and that are based on respective unlabeled SS7 messages, an ML model such as a reconstructive model such as an autoencoder is unsupervised trained to detect an anomalous SS7 message.
    Type: Application
    Filed: December 16, 2020
    Publication date: June 16, 2022
    Inventors: Hamed Ahmadi, Ali Moharrer, Venkatanathan Varadarajan, Vaseem Akram, Nishesh Rai, Reema Hingorani, Sanjay Jinturkar, Nipun Agarwal
  • Publication number: 20220138504
    Abstract: In an embodiment based on computer(s), an ML model is trained to detect outliers. The ML model calculates anomaly scores that include a respective anomaly score for each item in a validation dataset. The anomaly scores are automatically organized by sorting and/or clustering. Based on the organized anomaly scores, a separation is measured that indicates fitness of the ML model. In an embodiment, a computer performs two-clustering of anomaly scores into a first organization that consists of a first normal cluster of anomaly scores and a first anomaly cluster of anomaly scores. The computer performs three-clustering of the same anomaly scores into a second organization that consists of a second normal cluster of anomaly scores, a second anomaly cluster of anomaly scores, and a middle cluster of anomaly scores. A distribution difference between the first organization and the second organization is measured. An ML model is processed based on the distribution difference.
    Type: Application
    Filed: October 29, 2020
    Publication date: May 5, 2022
    Inventors: Hesam Fathi Moghadam, Anatoly Yakovlev, Sandeep Agrawal, Venkatanathan Varadarajan, Robert Hopkins, Matteo Casserini, Milos Vasic, Sanjay Jinturkar, Nipun Agarwal
  • Publication number: 20220121955
    Abstract: Herein, a computer generates and evaluates many preprocessor configurations for a window preprocessor that transforms a training timeseries dataset for an ML model. With each preprocessor configuration, the window preprocessor is configured. The window preprocessor then converts the training timeseries dataset into a configuration-specific point-based dataset that is based on the preprocessor configuration. The ML model is trained based on the configuration-specific point-based dataset to calculate a score for the preprocessor configuration. Based on the scores of the many preprocessor configurations, an optimal preprocessor configuration is selected for finally configuring the window preprocessor, after which, the window preprocessor can optimally transform a new timeseries dataset such as in an offline or online production environment such as for real-time processing of a live streaming timeseries.
    Type: Application
    Filed: October 15, 2020
    Publication date: April 21, 2022
    Inventors: Nikan Chavoshi, Anatoly Yakovlev, Hesam Fathi Moghadam, Venkatanathan Varadarajan, Sandeep Agrawal, Ali Moharrer, Jingxiao Cai, Sanjay Jinturkar, Nipun Agarwal
  • Publication number: 20220043681
    Abstract: Herein, a computer receives a new training dataset for a target ML model. Proven or unproven respective values of hyperparameters of the target ML model are selected. An already-trained ML metamodel predicts an amount of memory that the target ML model will need, when configured with the respective values of the hyperparameters, to train with the new training dataset. In an embodiment, supervised training of the ML metamodel is as follows. The ML metamodel receives feature vectors that each contains distinct details of a respective past training of the target ML model of many and varied trainings of the target ML model. Those distinct details of each past training includes: respective values of the hyperparameters, and respective values of metafeatures of a respective training dataset of many training datasets. Each feature vector is labeled with a respective amount of memory that the target ML model needed during the respective past training.
    Type: Application
    Filed: August 4, 2020
    Publication date: February 10, 2022
    Inventors: Ali Moharrer, Sandeep R. Agrawal, Venkatanathan Varadarajan, Sanjay Jinturkar, Nipun Agarwal
  • Publication number: 20220027746
    Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.
    Type: Application
    Filed: October 13, 2021
    Publication date: January 27, 2022
    Inventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
  • Publication number: 20210390466
    Abstract: A proxy-based automatic non-iterative machine learning (PANI-ML) pipeline is described, which predicts machine learning model configuration performance and outputs an automatically-configured machine learning model for a target training dataset. Techniques described herein use one or more proxy models—which implement a variety of machine learning algorithms and are pre-configured with tuned hyperparameters—to estimate relative performance of machine learning model configuration parameters at various stages of the PANI-ML pipeline. The PANI-ML pipeline implements a radically new approach of rapidly narrowing the search space for machine learning model configuration parameters by performing algorithm selection followed by algorithm-specific adaptive data reduction (i.e., row- and/or feature-wise dataset sampling), and then hyperparameter tuning.
    Type: Application
    Filed: October 30, 2020
    Publication date: December 16, 2021
    Inventors: Venkatanathan Varadarajan, Sandeep R. Agrawal, Hesam Fathi Moghadam, Anatoly Yakovlev, Ali Moharrer, Jingxiao Cai, Sanjay Jinturkar, Nipun Agarwal, Sam Idicula, Nikan Chavoshi
  • Patent number: 11176487
    Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.
    Type: Grant
    Filed: January 31, 2018
    Date of Patent: November 16, 2021
    Assignee: Oracle International Corporation
    Inventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
  • Patent number: 11120368
    Abstract: Herein are techniques for automatic tuning of hyperparameters of machine learning algorithms. System throughput is maximized by horizontally scaling and asynchronously dispatching the configuration, training, and testing of an algorithm. In an embodiment, a computer stores a best cost achieved by executing a target model based on best values of the target algorithm's hyperparameters. The best values and their cost are updated by epochs that asynchronously execute. Each epoch has asynchronous costing tasks that explore a distinct hyperparameter. Each costing task has a sample of exploratory values that differs from the best values along the distinct hyperparameter. The asynchronous costing tasks of a same epoch have different values for the distinct hyperparameter, which accomplishes an exploration. In an embodiment, an excessive update of best values or best cost creates a major epoch for exploration in a subspace that is more or less unrelated to other epochs, thereby avoiding local optima.
    Type: Grant
    Filed: September 21, 2018
    Date of Patent: September 14, 2021
    Assignee: Oracle International Corporation
    Inventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
  • Publication number: 20200380378
    Abstract: Herein are techniques that train regressor(s) to predict how effective would a machine learning model (MLM) be if rained with new hyperparameters and/or dataset. In an embodiment, for each training dataset, a computer derives, from the dataset, values for dataset metafeatures. The computer performs, for each hyperparameters configuration (HC) of a MLM, including landmark HCs: configuring the MLM based on the HC, training the MLM based on the dataset, and obtaining an empirical quality score that indicates how effective was said training the MLM when configured with the HC. A performance tuple is generated that contains: the HC, the values for the dataset metafeatures, the empirical quality score and, for each landmark configuration, the empirical quality score of the landmark configuration and/or the landmark configuration itself. Based on the performance tuples, a regressor is trained to predict an estimated quality score based on a given dataset and a given HC.
    Type: Application
    Filed: May 30, 2019
    Publication date: December 3, 2020
    Inventors: ALI MOHARRER, VENKATANATHAN VARADARAJAN, SAM IDICULA, SANDEEP AGRAWAL, NIPUN AGARWAL
  • Publication number: 20200342265
    Abstract: According to an embodiment, a method includes generating a first dataset sample from a dataset, calculating a first validation score for the first dataset sample and a machine learning model, and determining whether a difference in validation score between the first validation score and a second validation score satisfies a first criteria. If the difference in validation score does not satisfy the first criteria, the method includes generating a second dataset sample from the dataset. If the difference in validation score does satisfy the first criteria, the method includes updating a convergence value and determining whether the updated convergence value satisfies a second criteria. If the updated convergence value satisfies the second criteria, the method includes returning the first dataset sample. If the updated convergence value does not satisfy the second criteria, the method includes generating the second dataset sample from the dataset.
    Type: Application
    Filed: December 17, 2019
    Publication date: October 29, 2020
    Inventors: Jingxiao Cai, Sandeep Agrawal, Sam Idicula, Venkatanathan Varadarajan, Anatoly Yakovlev, Nipun Agarwal