Patents by Inventor Sam Idicula
Sam Idicula has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250315545Abstract: The technology generally relates to securing end user access to application databases. An application receives a natural language query from an application end user requesting particular data from a database. The application identifies one or more tokens associated with the application end user to identify the application-specific data that is accessible to the application end user. The application provides the one or more tokens to one or more parameterized secure view elements. The parameterized secure view elements use the one or more tokens to create a virtual database that contains only the application-specific data that is accessible to the application end user. The application uses one or more machine learning models to translate the natural language query into a database query that the application uses to access the virtual database and retrieve the particular data to respond to the natural language query.Type: ApplicationFiled: April 7, 2025Publication date: October 9, 2025Inventors: Yannis Papakonstantinou, Sam Idicula, Pranav Nambiar, Saileshwar Krishnamurthy, Shreedhar Hardikar, Noah Jeffrey Misch, Per Jacobsson, Haoyu Huang, Andrew Lee Brook
-
Publication number: 20240311356Abstract: A method for workload-driven index selections includes receiving a request for a recommended index configuration. The method includes obtaining a plurality of queries executed at the database. The method also includes selecting a set of candidate indexes from the plurality of indexes. The method includes for each respective candidate index of the set of candidate indexes, determining, based on the plurality of queries, a respective workload cost for the respective candidate index. The method also includes selecting, based on the respective workload cost, a first candidate index from the set of candidate indexes for the recommended index configuration. The method includes selecting one or more additional candidate indexes from the set of candidate indexes for the recommended index configuration. The method includes determining that a size of the selected candidate indexes satisfies a size threshold and transmitting the recommended index configuration.Type: ApplicationFiled: March 14, 2023Publication date: September 19, 2024Applicant: Google LLCInventors: Haoyu Huang, Vincent Zhuang, Sam Idicula, Gaurav Jain
-
Patent number: 12072953Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.Type: GrantFiled: June 16, 2021Date of Patent: August 27, 2024Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Gaurav Chadha, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
-
Patent number: 11868854Abstract: Herein are techniques that train regressor(s) to predict how effective would a machine learning model (MLM) be if trained with new hyperparameters and/or dataset. In an embodiment, for each training dataset, a computer derives, from the dataset, values for dataset metafeatures. The computer performs, for each hyperparameters configuration (HC) of a MLM, including landmark HCs: configuring the MLM based on the HC, training the MLM based on the dataset, and obtaining an empirical quality score that indicates how effective was said training the MLM when configured with the HC. A performance tuple is generated that contains: the HC, the values for the dataset metafeatures, the empirical quality score and, for each landmark configuration, the empirical quality score of the landmark configuration and/or the landmark configuration itself. Based on the performance tuples, a regressor is trained to predict an estimated quality score based on a given dataset and a given HC.Type: GrantFiled: May 30, 2019Date of Patent: January 9, 2024Assignee: Oracle International CorporationInventors: Ali Moharrer, Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
-
Patent number: 11790242Abstract: Techniques are described for generating and applying mini-machine learning variants of machine learning algorithms to save computational resources in tuning and selection of machine learning algorithms. In an embodiment, at least one of the hyper-parameter values for a reference variant is modified to a new hyper-parameter value thereby generating a new variant of machine learning algorithm from the reference variant of machine learning algorithm. A performance score is determined for the new variant of machine learning algorithm using a training dataset, the performance score representing the accuracy of the new machine learning model for the training dataset. By performing training of the new variant of machine learning algorithm with the training data set, a cost metric of the new variant of machine learning algorithm is measured by measuring usage the used computing resources for the training.Type: GrantFiled: October 19, 2018Date of Patent: October 17, 2023Assignee: Oracle International CorporationInventors: Sandeep Agrawal, Venkatanathan Varadarajan, Sam Idicula, Nipun Agarwal
-
Patent number: 11782926Abstract: Embodiments utilize trained query performance machine learning (QP-ML) models to predict an optimal compute node cluster size for a given in-memory workload. The QP-ML models include models that predict query task runtimes at various compute node cardinalities, and models that predict network communication time between nodes of the cluster. Embodiments also utilize an analytical model to predict overlap between predicted task runtimes and predicted network communication times. Based on this data, an optimal cluster size is selected for the workload. Embodiments further utilize trained data capacity machine learning (DC-ML) models to predict a minimum number of compute nodes needed to run a workload. The DC-ML models include models that predict the size of the workload dataset in a target data encoding, models that predict the amount of memory needed to run the queries in the workload, and models that predict the memory needed to accommodate changes to the dataset.Type: GrantFiled: January 12, 2022Date of Patent: October 10, 2023Assignee: Oracle International CorporationInventors: Sam Idicula, Tomas Karnagel, Jian Wen, Seema Sundara, Nipun Agarwal, Mayur Bency
-
Patent number: 11720822Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.Type: GrantFiled: October 13, 2021Date of Patent: August 8, 2023Assignee: Oracle International CorporationInventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
-
Patent number: 11620568Abstract: Techniques are provided for selection of machine learning algorithms based on performance predictions by using hyperparameter predictors. In an embodiment, for each mini-machine learning model (MML model), a respective hyperparameter predictor set that predicts a respective set of hyperparameter settings for a data set is trained. Each MML model represents a respective reference machine learning model (RML model). Data set samples are generated from the data set. Meta-feature sets are generated, each meta-feature set describing a respective data set sample. A respective target set of hyperparameter settings are generated for said each MML model using a hypertuning algorithm. The meta-feature sets and the respective target set of hyperparameter settings are used to train the respective hyperparameter predictor set. Each hyperparameter predictor set is used during training and inference to improve the accuracy of automatically selecting a RML model per data set.Type: GrantFiled: April 18, 2019Date of Patent: April 4, 2023Assignee: Oracle International CorporationInventors: Hesam Fathi Moghadam, Sandeep Agrawal, Venkatanathan Varadarajan, Anatoly Yakovlev, Sam Idicula, Nipun Agarwal
-
Patent number: 11615265Abstract: The present invention relates to dimensionality reduction for machine learning (ML) models. Herein are techniques that individually rank features and combine features based on their rank to achieve an optimal combination of features that may accelerate training and/or inferencing, prevent overfitting, and/or provide insights into somewhat mysterious datasets. In an embodiment, a computer ranks features of datasets of a training corpus. For each dataset and for each landmark percentage, a target ML model is configured to receive only a highest ranking landmark percentage of features, and a landmark accuracy achieved by training the ML model with the dataset is measured. Based on the landmark accuracies and meta-features values of the dataset, a respective training tuple is generated for each dataset. Based on all of the training tuples, a regressor is trained to predict an optimal amount of features for training the target ML model.Type: GrantFiled: August 21, 2019Date of Patent: March 28, 2023Assignee: Oracle International CorporationInventors: Tomas Karnagel, Sam Idicula, Hesam Fathi Moghadam, Nipun Agarwal
-
Patent number: 11579951Abstract: Techniques are described herein for predicting disk drive failure using a machine learning model. The framework involves receiving disk drive sensor attributes as training data, preprocessing the training data to select a set of enhanced feature sequences, and using the enhanced feature sequences to train a machine learning model to predict disk drive failures from disk drive sensor monitoring data. Prior to the training phase, the RNN LSTM model is tuned using a set of predefined hyper-parameters. The preprocessing, which is performed during the training and evaluation phase as well as later during the prediction phase, involves using predefined values for a set of parameters to generate the set of enhanced sequences from raw sensor reading. The enhanced feature sequences are generated to maintain a desired healthy/failed disk ratio, and only use samples leading up to a last-valid-time sample in order to honor a pre-specified heads-up-period alert requirement.Type: GrantFiled: September 27, 2018Date of Patent: February 14, 2023Assignee: Oracle International CorporationInventors: Onur Kocberber, Felix Schmidt, Arun Raghavan, Nipun Agarwal, Sam Idicula, Guang-Tong Zhou, Nitin Kunal
-
Patent number: 11567937Abstract: Embodiments implement a prediction-driven, rather than a trial-driven, approach to automate database configuration parameter tuning for a database workload. This approach uses machine learning (ML) models to test performance metrics resulting from application of particular database parameters to a database workload, and does not require live trials on the DBMS managing the workload. Specifically, automatic configuration (AC) ML models are trained, using a training corpus that includes information from workloads being run by DBMSs, to predict performance metrics based on workload features and configuration parameter values. The trained AC-ML models predict performance metrics resulting from applying particular configuration parameter values to a given database workload being automatically tuned. Based on correlating changes to configuration parameter values with changes in predicted performance metrics, an optimization algorithm is used to converge to an optimal set of configuration parameters.Type: GrantFiled: May 12, 2021Date of Patent: January 31, 2023Assignee: Oracle International CorporationInventors: Sam Idicula, Tomas Karnagel, Jian Wen, Seema Sundara, Nipun Agarwal, Mayur Bency
-
Patent number: 11562178Abstract: According to an embodiment, a method includes generating a first dataset sample from a dataset, calculating a first validation score for the first dataset sample and a machine learning model, and determining whether a difference in validation score between the first validation score and a second validation score satisfies a first criteria. If the difference in validation score does not satisfy the first criteria, the method includes generating a second dataset sample from the dataset. If the difference in validation score does satisfy the first criteria, the method includes updating a convergence value and determining whether the updated convergence value satisfies a second criteria. If the updated convergence value satisfies the second criteria, the method includes returning the first dataset sample. If the updated convergence value does not satisfy the second criteria, the method includes generating the second dataset sample from the dataset.Type: GrantFiled: December 17, 2019Date of Patent: January 24, 2023Assignee: Oracle International CorporationInventors: Jingxiao Cai, Sandeep Agrawal, Sam Idicula, Venkatanathan Varadarajan, Anatoly Yakovlev, Nipun Agarwal
-
Patent number: 11544494Abstract: Techniques are provided for selection of machine learning algorithms based on performance predictions by trained algorithm-specific regressors. In an embodiment, a computer derives meta-feature values from an inference dataset by, for each meta-feature, deriving a respective meta-feature value from the inference dataset. For each trainable algorithm and each regression meta-model that is respectively associated with the algorithm, a respective score is calculated by invoking the meta-model based on at least one of: a respective subset of meta-feature values, and/or hyperparameter values of a respective subset of hyperparameters of the algorithm. The algorithm(s) are selected based on the respective scores. Based on the inference dataset, the selected algorithm(s) may be invoked to obtain a result. In an embodiment, the trained regressors are distinctly configured artificial neural networks. In an embodiment, the trained regressors are contained within algorithm-specific ensembles.Type: GrantFiled: January 30, 2018Date of Patent: January 3, 2023Assignee: Oracle International CorporationInventors: Sandeep Agrawal, Sam Idicula, Venkatanathan Varadarajan, Nipun Agarwal
-
Patent number: 11544630Abstract: The present invention relates to dimensionality reduction for machine learning (ML) models. Herein are techniques that individually rank features and combine features based on their rank to achieve an optimal combination of features that may accelerate training and/or inferencing, prevent overfitting, and/or provide insights into somewhat mysterious datasets. In an embodiment, a computer calculates, for each feature of a training dataset, a relevance score based on: a relevance scoring function, and statistics of values, of the feature, that occur in the training dataset. A rank based on relevance scores of the features is calculated for each feature. A sequence of distinct subsets of the features, based on the ranks of the features, is generated. For each distinct subset of the sequence of distinct feature subsets, a fitness score is generated based on training a machine learning (ML) model that is configured for the distinct subset.Type: GrantFiled: May 20, 2019Date of Patent: January 3, 2023Assignee: Oracle International CorporationInventors: Tomas Karnagel, Sam Idicula, Nipun Agarwal
-
Method for generating rulesets using tree-based models for black-box machine learning explainability
Patent number: 11531915Abstract: Herein are techniques to generate candidate rulesets for machine learning (ML) explainability (MLX) for black-box ML models. In an embodiment, an ML model generates classifications that each associates a distinct example with a label. A decision tree that, based on the classifications, contains tree nodes is received or generated. Each node contains label(s), a condition that identifies a feature of examples, and a split value for the feature. When a node has child nodes, the feature and the split value that are identified by the condition of the node are set to maximize information gain of the child nodes. Candidate rules are generated by traversing the tree. Each rule is built from a combination of nodes in a tree traversal path. Each rule contains a condition of at least one node and is assigned to a rule level. Candidate rules are subsequently optimized into an optimal ruleset for actual use.Type: GrantFiled: March 20, 2019Date of Patent: December 20, 2022Assignee: Oracle International CorporationInventors: Tayler Hetherington, Zahra Zohrevand, Onur Kocberber, Karoon Rashedi Nia, Sam Idicula, Nipun Agarwal -
Patent number: 11429895Abstract: Herein are techniques for exploring hyperparameters of a machine learning model (MLM) and to train a regressor to predict a time needed to train the MLM based on a hyperparameter configuration and a dataset. In an embodiment that is deployed in production inferencing mode, for each landmark configuration, each containing values for hyperparameters of a MLM, a computer configures the MLM based on the landmark configuration and measures time spent training the MLM on a dataset. An already trained regressor predicts time needed to train the MLM based on a proposed configuration of the MLM, dataset meta-feature values, and training durations and hyperparameter values of landmark configurations of the MLM. When instead in training mode, a regressor in training ingests a training corpus of MLM performance history to learn, by reinforcement, to predict a training time for the MLM for new datasets and/or new hyperparameter configurations.Type: GrantFiled: April 15, 2019Date of Patent: August 30, 2022Assignee: Oracle International CorporationInventors: Anatoly Yakovlev, Venkatanathan Varadarajan, Sandeep Agrawal, Hesam Fathi Moghadam, Sam Idicula, Nipun Agarwal
-
Patent number: 11423022Abstract: Techniques are described herein for building a framework for declarative query compilation using both rule-based and cost-based approaches for database management. The framework involves constructing and using: a set of rule-based properties tables that contain optimization parameters for both logical and physical optimization, a recursive algorithm to form candidate physical query plans that is based on the rule based tables, and a cost model for estimating the cost of a generated physical query plan that is used with the rule based properties tables to prune inferior query plans.Type: GrantFiled: June 25, 2018Date of Patent: August 23, 2022Assignee: Oracle International CorporationInventors: Jian Wen, Sam Idicula, Nitin Kunal, Farhan Tauheed, Seema Sundara, Nipun Agarwal, Indu Bhagat
-
Publication number: 20220138199Abstract: Embodiments utilize trained query performance machine learning (QP-ML) models to predict an optimal compute node cluster size for a given in-memory workload. The QP-ML models include models that predict query task runtimes at various compute node cardinalities, and models that predict network communication time between nodes of the cluster. Embodiments also utilize an analytical model to predict overlap between predicted task runtimes and predicted network communication times. Based on this data, an optimal cluster size is selected for the workload. Embodiments further utilize trained data capacity machine learning (DC-ML) models to predict a minimum number of compute nodes needed to run a workload. The DC-ML models include models that predict the size of the workload dataset in a target data encoding, models that predict the amount of memory needed to run the queries in the workload, and models that predict the memory needed to accommodate changes to the dataset.Type: ApplicationFiled: January 12, 2022Publication date: May 5, 2022Inventors: Sam Idicula, Tomas Karnagel, Jian Wen, Seema Sundara, Nipun Agarwal, Mayur Bency
-
Patent number: 11256698Abstract: Embodiments utilize trained query performance machine learning (QP-ML) models to predict an optimal compute node cluster size for a given in-memory workload. The QP-ML models include models that predict query task runtimes at various compute node cardinalities, and models that predict network communication time between nodes of the cluster. Embodiments also utilize an analytical model to predict overlap between predicted task runtimes and predicted network communication times. Based on this data, an optimal cluster size is selected for the workload. Embodiments further utilize trained data capacity machine learning (DC-ML) models to predict a minimum number of compute nodes needed to run a workload. The DC-ML models include models that predict the size of the workload dataset in a target data encoding, models that predict the amount of memory needed to run the queries in the workload, and models that predict the memory needed to accommodate changes to the dataset.Type: GrantFiled: April 11, 2019Date of Patent: February 22, 2022Assignee: Oracle International CorporationInventors: Sam Idicula, Tomas Karnagel, Jian Wen, Seema Sundara, Nipun Agarwal, Mayur Bency
-
Publication number: 20220027746Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.Type: ApplicationFiled: October 13, 2021Publication date: January 27, 2022Inventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal