Patents by Inventor Onur Kocberber
Onur Kocberber has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12014286Abstract: Herein are approaches for self-optimization of a database management system (DBMS) such as in real time. Adaptive just-in-time sampling techniques herein estimate database content statistics that a machine learning (ML) model may use to predict configuration settings that conserve computer resources such as execution time and storage space. In an embodiment, a computer repeatedly samples database content until a dynamic convergence criterion is satisfied. In each iteration of a series of sampling iterations, a subset of rows of a database table are sampled, and estimates of content statistics of the database table are adjusted based on the sampled subset of rows. Immediately or eventually after detecting dynamic convergence, a machine learning (ML) model predicts, based on the content statistic estimates, an optimal value for a configuration setting of the DBMS.Type: GrantFiled: June 29, 2020Date of Patent: June 18, 2024Assignee: Oracle International CorporationInventors: Farhan Tauheed, Onur Kocberber, Tomas Karnagel, Nipun Agarwal
-
Patent number: 11907250Abstract: Techniques are described for executing machine learning models trained for specific operators with feature values that are based on the actual execution of a workload set. The machine learning models generate an estimate of benefit gain/cost for executing operations on data portions in the alternative encoding format. Such data potions may be sorted based on the estimated benefit, in an embodiment. Using cost estimation machine learning models for memory space, the data portions with the most benefits that comply with the existing memory space constraints are recommended and/or are automatically encoded into the alternative encoding format.Type: GrantFiled: July 22, 2022Date of Patent: February 20, 2024Assignee: Oracle International CorporationInventors: Urvashi Oswal, Marc Jolles, Onur Kocberber, Seema Sundara, Nipun Agarwal
-
Publication number: 20240028605Abstract: Techniques are described for executing machine learning models trained for specific operators with feature values that are based on the actual execution of a workload set. The machine learning models generate an estimate of benefit gain/cost for executing operations on data portions in the alternative encoding format. Such data potions may be sorted based on the estimated benefit, in an embodiment. Using cost estimation machine learning models for memory space, the data portions with the most benefits that comply with the existing memory space constraints are recommended and/or are automatically encoded into the alternative encoding format.Type: ApplicationFiled: July 22, 2022Publication date: January 25, 2024Inventors: URVASHI OSWAL, MARC JOLLES, ONUR KOCBERBER, SEEMA SUNDARA, NIPUN AGARWAL
-
Patent number: 11868261Abstract: Techniques are described herein for prediction of an buffer pool size (BPS). Before performing BPS prediction, gathered data are used to determine whether a target workload is in a steady state. Historical utilization data gathered while the workload is in a steady state are used to predict object-specific BPS components for database objects, accessed by the target workload, that are identified for BPS analysis based on shares of the total disk I/O requests, for the workload, that are attributed to the respective objects. Preference of analysis is given to objects that are associated with larger shares of disk I/O activity. An object-specific BPS component is determined based on a coverage function that returns a percentage of the database object size (on disk) that should be available in the buffer pool for that database object. The percentage is determined using either a heuristic-based or a machine learning-based approach.Type: GrantFiled: July 20, 2021Date of Patent: January 9, 2024Assignee: Oracle International CorporationInventors: Peyman Faizian, Mayur Bency, Onur Kocberber, Seema Sundara, Nipun Agarwal
-
Publication number: 20230297573Abstract: Embodiments implement a prediction-driven, rather than a trial-driven, approach to automatic data placement recommendations for partitioning data across multiple nodes in a database system. The system is configured to extract workload-specific features of a database workload running at a database system and dataset-specific features of a database running on the database system. The workload-specific features characterize utilization of the database workload. The dataset-specific features characterize how data is organized within the database. The system identifies a plurality of candidate keys for determining how to partition data stored in the database across nodes. Based at least in part on the workload-specific features, the dataset specific features, and the plurality of candidate keys, a set of candidate key combinations for partitioning data is generated.Type: ApplicationFiled: March 21, 2022Publication date: September 21, 2023Inventors: Urvashi Oswal, Jian Wen, Farhan Tauheed, Onur Kocberber, Seema Sundara, Nipun Agarwal
-
Patent number: 11657256Abstract: Embodiments use a hierarchy of machine learning models to predict datacenter behavior at multiple hardware levels of a datacenter without accessing operating system generated hardware utilization information. The accuracy of higher-level models in the hierarchy of models is increased by including, as input to the higher-level models, hardware utilization predictions from lower-level models. The hierarchy of models includes: server utilization models and workload/OS prediction models that produce predictions at a server device-level of a datacenter; and also top-of-rack switch models and backbone switch models that produce predictions at higher levels of the datacenter. These models receive, as input, hardware utilization information from non-OS sources. Based on datacenter-level network utilization predictions from the hierarchy of models, the datacenter automatically configures its hardware to avoid any predicted over-utilization of hardware in the datacenter.Type: GrantFiled: July 18, 2022Date of Patent: May 23, 2023Assignee: Oracle International CorporationInventors: Pravin Shinde, Felix Schmidt, Onur Kocberber
-
Patent number: 11620547Abstract: Techniques for estimating the number of distinct values in a data set using machine learning are provided. In one technique, a sample of a data set is retrieved where the sample is a strict subset of the data set. The sample is analyzed to identify feature values of multiple features of the sample. The feature values are inserted into a machine-learned model that computes a prediction regarding a number of distinct values in the data set. An estimated number of distinct values that is based on the prediction is stored in association with the data set.Type: GrantFiled: May 19, 2020Date of Patent: April 4, 2023Assignee: Oracle International CorporationInventors: Tomas Karnagel, Onur Kocberber, Farhan Tauheed, Nipun Agarwal
-
Patent number: 11579951Abstract: Techniques are described herein for predicting disk drive failure using a machine learning model. The framework involves receiving disk drive sensor attributes as training data, preprocessing the training data to select a set of enhanced feature sequences, and using the enhanced feature sequences to train a machine learning model to predict disk drive failures from disk drive sensor monitoring data. Prior to the training phase, the RNN LSTM model is tuned using a set of predefined hyper-parameters. The preprocessing, which is performed during the training and evaluation phase as well as later during the prediction phase, involves using predefined values for a set of parameters to generate the set of enhanced sequences from raw sensor reading. The enhanced feature sequences are generated to maintain a desired healthy/failed disk ratio, and only use samples leading up to a last-valid-time sample in order to honor a pre-specified heads-up-period alert requirement.Type: GrantFiled: September 27, 2018Date of Patent: February 14, 2023Assignee: Oracle International CorporationInventors: Onur Kocberber, Felix Schmidt, Arun Raghavan, Nipun Agarwal, Sam Idicula, Guang-Tong Zhou, Nitin Kunal
-
Publication number: 20230022884Abstract: Techniques are described herein for prediction of an buffer pool size (BPS). Before performing BPS prediction, gathered data are used to determine whether a target workload is in a steady state. Historical utilization data gathered while the workload is in a steady state are used to predict object-specific BPS components for database objects, accessed by the target workload, that are identified for BPS analysis based on shares of the total disk I/O requests, for the workload, that are attributed to the respective objects. Preference of analysis is given to objects that are associated with larger shares of disk I/O activity. An object-specific BPS component is determined based on a coverage function that returns a percentage of the database object size (on disk) that should be available in the buffer pool for that database object. The percentage is determined using either a heuristic-based or a machine learning-based approach.Type: ApplicationFiled: July 20, 2021Publication date: January 26, 2023Inventors: Peyman Faizian, Mayur Bency, Onur Kocberber, Seema Sundara, Nipun Agarwal
-
Method for generating rulesets using tree-based models for black-box machine learning explainability
Patent number: 11531915Abstract: Herein are techniques to generate candidate rulesets for machine learning (ML) explainability (MLX) for black-box ML models. In an embodiment, an ML model generates classifications that each associates a distinct example with a label. A decision tree that, based on the classifications, contains tree nodes is received or generated. Each node contains label(s), a condition that identifies a feature of examples, and a split value for the feature. When a node has child nodes, the feature and the split value that are identified by the condition of the node are set to maximize information gain of the child nodes. Candidate rules are generated by traversing the tree. Each rule is built from a combination of nodes in a tree traversal path. Each rule contains a condition of at least one node and is assigned to a rule level. Candidate rules are subsequently optimized into an optimal ruleset for actual use.Type: GrantFiled: March 20, 2019Date of Patent: December 20, 2022Assignee: Oracle International CorporationInventors: Tayler Hetherington, Zahra Zohrevand, Onur Kocberber, Karoon Rashedi Nia, Sam Idicula, Nipun Agarwal -
Patent number: 11520834Abstract: Techniques are described for generating an approximate frequency histogram using a series of Bloom filters (BF). For example, to estimate the f1 and f2 cardinalities in a dataset, an ordered chain of three BFs is established (“BF1”, “BF2”, and “BF3”). An insertion operation is performed for each datum in the dataset, whereby the BFs are tested in order (starting at BF1) for the datum. If the datum is represented in a currently-tested BF, the subsequent BF in the chain is tested for the datum. If the datum is not represented in the currently-tested BF, the datum is added to the BF, a counter for the BF is incremented, and the insertion operation for the current datum ends. To estimate the cardinality of f1-values in the dataset, the BF2-counter is subtracted from the BF1-counter. Similarly, to estimate the cardinality of f2-values in the dataset, the BF3-counter is subtracted from the BF2-counter.Type: GrantFiled: July 28, 2021Date of Patent: December 6, 2022Assignee: Oracle International CorporationInventors: Tomas Karnagel, Suratna Budalakoti, Onur Kocberber, Nipun Agarwal, Alan Wood
-
Publication number: 20220351023Abstract: Embodiments use a hierarchy of machine learning models to predict datacenter behavior at multiple hardware levels of a datacenter without accessing operating system generated hardware utilization information. The accuracy of higher-level models in the hierarchy of models is increased by including, as input to the higher-level models, hardware utilization predictions from lower-level models. The hierarchy of models includes: server utilization models and workload/OS prediction models that produce predictions at a server device-level of a datacenter; and also top-of-rack switch models and backbone switch models that produce predictions at higher levels of the datacenter. These models receive, as input, hardware utilization information from non-OS sources. Based on datacenter-level network utilization predictions from the hierarchy of models, the datacenter automatically configures its hardware to avoid any predicted over-utilization of hardware in the datacenter.Type: ApplicationFiled: July 18, 2022Publication date: November 3, 2022Inventors: Pravin Shinde, Felix Schmidt, Onur Kocberber
-
Patent number: 11443166Abstract: Embodiments use a hierarchy of machine learning models to predict datacenter behavior at multiple hardware levels of a datacenter without accessing operating system generated hardware utilization information. The accuracy of higher-level models in the hierarchy of models is increased by including, as input to the higher-level models, hardware utilization predictions from lower-level models. The hierarchy of models includes: server utilization models and workload/OS prediction models that produce predictions at a server device-level of a datacenter; and also top-of-rack switch models and backbone switch models that produce predictions at higher levels of the datacenter. These models receive, as input, hardware utilization information from non-OS sources. Based on datacenter-level network utilization predictions from the hierarchy of models, the datacenter automatically configures its hardware to avoid any predicted over-utilization of hardware in the datacenter.Type: GrantFiled: October 29, 2018Date of Patent: September 13, 2022Assignee: Oracle International CorporationInventors: Pravin Shinde, Felix Schmidt, Onur Kocberber
-
Patent number: 11423327Abstract: Techniques are described herein for estimating CPU, memory, and I/O utilization for a workload via out-of-band sensor readings using a machine learning model. The framework involves receiving sensor data associated with executing benchmark applications, obtaining ground truth utilization values for the benchmarks, preprocessing the training data to select a set of enhanced sequences, and using the enhanced sequences to train a random forest model to estimate CPU, memory, and I/O utilization given sensor monitoring data. Prior to the training phase, a machine learning model is trained using a set of predefined hyper-parameters. The trained models are used to generate estimations for CPU, memory, and I/O utilizations values. The utilization values are used with workload context information to assess the deployment and generate one or more recommendations for machine types that will best serve the workload in terms of system utilization.Type: GrantFiled: October 10, 2018Date of Patent: August 23, 2022Assignee: Oracle International CorporationInventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Andrew Brownsword, Nipun Agarwal
-
Patent number: 11379456Abstract: Systems and methods for adjusting parameters for a spin-lock implementation of concurrency control are described herein. In an embodiment, a system continuously retrieves, from a resource management system, one or more state values defining a state of the resource management system. Based on the one or more state values, the system determines that the resource management system has reached a steady state and, in response adjusts a plurality of parameters for spin-locking performed by said resource management system to identify optimal values for the plurality of parameters. After adjusting the plurality of parameters, the system detects, based on one or more current state values, a workload change in the resource management system and, in response, readjusts the plurality of parameters for spin-locking performed by said resource management system to identify new optimal values for the parameters.Type: GrantFiled: October 1, 2020Date of Patent: July 5, 2022Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Onur Kocberber, Mayur Bency, Marc Jolles, Seema Sundara, Nipun Agarwal
-
Publication number: 20220107933Abstract: Systems and methods for adjusting parameters for a spin-lock implementation of concurrency control are described herein. In an embodiment, a system continuously retrieves, from a resource management system, one or more state values defining a state of the resource management system. Based on the one or more state values, the system determines that the resource management system has reached a steady state and, in response adjusts a plurality of parameters for spin-locking performed by said resource management system to identify optimal values for the plurality of parameters. After adjusting the plurality of parameters, the system detects, based on one or more current state values, a workload change in the resource management system and, in response, readjusts the plurality of parameters for spin-locking performed by said resource management system to identify new optimal values for the parameters.Type: ApplicationFiled: October 1, 2020Publication date: April 7, 2022Inventors: Onur Kocberber, Mayur Bency, Marc Jolles, Seema Sundara, Nipun Agarwal
-
Publication number: 20210406717Abstract: Herein are approaches for self-optimization of a database management system (DBMS) such as in real time. Adaptive just-in-time sampling techniques herein estimate database content statistics that a machine learning (ML) model may use to predict configuration settings that conserve computer resources such as execution time and storage space. In an embodiment, a computer repeatedly samples database content until a dynamic convergence criterion is satisfied. In each iteration of a series of sampling iterations, a subset of rows of a database table are sampled, and estimates of content statistics of the database table are adjusted based on the sampled subset of rows. Immediately or eventually after detecting dynamic convergence, a machine learning (ML) model predicts, based on the content statistic estimates, an optimal value for a configuration setting of the DBMS.Type: ApplicationFiled: June 29, 2020Publication date: December 30, 2021Inventors: Farhan Tauheed, Onur Kocberber, Tomas Karnagel, Nipun Agarwal
-
Publication number: 20210365805Abstract: Techniques for estimating the number of distinct values in a data set using machine learning are provided. In one technique, a sample of a data set is retrieved where the sample is a strict subset of the data set. The sample is analyzed to identify feature values of multiple features of the sample. The feature values are inserted into a machine-learned model that computes a prediction regarding a number of distinct values in the data set. An estimated number of distinct values that is based on the prediction is stored in association with the data set.Type: ApplicationFiled: May 19, 2020Publication date: November 25, 2021Inventors: Tomas Karnagel, Onur Kocberber, Farhan Tauheed, Nipun Agarwal
-
Patent number: 11082438Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.Type: GrantFiled: September 5, 2018Date of Patent: August 3, 2021Assignee: Oracle International CorporationInventors: Juan Fernandez Peinador, Manel Fernandez Gomez, Guang-Tong Zhou, Hossein Hajimirsadeghi, Andrew Brownsword, Onur Kocberber, Felix Schmidt, Craig Schelp
-
Patent number: 10892961Abstract: Herein are computerized techniques for autonomous and artificially intelligent administration of a computer cloud health monitoring system. In an embodiment, an orchestration computer automatically detects a current state of network elements of a computer network by processing: a) a network plan that defines a topology of the computer network, and b) performance statistics of the network elements. The network elements include computers that each hosts virtual execution environment(s). Each virtual execution environment hosts analysis logic that transforms raw performance data of a network element into a portion of the performance statistics. For each computer, a configuration specification for each virtual execution environment of the computer is automatically generated based on the network plan and the current state of the computer network. At least one virtual execution environment is automatically tuned and/or re-provisioned based on a generated configuration specification.Type: GrantFiled: February 8, 2019Date of Patent: January 12, 2021Assignee: Oracle International CorporationInventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Pravin Shinde