Patents by Inventor Nipun Agarwal

Nipun Agarwal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ANOMALY DETECTION ON SEQUENTIAL LOG DATA USING A RESIDUAL NEURAL NETWORK

Publication number: 20220108181

Abstract: A multilayer perceptron herein contains an already-trained combined sequence of residual blocks that contains a semantic sequence of residual blocks and a contextual sequence of residual blocks. The semantic sequence of residual blocks contains a semantic sequence of layers of an autoencoder. The contextual sequence of residual blocks contains a contextual sequence of layers of a recurrent neural network. Each residual block of the combined sequence of residual blocks is used based on a respective survival probability. By the autoencoder and based on the using each residual block of the semantic sequence, a previous entry of a log is semantically encoded. By the recurrent neural network and based on the using each residual block of the contextual sequence, a next entry of the log is predicted. In an embodiment during training, survival probabilities are hyperparameters that are learned and used to probabilistically skip residual blocks such that the multilayer perceptron has stochastic depth.

Type: Application

Filed: October 7, 2020

Publication date: April 7, 2022

Inventors: HAMED AHMADI, SAEID ALLAHDADIAN, MATTEO CASSERINI, MILOS VASIC, AMIN SUZANI, FELIX SCHMIDT, ANDREW BROWNSWORD, NIPUN AGARWAL
Automated provisioning for database performance

Patent number: 11256698

Abstract: Embodiments utilize trained query performance machine learning (QP-ML) models to predict an optimal compute node cluster size for a given in-memory workload. The QP-ML models include models that predict query task runtimes at various compute node cardinalities, and models that predict network communication time between nodes of the cluster. Embodiments also utilize an analytical model to predict overlap between predicted task runtimes and predicted network communication times. Based on this data, an optimal cluster size is selected for the workload. Embodiments further utilize trained data capacity machine learning (DC-ML) models to predict a minimum number of compute nodes needed to run a workload. The DC-ML models include models that predict the size of the workload dataset in a target data encoding, models that predict the amount of memory needed to run the queries in the workload, and models that predict the memory needed to accommodate changes to the dataset.

Type: Grant

Filed: April 11, 2019

Date of Patent: February 22, 2022

Assignee: Oracle International Corporation

Inventors: Sam Idicula, Tomas Karnagel, Jian Wen, Seema Sundara, Nipun Agarwal, Mayur Bency
MEMORY USAGE PREDICTION FOR MACHINE LEARNING AND DEEP LEARNING MODELS

Publication number: 20220043681

Abstract: Herein, a computer receives a new training dataset for a target ML model. Proven or unproven respective values of hyperparameters of the target ML model are selected. An already-trained ML metamodel predicts an amount of memory that the target ML model will need, when configured with the respective values of the hyperparameters, to train with the new training dataset. In an embodiment, supervised training of the ML metamodel is as follows. The ML metamodel receives feature vectors that each contains distinct details of a respective past training of the target ML model of many and varied trainings of the target ML model. Those distinct details of each past training includes: respective values of the hyperparameters, and respective values of metafeatures of a respective training dataset of many training datasets. Each feature vector is labeled with a respective amount of memory that the target ML model needed during the respective past training.

Type: Application

Filed: August 4, 2020

Publication date: February 10, 2022

Inventors: Ali Moharrer, Sandeep R. Agrawal, Venkatanathan Varadarajan, Sanjay Jinturkar, Nipun Agarwal
Personal information indexing for columnar data storage format

Patent number: 11238035

Abstract: Techniques are described herein for indexing personal information in columnar data storage format based files. In an embodiment, row groups of rows that comprise a plurality of columns are stored in a set of files. Each column of a row group is stored in a chunk of column pages in the set of files. A regular expression index that indexes a particular column in the set of files is stored for each row group. The regular expression index identifies column pages in the chunk of the particular column that include a particular column value that satisfies a regular expression specified in a query. The regular expression specified in the query in evaluated against the particular column using the regular expression index.

Type: Grant

Filed: March 10, 2020

Date of Patent: February 1, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Hamed Ahmadi, Jian Wen, Shrikumar Hariharasubrahmanian, Sanjay Jinturkar, Nipun Agarwal
GRADIENT-BASED AUTO-TUNING FOR MACHINE LEARNING AND DEEP LEARNING MODELS

Publication number: 20220027746

Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.

Type: Application

Filed: October 13, 2021

Publication date: January 27, 2022

Inventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
PROBABILISTIC TEXT INDEX FOR SEMI-STRUCTURED DATA IN COLUMNAR ANALYTICS STORAGE FORMATS

Publication number: 20220019784

Abstract: Herein is a probabilistic indexing technique for searching semi-structured text documents in columnar storage formats such as Parquet, using columnar input/output (I/O) avoidance, and needing minimal storage overhead. In an embodiment, a computer associates columns with text strings that occur in semi-structured documents. Text words that occur in the text strings are detected. Respectively for each text word, a bitmap, of a plurality of bitmaps, that contains a respective bit for each column is generated. Based on at least one of the bitmaps, some of the columns or some of the semi-structured documents are accessed.

Type: Application

Filed: July 15, 2020

Publication date: January 20, 2022

Inventors: Jian Wen, Hamed Ahmadi, Sanjay Jinturkar, Nipun Agarwal, Lijian Wan, Shrikumar Hariharasubrahmanian
Context-aware feature embedding and anomaly detection of sequential log data using deep recurrent neural networks

Patent number: 11218498

Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.

Type: Grant

Filed: September 5, 2018

Date of Patent: January 4, 2022

Assignee: Oracle International Corporation

Inventors: Hossein Hajimirsadeghi, Guang-Tong Zhou, Andrew Brownsword, Nipun Agarwal, Pavan Chandrashekar, Karoon Rashedi Nia
ENABLING EFFICIENT MACHINE LEARNING MODEL INFERENCE USING ADAPTIVE SAMPLING FOR AUTONOMOUS DATABASE SERVICES

Publication number: 20210406717

Abstract: Herein are approaches for self-optimization of a database management system (DBMS) such as in real time. Adaptive just-in-time sampling techniques herein estimate database content statistics that a machine learning (ML) model may use to predict configuration settings that conserve computer resources such as execution time and storage space. In an embodiment, a computer repeatedly samples database content until a dynamic convergence criterion is satisfied. In each iteration of a series of sampling iterations, a subset of rows of a database table are sampled, and estimates of content statistics of the database table are adjusted based on the sampled subset of rows. Immediately or eventually after detecting dynamic convergence, a machine learning (ML) model predicts, based on the content statistic estimates, an optimal value for a configuration setting of the DBMS.

Type: Application

Filed: June 29, 2020

Publication date: December 30, 2021

Inventors: Farhan Tauheed, Onur Kocberber, Tomas Karnagel, Nipun Agarwal
FAST, PREDICTIVE, AND ITERATION-FREE AUTOMATED MACHINE LEARNING PIPELINE

Publication number: 20210390466

Abstract: A proxy-based automatic non-iterative machine learning (PANI-ML) pipeline is described, which predicts machine learning model configuration performance and outputs an automatically-configured machine learning model for a target training dataset. Techniques described herein use one or more proxy models—which implement a variety of machine learning algorithms and are pre-configured with tuned hyperparameters—to estimate relative performance of machine learning model configuration parameters at various stages of the PANI-ML pipeline. The PANI-ML pipeline implements a radically new approach of rapidly narrowing the search space for machine learning model configuration parameters by performing algorithm selection followed by algorithm-specific adaptive data reduction (i.e., row- and/or feature-wise dataset sampling), and then hyperparameter tuning.

Type: Application

Filed: October 30, 2020

Publication date: December 16, 2021

Inventors: Venkatanathan Varadarajan, Sandeep R. Agrawal, Hesam Fathi Moghadam, Anatoly Yakovlev, Ali Moharrer, Jingxiao Cai, Sanjay Jinturkar, Nipun Agarwal, Sam Idicula, Nikan Chavoshi
CODE DICTIONARY GENERATION BASED ON NON-BLOCKING OPERATIONS

Publication number: 20210390089

Abstract: Techniques related to code dictionary generation based on non-blocking operations are disclosed. In some embodiments, a column of tokens includes a first token and a second token that are stored in separate rows. The column of tokens is correlated with a set of row identifiers including a first row identifier and a second row identifier that is different from the first row identifier. Correlating the column of tokens with the set of row identifiers involves: storing a correlation between the first token and the first row identifier, storing a correlation between the second token and the second row identifier if the first token and the second token have different values, and storing a correlation between the second token and the first row identifier if the first token and the second token have identical values. After correlating the column of tokens with the set of row identifiers, duplicate correlations are removed.

Type: Application

Filed: August 27, 2021

Publication date: December 16, 2021

Inventors: Pit Fender, Felix Schmidt, Benjamin Schlegel, Matthias Brantner, Nipun Agarwal
Method and apparatus for providing promotion recommendations

Patent number: 11200599

Abstract: The present disclosure relates to methods, systems, and apparatuses for providing promotion recommendations using a promotion and marketing service. Some aspects may provide a method for providing a promotion recommendation framework. The method includes receiving, via a network interface, a promotion recommendation inquiry from a component of a promotion and marketing service, the promotion recommendation inquiry including electronic identification data identifying at least one of a consumer or a consumer characteristic. The method also includes identifying, via processing circuitry, promotion transaction information associated with the electronic identification data. The promotion transaction information includes electronic data identifying at least one transaction performed using the promotion and marketing service.

Type: Grant

Filed: August 19, 2020

Date of Patent: December 14, 2021

Assignee: Groupon, Inc.

Inventors: Nipun Agarwal, Rajesh Girish Parekh, Ying Chen
Method and apparatus for providing promotion recommendations

Patent number: 11188943

Abstract: The present disclosure relates to methods, systems, and apparatuses for providing promotion recommendations using a promotion and marketing service. Some aspects may provide a method for providing a promotion recommendation framework. The method includes receiving, via a network interface, a promotion recommendation inquiry from a component of a promotion and marketing service, the promotion recommendation inquiry including electronic identification data identifying at least one of a consumer or a consumer characteristic. The method also includes identifying, via processing circuitry, promotion transaction information associated with the electronic identification data. The promotion transaction information includes electronic data identifying at least one transaction performed using the promotion and marketing service.

Type: Grant

Filed: May 19, 2020

Date of Patent: November 30, 2021

Assignee: Groupon, Inc.

Inventors: Nipun Agarwal, Rajesh Girish Parekh, Ying Chen
ESTIMATING NUMBER OF DISTINCT VALUES IN A DATA SET USING MACHINE LEARNING

Publication number: 20210365805

Abstract: Techniques for estimating the number of distinct values in a data set using machine learning are provided. In one technique, a sample of a data set is retrieved where the sample is a strict subset of the data set. The sample is analyzed to identify feature values of multiple features of the sample. The feature values are inserted into a machine-learned model that computes a prediction regarding a number of distinct values in the data set. An estimated number of distinct values that is based on the prediction is stored in association with the data set.

Type: Application

Filed: May 19, 2020

Publication date: November 25, 2021

Inventors: Tomas Karnagel, Onur Kocberber, Farhan Tauheed, Nipun Agarwal
Gradient-based auto-tuning for machine learning and deep learning models

Patent number: 11176487

Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.

Type: Grant

Filed: January 31, 2018

Date of Patent: November 16, 2021

Assignee: Oracle International Corporation

Inventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
Relational dictionaries

Patent number: 11169995

Abstract: Techniques related to relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves storing a code dictionary comprising a set of tuples. The code dictionary is a database table defined by a database dictionary and comprises columns that are each defined by the database dictionary. The set of tuples maps a set of codes to a set of tokens. The set of tokens are stored in a column of unencoded database data. The method further involves generating encoded database data based on joining the unencoded database data with the set of tuples. Furthermore, the method involves generating decoding database data based on joining the encoded database data with the set of tuples.

Type: Grant

Filed: November 21, 2017

Date of Patent: November 9, 2021

Assignee: Oracle International Corporation

Inventors: Pit Fender, Seema Sundara, Benjamin Schlegel, Nipun Agarwal
Efficient partitioning of relational data

Patent number: 11163800

Abstract: Techniques for non-power-of-two partitioning of a data set as well as generation and selection of partition schemes for the data set. In an embodiment, one or more iterations of a partition scheme is for a non-power-of-two number of partitions. Extended hash partitioning may be used to partition a data set into a non-power-of-two number of partitions by determining the partition identifier of each tuple of the data set using the extended hash partitioning algorithm. In an embodiment, multiple partition schemes are generated for multiple data sets, based on properties of the data sets and/or availability of computing resources for the partition operation or the subsequent operation to the partition operation. The generated partition schemes may use non-power-of-two partitioning for one or more iterations of a generated partition scheme. The most optimal partition scheme may be selected from the generated partition schemes based on optimization policies.

Type: Grant

Filed: August 15, 2019

Date of Patent: November 2, 2021

Assignee: Oracle International Corporation

Inventors: Negar Koochakzadeh, Nitin Kunal, Sam Idicula, Cagri Balkesen, Nipun Agarwal
ASYMMETRIC ALLOCATION OF SRAM AND DATA LAYOUT FOR EFFICIENT MATRIX-MATRIX MULTIPLICATION

Publication number: 20210312014

Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.

Type: Application

Filed: June 16, 2021

Publication date: October 7, 2021

Inventors: Gaurav Chadha, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
Assymetric allocation of SRAM and data layout for efficient matrix multiplication

Patent number: 11138291

Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.

Type: Grant

Filed: September 26, 2017

Date of Patent: October 5, 2021

Assignee: Oracle International Corporation

Inventors: Gaurav Chadha, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
Massively parallel and in-memory execution of grouping and aggregation in a heterogeneous system

Patent number: 11126626

Abstract: A system and method for processing a group and aggregate query on a relation are disclosed. A database system determines whether assistance of a heterogeneous system (HS) of compute nodes is beneficial in performing the query. Assuming that the relation has been partitioned and loaded into the HS, the database system determines, in a compile phase, whether the HS has the functional capabilities to assist, and whether the cost and benefit favor performing the operation with the assistance of the HS. If the cost and benefit favor using the assistance of the HS, then the system enters the execution phase. The database system starts, in the execution phase, an optimal number of parallel processes to produce and consume the results from the compute nodes of the HS. After any needed transaction consistency checks, the results of the query are returned by the database system.

Type: Grant

Filed: February 11, 2019

Date of Patent: September 21, 2021

Assignee: Oracle International Corporation

Inventors: Sabina Petride, Sam Idicula, Nipun Agarwal
Code dictionary generation based on non-blocking operations

Patent number: 11126611

Abstract: Techniques related to code dictionary generation based on non-blocking operations are disclosed. In some embodiments, a column of tokens includes a first token and a second token that are stored in separate rows. The column of tokens is correlated with a set of row identifiers including a first row identifier and a second row identifier that is different from the first row identifier. Correlating the column of tokens with the set of row identifiers involves: storing a correlation between the first token and the first row identifier, storing a correlation between the second token and the second row identifier if the first token and the second token have different values, and storing a correlation between the second token and the first row identifier if the first token and the second token have identical values. After correlating the column of tokens with the set of row identifiers, duplicate correlations are removed.

Type: Grant

Filed: February 15, 2018

Date of Patent: September 21, 2021

Assignee: Oracle International Corporation

Inventors: Pit Fender, Felix Schmidt, Benjamin Schlegel, Matthias Brantner, Nipun Agarwal

prev … 3 4 5 6 7 8 9 10 11 … next