Patents by Inventor NITIN KUNAL

NITIN KUNAL has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Disk drive failure prediction with neural networks

Patent number: 11579951

Abstract: Techniques are described herein for predicting disk drive failure using a machine learning model. The framework involves receiving disk drive sensor attributes as training data, preprocessing the training data to select a set of enhanced feature sequences, and using the enhanced feature sequences to train a machine learning model to predict disk drive failures from disk drive sensor monitoring data. Prior to the training phase, the RNN LSTM model is tuned using a set of predefined hyper-parameters. The preprocessing, which is performed during the training and evaluation phase as well as later during the prediction phase, involves using predefined values for a set of parameters to generate the set of enhanced sequences from raw sensor reading. The enhanced feature sequences are generated to maintain a desired healthy/failed disk ratio, and only use samples leading up to a last-valid-time sample in order to honor a pre-specified heads-up-period alert requirement.

Type: Grant

Filed: September 27, 2018

Date of Patent: February 14, 2023

Assignee: Oracle International Corporation

Inventors: Onur Kocberber, Felix Schmidt, Arun Raghavan, Nipun Agarwal, Sam Idicula, Guang-Tong Zhou, Nitin Kunal
Hybrid declarative query compiler and optimizer framework

Patent number: 11423022

Abstract: Techniques are described herein for building a framework for declarative query compilation using both rule-based and cost-based approaches for database management. The framework involves constructing and using: a set of rule-based properties tables that contain optimization parameters for both logical and physical optimization, a recursive algorithm to form candidate physical query plans that is based on the rule based tables, and a cost model for estimating the cost of a generated physical query plan that is used with the rule based properties tables to prune inferior query plans.

Type: Grant

Filed: June 25, 2018

Date of Patent: August 23, 2022

Assignee: Oracle International Corporation

Inventors: Jian Wen, Sam Idicula, Nitin Kunal, Farhan Tauheed, Seema Sundara, Nipun Agarwal, Indu Bhagat
Efficient partitioning of relational data

Patent number: 11163800

Abstract: Techniques for non-power-of-two partitioning of a data set as well as generation and selection of partition schemes for the data set. In an embodiment, one or more iterations of a partition scheme is for a non-power-of-two number of partitions. Extended hash partitioning may be used to partition a data set into a non-power-of-two number of partitions by determining the partition identifier of each tuple of the data set using the extended hash partitioning algorithm. In an embodiment, multiple partition schemes are generated for multiple data sets, based on properties of the data sets and/or availability of computing resources for the partition operation or the subsequent operation to the partition operation. The generated partition schemes may use non-power-of-two partitioning for one or more iterations of a generated partition scheme. The most optimal partition scheme may be selected from the generated partition schemes based on optimization policies.

Type: Grant

Filed: August 15, 2019

Date of Patent: November 2, 2021

Assignee: Oracle International Corporation

Inventors: Negar Koochakzadeh, Nitin Kunal, Sam Idicula, Cagri Balkesen, Nipun Agarwal
Dynamic operation scheduling for distributed data processing

Patent number: 10956417

Abstract: Techniques are provided for scheduling data operations for a given query based upon a query-cost model that analyzes the cost of scheduling data operations based upon their operation cost and the type of resources needed for the operation. In an embodiment, a database server receives a set of operations for a query. The database server determines a set of leaf operation nodes from the set of data operations, where the set of leaf operation nodes includes operation nodes that do not depend on the execution of other nodes within the set of data operations. The database server compares operation costs between the leaf operation nodes to determine which leaf operation node to insert into a scheduled order set. The database server inserts the leaf operation node into the scheduled order set. Then the database server iteratively determines new leaf operation nodes and performs cost analysis on remaining leaf operation nodes to generate a set of scheduled data operations.

Type: Grant

Filed: April 28, 2017

Date of Patent: March 23, 2021

Assignee: Oracle International Corporation

Inventors: Jarod Wen, Sam Idicula, Nitin Kunal, Thomas Chang, Gong Zhang, Nipun Agarwal, Farhan Tauheed
Limited memory and statistics resilient hash join execution

Patent number: 10810207

Abstract: A processor receives a payload array and generates a hash table in a cache that includes a hash bucket array. Each hash bucket element contains an identifier that defines a location of a build key array element in the payload array. For a particular build key array element, the processor determines a hash bucket element that corresponds to the payload array. The processor copies the identifier for particular build key array element into the hash bucket element. If the cache is unable to insert additional build key array elements into the hash table in the cache, then the processor generates a second hash table for the remaining build key array elements in local volatile memory. When probing, the processor probes both hash tables in the cache and local volatile memory for identifiers in hash bucket elements that are used to locate matching build key array elements.

Type: Grant

Filed: April 3, 2018

Date of Patent: October 20, 2020

Assignee: Oracle International Corporation

Inventors: Cagri Balkesen, Nitin Kunal, Nipun Agarwal
Distributed relational dictionaries

Patent number: 10810195

Abstract: Techniques related to distributed relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves generating, by a query optimizer at a distributed database system (DDS), a query execution plan (QEP) for generating a code dictionary and a column of encoded database data. The QEP specifies a sequence of operations for generating the code dictionary. The code dictionary is a database table. The method further involves receiving, at the DDS, a column of unencoded database data from a data source that is external to the DDS. The DDS generates the code dictionary according to the QEP. Furthermore, based on joining the column of unencoded database data with the code dictionary, the DDS generates the column of encoded database data according to the QEP.

Type: Grant

Filed: January 3, 2018

Date of Patent: October 20, 2020

Assignee: Oracle International Corporation

Inventors: Anantha Kiran Kandukuri, Seema Sundara, Sam Idicula, Pit Fender, Nitin Kunal, Sabina Petride, Georgios Giannikis, Nipun Agarwal
Partition aware evaluation of top-N queries

Patent number: 10706055

Abstract: Techniques are described for executing an analytical query with a top-N clause. In an embodiment, a stream of tuples are received by each of the processing units from a data source identified in the query. The processing unit uses a portion of a received tuple to identify the partition that the tuple is assigned to. For each partition, the processing unit maintains a top-N data store that stores an N number of received tuples that match the criteria of top N tuples according to the query. The received tuple is compared to the N number of tuples to determine whether to store the received tuple and discard an already stored tuple, or to discard the received tuple. After all the tuples have been similarly processed by the processing units, all the top-N data stores for each partition are merged, yielding the top N number of tuples for each partition to return as a result of the query.

Type: Grant

Filed: April 6, 2016

Date of Patent: July 7, 2020

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Gong Zhang, Sam Idicula, Michael Duller, Nitin Kunal
DISK DRIVE FAILURE PREDICTION WITH NEURAL NETWORKS

Publication number: 20200104200

Abstract: Techniques are described herein for predicting disk drive failure using a machine learning model. The framework involves receiving disk drive sensor attributes as training data, preprocessing the training data to select a set of enhanced feature sequences, and using the enhanced feature sequences to train a machine learning model to predict disk drive failures from disk drive sensor monitoring data. Prior to the training phase, the RNN LSTM model is tuned using a set of predefined hyper-parameters. The preprocessing, which is performed during the training and evaluation phase as well as later during the prediction phase, involves using predefined values for a set of parameters to generate the set of enhanced sequences from raw sensor reading. The enhanced feature sequences are generated to maintain a desired healthy/failed disk ratio, and only use samples leading up to a last-valid-time sample in order to honor a pre-specified heads-up-period alert requirement.

Type: Application

Filed: September 27, 2018

Publication date: April 2, 2020

Inventors: ONUR KOCBERBER, FELIX SCHMIDT, ARUN RAGHAVAN, NIPUN AGARWAL, SAM IDICULA, GUANG-TONG ZHOU, NITIN KUNAL
Efficient partitioning of relational data

Patent number: 10592531

Abstract: Techniques for non-power-of-two partitioning of a data set as well as generation and selection of partition schemes for the data set. In an embodiment, one or more iterations of a partition scheme is for a non-power-of-two number of partitions. Extended hash partitioning may be used to partition a data set into a non-power-of-two number of partitions by determining the partition identifier of each tuple of the data set using the extended hash partitioning algorithm. In an embodiment, multiple partition schemes are generated for multiple data sets, based on properties of the data sets and/or availability of computing resources for the partition operation or the subsequent operation to the partition operation. The generated partition schemes may use non-power-of-two partitioning for one or more iterations of a generated partition scheme. The most optimal partition scheme may be selected from the generated partition schemes based on optimization policies.

Type: Grant

Filed: February 21, 2017

Date of Patent: March 17, 2020

Assignee: Oracle International Corporation

Inventors: Negar Koochakzadeh, Nitin Kunal, Sam Idicula, Cagri Balkesen, Nipun Agarwal
HYBRID DECLARATIVE QUERY COMPILER AND OPTIMIZER FRAMEWORK

Publication number: 20190392068

Abstract: Techniques are described herein for building a framework for declarative query compilation using both rule-based and cost-based approaches for database management. The framework involves constructing and using: a set of rule-based properties tables that contain optimization parameters for both logical and physical optimization, a recursive algorithm to form candidate physical query plans that is based on the rule based tables, and a cost model for estimating the cost of a generated physical query plan that is used with the rule based properties tables to prune inferior query plans.

Type: Application

Filed: June 25, 2018

Publication date: December 26, 2019

Inventors: JIAN WEN, SAM IDICULA, NITIN KUNAL, FARHAN TAUHEED, SEEMA SUNDARA, NIPUN AGARWAL, INDU BHAGAT
EFFICIENT PARTITIONING OF RELATIONAL DATA

Publication number: 20190370268

Abstract: Techniques for non-power-of-two partitioning of a data set as well as generation and selection of partition schemes for the data set. In an embodiment, one or more iterations of a partition scheme is for a non-power-of-two number of partitions. Extended hash partitioning may be used to partition a data set into a non-power-of-two number of partitions by determining the partition identifier of each tuple of the data set using the extended hash partitioning algorithm. In an embodiment, multiple partition schemes are generated for multiple data sets, based on properties of the data sets and/or availability of computing resources for the partition operation or the subsequent operation to the partition operation. The generated partition schemes may use non-power-of-two partitioning for one or more iterations of a generated partition scheme. The most optimal partition scheme may be selected from the generated partition schemes based on optimization policies.

Type: Application

Filed: August 15, 2019

Publication date: December 5, 2019

Inventors: NEGAR KOOCHAKZADEH, NITIN KUNAL, SAM IDICULA, CAGRI BALKESEN, NIPUN AGARWAL
LIMITED MEMORY AND STATISTICS RESILIENT HASH JOIN EXECUTION

Publication number: 20190303482

Abstract: Techniques are described for building and probing a hash table where the size of an input partition is larger than the cache size of a receiving processor. A processor receives a payload array and generates a hash table in cache that includes a hash bucket array. Each hash bucket element contains an identifier that defines a location of a build key array element in the payload array. For a particular build key array element, the processor determines a hash bucket element that corresponds to the payload array. The processor copies the identifier for particular build key array element into the hash bucket element. If the cache is unable to insert additional build key array elements into the hash table in the cache, then the processor generates a second hash table for the remaining build key array elements in local volatile memory.

Type: Application

Filed: April 3, 2018

Publication date: October 3, 2019

Inventors: Cagri Balkesen, Nitin Kunal, Nipun Agarwal
Dynamic grouping of in-memory data processing operations

Patent number: 10366124

Abstract: Techniques are described herein for grouping of operations in local memory of a processing unit. The techniques involve adding a first operation for a first leaf operator of a query execution plan to a first pipelined group. The query execution plan includes a set of leaf operators and a set of non-leaf operators. Each leaf operator of the set of one or more leaf operators has a respective parent non-leaf operator and each non-leaf operator has one or more child operators from among the set of leaf operators or others of the set of non-leaf operators. The techniques further involve determining a memory requirement of executing the first operation for the first leaf operator and executing a second operation for the respective parent non-leaf operator of the first leaf operator. The output of the first operation is input to the second operation. The techniques further involve determining whether the memory requirement is satisfied by an amount of local memory.

Type: Grant

Filed: June 7, 2017

Date of Patent: July 30, 2019

Assignee: Oracle International Corporation

Inventors: Jian Wen, Sam Idicula, Nitin Kunal, Negar Koochakzadeh, Seema Sundara, Thomas Chang, Aarti Basant, Nipun Agarwal, Farhan Tauheed
DISTRIBUTED RELATIONAL DICTIONARIES

Publication number: 20190205446

Abstract: Techniques related to distributed relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves generating, by a query optimizer at a distributed database system (DDS), a query execution plan (QEP) for generating a code dictionary and a column of encoded database data. The QEP specifies a sequence of operations for generating the code dictionary. The code dictionary is a database table. The method further involves receiving, at the DDS, a column of unencoded database data from a data source that is external to the DDS. The DDS generates the code dictionary according to the QEP. Furthermore, based on joining the column of unencoded database data with the code dictionary, the DDS generates the column of encoded database data according to the QEP.

Type: Application

Filed: January 3, 2018

Publication date: July 4, 2019

Inventors: Anantha Kiran Kandukuri, Seema Sundara, Sam Idicula, Pit Fender, Nitin Kunal, Sabina Petride, Georgios Giannikis, Nipun Agarwal
DYNAMIC GROUPING OF IN-MEMORY DATA PROCESSING OPERATIONS

Publication number: 20180357331

Abstract: Techniques are described herein for grouping of operations in local memory of a processing unit. The techniques involve adding a first operation for a first leaf operator of a query execution plan to a first pipelined group. The query execution plan includes a set of leaf operators and a set of non-leaf operators. Each leaf operator of the set of one or more leaf operators has a respective parent non-leaf operator and each non-leaf operator has one or more child operators from among the set of leaf operators or others of the set of non-leaf operators. The techniques further involve determining a memory requirement of executing the first operation for the first leaf operator and executing a second operation for the respective parent non-leaf operator of the first leaf operator. The output of the first operation is input to the second operation. The techniques further involve determining whether the memory requirement is satisfied by an amount of local memory.

Type: Application

Filed: June 7, 2017

Publication date: December 13, 2018

Inventors: Jian Wen, Sam Idicula, Nitin Kunal, Negar Koochakzadeh, Seema Sundara, Thomas Chang, Aarti Basant, Nipun Agarwal, Farhan Tauheed
DYNAMIC OPERATION SCHEDULING FOR DISTRIBUTED DATA PROCESSING

Publication number: 20180314733

Abstract: Techniques are provided for scheduling data operations for a given query based upon a query-cost model that analyzes the cost of scheduling data operations based upon their operation cost and the type of resources needed for the operation. In an embodiment, a database server receives a set of operations for a query. The database server determines a set of leaf operation nodes from the set of data operations, where the set of leaf operation nodes includes operation nodes that do not depend on the execution of other nodes within the set of data operations. The database server compares operation costs between the leaf operation nodes to determine which leaf operation node to insert into a scheduled order set. The database server inserts the leaf operation node into the scheduled order set. Then the database server iteratively determines new leaf operation nodes and performs cost analysis on remaining leaf operation nodes to generate a set of scheduled data operations.

Type: Application

Filed: April 28, 2017

Publication date: November 1, 2018

Inventors: Jarod Wen, Sam Idicula, Nitin Kunal, Thomas Chang, Gong Zhang, Nipun Agarwal, Farhan Tauheed
EFFICIENT PARTITIONING OF RELATIONAL DATA

Publication number: 20180239808

Abstract: Techniques for non-power-of-two partitioning of a data set as well as generation and selection of partition schemes for the data set. In an embodiment, one or more iterations of a partition scheme is for a non-power-of-two number of partitions. Extended hash partitioning may be used to partition a data set into a non-power-of-two number of partitions by determining the partition identifier of each tuple of the data set using the extended hash partitioning algorithm. In an embodiment, multiple partition schemes are generated for multiple data sets, based on properties of the data sets and/or availability of computing resources for the partition operation or the subsequent operation to the partition operation. The generated partition schemes may use non-power-of-two partitioning for one or more iterations of a generated partition scheme. The most optimal partition scheme may be selected from the generated partition schemes based on optimization policies.

Type: Application

Filed: February 21, 2017

Publication date: August 23, 2018

Inventors: NEGAR KOOCHAKZADEH, NITIN KUNAL, SAM IDICULA, CAGRI BALKESEN, NIPUN AGARWAL
PARTITION AWARE EVALUATION OF TOP-N QUERIES

Publication number: 20170293658

Abstract: Techniques are described for executing an analytical query with a top-N clause. In an embodiment, a stream of tuples are received by each of the processing units from a data source identified in the query. The processing unit uses a portion of a received tuple to identify the partition that the tuple is assigned to. For each partition, the processing unit maintains a top-N data store that stores an N number of received tuples that match the criteria of top N tuples according to the query. The received tuple is compared to the N number of tuples to determine whether to store the received tuple and discard an already stored tuple, or to discard the received tuple. After all the tuples have been similarly processed by the processing units, all the top-N data stores for each partition are merged, yielding the top N number of tuples for each partition to return as a result of the query.

Type: Application

Filed: April 6, 2016

Publication date: October 12, 2017

Inventors: GONG ZHANG, SAM IDICULA, MICHAEL DULLER, NITIN KUNAL