Patents by Inventor Sam Idicula

Sam Idicula has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TAIL-BASED TOP-N QUERY EVALUATION

Publication number: 20190303369

Abstract: Techniques are described for executing a query with a top-N clause to select a first N-number of rows in a data source arranged at least according to a first key and a second key of the data source using a first sort order respectively specified for the first key and a second sort order respectively specified for the second key by the query. The data source may include one or more tiles that include at least a portion of the first key and the second key. To execute the query, in an embodiment, a DBMS determines, in a first vector of first key values that are in a first tile, row identifiers identifying entries of the first vector that contain values equal to a tail value that follows a particular top number of the first key values. The DBMS may select, from a second vector of values of the second key in the first tile, second key values identified based on the determined row identifiers of the first vector.

Type: Application

Filed: June 20, 2019

Publication date: October 3, 2019

Inventors: GONG ZHANG, VENKATRAMAN GOVINDARAJU, SAM IDICULA
Tuple encoding aware direct memory access engine for scratchpad enabled multi-core processors

Patent number: 10402425

Abstract: Techniques provide for hardware accelerated data movement between main memory and an on-chip data movement system that comprises multiple core processors that operate on the tabular data. The tabular data is moved to or from the scratch pad memories of the core processors. While the data is in-flight, the data may be manipulated by data manipulation operations. The data movement system includes multiple data movement engines, each dedicated to moving and transforming tabular data from main memory data to a subset of the core processors. Each data movement engine is coupled to an internal memory that stores data (e.g. a bit vector) that dictates how data manipulation operations are performed on tabular data moved from a main memory to the memories of a core processor, or to and from other memories. The internal memory of each data movement engine is private to the data movement engine.

Type: Grant

Filed: July 24, 2018

Date of Patent: September 3, 2019

Assignee: Oracle International Corporation

Inventors: David A. Brown, Rishabh Jain, Michael Duller, Sam Idicula, Erik Schlanger, David Joseph Hawkins, Christopher Joseph Daniels
Boomerang join: a network efficient, late-materialized, distributed join technique

Patent number: 10397317

Abstract: Embodiments comprise a distributed join processing technique that reduces the data exchanged over the network. Embodiments first evaluate the join using a partitioned parallel join based on join tuples that represent the rows that are to be joined to produce join result tuples that represent matches between rows for the join result. Embodiments fetch, over the network, projected columns from the appropriate partitions of the tables among the nodes of the system using the record identifiers from the join result tuples. To further conserve network bandwidth, embodiments perform an additional record-identifier shuffling phase based on the respective sizes of the projected columns from the relations involved in the join operation. Specifically, the result tuples are shuffled such that transmitting projected columns from the join relation with the larger payload is avoided and the system need only exchange, over the network, projected columns from the join relation with the smaller payload.

Type: Grant

Filed: September 29, 2017

Date of Patent: August 27, 2019

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Cagri Balkesen, Sam Idicula, Nipun Agarwal
Tail-based top-N query evaluation

Patent number: 10394811

Abstract: Techniques are described for executing a query with a top-N clause to select a first N-number of rows in a data source arranged at least according to a first key and a second key of the data source using a first sort order respectively specified for the first key and a second sort order respectively specified for the second key by the query. The data source may include one or more tiles that include at least a portion of the first key and the second key. To execute the query, in an embodiment, a DBMS determines, in a first vector of first key values that are in a first tile, row identifiers identifying entries of the first vector that contain values equal to a tail value that follows a particular top number of the first key values. The DBMS may select, from a second vector of values of the second key in the first tile, second key values identified based on the determined row identifiers of the first vector.

Type: Grant

Filed: May 30, 2017

Date of Patent: August 27, 2019

Assignee: Oracle International Corporation

Inventors: Gong Zhang, Venkatraman Govindaraju, Sam Idicula
USING META-LEARNING FOR AUTOMATIC GRADIENT-BASED HYPERPARAMETER OPTIMIZATION FOR MACHINE LEARNING AND DEEP LEARNING MODELS

Publication number: 20190244139

Abstract: Techniques are provided herein for optimal initialization of value ranges of machine learning algorithm hyperparameters and other predictions based on dataset meta-features. In an embodiment for each particular hyperparameter of a machine learning algorithm, a computer invokes, based on an inference dataset, a distinct trained metamodel for the particular hyperparameter to detect an improved subrange of possible values for the particular hyperparameter. The machine learning algorithm is configured based on the improved subranges of possible values for the hyperparameters. The machine learning algorithm is invoked to obtain a result. In an embodiment, a gradient-based search space reduction (GSSR) finds an optimal value within the improved subrange of values for the particular hyperparameter. In an embodiment, the metamodel is trained based on performance data from exploratory sampling of configuration hyperspace, such as by GSSR.

Type: Application

Filed: March 7, 2018

Publication date: August 8, 2019

Inventors: VENKATANATHAN VARADARAJAN, SANDEEP AGRAWAL, SAM IDICULA, NIPUN AGARWAL
HYBRID INSTRUMENTATION FRAMEWORK FOR MULTICORE LOW POWER PROCESSORS

Publication number: 20190235985

Abstract: Techniques are provided for redundant execution by a better processor for intensive dynamic profiling after initial execution by a constrained processor. In an embodiment, a system of computer(s) receives a request to profile particular runtime aspects of an original binary executable. Based on the particular runtime aspects and without accessing source logic, the system statically rewrites the original binary executable into a rewritten binary executable that invokes telemetry instrumentation that makes observations of the particular runtime aspects and emits traces of those observations. A first processing core having low power (capacity) performs a first execution of the rewritten binary executable to make first observations and emit first traces of the first observations. Afterwards, a second processing core performs a second (redundant) execution of the original binary executable based on the first traces.

Type: Application

Filed: January 29, 2018

Publication date: August 1, 2019

Inventors: SAM IDICULA, KIRTIKAR KASHYAP, ARUN RAGHAVAN, EVANGELOS VLACHOS, VENKATRAMAN GOVINDARAJU
Dynamic grouping of in-memory data processing operations

Patent number: 10366124

Abstract: Techniques are described herein for grouping of operations in local memory of a processing unit. The techniques involve adding a first operation for a first leaf operator of a query execution plan to a first pipelined group. The query execution plan includes a set of leaf operators and a set of non-leaf operators. Each leaf operator of the set of one or more leaf operators has a respective parent non-leaf operator and each non-leaf operator has one or more child operators from among the set of leaf operators or others of the set of non-leaf operators. The techniques further involve determining a memory requirement of executing the first operation for the first leaf operator and executing a second operation for the respective parent non-leaf operator of the first leaf operator. The output of the first operation is input to the second operation. The techniques further involve determining whether the memory requirement is satisfied by an amount of local memory.

Type: Grant

Filed: June 7, 2017

Date of Patent: July 30, 2019

Assignee: Oracle International Corporation

Inventors: Jian Wen, Sam Idicula, Nitin Kunal, Negar Koochakzadeh, Seema Sundara, Thomas Chang, Aarti Basant, Nipun Agarwal, Farhan Tauheed
DISTRIBUTED RELATIONAL DICTIONARIES

Publication number: 20190205446

Abstract: Techniques related to distributed relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves generating, by a query optimizer at a distributed database system (DDS), a query execution plan (QEP) for generating a code dictionary and a column of encoded database data. The QEP specifies a sequence of operations for generating the code dictionary. The code dictionary is a database table. The method further involves receiving, at the DDS, a column of unencoded database data from a data source that is external to the DDS. The DDS generates the code dictionary according to the QEP. Furthermore, based on joining the column of unencoded database data with the code dictionary, the DDS generates the column of encoded database data according to the QEP.

Type: Application

Filed: January 3, 2018

Publication date: July 4, 2019

Inventors: Anantha Kiran Kandukuri, Seema Sundara, Sam Idicula, Pit Fender, Nitin Kunal, Sabina Petride, Georgios Giannikis, Nipun Agarwal
Massively Parallel And In-Memory Execution Of Grouping And Aggregation In a Heterogeneous System

Publication number: 20190188205

Abstract: A system and method for processing a group and aggregate query on a relation are disclosed. A database system determines whether assistance of a heterogeneous system (HS) of compute nodes is beneficial in performing the query. Assuming that the relation has been partitioned and loaded into the HS, the database system determines, in a compile phase, whether the HS has the functional capabilities to assist, and whether the cost and benefit favor performing the operation with the assistance of the HS. If the cost and benefit favor using the assistance of the HS, then the system enters the execution phase. The database system starts, in the execution phase, an optimal number of parallel processes to produce and consume the results from the compute nodes of the HS. After any needed transaction consistency checks, the results of the query are returned by the database system.

Type: Application

Filed: February 11, 2019

Publication date: June 20, 2019

Inventors: Sabina Petride, Sam Idicula, Nipun Agarwal
ADAPTIVE RESOLUTION HISTOGRAM ON COMPLEX DATATYPES

Publication number: 20190129984

Abstract: Techniques herein map between key spaces to generate a balanced adaptive resolution histogram for dataset partitioning. In embodiments, a computer (C) creates a mapping that associates sparse keys (SKs) with distinct dense keys. C constructs a trie by processing each item of a dataset as follows. Based on the item, C obtains an SK. C navigates from a root NT (node of the trie) to a particular NT based on a sequence of dense digits (SDD). Each dense digit of the SDD is based on the mapping. Each NT identifies a dense prefix comprising dense digits. C assigns the item to a target node based on a threshold and count of items assigned to a subtree rooted at the particular node. C determines a range of SKs for each partition of the dataset, based on: an item count for a node or subtree, dense prefixes of NTs, and the mapping.

Type: Application

Filed: October 31, 2017

Publication date: May 2, 2019

Inventors: ANANTHA KIRAN KANDUKURI, SAM IDICULA
COMPLETE, CORRECT AND FAST COMPILE-TIME ENCODING INFERENCE ON THE BASIS OF AN UNDERLYING TYPE SYSTEM

Publication number: 20190121893

Abstract: Techniques are described herein for introducing transcode operators into a generated operator tree during query processing. Setting up the transcode operators with correct encoding type at runtime is performed by inferring correct encoding type information during compile time. The inference of the correct encoding type information occurs in three phases during compile time: the first phase involves collecting, consolidating, and propagating the encoding-type information of input columns up the expression tree. The second phase involves pushing the encoding-type information down the tree for nodes in the expression tree that do not yet have any encoding-type assigned. The third phase involves determining which inputs to the current relational operator need to be pre-processed by a transcode operator.

Type: Application

Filed: October 24, 2017

Publication date: April 25, 2019

Inventors: Pit Fender, Sam Idicula, Nipun Agarwal, Benjamin Schlegel
Application-level dynamic scheduling of network communication for efficient re-partitioning of skewed data

Patent number: 10263893

Abstract: Techniques are provided for using decentralized lock synchronization to increase network throughput. In an embodiment, a first computer sends, to a second computer comprising a lock, a request to acquire the lock. In response to receiving the lock acquisition request, the second computer detects whether the lock is available. If the lock is unavailable, then the second computer replies by sending a denial to the first computer. Otherwise, the second computer sends an exclusive grant of the lock to the first computer. While the first computer has acquired the lock, the first computer sends data to the second computer. Afterwards, the first computer sends a request to release the lock to the second computer. This completes one duty cycle of the lock, and the lock is again available for acquisition.

Type: Grant

Filed: December 7, 2016

Date of Patent: April 16, 2019

Assignee: Oracle International Corporation

Inventors: Vikas Aggarwal, Ankur Arora, Sam Idicula, Nipun Agarwal
BOOMERANG JOIN: A NETWORK EFFICIENT, LATE-MATERIALIZED, DISTRIBUTED JOIN TECHNIQUE

Publication number: 20190104175

Abstract: Embodiments comprise a distributed join processing technique that reduces the data exchanged over the network. Embodiments first evaluate the join using a partitioned parallel join based on join tuples that represent the rows that are to be joined to produce join result tuples that represent matches between rows for the join result. Embodiments fetch, over the network, projected columns from the appropriate partitions of the tables among the nodes of the system using the record identifiers from the join result tuples. To further conserve network bandwidth, embodiments perform an additional record-identifier shuffling phase based on the respective sizes of the projected columns from the relations involved in the join operation. Specifically, the result tuples are shuffled such that transmitting projected columns from the join relation with the larger payload is avoided and the system need only exchange, over the network, projected columns from the join relation with the smaller payload.

Type: Application

Filed: September 29, 2017

Publication date: April 4, 2019

Inventors: Cagri Balkesen, Sam Idicula, Nipun Agarwal
GRADIENT-BASED AUTO-TUNING FOR MACHINE LEARNING AND DEEP LEARNING MODELS

Publication number: 20190095818

Abstract: Herein, horizontally scalable techniques efficiently configure machine learning algorithms for optimal accuracy and without informed inputs. In an embodiment, for each particular hyperparameter, and for each epoch, a computer processes the particular hyperparameter. An epoch explores one hyperparameter based on hyperparameter tuples. A respective score is calculated from each tuple. The tuple contains a distinct combination of values, each of which is contained in a value range of a distinct hyperparameter. All values of a tuple that belong to the particular hyperparameter are distinct. All values of a tuple that belong to other hyperparameters are held constant. The value range of the particular hyperparameter is narrowed based on an intersection point of a first line based on the scores and a second line based on the scores. A machine learning algorithm is optimally configured from repeatedly narrowed value ranges of hyperparameters. The configured algorithm is invoked to obtain a result.

Type: Application

Filed: January 31, 2018

Publication date: March 28, 2019

Inventors: Venkatanathan Varadarajan, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
ASSYMETRIC ALLOCATION OF SRAM AND DATA LAYOUT FOR EFFICIENT MATRIX MULTIPLICATION

Publication number: 20190095399

Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.

Type: Application

Filed: September 26, 2017

Publication date: March 28, 2019

Inventors: Gaurav Chadha, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
SCALABLE AND EFFICIENT DISTRIBUTED AUTO-TUNING OF MACHINE LEARNING AND DEEP LEARNING MODELS

Publication number: 20190095819

Abstract: Herein are techniques for automatic tuning of hyperparameters of machine learning algorithms. System throughput is maximized by horizontally scaling and asynchronously dispatching the configuration, training, and testing of an algorithm. In an embodiment, a computer stores a best cost achieved by executing a target model based on best values of the target algorithm's hyperparameters. The best values and their cost are updated by epochs that asynchronously execute. Each epoch has asynchronous costing tasks that explore a distinct hyperparameter. Each costing task has a sample of exploratory values that differs from the best values along the distinct hyperparameter. The asynchronous costing tasks of a same epoch have different values for the distinct hyperparameter, which accomplishes an exploration. In an embodiment, an excessive update of best values or best cost creates a major epoch for exploration in a subspace that is more or less unrelated to other epochs, thereby avoiding local optima.

Type: Application

Filed: September 21, 2018

Publication date: March 28, 2019

Inventors: VENKATANATHAN VARADARAJAN, SAM IDICULA, SANDEEP AGRAWAL, NIPUN AGARWAL
ALGORITHM-SPECIFIC NEURAL NETWORK ARCHITECTURES FOR AUTOMATIC MACHINE LEARNING MODEL SELECTION

Publication number: 20190095756

Abstract: Techniques are provided for selection of machine learning algorithms based on performance predictions by trained algorithm-specific regressors. In an embodiment, a computer derives meta-feature values from an inference dataset by, for each meta-feature, deriving a respective meta-feature value from the inference dataset. For each trainable algorithm and each regression meta-model that is respectively associated with the algorithm, a respective score is calculated by invoking the meta-model based on at least one of: a respective subset of meta-feature values, and/or hyperparameter values of a respective subset of hyperparameters of the algorithm. The algorithm(s) are selected based on the respective scores. Based on the inference dataset, the selected algorithm(s) may be invoked to obtain a result. In an embodiment, the trained regressors are distinctly configured artificial neural networks. In an embodiment, the trained regressors are contained within algorithm-specific ensembles.

Type: Application

Filed: January 30, 2018

Publication date: March 28, 2019

Inventors: Sandeep Agrawal, Sam Idicula, Venkatanathan Varadarajan, Nipun Agarwal
Massively parallel and in-memory execution of grouping and aggregation in a heterogeneous system

Patent number: 10204140

Abstract: A system and method for processing a group and aggregate query on a relation are disclosed. A database system determines whether assistance of a heterogeneous system (HS) of compute nodes is beneficial in performing the query. Assuming that the relation has been partitioned and loaded into the HS, the database system determines, in a compile phase, whether the HS has the functional capabilities to assist, and whether the cost and benefit favor performing the operation with the assistance of the HS. If the cost and benefit favor using the assistance of the HS, then the system enters the execution phase. The database system starts, in the execution phase, an optimal number of parallel processes to produce and consume the results from the compute nodes of the HS. After any needed transaction consistency checks, the results of the query are returned by the database system.

Type: Grant

Filed: March 14, 2013

Date of Patent: February 12, 2019

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Sabina Petride, Sam Idicula, Nipun Agarwal
Hybrid binary XML storage model for efficient XML processing

Patent number: 10191656

Abstract: A method for storing XML documents a hybrid navigation/streaming format is provided to allow efficient storage and processing of queries on the XML data that provides the benefits of both navigation and streaming and ameliorates the disadvantages of each. Each XML document to be stored is independently analyzed to determine a combination of navigable and streamable storage format that optimizes the processing of the data for anticipated access patterns.

Type: Grant

Filed: October 17, 2015

Date of Patent: January 29, 2019

Assignee: Oracle International Corporation

Inventors: Sam Idicula, Balasubramanyam Sthanikam, Nipun Agarwal
Row identification number generation in database direct memory access engine

Patent number: 10176114

Abstract: Techniques provide for hardware accelerated data movement between main memory and an on-chip data movement system that comprises multiple core processors that operate on the tabular data. The tabular data is moved to or from the scratch pad memories of the core processors. While the data is in-flight, the data may be manipulated by data manipulation operations. The data movement system includes multiple data movement engines, each dedicated to moving and transforming tabular data from main memory data to a subset of the core processors. Each data movement engine is coupled to an internal memory that stores data (e.g. a bit vector) that dictates how data manipulation operations are performed on tabular data moved from a main memory to the memories of a core processor, or to and from other memories. The internal memory of each data movement engine is private to the data movement engine.

Type: Grant

Filed: November 28, 2016

Date of Patent: January 8, 2019

Assignee: Oracle International Corporation

Inventors: David A. Brown, Sam Idicula, Erik Schlanger, Rishabh Jain, Michael Duller

prev 1 2 3 4 5 6 7 8 … next