Patents by Inventor Dina Thomas

Dina Thomas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Combined row and columnar storage for in-memory databases for OLTP and analytics workloads

Patent number: 11860830

Abstract: Columns of a table are stored in either row-major format or column-major format in an in-memory DBMS. For a given table, one set of columns is stored in column-major format; another set of columns for a table are stored in row-major format. This way of storing columns of a table is referred to herein as dual-major format. In addition, a row in a dual-major table is updated “in-place”, that is, updates are made directly to column-major columns without creating an interim row-major form of the column-major columns of the row. Users may submit database definition language (“DDL”) commands that declare the row-major columns and column-major columns of a table.

Type: Grant

Filed: February 27, 2019

Date of Patent: January 2, 2024

Assignee: Oracle International Corporation

Inventors: Tirthankar Lahiri, Martin A. Reames, Kirk Edson, Neelam Goyal, Kao Makino, Anindya Patthak, Dina Thomas, Subhradyuti Sarkar, Chi-Kim Hoang, Qingchun Jiang
Evaluating SQL expressions on dictionary encoded vectors

Patent number: 11294816

Abstract: Techniques are described herein for reducing the number of redundant evaluations that occur when an expression is evaluated against an encoded column vector by caching results of expression evaluations. When executing a query that includes an expression that references columns for which dictionary-encoded column vectors exist, the database server performs a cost-based analysis to determine which expressions (or sub-expressions) would benefit from caching the expression's evaluation result. For each such expression, the database server performs the necessary computations and caches the results for each of the possible distinct input values. When evaluating an expression for a row with a particular set of input codes, a look-up is performed based on the input code combination to retrieve the pre-computed results of that evaluation from the cache.

Type: Grant

Filed: January 3, 2017

Date of Patent: April 5, 2022

Assignee: Oracle International Corporation

Inventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Dennis Lui, Sheldon A. K. Lewis
Column domain dictionary compression

Patent number: 10756759

Abstract: In column domain dictionary compression, column values in one or more columns are tokenized by a single dictionary. The domain of the dictionary is the entire set of columns. A dictionary may not only map a token to a tokenized value, but also to a count (“token count”) of the number of occurrences of the token and corresponding tokenized value in the dictionary's domain. Such information may be used to compute queries on the base table.

Type: Grant

Filed: September 2, 2011

Date of Patent: August 25, 2020

Assignee: Oracle International Corporation

Inventors: Tirthankar Lahiri, Chi-Kim Hoang, Dina Thomas, Kirk Meredith Edson, Subhradyuti Sarkar, Mark McAuliffe, Marie-Anne Neimat, Chih-Ping Wang
In-memory column-level multi-versioned global dictionary for in-memory databases

Patent number: 10726016

Abstract: Techniques are described herein for sharing a dictionary across multiple in-memory compression units (IMCUs). After a dictionary is used to encode a first column vector in a first IMCU, the same dictionary is used to encode a second column vector in a second IMCU. The entries in the dictionary are in sort order to facilitate binary searching when performing value-to-code look-ups. If, during the encoding of the second column vector, values are encountered for which the dictionary does not already have codes, then a “sort-order-boundary” is established after the last entry in the dictionary, and entries for the newly encountered values are added to the dictionary, after the sort-order-boundary. To facilitate value-to-code look-ups, the new entries are also sorted relative to each other, creating a second “sort order set”. A new version of the dictionary may be created when the number of sort order sets in the first version of the dictionary reaches a configurable threshold.

Type: Grant

Filed: October 14, 2016

Date of Patent: July 28, 2020

Assignee: Oracle International Corporation

Inventors: Shasank K. Chavan, Prashant Gaharwar, Ajit Mylavarapu, Dina Thomas, Dennis Lui, Sheldon A. K. Lewis, Roger D. Macnicol
Using shared dictionaries on join columns to improve performance of joins in relational databases

Patent number: 10678791

Abstract: Techniques are described for encoding join columns that belong to the same domain with a common dictionary. The tables are encoded with dictionary indexes that make the comparison operation of a join query a quick equality check of two integers and there is no need to compute any hashes during execution. Additionally, the techniques described herein minimize the bloom filter creation and evaluation cost as well because the dictionary indexes serve as hash values into the bloom filter. If the bloom filter is as large as the range of dictionary indexes, then the filter is no longer a probabilistic structure and can be used to filter rows in the probe phase with full certainty without any significant overhead.

Type: Grant

Filed: May 22, 2017

Date of Patent: June 9, 2020

Assignee: Oracle International Corporation

Inventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Tirthankar Lahiri, Jesse Kamp
Combined Row And Columnar Storage For In-Memory Databases For OLTP And Analytics Workloads

Publication number: 20190197026

Abstract: Columns of a table are stored in either row-major format or column-major format in an in-memory DBMS. For a given table, one set of columns is stored in column-major format; another set of columns for a table are stored in row-major format. This way of storing columns of a table is referred to herein as dual-major format. In addition, a row in a dual-major table is updated “in-place”, that is, updates are made directly to column-major columns without creating an interim row-major form of the column-major columns of the row. Users may submit database definition language (“DDL”) commands that declare the row-major columns and column-major columns of a table.

Type: Application

Filed: February 27, 2019

Publication date: June 27, 2019

Inventors: TIRTHANKAR LAHIRI, MARTIN A. REAMES, KIRK EDSON, NEELAM GOYAL, KAO MAKINO, ANINDYA PATTHAK, DINA THOMAS, SUBHRADYUTI SARKAR, CHI-KIM HOANG, QINGCHUN JIANG
Combined row and columnar storage for in-memory databases for OLTP and analytics workloads

Patent number: 10311154

Abstract: Columns of a table are stored in either row-major format or column-major format in an in-memory DBMS. For a given table, one set of columns is stored in column-major format; another set of columns for a table are stored in row-major format. This way of storing columns of a table is referred to herein as dual-major format. In addition, a row in a dual-major table is updated “in-place”, that is, updates are made directly to column-major columns without creating an interim row-major form of the column-major columns of the row. Users may submit database definition language (“DDL”) commands that declare the row-major columns and column-major columns of a table.

Type: Grant

Filed: December 5, 2013

Date of Patent: June 4, 2019

Assignee: Oracle International Corporation

Inventors: Tirthankar Lahiri, Martin A. Reames, Kirk Edson, Neelam Goyal, Kao Makino, Anindya Patthak, Dina Thomas, Subhradyuti Sarkar, Chi-Kim Hoang, Qingchun Jiang
USING SHARED DICTIONARIES ON JOIN COLUMNS TO IMPROVE PERFORMANCE OF JOINS IN RELATIONAL DATABASES

Publication number: 20170255675

Abstract: Techniques are described for encoding join columns that belong to the same domain with a common dictionary. The tables are encoded with dictionary indexes that make the comparison operation of a join query a quick equality check of two integers and there is no need to compute any hashes during execution. Additionally, the techniques described herein minimize the bloom filter creation and evaluation cost as well because the dictionary indexes serve as hash values into the bloom filter. If the bloom filter is as large as the range of dictionary indexes, then the filter is no longer a probabilistic structure and can be used to filter rows in the probe phase with full certainty without any significant overhead.

Type: Application

Filed: May 22, 2017

Publication date: September 7, 2017

Inventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Tirthankar Lahiri, Jesse Kamp
EVALUATING SQL EXPRESSIONS ON DICTIONARY ENCODED VECTORS

Publication number: 20170116242

Abstract: Techniques are described herein for reducing the number of redundant evaluations that occur when an expression is evaluated against an encoded column vector by caching results of expression evaluations. When executing a query that includes an expression that references columns for which dictionary-encoded column vectors exist, the database server performs a cost-based analysis to determine which expressions (or sub-expressions) would benefit from caching the expression's evaluation result. For each such expression, the database server performs the necessary computations and caches the results for each of the possible distinct input values. When evaluating an expression for a row with a particular set of input codes, a look-up is performed based on the input code combination to retrieve the pre-computed results of that evaluation from the cache.

Type: Application

Filed: January 3, 2017

Publication date: April 27, 2017

Inventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Dennis Lui, Sheldon A.K. Lewis
IN-MEMORY COLUMN-LEVEL MULTI-VERSIONED GLOBAL DICTIONARY FOR IN-MEMORY DATABASES

Publication number: 20170109406

Abstract: Techniques are described herein for sharing a dictionary across multiple in-memory compression units (IMCUs). After a dictionary is used to encode a first column vector in a first IMCU, the same dictionary is used to encode a second column vector in a second IMCU. The entries in the dictionary are in sort order to facilitate binary searching when performing value-to-code look-ups. If, during the encoding of the second column vector, values are encountered for which the dictionary does not already have codes, then a “sort-order-boundary” is established after the last entry in the dictionary, and entries for the newly encountered values are added to the dictionary, after the sort-order-boundary. To facilitate value-to-code look-ups, the new entries are also sorted relative to each other, creating a second “sort order set”. A new version of the dictionary may be created when the number of sort order sets in the first version of the dictionary reaches a configurable threshold.

Type: Application

Filed: October 14, 2016

Publication date: April 20, 2017

Inventors: Shasank K. Chavan, Prashant Gaharwar, Ajit Mylavarapu, Dina Thomas, Dennis Lui, Sheldon A.K. Lewis, Roger D. Macnicol
Combined Row and Columnar Storage for In-Memory Databases for OLTP and Analytics Workloads

Publication number: 20150088813

Abstract: Columns of a table are stored in either row-major format or column-major format in an in-memory DBMS. For a given table, one set of columns is stored in column-major format; another set of columns for a table are stored in row-major format. This way of storing columns of a table is referred to herein as dual-major format. In addition, a row in a dual-major table is updated “in-place”, that is, updates are made directly to column-major columns without creating an interim row-major form of the column-major columns of the row. Users may submit database definition language (“DDL”) commands that declare the row-major columns and column-major columns of a table.

Type: Application

Filed: December 5, 2013

Publication date: March 26, 2015

Applicant: Oracle International Corporation

Inventors: Tirthankar Lahiri, Martin A. Reames, Kirk Edson, Neelam Goyal, Kao Makino, Anindya Patthak, Dina Thomas, Subhradyuti Sarkar, Chi-Kim Hoang, Qingchun Jiang
User-defined parallelization in transactional replication of in-memory database

Patent number: 8738568

Abstract: A replication track is a designated group of transactions that are to be replicated at a destination database in a way that, with respect to any other transaction in the replication track, preserves transactional dependency. Further, transactions in a replication track can be replicated at the destination database without regard to transactional dependency of other transactions in another track. This facilitates concurrent parallel replication of transactions of different tracks. Replicating data in this manner is referred to herein as track replication. An application may request execution of transactions and designate different tracks for transactions.

Type: Grant

Filed: May 5, 2011

Date of Patent: May 27, 2014

Assignee: Oracle International Corporation

Inventors: Sourav Ghosh, Rohan Aranha, Tirthankar Lahiri, Mark McAuliffe, Chih-Ping Wang, Paul Tuck, Nagender Bandi, John E. Miller, Dina Thomas, Marie-Anne Neimat
Column Domain Dictionary Compression

Publication number: 20130060780

Abstract: In column domain dictionary compression, column values in one or more columns are tokenized by a single dictionary. The domain of the dictionary is the entire set of columns. A dictionary may not only map a token to a tokenized value, but also to a count (“token count”) of the number of occurrences of the token and corresponding tokenized value in the dictionary's domain. Such information may be used to compute queries on the base table.

Type: Application

Filed: September 2, 2011

Publication date: March 7, 2013

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Tirthankar Lahiri, Chi-Kim Hoang, Dina Thomas, Kirk Meredith Edson, Subhradyuti Sarkar, Mark McAuliffe, Marie-Anne Neimat, Chih-Ping Wang
System and method for analyzing streams and counting stream items on multi-core processors

Patent number: 8321579

Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.

Type: Grant

Filed: July 26, 2007

Date of Patent: November 27, 2012

Assignee: International Business Machines Corporation

Inventors: Charu Chandra Aggarwal, Rajesh Bordawekar, Dina Thomas, Philip Shilung Yu
User-Defined Parallelization in Transactional Replication of In-Memory Database

Publication number: 20120284228

Abstract: A replication track is a designated group of transactions that are to be replicated at a destination database in a way that, with respect to any other transaction in the replication track, preserves transactional dependency. Further, transactions in a replication track can be replicated at the destination database without regard to transactional dependency of other transactions in another track. This facilitates concurrent parallel replication of transactions of different tracks. Replicating data in this manner is referred to herein as track replication. An application may request execution of transactions and designate different tracks for transactions.

Type: Application

Filed: May 5, 2011

Publication date: November 8, 2012

Inventors: Sourav Ghosh, Rohan Aranha, Tirthankar Lahiri, Mark McAuliffe, Chih-Ping Wang, Paul Tuck, Nagender Bandi, John E. Miller, Dina Thomas, Marie-Anne Neimat
SYSTEM AND METHOD FOR ANALYZING STREAMS AND COUNTING STREAM ITEMS ON MULTI-CORE PROCESSORS

Publication number: 20090031175

Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.

Type: Application

Filed: July 26, 2007

Publication date: January 29, 2009

Inventors: CHARU CHANDRA AGGARWAL, RAJESH BORDAWEKAR, DINA THOMAS, PHILIP SHILUNG YU