Patents by Inventor Dina Thomas
Dina Thomas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11860830Abstract: Columns of a table are stored in either row-major format or column-major format in an in-memory DBMS. For a given table, one set of columns is stored in column-major format; another set of columns for a table are stored in row-major format. This way of storing columns of a table is referred to herein as dual-major format. In addition, a row in a dual-major table is updated “in-place”, that is, updates are made directly to column-major columns without creating an interim row-major form of the column-major columns of the row. Users may submit database definition language (“DDL”) commands that declare the row-major columns and column-major columns of a table.Type: GrantFiled: February 27, 2019Date of Patent: January 2, 2024Assignee: Oracle International CorporationInventors: Tirthankar Lahiri, Martin A. Reames, Kirk Edson, Neelam Goyal, Kao Makino, Anindya Patthak, Dina Thomas, Subhradyuti Sarkar, Chi-Kim Hoang, Qingchun Jiang
-
Patent number: 11294816Abstract: Techniques are described herein for reducing the number of redundant evaluations that occur when an expression is evaluated against an encoded column vector by caching results of expression evaluations. When executing a query that includes an expression that references columns for which dictionary-encoded column vectors exist, the database server performs a cost-based analysis to determine which expressions (or sub-expressions) would benefit from caching the expression's evaluation result. For each such expression, the database server performs the necessary computations and caches the results for each of the possible distinct input values. When evaluating an expression for a row with a particular set of input codes, a look-up is performed based on the input code combination to retrieve the pre-computed results of that evaluation from the cache.Type: GrantFiled: January 3, 2017Date of Patent: April 5, 2022Assignee: Oracle International CorporationInventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Dennis Lui, Sheldon A. K. Lewis
-
Patent number: 10756759Abstract: In column domain dictionary compression, column values in one or more columns are tokenized by a single dictionary. The domain of the dictionary is the entire set of columns. A dictionary may not only map a token to a tokenized value, but also to a count (“token count”) of the number of occurrences of the token and corresponding tokenized value in the dictionary's domain. Such information may be used to compute queries on the base table.Type: GrantFiled: September 2, 2011Date of Patent: August 25, 2020Assignee: Oracle International CorporationInventors: Tirthankar Lahiri, Chi-Kim Hoang, Dina Thomas, Kirk Meredith Edson, Subhradyuti Sarkar, Mark McAuliffe, Marie-Anne Neimat, Chih-Ping Wang
-
Patent number: 10726016Abstract: Techniques are described herein for sharing a dictionary across multiple in-memory compression units (IMCUs). After a dictionary is used to encode a first column vector in a first IMCU, the same dictionary is used to encode a second column vector in a second IMCU. The entries in the dictionary are in sort order to facilitate binary searching when performing value-to-code look-ups. If, during the encoding of the second column vector, values are encountered for which the dictionary does not already have codes, then a “sort-order-boundary” is established after the last entry in the dictionary, and entries for the newly encountered values are added to the dictionary, after the sort-order-boundary. To facilitate value-to-code look-ups, the new entries are also sorted relative to each other, creating a second “sort order set”. A new version of the dictionary may be created when the number of sort order sets in the first version of the dictionary reaches a configurable threshold.Type: GrantFiled: October 14, 2016Date of Patent: July 28, 2020Assignee: Oracle International CorporationInventors: Shasank K. Chavan, Prashant Gaharwar, Ajit Mylavarapu, Dina Thomas, Dennis Lui, Sheldon A. K. Lewis, Roger D. Macnicol
-
Patent number: 10678791Abstract: Techniques are described for encoding join columns that belong to the same domain with a common dictionary. The tables are encoded with dictionary indexes that make the comparison operation of a join query a quick equality check of two integers and there is no need to compute any hashes during execution. Additionally, the techniques described herein minimize the bloom filter creation and evaluation cost as well because the dictionary indexes serve as hash values into the bloom filter. If the bloom filter is as large as the range of dictionary indexes, then the filter is no longer a probabilistic structure and can be used to filter rows in the probe phase with full certainty without any significant overhead.Type: GrantFiled: May 22, 2017Date of Patent: June 9, 2020Assignee: Oracle International CorporationInventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Tirthankar Lahiri, Jesse Kamp
-
Publication number: 20190197026Abstract: Columns of a table are stored in either row-major format or column-major format in an in-memory DBMS. For a given table, one set of columns is stored in column-major format; another set of columns for a table are stored in row-major format. This way of storing columns of a table is referred to herein as dual-major format. In addition, a row in a dual-major table is updated “in-place”, that is, updates are made directly to column-major columns without creating an interim row-major form of the column-major columns of the row. Users may submit database definition language (“DDL”) commands that declare the row-major columns and column-major columns of a table.Type: ApplicationFiled: February 27, 2019Publication date: June 27, 2019Inventors: TIRTHANKAR LAHIRI, MARTIN A. REAMES, KIRK EDSON, NEELAM GOYAL, KAO MAKINO, ANINDYA PATTHAK, DINA THOMAS, SUBHRADYUTI SARKAR, CHI-KIM HOANG, QINGCHUN JIANG
-
Patent number: 10311154Abstract: Columns of a table are stored in either row-major format or column-major format in an in-memory DBMS. For a given table, one set of columns is stored in column-major format; another set of columns for a table are stored in row-major format. This way of storing columns of a table is referred to herein as dual-major format. In addition, a row in a dual-major table is updated “in-place”, that is, updates are made directly to column-major columns without creating an interim row-major form of the column-major columns of the row. Users may submit database definition language (“DDL”) commands that declare the row-major columns and column-major columns of a table.Type: GrantFiled: December 5, 2013Date of Patent: June 4, 2019Assignee: Oracle International CorporationInventors: Tirthankar Lahiri, Martin A. Reames, Kirk Edson, Neelam Goyal, Kao Makino, Anindya Patthak, Dina Thomas, Subhradyuti Sarkar, Chi-Kim Hoang, Qingchun Jiang
-
Publication number: 20170255675Abstract: Techniques are described for encoding join columns that belong to the same domain with a common dictionary. The tables are encoded with dictionary indexes that make the comparison operation of a join query a quick equality check of two integers and there is no need to compute any hashes during execution. Additionally, the techniques described herein minimize the bloom filter creation and evaluation cost as well because the dictionary indexes serve as hash values into the bloom filter. If the bloom filter is as large as the range of dictionary indexes, then the filter is no longer a probabilistic structure and can be used to filter rows in the probe phase with full certainty without any significant overhead.Type: ApplicationFiled: May 22, 2017Publication date: September 7, 2017Inventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Tirthankar Lahiri, Jesse Kamp
-
Publication number: 20170116242Abstract: Techniques are described herein for reducing the number of redundant evaluations that occur when an expression is evaluated against an encoded column vector by caching results of expression evaluations. When executing a query that includes an expression that references columns for which dictionary-encoded column vectors exist, the database server performs a cost-based analysis to determine which expressions (or sub-expressions) would benefit from caching the expression's evaluation result. For each such expression, the database server performs the necessary computations and caches the results for each of the possible distinct input values. When evaluating an expression for a row with a particular set of input codes, a look-up is performed based on the input code combination to retrieve the pre-computed results of that evaluation from the cache.Type: ApplicationFiled: January 3, 2017Publication date: April 27, 2017Inventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Dennis Lui, Sheldon A.K. Lewis
-
Publication number: 20170109406Abstract: Techniques are described herein for sharing a dictionary across multiple in-memory compression units (IMCUs). After a dictionary is used to encode a first column vector in a first IMCU, the same dictionary is used to encode a second column vector in a second IMCU. The entries in the dictionary are in sort order to facilitate binary searching when performing value-to-code look-ups. If, during the encoding of the second column vector, values are encountered for which the dictionary does not already have codes, then a “sort-order-boundary” is established after the last entry in the dictionary, and entries for the newly encountered values are added to the dictionary, after the sort-order-boundary. To facilitate value-to-code look-ups, the new entries are also sorted relative to each other, creating a second “sort order set”. A new version of the dictionary may be created when the number of sort order sets in the first version of the dictionary reaches a configurable threshold.Type: ApplicationFiled: October 14, 2016Publication date: April 20, 2017Inventors: Shasank K. Chavan, Prashant Gaharwar, Ajit Mylavarapu, Dina Thomas, Dennis Lui, Sheldon A.K. Lewis, Roger D. Macnicol
-
Publication number: 20150088813Abstract: Columns of a table are stored in either row-major format or column-major format in an in-memory DBMS. For a given table, one set of columns is stored in column-major format; another set of columns for a table are stored in row-major format. This way of storing columns of a table is referred to herein as dual-major format. In addition, a row in a dual-major table is updated “in-place”, that is, updates are made directly to column-major columns without creating an interim row-major form of the column-major columns of the row. Users may submit database definition language (“DDL”) commands that declare the row-major columns and column-major columns of a table.Type: ApplicationFiled: December 5, 2013Publication date: March 26, 2015Applicant: Oracle International CorporationInventors: Tirthankar Lahiri, Martin A. Reames, Kirk Edson, Neelam Goyal, Kao Makino, Anindya Patthak, Dina Thomas, Subhradyuti Sarkar, Chi-Kim Hoang, Qingchun Jiang
-
Patent number: 8738568Abstract: A replication track is a designated group of transactions that are to be replicated at a destination database in a way that, with respect to any other transaction in the replication track, preserves transactional dependency. Further, transactions in a replication track can be replicated at the destination database without regard to transactional dependency of other transactions in another track. This facilitates concurrent parallel replication of transactions of different tracks. Replicating data in this manner is referred to herein as track replication. An application may request execution of transactions and designate different tracks for transactions.Type: GrantFiled: May 5, 2011Date of Patent: May 27, 2014Assignee: Oracle International CorporationInventors: Sourav Ghosh, Rohan Aranha, Tirthankar Lahiri, Mark McAuliffe, Chih-Ping Wang, Paul Tuck, Nagender Bandi, John E. Miller, Dina Thomas, Marie-Anne Neimat
-
Publication number: 20130060780Abstract: In column domain dictionary compression, column values in one or more columns are tokenized by a single dictionary. The domain of the dictionary is the entire set of columns. A dictionary may not only map a token to a tokenized value, but also to a count (“token count”) of the number of occurrences of the token and corresponding tokenized value in the dictionary's domain. Such information may be used to compute queries on the base table.Type: ApplicationFiled: September 2, 2011Publication date: March 7, 2013Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Tirthankar Lahiri, Chi-Kim Hoang, Dina Thomas, Kirk Meredith Edson, Subhradyuti Sarkar, Mark McAuliffe, Marie-Anne Neimat, Chih-Ping Wang
-
Patent number: 8321579Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.Type: GrantFiled: July 26, 2007Date of Patent: November 27, 2012Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Rajesh Bordawekar, Dina Thomas, Philip Shilung Yu
-
Publication number: 20120284228Abstract: A replication track is a designated group of transactions that are to be replicated at a destination database in a way that, with respect to any other transaction in the replication track, preserves transactional dependency. Further, transactions in a replication track can be replicated at the destination database without regard to transactional dependency of other transactions in another track. This facilitates concurrent parallel replication of transactions of different tracks. Replicating data in this manner is referred to herein as track replication. An application may request execution of transactions and designate different tracks for transactions.Type: ApplicationFiled: May 5, 2011Publication date: November 8, 2012Inventors: Sourav Ghosh, Rohan Aranha, Tirthankar Lahiri, Mark McAuliffe, Chih-Ping Wang, Paul Tuck, Nagender Bandi, John E. Miller, Dina Thomas, Marie-Anne Neimat
-
Publication number: 20090031175Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.Type: ApplicationFiled: July 26, 2007Publication date: January 29, 2009Inventors: CHARU CHANDRA AGGARWAL, RAJESH BORDAWEKAR, DINA THOMAS, PHILIP SHILUNG YU