Patents by Inventor Shasank K Chavan
Shasank K Chavan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10726016Abstract: Techniques are described herein for sharing a dictionary across multiple in-memory compression units (IMCUs). After a dictionary is used to encode a first column vector in a first IMCU, the same dictionary is used to encode a second column vector in a second IMCU. The entries in the dictionary are in sort order to facilitate binary searching when performing value-to-code look-ups. If, during the encoding of the second column vector, values are encountered for which the dictionary does not already have codes, then a “sort-order-boundary” is established after the last entry in the dictionary, and entries for the newly encountered values are added to the dictionary, after the sort-order-boundary. To facilitate value-to-code look-ups, the new entries are also sorted relative to each other, creating a second “sort order set”. A new version of the dictionary may be created when the number of sort order sets in the first version of the dictionary reaches a configurable threshold.Type: GrantFiled: October 14, 2016Date of Patent: July 28, 2020Assignee: Oracle International CorporationInventors: Shasank K. Chavan, Prashant Gaharwar, Ajit Mylavarapu, Dina Thomas, Dennis Lui, Sheldon A. K. Lewis, Roger D. Macnicol
-
Patent number: 10678788Abstract: Techniques are provided for storing in in-memory unit (IMU) in a lower-storage tier and copying the IMU to DRAM when needed for query processing. Techniques are also provided for copying IMUs to lower tiers of storage when evicted from the cache of higher tiers of storage. Techniques are provided for implementing functionality of IMUs within a storage system, to enable database servers to push tasks, such as filtering, to the storage system where the storage system may access IMUs within its own memory to perform the tasks. Metadata associated with a set of data may be used to indicate whether an IMU for the data should be created by the database server machine or within the storage system.Type: GrantFiled: October 21, 2016Date of Patent: June 9, 2020Assignee: Oracle International CorporationInventors: Roger D. Macnicol, Viral Shah, Xia Hua, Jesse Kamp, Shasank K. Chavan, Maria Colgan, Tirthankar Lahiri, Adrian Tsz Him Ng, Krishnan Meiyyappan, Amit Ganesh, Juan R. Loaiza, Kothanda Umamageswaran, Yiran Qin
-
Patent number: 10678791Abstract: Techniques are described for encoding join columns that belong to the same domain with a common dictionary. The tables are encoded with dictionary indexes that make the comparison operation of a join query a quick equality check of two integers and there is no need to compute any hashes during execution. Additionally, the techniques described herein minimize the bloom filter creation and evaluation cost as well because the dictionary indexes serve as hash values into the bloom filter. If the bloom filter is as large as the range of dictionary indexes, then the filter is no longer a probabilistic structure and can be used to filter rows in the probe phase with full certainty without any significant overhead.Type: GrantFiled: May 22, 2017Date of Patent: June 9, 2020Assignee: Oracle International CorporationInventors: Shasank K. Chavan, Dina Thomas, Ajit Mylavarapu, Prashant Gaharwar, Tirthankar Lahiri, Jesse Kamp
-
Publication number: 20200004736Abstract: A “hybrid derived cache” stores semi-structured data or unstructured text data in an in-memory mirrored form and columns in another form, such as column-major format. The hybrid derived cache may cache scalar type columns in column-major format. The structure of the in-memory mirrored form of semi-structured data or unstructured text data enables and/or enhances access to perform path-based and/or text based query operations. A hybrid derived cache improves cache containment for executing query operations. The in-memory mirrored form is used to compute queries in a transactionally consistent manner through the use of an invalid vector that used to determine when to retrieve the transactionally consistent persistent form of semi-structured data or unstructured text data in lieu of the in-memory form.Type: ApplicationFiled: June 28, 2018Publication date: January 2, 2020Inventors: ZHEN HUA LIU, AUROSISH MISHRA, SHASANK K. CHAVAN, DOUGLAS J. MCMAHON, VIKAS ARORA, HUI JOE CHANG, SHUBHRO JYOTI ROY
-
Patent number: 10372706Abstract: Techniques are described for maintaining an expression statistics store that stores and updates metadata values for query expressions based on the occurrence of those query expressions within queries. In an embodiment, a database server instance receives a database query. In response, the database server instance identifies expressions within the database queries. The database server instance then determines whether an expression statistics store includes an entry for the particular expression. Responsive to determining that the expression statistics store includes an entry for the particular expression, the database server instance updates at least one metadata value in the entry based on the occurrence of the particular expression. Responsive to determining that the expression statistics store does not include an entry for the particular expression, the database server instance adds an entry for the particular expression.Type: GrantFiled: May 4, 2016Date of Patent: August 6, 2019Assignee: Oracle International CorporationInventors: Shasank K. Chavan, Aurosish Mishra, Mohamed Zait, Sunil P. Chakkappen, Can Tuzla, Jiaqi Yan
-
Patent number: 10366083Abstract: Techniques are described for materializing computations in memory. In an embodiment, responsive to a database server instance receiving a query, the database server instance identifies a set of computations for evaluation during execution of the query. Responsive to identifying the set of computations, the database server instance evaluates at least one computation in the set of computations to obtain a first set of computation results for a first computation in the set of computations. After evaluating the at least one computation, the database server instance stores, within an in-memory unit, the first set of computation results. The database server also stores mapping data that maps a set of metadata values associated with the first computation to the first set of computation results.Type: GrantFiled: May 4, 2016Date of Patent: July 30, 2019Assignee: Oracle International CorporationInventors: Shasank K. Chavan, Aurosish Mishra, Amit Ganesh
-
Publication number: 20190220461Abstract: Techniques are described for materializing computations in memory. In an embodiment, responsive to a database server instance receiving a query, the database server instance identifies a set of computations for evaluation during execution of the query. Responsive to identifying the set of computations, the database server instance evaluates at least one computation in the set of computations to obtain a first set of computation results for a first computation in the set of computations. After evaluating the at least one computation, the database server instance stores, within an in-memory unit, the first set of computation results. The database server also stores mapping data that maps a set of metadata values associated with the first computation to the first set of computation results.Type: ApplicationFiled: March 26, 2019Publication date: July 18, 2019Inventors: Shasank K. Chavan, Aurosish Mishra, Amit Ganesh
-
Publication number: 20190102391Abstract: Techniques related to cache storage formats are disclosed. In some embodiments, a set of values is stored in a cache as a set of first representations and a set of second representations. For example, the set of first representations may be a set of hardware-level representations, and the set of second representations may be a set of non-hardware-level representations. Responsive to receiving a query to be executed over the set of values, a determination is made as to whether or not it would be more efficient to execute the query over the set of first representations than to execute the query over the set of second representations. If the determination indicates that it would be more efficient to execute the query over the set of first representations than to execute the query over the set of second representations, the query is executed over the set of first representations.Type: ApplicationFiled: April 2, 2018Publication date: April 4, 2019Inventors: Aurosish Mishra, Shasank K. Chavan, Vinita Subramanian, Ekrem S.C. Soylemez, Adam Kociubes, Eugene Karichkin, Garret F. Swart
-
Publication number: 20190102429Abstract: Techniques related to prefix compression are disclosed. Each value in a set of values has a prefix component and a suffix component. The set of values is divided into at least a first plurality of values and a second plurality of values. Compression of each plurality of values involves determining a prefix component that is shared by each value of the plurality of values. Compression further involves determining a plurality of suffix components belonging to the plurality of values. The prefix component is stored separately from the plurality of suffix components but contiguously with a suffix component of the plurality of suffix components. This enables the prefix component and the suffix component to be read as a value of the plurality of values.Type: ApplicationFiled: June 28, 2018Publication date: April 4, 2019Inventors: Shasank K. Chavan, Xia Hua, Cole Jenkins, Sangho Lee
-
Publication number: 20190095346Abstract: Techniques related to automatic cache management are disclosed. In some embodiments, one or more non-transitory storage media store instructions which, when executed by one or more computing devices, cause performance of an automatic cache management method when a determination is made to store a first set of data in a cache. The method involves determining whether an amount of available space in the cache is less than a predetermined threshold. When the amount of available space in the cache is less than the predetermined threshold, a determination is made as to whether a second set of data has a lower ranking than the first set of data by at least a predetermined amount. When the second set of data has a lower ranking than the first set of data by at least the predetermined amount, the second set of data is evicted. Thereafter, the first set of data is cached.Type: ApplicationFiled: September 27, 2018Publication date: March 28, 2019Inventors: Hariharan Lakshmanan, Dhruvil Shah, Prashant Gaharwar, Shasank K. Chavan, Tirthankar Lahiri, Saraswathy Narayan
-
Patent number: 10229089Abstract: A method and apparatus for efficiently processing data in various formats in a single instruction multiple data (“SIMD”) architecture is presented. Specifically, a method to unpack a fixed-width bit values in a bit stream to a fixed width byte stream in a SIMD architecture is presented. A method to unpack variable-length byte packed values in a byte stream in a SIMD architecture is presented. A method to decompress a run length encoded compressed bit-vector in a SIMD architecture is presented. A method to return the offset of each bit set to one in a bit-vector in a SIMD architecture is presented. A method to fetch bits from a bit-vector at specified offsets relative to a base in a SIMD architecture is presented. A method to compare values stored in two SIMD registers is presented.Type: GrantFiled: June 30, 2017Date of Patent: March 12, 2019Assignee: Oracle International CorporationInventors: Amit Ganesh, Shasank K. Chavan, Vineet Marwah, Jesse Kamp, Anindya C. Patthak, Michael J. Gleeson, Allison L. Holloway, Roger Macnicol
-
Patent number: 10204135Abstract: Techniques are described for materializing pre-computed results of expressions. In an embodiment, a set of one or more column units are stored in volatile or non-volatile memory. Each column unit corresponds to a column that belongs to an on-disk table within a database managed by a database server instance and includes data items from the corresponding column. A set of one or more virtual column units, and data that associates the set of one or more column units with the set of one or more virtual column units, are also stored in memory. The set of one or more virtual column units includes a particular virtual column unit storing results that are derived by evaluating an expression on at least one column of the on-disk table.Type: GrantFiled: May 4, 2016Date of Patent: February 12, 2019Assignee: Oracle International CorporationInventors: Aurosish Mishra, Shasank K. Chavan, Allison L. Holloway, Jesse Kamp, Ramesh Kumar, Zhen Hua Liu, Niloy Mukherjee, Amit Ganesh, Tirthankar Lahiri, Vineet Marwah
-
Patent number: 10120895Abstract: Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data.Type: GrantFiled: January 22, 2018Date of Patent: November 6, 2018Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Jesse Kamp, Amit Ganesh, Vineet Marwah, Vivekanandhan Raja, Tirthankar Lahiri, Allison L. Holloway, Sanket Hase, Shasank K. Chavan, Niloy Mukherjee, Teck Hua Lee, Michael J. Gleeson, Krishna Kunchithapadam
-
Publication number: 20180144023Abstract: Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data.Type: ApplicationFiled: January 22, 2018Publication date: May 24, 2018Inventors: Jesse Kamp, Amit Ganesh, Vineet Marwah, Vivekanandhan Raja, Tirthankar Lahiri, Allison L. Holloway, Sanket Hase, Shasank K. Chavan, Niloy Mukherjee, Teck Hua Lee, Michael J. Gleeson, Krishna Kunchithapadam
-
Patent number: 9965501Abstract: Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.Type: GrantFiled: December 1, 2015Date of Patent: May 8, 2018Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Lawrence J. Ellison, Amit Ganesh, Vineet Marwah, Jesse Kamp, Anindya C. Patthak, Shasank K. Chavan, Michael J. Gleeson, Allison L. Holloway, Manosiz Bhattacharyya
-
Publication number: 20180075105Abstract: Techniques related to efficient evaluation of queries with multiple predicate expressions are disclosed. A first predicate expression (PE) is evaluated against a plurality of rows in a first column vector (CV) to determine that a subset of rows does not satisfy the first PE. The subset comprises less than all of the plurality of rows. When a query specifies the first PE in conjunction with a second PE, a selectivity of the first PE is determined. If the selectivity meets a threshold, the second PE is evaluated against all of the plurality of rows in a second CV. If the selectivity does not meet the threshold, the second PE is evaluated against only the subset of rows in the second CV. When a query specifies the first PE in disjunction with the second PE, the second PE may be evaluated against only the subset of rows in the second CV.Type: ApplicationFiled: September 12, 2017Publication date: March 15, 2018Inventors: Shasank K. Chavan, Dennis Lui, Allison L Holloway, Sheldon A.K. Lewis
-
Publication number: 20180075096Abstract: Techniques related to efficient evaluation of query expressions including grouping clauses are disclosed. Computing device(s) perform a method for aggregating a measure column vector (MCV) according to a plurality of grouping column vectors (GCVs). The MCV and each of the plurality of GCVs may be encoded. The method includes determining a plurality of actual grouping keys (AGKs) and generating a dense identifier (DI) mapping that maps the plurality of AGKs to a plurality of DIs. Each AGK occurs in the plurality of GCVs. Each DI corresponds to a respective workspace. Aggregating the MCV involves aggregating, in each workspace, one or more codes of the MCV that correspond to an AGK mapped to a DI corresponding to the workspace. For a first row of the MCV and the plurality of GCVs, aggregating the one or more codes includes generating a particular grouping key based on codes in the plurality of GCVs.Type: ApplicationFiled: September 12, 2017Publication date: March 15, 2018Inventors: Shasank K. Chavan, Dennis Lui, Allison L. Holloway, Sheldon A.K. Lewis
-
Publication number: 20180075113Abstract: Techniques related to efficient evaluation of aggregate functions are disclosed. Computing device(s) may perform a method for aggregating results of performing a multiplication on a first column and a second column of a database table. A first vector stores a subset of values of the first column. A second vector stores a corresponding subset of values of the second column. When it is determined that the first vector has a lower cardinality than the second vector, a third vector stores at least a first distinct value and a second distinct value of the first vector. A first set of one or more values of the second vector is determined, wherein each value of the first set of one or more values corresponds to the first distinct value in the first vector. A first multiplicand is generated based on performing a summation over the first set of one or more values.Type: ApplicationFiled: September 12, 2017Publication date: March 15, 2018Inventors: Shasank K. Chavan, Dennis Lui, Allison L Holloway, Sheldon A.K. Lewis
-
Patent number: 9886459Abstract: Methods and apparatuses for determining set-membership using Single Instruction Multiple Data (“SIMD”) architecture are presented herein. Specifically, methods and apparatuses are discussed for determining, in parallel, whether multiple values in a first set of values are members of a second set of values. Many of the methods and systems discussed herein are applied to determining whether one or more rows in a dictionary-encoded column of a database table satisfy one or more conditions based on the dictionary-encoded column. However, the methods and systems discussed herein may apply to many applications executed on a SIMD processor using set-membership tests.Type: GrantFiled: July 22, 2014Date of Patent: February 6, 2018Assignee: Oracle International CorporationInventors: Shasank K. Chavan, Phumpong Watanaprakornkul
-
Patent number: 9881048Abstract: Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data.Type: GrantFiled: October 13, 2015Date of Patent: January 30, 2018Assignee: Oracle International CorporationInventors: Jesse Kamp, Amit Ganesh, Vineet Marwah, Vivekanandhan Raja, Tirthankar Lahiri, Allison L. Holloway, Sanket Hase, Shasank K. Chavan, Niloy Mukherjee, Teck Hua Lee, Michael J. Gleeson, Krishna Kunchithapadam