Patents by Inventor Prateek Gaur

Prateek Gaur has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Unified structured and semi-structured data types in database systems

Patent number: 12321361

Abstract: The subject technology receives a query, the query referencing a unified representation for structured type data and semi-structured type data, the unified representation being provided in storage and in memory during query processing, the unified representation comprising a set of structured type fields that include a set of semi-structured typed fields that enables type safety and enforcement for the set of structured type fields, and flexibility for the set of semi-structured typed fields in a same column, the unified representation in storage including type information for the semi-structured type data as part of the semi-structured type data, the unified representation being utilized for structured type data and semi-structured type data. The subject technology processes the query using the unified representation stored in the memory, the unified representation providing performance parity between structured type data and semi-structured type data.

Type: Grant

Filed: October 30, 2023

Date of Patent: June 3, 2025

Assignee: Snowflake Inc.

Inventors: Xinzhu Cai, Bowei Chen, Prateek Gaur, Dmitry A. Lychagin, Muthunagappan Muthuraman, Zhuo Peng, Mengran Wang, Jiaqi Yan
Query Execution On Compressed In-Memory Data

Publication number: 20250005079

Abstract: Query execution on compressed in-memory data includes receiving, at a processor of an instance of a distributed in-memory database, a query for data from a table stored in the distributed in-memory database as compressed table data, obtaining results data responsive to the query from the table, and outputting the results data for presentation to a user. Obtaining results data includes allocating memory to identify allocated memory for decompressing the compressed table data, obtaining uncompressed table data by decompressing the compressed table data into the allocated memory, and obtaining the results data from the uncompressed table data. The allocated memory is deallocated in response to obtaining the results data. Compressing a table to form compressed table data is also described.

Type: Application

Filed: September 11, 2024

Publication date: January 2, 2025

Inventors: Satyam Shekhar, Prateek Gaur, Amit Prakash, Abhishek Rai
UNIFIED STRUCTURED AND SEMI-STRUCTURED DATA TYPES IN DATABASE SYSTEMS

Publication number: 20240427790

Abstract: The subject technology receives a query, the query referencing a unified representation for structured type data and semi-structured type data, the unified representation being provided in storage and in memory during query processing, the unified representation comprising a set of structured type fields that include a set of semi-structured typed fields that enables type safety and enforcement for the set of structured type fields, and flexibility for the set of semi-structured typed fields in a same column, the unified representation in storage including type information for the semi-structured type data as part of the semi-structured type data, the unified representation being utilized for structured type data and semi-structured type data. The subject technology processes the query using the unified representation stored in the memory, the unified representation providing performance parity between structured type data and semi-structured type data.

Type: Application

Filed: October 30, 2023

Publication date: December 26, 2024

Inventors: Xinzhu Cai, Bowei Chen, Prateek Gaur, Dmitry A. Lychagin, Muthunagappan Muthuraman, Zhuo Peng, Mengran Wang, Jiaqi Yan
Compacted table data files validation

Patent number: 12174819

Abstract: A first replay log is replayed to generate a first replay result. Replaying the first replay log includes replacing, in the first replay result, a first value of a first field included in a first command in the first replay log with a first hash value responsive to a determination that the first field is not utilized as a condition in at least one command included in the first replay log. A second replay log is replayed to generate a second replay result. The first replay result and the second replay result are compared to verify that the first replay log and the second replay log are equivalent.

Type: Grant

Filed: April 17, 2023

Date of Patent: December 24, 2024

Assignee: ThoughtSpot, Inc.

Inventors: Sandeep Gottimukkala, Nitin Motiani, Prateek Gaur
Aggregation Operations In A Distributed Database

Publication number: 20240354303

Abstract: A distributed database that includes multiple database instances receives a data-query that includes an aggregation clause on a first column of a table. The table is partitioned into shards according to a sharding criterion based on the first column such that all rows having the same value for the first column are included in the same shard. The shards are distributed to the multiple database instances. Respective intermediate results are received from at least some of the database instances. Each intermediate result received from a respective database instance that includes a respective shard aggregates values of the first column in the respective shard. The respective intermediate results are combined to obtain a final result of the data-query. The final result is then output.

Type: Application

Filed: June 26, 2024

Publication date: October 24, 2024

Inventors: Ashok Anand, Ambareesh Sreekumaran Nair Jayakumari, Prateek Gaur, Donko Donjerkovic
Query execution on compressed in-memory data

Patent number: 12118041

Abstract: Query execution on compressed in-memory data includes receiving, at a processor of an instance of a distributed in-memory database, a query for data from a table stored in the distributed in-memory database as compressed table data, obtaining results data responsive to the query from the table, and outputting the results data for presentation to a user. Obtaining results data includes allocating memory to identify allocated memory for decompressing the compressed table data, obtaining uncompressed table data by decompressing the compressed table data into the allocated memory, and obtaining the results data from the uncompressed table data. The allocated memory is deallocated in response to obtaining the results data. Compressing a table to form compressed table data is also described.

Type: Grant

Filed: October 13, 2020

Date of Patent: October 15, 2024

Assignee: ThoughtSpot, Inc.

Inventors: Satyam Shekhar, Prateek Gaur, Amit Prakash, Abhishek Rai
Aggregation operations in a distributed database

Patent number: 12038929

Abstract: Query planning in a distributed database that includes a table partitioned into shards according to a sharding criterion and distributed to database instances includes receiving a data-query. The data-query includes a “distinct count” clause on a first column and a “group by” clause on least a second column. A query plan is formulated to include respective instructions for converting, at at least some of the database instances, distinct values of the first column grouped by values of the second column into a count of the distinct values grouped by the values of the second column to obtain respective intermediate results; instructions for receiving the respective intermediate results from at least a subset of the at least some of the database instances; and instructions for concatenating the respective intermediate results using a summing operation to obtain the first “distinct count” of the first column grouped by the second column.

Type: Grant

Filed: June 13, 2023

Date of Patent: July 16, 2024

Assignee: ThoughtSpot, Inc.

Inventors: Ashok Anand, Ambareesh Sreekumaran Nair Jayakumari, Prateek Gaur, Donko Donjerkovic
Distributed pseudo-random subset generation

Patent number: 11836136

Abstract: Distributed pseudo-random subset generation includes obtaining a data-query indicating a first table having a first column including unique values, a second table having a second column including unique values, a join clause joining the first table and the second table on the first column and the second column, and a limit value, pseudo-random filtering the first table to obtain left intermediate data and left filtering criteria, pseudo-random filtering the second table to obtain right intermediate data and right filtering criteria, obtaining intermediate results data by full outer joining the left intermediate data and the right intermediate data, obtaining results data by filtering the intermediate results data using most-restrictive filtering criteria among the left filtering criteria and the right filtering criteria, and outputting the results data, wherein outputting the results data includes limiting the cardinality of rows of the results data to be at most the limit value.

Type: Grant

Filed: December 6, 2022

Date of Patent: December 5, 2023

Assignee: ThoughtSpot, Inc.

Inventors: Donko Donjerkovic, Prateek Gaur, Eric Musser
Aggregation Operations In A Distributed Database

Publication number: 20230325388

Abstract: Query planning in a distributed database that includes a table partitioned into shards according to a sharding criterion and distributed to database instances includes receiving a data-query. The data-query includes a “distinct count” clause on a first column and a “group by” clause on least a second column. A query plan is formulated to include respective instructions for converting, at at least some of the database instances, distinct values of the first column grouped by values of the second column into a count of the distinct values grouped by the values of the second column to obtain respective intermediate results; instructions for receiving the respective intermediate results from at least a subset of the at least some of the database instances; and instructions for concatenating the respective intermediate results using a summing operation to obtain the first “distinct count” of the first column grouped by the second column.

Type: Application

Filed: June 13, 2023

Publication date: October 12, 2023

Inventors: Ashok Anand, Ambareesh Sreekumaran Nair Jayakumari, Prateek Gaur, Donko Donjerkovic
Compacted Table Data Files Validation

Publication number: 20230252016

Abstract: A first replay log is replayed to generate a first replay result. Replaying the first replay log includes replacing, in the first replay result, a first value of a first field included in a first command in the first replay log with a first hash value responsive to a determination that the first field is not utilized as a condition in at least one command included in the first replay log. A second replay log is replayed to generate a second replay result. The first replay result and the second replay result are compared to verify that the first replay log and the second replay log are equivalent.

Type: Application

Filed: April 17, 2023

Publication date: August 10, 2023

Inventors: Sandeep Gottimukkala, Nitin Motiani, Prateek Gaur
Aggregation operations in a distributed database

Patent number: 11720570

Abstract: Querying a distributed database including a table sharded into shards distributed to database instances includes receiving a data-query that includes an aggregation clause on a first column and a grouping clause on a second column; obtaining and outputting results data. Obtaining the results data includes receiving, by a query coordinator, intermediate results data; and combining, by the query coordinator, the intermediate results to obtain the results data.

Type: Grant

Filed: March 26, 2021

Date of Patent: August 8, 2023

Assignee: ThoughtSpot, Inc.

Inventors: Ashok Anand, Ambareesh Sreekumaran Nair Jayakumari, Prateek Gaur, Donko Donjerkovic
Compacted table data files validation

Patent number: 11657032

Abstract: Database replay log compaction verification includes identifying at least one replay log of a table that includes first database manipulation commands; obtaining a compacted replay log that includes second database manipulation commands that are insert commands, where an insert command includes a column and a corresponding value; replaying, to obtain a first replay result, the first database manipulation commands; replaying, to obtain a second replay result, the second database manipulation commands; and, responsive to one row of the first replay result not matching a corresponding row of the second replay result, sending a notification including a non-match. Replaying the first database manipulation commands includes identifying condition columns of the table; responsive to the condition columns not including the column, obtaining a row corresponding to the insert command, where the row includes a modified value of the corresponding value of the column; and adding the row to the first replay result.

Type: Grant

Filed: July 30, 2021

Date of Patent: May 23, 2023

Assignee: ThoughtSpot, Inc.

Inventors: Sandeep Gottimukkala, Nitin Motiani, Prateek Gaur
Distributed Pseudo-Random Subset Generation

Publication number: 20230117794

Abstract: Distributed pseudo-random subset generation includes obtaining a data-query indicating a first table having a first column including unique values, a second table having a second column including unique values, a join clause joining the first table and the second table on the first column and the second column, and a limit value, pseudo-random filtering the first table to obtain left intermediate data and left filtering criteria, pseudo-random filtering the second table to obtain right intermediate data and right filtering criteria, obtaining intermediate results data by full outer joining the left intermediate data and the right intermediate data, obtaining results data by filtering the intermediate results data using most-restrictive filtering criteria among the left filtering criteria and the right filtering criteria, and outputting the results data, wherein outputting the results data includes limiting the cardinality of rows of the results data to be at most the limit value.

Type: Application

Filed: December 6, 2022

Publication date: April 20, 2023

Inventors: Donko Donjerkovic, Prateek Gaur, Eric Musser
State-Sequence Pathing

Publication number: 20230083123

Abstract: State-sequence pathing in a low-latency data access and analysis system includes obtaining, by the low-latency data access and analysis system, predicate data responsive to a request for data expressed in previously obtained data expressing usage intent, obtaining, by the low-latency data access and analysis system, state-sequence pathing criteria identified with respect to the predicate data, obtaining, by the low-latency data access and analysis system, state-sequence path data in accordance with the predicate data and the state-sequence pathing criteria, wherein the state-sequence path data aggregates data representing multiple state-sequence paths, wherein a respective state-sequence path represents an ordered sequence of states of a system, wherein the states are represented individually by the predicate data, generating, by the low-latency data access and analysis system, state-sequence path visualization data for presenting a visualization of the state-sequence path data, and outputting, by the low-lat

Type: Application

Filed: September 6, 2022

Publication date: March 16, 2023

Inventors: Ashok Anand, Tushar Marda, Bhanu Prakash, Sreenivas Kandhade, Sandeep Gottimukkala, Jibin Thomas, Prateek Gaur, Amit Prakash
Distributed pseudo-random subset generation

Patent number: 11580111

Abstract: Distributed pseudo-random subset generation includes obtaining a data-query indicating a first table having a first column including unique values, a second table having a second column including unique values, a join clause joining the first table and the second table on the first column and the second column, and a limit value, pseudo-random filtering the first table to obtain left intermediate data and left filtering criteria, pseudo-random filtering the second table to obtain right intermediate data and right filtering criteria, obtaining intermediate results data by full outer joining the left intermediate data and the right intermediate data, obtaining results data by filtering the intermediate results data using most-restrictive filtering criteria among the left filtering criteria and the right filtering criteria, and outputting the results data, wherein outputting the results data includes limiting the cardinality of rows of the results data to be at most the limit value.

Type: Grant

Filed: April 6, 2021

Date of Patent: February 14, 2023

Assignee: ThoughtSpot, Inc.

Inventors: Donko Donjerkovic, Prateek Gaur, Eric Musser
Compacted Table Data Files Validation

Publication number: 20230035166

Abstract: Database replay log compaction verification includes identifying at least one replay log of a table that includes first database manipulation commands; obtaining a compacted replay log that includes second database manipulation commands that are insert commands, where an insert command includes a column and a corresponding value; replaying, to obtain a first replay result, the first database manipulation commands; replaying, to obtain a second replay result, the second database manipulation commands; and, responsive to one row of the first replay result not matching a corresponding row of the second replay result, sending a notification including a non-match. Replaying the first database manipulation commands includes identifying condition columns of the table; responsive to the condition columns not including the column, obtaining a row corresponding to the insert command, where the row includes a modified value of the corresponding value of the column; and adding the row to the first replay result.

Type: Application

Filed: July 30, 2021

Publication date: February 2, 2023

Inventors: Sandeep Gottimukkala, Nitin Motiani, Prateek Gaur
Distributed Pseudo-Random Subset Generation

Publication number: 20220318243

Abstract: Distributed pseudo-random subset generation includes obtaining a data-query indicating a first table having a first column including unique values, a second table having a second column including unique values, a join clause joining the first table and the second table on the first column and the second column, and a limit value, pseudo-random filtering the first table to obtain left intermediate data and left filtering criteria, pseudo-random filtering the second table to obtain right intermediate data and right filtering criteria, obtaining intermediate results data by full outer joining the left intermediate data and the right intermediate data, obtaining results data by filtering the intermediate results data using most-restrictive filtering criteria among the left filtering criteria and the right filtering criteria, and outputting the results data, wherein outputting the results data includes limiting the cardinality of rows of the results data to be at most the limit value.

Type: Application

Filed: April 6, 2021

Publication date: October 6, 2022

Inventors: Donko Donjerkovic, Prateek Gaur, Eric Musser
Aggregation Operations In A Distributed Database

Publication number: 20220309067

Abstract: Querying a distributed database including a table sharded into shards distributed to database instances includes receiving a data-query that includes an aggregation clause on a first column and a grouping clause on a second column; obtaining and outputting results data. Obtaining the results data includes receiving, by a query coordinator, intermediate results data; and combining, by the query coordinator, the intermediate results to obtain the results data.

Type: Application

Filed: March 26, 2021

Publication date: September 29, 2022

Inventors: Ashok Anand, Ambareesh Sreekumaran Nair Jayakumari, Prateek Gaur, Donko Donjerkovic
Machine language query management for low-latency database analysis system

Patent number: 11429607

Abstract: Data-query execution with distributed machine-language query management in a low-latency database analysis system may include obtaining, at a distributed in-memory database, a data-query expressing a request for data in a defined structured query language associated with the distributed in-memory database, automatically generating a high-level language query representing at least a portion of the data-query, obtaining a machine language query corresponding to the high-level language query, executing the machine language query to obtain results data, and outputting the results data. Obtaining the machine language query may include determining whether the machine language query is cached, and in response to a determination that the machine language query is unavailable, sending a request for the machine language query to a distributed machine-language-query management instance.

Type: Grant

Filed: September 18, 2020

Date of Patent: August 30, 2022

Assignee: ThoughtSpot, Inc.

Inventors: Ashok Anand, Satyam Shekhar, Prateek Gaur, Amit Prakash
Query Execution On Compressed In-Memory Data

Publication number: 20210109974

Abstract: Query execution on compressed in-memory data includes receiving, at a processor of an instance of a distributed in-memory database, a query for data from a table stored in the distributed in-memory database as compressed table data, obtaining results data responsive to the query from the table, and outputting the results data for presentation to a user. Obtaining results data includes allocating memory to identify allocated memory for decompressing the compressed table data, obtaining uncompressed table data by decompressing the compressed table data into the allocated memory, and obtaining the results data from the uncompressed table data. The allocated memory is deallocated in response to obtaining the results data. Compressing a table to form compressed table data is also described.

Type: Application

Filed: October 13, 2020

Publication date: April 15, 2021

Inventors: Satyam Shekhar, Prateek Gaur, Amit Prakash, Abhishek Rai

1 2 next