Patents by Inventor Marcin Zukowski

Marcin Zukowski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210200769
    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.
    Type: Application
    Filed: March 12, 2021
    Publication date: July 1, 2021
    Inventors: Florian Andreas Funke, Thierry Cruanes, Benoit Dageville, Marcin Zukowski
  • Patent number: 11048721
    Abstract: A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components.
    Type: Grant
    Filed: October 26, 2020
    Date of Patent: June 29, 2021
    Assignee: SNOWFLAKE INC.
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Allison Waingold Lee, Philipp Thomas Unterbrunner
  • Publication number: 20210182241
    Abstract: A query directed to database data stored across a set of files is received. The query includes predicates and each file from the set of files is associated with metadata stored in a metadata store that is separate from a storage platform that stores the set of files. One or more files are removed from the set of files whose metadata does not satisfy a predicate of the plurality of predicates to generate a pruned set of files. One or more predicates are removed that are satisfied by the metadata of the pruned set of files to generate a modified query.
    Type: Application
    Filed: February 26, 2021
    Publication date: June 17, 2021
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Allison Waingold Lee, Philipp Thomas Unterbrunner
  • Patent number: 11036758
    Abstract: A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: June 15, 2021
    Assignee: Snowflake Inc.
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Allison Waingold Lee, Philipp Thomas Unterbrunner
  • Publication number: 20210157820
    Abstract: A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components.
    Type: Application
    Filed: January 5, 2021
    Publication date: May 27, 2021
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Allison Waingold Lee, Philipp Thomas Unterbrunner
  • Patent number: 11010407
    Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization/Furthermore, the device processes the set of queries using the updated set of processors.
    Type: Grant
    Filed: June 18, 2020
    Date of Patent: May 18, 2021
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Benoit Dageville, Marcin Zukowski
  • Patent number: 10997215
    Abstract: The subject technology creates partitions based on changes to a table, at least one of the one or more partitions overlapping with respect to values of one or more attributes with at least one of another partition and a previous partition. The subject technology maintains states for the partitions, each state from the plurality of states representing a particular degree of clustering of the table. The subject technology determines a number of overlapping partitions and a depth of the overlapping partitions, and determines a clustering ratio based at least in part on the number of overlapping partitions and the depth. The subject technology reclusters partitions of the table to increase the clustering ratio, the clustering ratio determined by at least a proportion of rows in a layout of the table that satisfy an ordering criteria based at least in part a particular attribute of the one or more attributes.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: May 4, 2021
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Marcin Zukowski, Benoit Dageville, Jiaqi Yan
  • Patent number: 10997201
    Abstract: A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components.
    Type: Grant
    Filed: July 19, 2018
    Date of Patent: May 4, 2021
    Assignee: SNOWFLAKE INC.
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Allison Waingold Lee, Philipp Thomas Unterbrunner
  • Publication number: 20210124761
    Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization/ Furthermore, the device processes the set of queries using the updated set of processors.
    Type: Application
    Filed: January 4, 2021
    Publication date: April 29, 2021
    Inventors: Thierry CRUANES, Benoit DAGEVILLE, Marcin ZUKOWSKI
  • Publication number: 20210124717
    Abstract: A system and method for pruning data based on metadata. The method may include receiving a query comprising a plurality of predicates and identifying one or more applicable files comprising database data satisfying at least one of the plurality of predicates. The identifying the one or more applicable files including reading metadata stored in a metadata store that is separate from the database data. The method further includes pruning inapplicable files comprising database data that does not satisfy at least one of the plurality of predicates to create a reduced set of files and reading the reduced set of files to execute the query.
    Type: Application
    Filed: January 4, 2021
    Publication date: April 29, 2021
    Inventors: Thierry CRUANES, Benoit DAGEVILLE, Ashish MOTIVALA, Marcin ZUKOWSKI
  • Publication number: 20210103601
    Abstract: Caching systems and methods are described. In one implementation, a method identifies multiple files used to process a query and distributes each of the multiple files to a particular execution node to execute the query. Each execution node determines whether the distributed file is stored in the execution node's cache. If the execution node determines that the file is stored in the cache, it processes the query using the cached file. If the file is not stored in the cache, the execution node retrieves the file from a remote storage device, stores the file in the execution node's cache, and processes the query using the file.
    Type: Application
    Filed: December 17, 2020
    Publication date: April 8, 2021
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski
  • Publication number: 20210103600
    Abstract: Caching systems and methods are described. In one implementation, a method identifies multiple files used to process a query and distributes each of the multiple files to a particular execution node to execute the query. Each execution node determines whether the distributed file is stored in the execution node's cache. If the execution node determines that the file is stored in the cache, it processes the query using the cached file. If the file is not stored in the cache, the execution node retrieves the file from a remote storage device, stores the file in the execution node's cache, and processes the query using the file.
    Type: Application
    Filed: December 16, 2020
    Publication date: April 8, 2021
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski
  • Patent number: 10970282
    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.
    Type: Grant
    Filed: June 11, 2018
    Date of Patent: April 6, 2021
    Assignee: Snowflake Inc.
    Inventors: Florian Andreas Funke, Thierry Cruanes, Benoit Dageville, Marcin Zukowski
  • Patent number: 10970283
    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.
    Type: Grant
    Filed: December 17, 2019
    Date of Patent: April 6, 2021
    Assignee: Snowflake Inc.
    Inventors: Florian Andreas Funke, Thierry Cruanes, Benoit Dageville, Marcin Zukowski
  • Patent number: 10963428
    Abstract: A system, apparatus, and method for processing queries wherein the query includes a request to access or delete data and accessing metadata associated with the set of data, the metadata defining data characteristics of the set of data and identifying at least sets of data that need or not need to be accessed or deleted based on the metadata without accessing the actual data in the set of data; also methods to optimize processing of some operations based on the collected metadata on data.
    Type: Grant
    Filed: January 22, 2020
    Date of Patent: March 30, 2021
    Assignee: Snowflake Inc.
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Allison Waingold Lee, Philipp Thomas Unterbrunner
  • Publication number: 20210089554
    Abstract: Example resource management systems and methods are described. In one implementation, a resource manager is configured to manage data processing tasks associated with multiple data elements. An execution platform is coupled to the resource manager and includes multiple execution nodes configured to store data retrieved from multiple remote storage devices. Each execution node includes a cache and a processor, where the cache and processor are independent of the remote storage devices. A metadata manager is configured to access metadata associated with at least a portion of the multiple data elements.
    Type: Application
    Filed: December 4, 2020
    Publication date: March 25, 2021
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski
  • Patent number: 10949446
    Abstract: Example resource provisioning systems and methods are described. In one implementation, multiple processing resources are provided within a data warehouse. The processing resources include at least one processor and at least one storage device. At least one query to process database data is received. At least some of the processing resources may process the database data. When a processing capacity of the processing resources has reached a threshold processing capacity, the processing capacity is automatically scaled by adding at least one additional processor to the data warehouse.
    Type: Grant
    Filed: April 29, 2020
    Date of Patent: March 16, 2021
    Assignee: Snowflake Inc.
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski
  • Publication number: 20210056091
    Abstract: A computer system hosting a column-store database engine is responsive to database requests for the update and retrieval of data from within a stable data table and providing for the storage of database tuples within a column-store organized database structure. A positional delta tree data structure is implemented in the memory space of the database engine and is operatively coupled in an update data transfer path between a database engine interface and the stable data table. The positional delta tree data structure includes a differential data storage layer operative to store differential update data values in positionally defined relative reference to database tuples stored by the stable data table.
    Type: Application
    Filed: November 6, 2020
    Publication date: February 25, 2021
    Inventors: Sandor ABC HEMAN, Peter A BONCZ, Marcin ZUKOWSKI, Nicolaas J NES
  • Publication number: 20210049187
    Abstract: A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components.
    Type: Application
    Filed: October 30, 2020
    Publication date: February 18, 2021
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Allison Waingold Lee, Philipp Thomas Unterbrunner
  • Publication number: 20210042326
    Abstract: A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components.
    Type: Application
    Filed: October 26, 2020
    Publication date: February 11, 2021
    Inventors: Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Allison Waingold Lee, Philipp Thomas Unterbrunner