Patents by Inventor Xinzhu Cai

Xinzhu Cai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240427790
    Abstract: The subject technology receives a query, the query referencing a unified representation for structured type data and semi-structured type data, the unified representation being provided in storage and in memory during query processing, the unified representation comprising a set of structured type fields that include a set of semi-structured typed fields that enables type safety and enforcement for the set of structured type fields, and flexibility for the set of semi-structured typed fields in a same column, the unified representation in storage including type information for the semi-structured type data as part of the semi-structured type data, the unified representation being utilized for structured type data and semi-structured type data. The subject technology processes the query using the unified representation stored in the memory, the unified representation providing performance parity between structured type data and semi-structured type data.
    Type: Application
    Filed: October 30, 2023
    Publication date: December 26, 2024
    Inventors: Xinzhu Cai, Bowei Chen, Prateek Gaur, Dmitry A. Lychagin, Muthunagappan Muthuraman, Zhuo Peng, Mengran Wang, Jiaqi Yan
  • Publication number: 20240419663
    Abstract: Provided herein are systems, methods, and computer-storage media for managing data skew in hash join operations. A skew manager partitions build-side row data into multiple sets corresponding to hash-join-build (HJB) instances based on hash values. The skew manager detects skew in a build-side row set associated with a first HJB instance by analyzing the number of rows. Upon detecting skew, the skew manager redirects data rows to at least a second HJB instance. The method involves configuring skew caches, generating histograms, and detecting frequent hash values to identify skew. It also includes communicating skew notifications, broadcasting probe-side row data, and adjusting partitioning of probe-side data. The disclosed techniques further include buffering build-side row sets in streams and performing join operations based on these streams, enhancing efficiency in distributed computing environments.
    Type: Application
    Filed: August 29, 2024
    Publication date: December 19, 2024
    Inventors: Xinzhu Cai, Bowei Chen, Bjoern Daase, Moritz Eyssen, Florian Andreas Funke
  • Publication number: 20240273096
    Abstract: A method includes generating, by at least one hardware processor of a first computing node, a plurality of hash values using build-side row data. A frequent hash value of the plurality of hash values is detected based on row size associated with a plurality of build-side row sets including the build-side row data. A plurality of hash partitions of the build-side row data is generated using a build-side row set of the plurality of build-side row sets that includes the frequent hash value. The plurality of hash partitions of the build-side row data is distributed to a corresponding plurality of hash-join-build (HJB) instances associated with a plurality of join operations.
    Type: Application
    Filed: April 24, 2024
    Publication date: August 15, 2024
    Inventors: Xinzhu Cai, Florian Andreas Funke
  • Publication number: 20240232189
    Abstract: Provided herein are systems and methods for handling build-side skew. For example, a method includes computing a plurality of hash values for a join operation. The join operation uses a corresponding plurality of row sets. The plurality of hash values are sampled to detect a frequent hash value. A build-side row set is partitioned using the frequent hash value to generate a partitioned build-side row set. The build-side row set is selected from the plurality of row sets. The partitioned build-side row set is distributed to a plurality of hash-join-build (HJB) instances executing at a corresponding plurality of servers.
    Type: Application
    Filed: October 19, 2022
    Publication date: July 11, 2024
    Inventors: Xinzhu Cai, Florian Andreas Funke
  • Patent number: 12001428
    Abstract: Provided herein are systems and methods for handling build-side skew. For example, a method includes computing a plurality of hash values for a join operation. The join operation uses a corresponding plurality of row sets. The plurality of hash values are sampled to detect a frequent hash value. A build-side row set is partitioned using the frequent hash value to generate a partitioned build-side row set. The build-side row set is selected from the plurality of row sets. The partitioned build-side row set is distributed to a plurality of hash-join-build (HJB) instances executing at a corresponding plurality of servers.
    Type: Grant
    Filed: October 19, 2022
    Date of Patent: June 4, 2024
    Assignee: Snowflake Inc.
    Inventors: Xinzhu Cai, Florian Andreas Funke
  • Publication number: 20240134851
    Abstract: Provided herein are systems and methods for handling build-side skew. For example, a method includes computing a plurality of hash values for a join operation. The join operation uses a corresponding plurality of row sets. The plurality of hash values are sampled to detect a frequent hash value. A build-side row set is partitioned using the frequent hash value to generate a partitioned build-side row set. The build-side row set is selected from the plurality of row sets. The partitioned build-side row set is distributed to a plurality of hash-join-build (HJB) instances executing at a corresponding plurality of servers.
    Type: Application
    Filed: October 18, 2022
    Publication date: April 25, 2024
    Inventors: Xinzhu Cai, Florian Andreas Funke
  • Patent number: 11403294
    Abstract: In one aspect, a computer-implemented method includes detecting, by a server includes one or more processors, a request to perform a hash join operation on a data structure stored in a data storage device, forming a hash lookup dictionary based on lookup results in a hash table, storing the hash lookup dictionary in a cache, and probing, during a probing phase of the hash join operation, the cache.
    Type: Grant
    Filed: August 30, 2021
    Date of Patent: August 2, 2022
    Assignee: Snowflake Inc.
    Inventors: Selcuk Aya, Xinzhu Cai, Florian Andreas Funke