Query Execution Plan Patents (Class 707/718)
  • Patent number: 11520780
    Abstract: Systems and techniques are described for efficient, general-purpose, and potentially decentralized databases, distributed storage systems, version control systems, and/or other types of data repositories. Data is represented in a database system in such a way that any value is represented by a unique identifier which is derived from the value itself. Any database peer in the system will derive an identical identifier from the same logical value. The identifier for a value may be derived using a variety of mechanisms, including, without limitation, a hash function known to all peers in the system. The values may be organized hierarchically as a tree of nodes. Any two peers storing the same logical value will deterministically represent that value with a graph, such as the described “Prolly” tree, having the same topology and hash value, irrespective of possibly differing sequences of mutations which caused each to arrive at the same final value.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: December 6, 2022
    Assignee: Salesforce, Inc.
    Inventors: Aaron Boodman, Rafael Weinstein, Erik Arvidsson, Chris Masone, Dan Willhite, Benjamin Kalman
  • Patent number: 11507590
    Abstract: Techniques are introduced herein for maintaining geometry-type data on persistent storage and in memory. Specifically, a DBMS that maintains a database table, which includes at least one column storing spatial data objects (SDOs), also maintains metadata for the database table that includes definition data for one or more virtual columns of the table. According to an embodiment, the definition data includes one or more expressions that calculate minimum bounding box values for SDOs stored in the geometry-type column in the table. The one or more expressions in the metadata maintained for the table are used to create one or more in-memory columns that materialize the bounding box data for the represented SDOs. When a query that uses spatial-type operators to perform spatial filtering over data in the geometry-type column is received, the DBMS replaces the spatial-type operators with operators that operate over the scalar bounding box information materialized in memory.
    Type: Grant
    Filed: June 17, 2020
    Date of Patent: November 22, 2022
    Assignee: Oracle International Corporation
    Inventors: Siva Ravada, Ying Hu, Zhen Hua Liu, Shasank Kisan Chavan, Aurosish Mishra, Vikas Arora
  • Patent number: 11507577
    Abstract: Methods, systems, apparatuses, and computer program products are provided for determining a query plan. A query is received that comprises a request for a data result for each of a plurality of original time windows. The plurality of original time windows included in the query are identified. An initial window representation is generated that identifies a set of connections between windows in a window set that includes at least the original time windows. A revised window representation is generated that includes an alternative set of connections between windows in the window set based at least on an execution cost for at least one window. The revised window representation is selected to obtain the data result for each of the plurality of original time windows. A revised query plan based on the revised window representation is provided to obtain the data result for each of the plurality of original time windows.
    Type: Grant
    Filed: May 28, 2020
    Date of Patent: November 22, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Alexander Raizman, Wentao Wu, Philip A. Bernstein
  • Patent number: 11500871
    Abstract: A computer-implemented method is disclosed that includes operations of receiving a query to be executed, the query including an indication of a data source at which input data is be to obtained, wherein the query is to be executed on the input data, determining a schema of the input data, determining fields of the input data that are required for execution of the query by analyzing a sequence of operators forming the query, determining one or more alterations to the query to improve efficiency of the execution of the query based on the fields of input data required for the execution, and generating an altered query be altering the query in accordance with the one or more alterations. The method may further include converting the query to a directed acyclic graph (DAG) and providing the DAG to a distributed processing engine configured to execute the DAG.
    Type: Grant
    Filed: October 19, 2020
    Date of Patent: November 15, 2022
    Assignee: SPLUNK Inc.
    Inventors: Chinmay Madhav Kulkarni, Lin Ma, Amir Malekpour, Mohan Rajagopalan, John C. Reed, Ram Sriharsha
  • Patent number: 11494379
    Abstract: Disclosed herein are systems and methods for pre-filter deduplication for multidimensional two-sided interval joins. In an embodiment, a data platform receives query instructions for a two-sided N dimensional interval join, where N is an integer greater than 1. The two-sided N dimensional interval join has an interval-join predicate that compares intervals determined from the input relations in each of N dimensions. The data platform implements the two-sided N dimensional interval join as a query-plan section that includes an N dimensional band join that is followed by a deduplication operator that is followed by a filter that applies the interval-join predicate. The N dimensional band join includes a hash join keyed to N dimensional domain cells overlapped at least in part by intervals determined from the input relations in each of the N dimensions. The deduplication operator removes duplicate rows from a potential-duplicates subset of the output of the N dimensional band join.
    Type: Grant
    Filed: April 23, 2021
    Date of Patent: November 8, 2022
    Assignee: Snowflake Inc.
    Inventors: Matthias Carl Adams, Spyridon Triantafyllis, Lars Volker, Kevin Wang
  • Patent number: 11487795
    Abstract: Disclosed is a template-based automatic question and answer method for software bug. An entity relationship triple is extracted from a bug corpus and a natural language pattern is acquired; an entity relationship in the triple is determined; a query template corresponding to the natural language pattern is acquired; an entity in a question q proposed by a user is replaced with an entity type to acquire a question q?; then, the entity type in q? and an entity type in the natural language pattern are compared and searched for and a similarity is calculated; then, a SPARQL query pattern of the question q is acquired according to the similarity and the entity in the question q; and finally, the SPARQL query pattern of the question q is executed so as to acquire an answer to the question q.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: November 1, 2022
    Inventors: Xiaobing Sun, Jinting Lu, Bin Li
  • Patent number: 11487708
    Abstract: Techniques for visual data preparation are described. An interactive visual data preparation service provides a user with a graphical user interface that presents values from a sample taken of a dataset along with statistical information associated with those values. A user uses the graphical user interface to test out various transformations to the sample dataset by applying transformations and viewing near-immediate results of those transformations as applied to the sample. The desired set of transformations is represented as a recipe object, which can be used to perform data preparation against the overall dataset or other datasets on behalf of the user or other users.
    Type: Grant
    Filed: November 11, 2020
    Date of Patent: November 1, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Surbhi Dangi, Gopinath Duddi, Amit Gul Phagwani, Romi Boimer, Ronald Stephen Kyker
  • Patent number: 11487772
    Abstract: The present disclosure provides a multi-party data joint query method, a device, a server and a storage medium. The multi-party data joint query method executed by a manager includes: analyzing a multi-party joint query sentence to obtain a logical execution plan; processing the logical execution plan according to providers of respective nodes in the logical execution plan to obtain a physical execution plan of each provider; and generating a query instruction of each provider according to the physical execution plan of each provider, and sending the query instruction to respective provider. The query instruction is configured to instruct the providers to perform a query cooperatively.
    Type: Grant
    Filed: December 26, 2019
    Date of Patent: November 1, 2022
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Zhi Feng, Yu Zhang, Sen Zhang
  • Patent number: 11481398
    Abstract: A system for spilling comprises an interface and a processor. The interface is configured to receive an indication to perform a GROUP BY operation, wherein the indication comprises an input table and a grouping column. The processor is configured to: for each input table entry of the input table, determine a key, wherein the key is based at least in part on the input table entry and the grouping column; add the key to a grouping hash table, wherein adding the key to the grouping hash table comprises last-in, first-out (LIFO) spilling when necessary; create an output table based at least in part on the grouping hash table; and provide the output table.
    Type: Grant
    Filed: December 9, 2020
    Date of Patent: October 25, 2022
    Assignee: Databricks Inc.
    Inventors: Alexander Behm, Ankur Dave, Ryan Deng, Shoumik Palkar
  • Patent number: 11468065
    Abstract: An information processing apparatus according to the present application includes an acquiring unit and a selecting unit. The acquiring unit acquires a plurality of pieces of second triple information hierarchized based on a conceptual system in a plurality of pieces of first triple information indicating a relationship about three types of elements and statistical information indicating the number of pieces of the first triple information associated with each of the plurality of pieces of the second triple information. The selecting unit selects, based on the statistical information acquired by the acquiring unit and based on a predetermined standard related to the statistical information, from among the plurality of pieces of the second triple information, a plurality of pieces of target triple information to be used for a clustering process.
    Type: Grant
    Filed: February 21, 2019
    Date of Patent: October 11, 2022
    Assignee: YAHOO JAPAN CORPORATION
    Inventors: Kiyoshi Nitta, Iztok Savnik
  • Patent number: 11468073
    Abstract: Techniques are provided for gathering statistics in a database system. The techniques involve gathering some statistics using an “on-the-fly” technique, some statistics through a “high-frequency” technique, and yet other statistics using a “prediction” technique. The technique used to gather each statistic is based, at least in part, on the overhead required to gather the statistic. For example, low-overhead statistics may be gathered “on-the-fly” using the same process that is performing the operation that affects the statistic, while statistics whose gathering incurs greater overhead may be gathered in the background, while the database is live, using the high-frequency technique. The prediction technique may be used for relatively-high overhead statistics that can be predicted based on historical data and the current value of predictor statistics.
    Type: Grant
    Filed: August 6, 2019
    Date of Patent: October 11, 2022
    Assignee: Oracle International Corporation
    Inventors: Mohamed Zait, Yuying Zhang, Hong Su, Jiakun Li
  • Patent number: 11461304
    Abstract: Signature-based cache optimization for data preparation includes: performing a first set of sequenced data preparation operations on one or more sets of data to generate a plurality of transformation results; caching one or more of the plurality of transformation results and one or more corresponding operation signatures, a cached operation signature being derived based at least in part on a subset of sequenced operations that generated a corresponding result; receiving a specification of a second set of sequenced operations; determining an operation signature associated with the second set of sequenced operations; identifying a cached result among the cached results based at least in part on the determined operation signature; and outputting the cached result.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: October 4, 2022
    Assignee: DataRobot, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 11461195
    Abstract: A method for processing query fault, where a database server receives a query statement and generates a corresponding query plan tree including multiple layers of operators in a pipeline relationship, and each layer includes operation symbols having logical relationship with each other. The server executes the query statement according to the query plan tree, extracts intermediate status information of a faulty operator when a fault occurs in a process of executing the query statement, updates operation symbols of the faulty operator and a logical relationship among the operation symbols according to the query plan tree and the intermediate status information to obtain a reconstructed query plan tree, and continues to execute the query statement according to the reconstructed query plan tree after the fault is recovered.
    Type: Grant
    Filed: November 23, 2020
    Date of Patent: October 4, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Jinwei Zhu, Qingqing Zhou, Pinggao Zhou
  • Patent number: 11461327
    Abstract: The subject technology receives a query, the query including a set of statements for performing the query. The subject technology populates a compilation context based at least in part the query. The subject technology provides the compilation context to a compiler. The subject technology invokes the compiler to perform a compilation process based on the compilation context, the compilation process comprising performing a lookup operation on a stored plan cache for an exact match based on information from the compilation context, the stored plan cache including a set of stored query plans, and determining whether the exact match of a particular query plan is found in the stored plan cache to avoid compiling the query using the compilation context.
    Type: Grant
    Filed: April 8, 2022
    Date of Patent: October 4, 2022
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Xuelai Cui, Sangyong Hwang, Allison Waingold Lee, Boyung Lee, Nicola Dan Onose, William Waddington, Jiaqi Yan, Li Yan, Yongsik Yoon
  • Patent number: 11455307
    Abstract: A system includes determination of a plurality of queries of a workload, determination of a data source comprising a plurality of data rows, and determination of a sample data source based on a cardinality of each of the plurality of queries with respect to the data source and an estimated cardinality of each of the plurality of queries with respect to the data source, wherein the estimated cardinality of a query with respect to the data source is determined based on the sample data source.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: September 27, 2022
    Assignee: SAP SE
    Inventors: Axel Hertzschuch, Norman May, Lars Fricke, Florian Wolf, Guido Moerkotte, Wolfgang Lehner
  • Patent number: 11455306
    Abstract: Techniques are described herein for leveraging recurrent neural networks for query processing. In some embodiments, a query analytic system determines a sequence of tokens for at least a portion of a query and determines a vector representation for each token. The query analytic system further generates, using a neural network based on the sequence of tokens, a performance prediction associated with executing at least the portion of the query, wherein the neural network assigns at least a first weight for at least a first token in the sequence of tokens based at least in part on at least a second token that preceded the token in the sequence. The query analytic system further triggers a responsive action, such as triggering an alert and/or tuning the query, based at least in part on the performance prediction.
    Type: Grant
    Filed: January 21, 2020
    Date of Patent: September 27, 2022
    Assignee: Oracle International Corporation
    Inventors: Arvind Kumar Maheshwari, Vamshidhar Reddy Pasham, Shantanu Mahajan, Debottam Kundu
  • Patent number: 11449481
    Abstract: Data storage and query method and device are disclosed, which facilitate a quick acquisition of query results through index queries at subsequent stages by establishing indexes for columns of a table. Furthermore, by scanning data in the table to obtain statistical information of data in the columns, this facilitates using the statistical information of the data in the columns to perform cost estimation in subsequent queries, in an attempt to obtain a data query mode that has the least cost and the best performance, thus improving query efficiency.
    Type: Grant
    Filed: June 5, 2020
    Date of Patent: September 20, 2022
    Assignee: Alibaba Group Holding Limited
    Inventors: Jiye Tu, Chuangxian Wei, Chaoqun Zhan
  • Patent number: 11442933
    Abstract: An approach for implementing function semantic based partition-wise SQL execution and partition pruning in a data processing system is provided. The system receives a query directed to a range-partitioned table and determines if operation key(s) of the query include(s) function(s) over the table partitioning key(s). If so, the system obtains a set of values corresponding to each partition by evaluating the function(s) on a low bound and/or a high bound table partitioning key value corresponding to the partition. The system may then compare the sets of values corresponding to different partitions and determine whether to aggregate results obtained by executing the query over the partitions based on the comparison. The system may also determine whether to prune any partitions from processing based on a set of correlations between the set of values for each partition and predicate(s) of the query including function(s) over the table partitioning key(s).
    Type: Grant
    Filed: September 21, 2017
    Date of Patent: September 13, 2022
    Assignee: Oracle International Corporation
    Inventors: Srikanth Bellamkonda, Andrew Witkowski, Manish Pratap Singh, Madhuri Kandepi
  • Patent number: 11443588
    Abstract: Embodiment method and associated apparatus relate to altering the expected value of a system modeled by a random process simplified to produce a binary outcome. Various embodiments modify a genetic algorithm to optimize such settings as population size, number of iterations to convergence, mutation chance, and sample space. Some embodiment ARON implementations correctly predict game outcome relative to the spread, based on transforming unrelated raw data, applying the transformed raw data to a modified genetic algorithm, generating multiple expected outcomes determined by the modified genetic algorithm as a function of the transformed raw data, and filtering the outcomes as a function of predefined metrics to produce a single end result that can be utilized effectively by an evolutionary-style algorithm.
    Type: Grant
    Filed: January 6, 2022
    Date of Patent: September 13, 2022
    Assignee: Ladris Technologies, Inc.
    Inventors: Leo Zlimen, Bowen Kyle
  • Patent number: 11429610
    Abstract: A method, a system, and a computer program product for generating a query executable plan. A query requiring access to data stored in a database system is received. Based on the received query, a query execution plan having a plurality of query execution pipelines is generated. Each query execution pipeline in the plurality of query execution pipelines is configured to execute a plurality of operations in a predetermined order associated with each query execution pipeline. The generated query execution plan is fragmented into a plurality of fragments. Each fragment has one or more query execution pipelines in the plurality of query execution pipelines. The received query is executed by executing each fragment in the plurality of fragments.
    Type: Grant
    Filed: April 1, 2020
    Date of Patent: August 30, 2022
    Assignee: SAP SE
    Inventors: Xun Cheng, Zhen Tian, Yuncong Qiao, Faming Qu, Paul Willems, Hongyong Lu, Yanxin Luo, Nitesh Maheshwari
  • Patent number: 11429630
    Abstract: Tiered storage may be implemented for processing data. Data processors may maintain some of a data set, including user data and metadata describing the user data, locally. The data set is also maintained a data store remote to the data processor. When processing requests are received, a determination is made as to whether the local portions of the data set can execute the processing request or one or more additional portions of the data set are needed from the remote data store. If additional portions of the data set are needed, then a request may be sent to the data store for the additional portions. Once received, the data processor may execute the processing request utilizing the additional portions. Portions of the data set maintained locally at the data processor may be selected and flushed from local storage to the remote data store.
    Type: Grant
    Filed: May 8, 2020
    Date of Patent: August 30, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Anurag Windlass Gupta, Andrew Edward Caldwell
  • Patent number: 11423047
    Abstract: The present disclosure relates to computer-implemented methods, software, and systems for managing data replication between different source sections and target sections in response to received copy instructions associated with copy profiles. In response to evaluating statistical metadata identifying whether data records in relation to at least one client are included for a table from a first set of tables, a first subset of tables from the first set of tables is determined. In response to evaluating update metadata defining latest updates of tables from the first subset of tables, a second subset of tables from the first subset of tables is determined that defines tables that include updated data records relevant for copying. The second subset tables are iteratively evaluated to define corresponding operations to be performed for tables at the target section and at the source section in the database in relation to the requested copy operation.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: August 23, 2022
    Assignee: SAP SE
    Inventors: Dominik Ofenloch, Thomas Vogt
  • Patent number: 11403299
    Abstract: Embodiments of the present disclosure are directed to techniques for monitoring and orchestrating the use and generation of collaborative data in a trustee environment subject to configurable constraints. A user interface can be provided to enable tenants to specify desired computations and constraints on the use and access to their data. A constraint manager can communicate with various components in the trustee environment to implement the constraints. For example, requests to execute an executable unit of logic such as a command or function call may be issued to the constraint manager, which can grant or deny permission. Permission may be granted subject to one or more conditions that implement the constraints, such as requiring the replacement of a particular executable unit of logic with a constrained executable unit of logic. As constraints are applied, any combination of schema, constraints, and/or attribution metadata can be associated with the data.
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: August 2, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yisroel Gershon Taber, Tomer Turgeman, Lev Rozenbaum
  • Patent number: 11392607
    Abstract: Embodiments for intelligent automated feature engineering for relational data in a computing environment by a processor. Indices may be automatically selected and built from one or more columns of one or more tables in a relational database using one or more automated feature engineering models that include a set of queries. One or more features may be determined using a set of queries of an automated feature engineering models to execute for a scoring operation.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: July 19, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Thanh Lam Hoang, Hong Min
  • Patent number: 11386087
    Abstract: In some aspects, there is provided a method including receiving an execution plan file, the execution plan file utilizing at least one operator of interest and further utilizing other actions separate from the at least one operator of interest. The method further includes forming an execution plan object by modifying the execution plan file by isolating the at least one operator of interest from the other actions. The method further includes performing a series of tests executing an extended execution plan object. The series of tests can include receiving the input data identified by the one or more pointers in the extended execution plan object, executing the extended execution plan object using the received input data, measuring, based on the execution of the extended execution plan object, at least one cost metric representative of execution of the at least one operator of interest, and outputting the measured cost metric.
    Type: Grant
    Filed: April 3, 2019
    Date of Patent: July 12, 2022
    Assignee: SAP SE
    Inventors: Marius Eich, Dennis Felsing
  • Patent number: 11386086
    Abstract: A DBMS query-optimization module receives a corpus of training data that contains data-access requests, such as SQL queries. Each request specifies data to be accessed but does not specify a query plan that the database should use to generate the requested data. The module identifies, in each received request, parameters, such as join methods and access methods, that can vary among query plans; and variables that cannot be assigned values until the query is actually processed. The system generates a set of queries, each of which implements a distinct query plan, that specify every viable permutation of values of the parameters and of the variables. The generated queries are added to the received corpus, which is forwarded to a machine-learning module in order to train the query-optimizer to select query plans that incur the lowest resource costs when servicing a particular type of query.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: July 12, 2022
    Assignee: International Business Machines Corporation
    Inventors: Terence P. Purcell, Thomas A. Beavin, Martin Dinh, Brian L. Baggett
  • Patent number: 11379480
    Abstract: Sub-plans are executed in parallel using a plurality of execution nodes, which can be part of a data platform. In particular, sub-plans (e.g., fragments or portions of one or more child operators) of a root operator are identified in a query plan such that the identified sub-plans that are candidates for execution on a single execution node, determine a cost estimate for causing the candidate sub-plans to be executed in parallel using multiple execution nodes, and cause the candidate sub-plans to be executed in parallel based on the cost estimate.
    Type: Grant
    Filed: January 11, 2022
    Date of Patent: July 5, 2022
    Assignee: Snowflake Inc.
    Inventors: Sebastian Breß, Moritz Eyssen, Max Heimel
  • Patent number: 11354373
    Abstract: A system and method for displaying data using temporal granularities. The method includes determining at least one first dataset of a plurality of datasets based on at least one temporal data requirement, wherein the plurality of datasets is generated based on a data model, wherein each of the plurality of datasets is generated based further on a distinct temporal granularity of a plurality of temporal granularities, wherein the distinct temporal granularity of each of the at least one first dataset meets at least one of the at least one temporal data requirement; and querying the determined at least one first dataset in order to obtain at least one query result.
    Type: Grant
    Filed: December 9, 2019
    Date of Patent: June 7, 2022
    Assignee: Sisense Ltd.
    Inventors: Guy Boyangu, Leon Gendler
  • Patent number: 11354312
    Abstract: A federated database-management system receives an SQL query or other type of data-access request. The federated system's host DBMS parses, rewrites, and optimizes the request into an optimal data-access plan, then determines which portions of the plan require access to data stored on the federated systems' remote databases. The federated host partitions the plan into subplans that each represent instructions of the original data-access request that were directed to a corresponding remote database of the federated DBMS. Each subplan is then transmitted to its corresponding remote database, which directly executes the subplan and returns results to the host. If necessary, a subplan is translated from an original generic access-plan format into a database-specific format required by its corresponding remote database.
    Type: Grant
    Filed: August 29, 2019
    Date of Patent: June 7, 2022
    Assignee: International Business Machines Corporation
    Inventors: Chang Sheng Liu, Yan Li Xu, Hui Guo, Yao M. Wang, Hai Jun Shen, Ping Liu
  • Patent number: 11354290
    Abstract: A query processing system generates and employs an inverted index of predicates for predicate statement evaluation. The inverted index maps values for variables to predicates that evaluate to true for the corresponding values. When querying input data, the query processing system identifies a value for each variable in the input data. For each value and variable pair, the query processing system identifies predicates mapped to the value for the variable in the inverted index. The query processing system evaluates the predicate statements by treating each predicate identified from the inverted index as true. In some configurations, the query processing system represents each predicate statement using a bit string and evaluates the predicate statements for the input data by setting bits to one for predicates identified from the inverted index and determining predicate statements that evaluate to true based on the bit strings.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: June 7, 2022
    Assignee: ADOBE INC.
    Inventor: Sandeep Nawathe
  • Patent number: 11347761
    Abstract: Techniques for a system capable of performing low-latency database query processing are disclosed herein. The system includes a gateway server and a plurality of worker nodes. The gateway server is configured to divide a database query, for a database containing data stored in a distributed storage cluster having a plurality of data nodes, into a plurality of partial queries and construct a query result based on a plurality of intermediate results. Each worker node of the plurality of worker nodes is configured to process a respective partial query of the plurality of partial queries by scanning data related to the respective partial query that stored on at least one data node of the distributed storage cluster and generate an intermediate result of the plurality of intermediate results that is stored in a memory of that worker node.
    Type: Grant
    Filed: May 12, 2020
    Date of Patent: May 31, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Raghotham Sathyanarayana Murthy, Ragat Goel
  • Patent number: 11347735
    Abstract: Embodiments of the present disclosure may provide a dynamic query execution model. This query execution model may provide acceleration by scaling out parallel parts of a query (also referred to as a fragment) to additional computing resources, for example computing resources leased from a pool of computing resources. Execution of the parts of the query may be coordinated by a parent query coordinator, where the query originated, and a fragment query coordinator.
    Type: Grant
    Filed: June 1, 2020
    Date of Patent: May 31, 2022
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Igor Demura, Varun Ganesh, Prasanna Rajaperumal, Libo Wang, Jiaqi Yan
  • Patent number: 11341132
    Abstract: An original query execution plan of a database query is received. The original query execution plan represents a tree of operators. Source code for the original query execution plan is generated by a single traversal of the tree of operators. The generated source code is compiled into native machine code. The native machine code represents a simplified native access plan (SNAP).
    Type: Grant
    Filed: September 1, 2015
    Date of Patent: May 24, 2022
    Assignee: SYBASE, INC.
    Inventors: Xiaobin Ma, Xun Cheng, Prabhas Kumar Samanta
  • Patent number: 11341135
    Abstract: An approach is provided for optimizing data fetching. A query employing a method to fetch data from a JSON document is received. An amount of time required to execute the query and a number of nested layers in a traversal of the JSON document required to fetch the data are determined. Based on the amount of time and the number of nested layers, a cost associated with an execution of the query is calculated. The cost is determined to exceed a threshold value. Responsive to the determination that the cost exceeds the threshold value and using historical query patterns and historical query execution times, a schema of the JSON document is re-designed. The data is fetched from the JSON document using the re-designed schema.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: May 24, 2022
    Assignee: International Business Machines Corporation
    Inventors: Ravi Chandra Chamarthy, Kishore Patel
  • Patent number: 11341090
    Abstract: A system for data migration is disclosed. The system may receive a migration request comprising a source file path and a target file location. The system may capture source file metadata based on the source file path and the migration request. The system may transfer a source file from a first data environment to an intermediate data environment via a first transfer process. The system may transfer the source file from the intermediate data environment to a second data environment via a second transfer process.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: May 24, 2022
    Assignee: AMERICAN EXPRESS TRAVEL RELATED SERVICES COMPANY, INC.
    Inventors: Arindam Chatterjee, Pratyush Kotturu, Pratap Singh Rathore, Brian C. Rosenfield, Nitish Sharma, Swatee Singh, Mohammad Torkzahrani
  • Patent number: 11327967
    Abstract: In some embodiments, a system is provided, comprising: memory storing instructions that, when executed, cause a processor to: submit a first database query; receive a runtime to execute the first database query using a plan selected by a query optimizer; receive runtimes to execute the first database query using a plurality of test plans; determine, based on the runtimes, a metric indicative of the effectiveness of the query optimizer; and cause the metric indicative of the effectiveness of the query optimizer to be presented to a user.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: May 10, 2022
    Assignee: Brandeis University
    Inventors: Olga Papaemmanouil, Mitch Cherniack, Zhan Li
  • Patent number: 11327969
    Abstract: Techniques for managing database workloads using similarity measures based on queries executed are described. Classical techniques from information retrieval are applied to the domain of database workload management. Specifically, the technique of using document term vectors to compute similarity measures are applied using the conceptual mapping of SQL workloads as “documents” composed of SQL queries as “terms.” The techniques include generating two or more sets of workloads with each workload representing a set of queries executed on at least one database. Based on the sets of workloads, workload term vectors are calculated that represent the set of queries executed on the database. Then, based on the calculated workload vectors, a similarity score is generated between the two or more sets of workloads.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: May 10, 2022
    Assignee: Oracle International Corporation
    Inventor: John Mark Beresniewicz
  • Patent number: 11314739
    Abstract: The present disclosure relates to a method of managing requests to a key-value database. A non-limiting example of the method includes receiving a request that includes a number of keys. The number of keys can be compared with a first threshold number and second threshold number. If the number of keys exceeds the first threshold number, the request can be split. If the number of keys is smaller than the second threshold number, the request can be merged with at least one previous or subsequent request. Requests resulting from the splitting and merging steps can be submitted to the key-value database for further processing of the submitted requests.
    Type: Grant
    Filed: April 9, 2018
    Date of Patent: April 26, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Robert Birke, Navaneeth Rameshan, Yiyu Chen, Martin Schmatz
  • Patent number: 11314775
    Abstract: A novel distributed graph database is provided that is designed for efficient graph data storage and processing on modern computing architectures. In particular a single node graph database and a runtime & communication layer allows for composing a distributed graph database from multiple single node instances.
    Type: Grant
    Filed: August 27, 2019
    Date of Patent: April 26, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Chun-Fu Chen, Jason L. Crawford, Ching-Yung Lin, Jie Lu, Mark R. Nutter, Toyotaro Suzumura, Ilie G. Tanase, Danny L. Yeh
  • Patent number: 11308047
    Abstract: System, method, and various embodiments for providing a data access and recommendation system are described herein. An embodiment operates by identifying a column access of one or more data values of a first column of a plurality of columns of a table of a database during a sampling period. A count of how many of the one or more data values are accessed during the column access are recorded. A first counter, corresponding to the first column and stored in a distributed hash table, is incremented by the count. The sampling period is determined to have expired. A load recommendation on how to load data values into the first column based on the first counter is computed. The load recommendation for implementation into the database for one or more subsequent column accesses is provided.
    Type: Grant
    Filed: March 12, 2020
    Date of Patent: April 19, 2022
    Assignee: SAP SE
    Inventors: Panfeng Zhou, Vivek Kandiyanallur, Colin Florendo, Robert Schulze, Zheng-Wei She, Yanhong Wang, Amarnadh Sai Eluri
  • Patent number: 11301517
    Abstract: Software is increasingly being developed as a collection of loosely coupled applications. Loosely coupled applications exchange data by publishing data to and retrieving data from a data store, such as a database, a file located on a storage cluster, etc. Data produced by one application and consumed by another is referred to as a data dependency. In some embodiments, an application's data dependencies are identified by analyzing cached query plans associated with the application. Query plans include a hierarchical representation of a query, where non-leaf nodes represent commands and leaf nodes identify data dependencies. An application's data dependencies are identified by traversing the hierarchical representation of the query. Data dependencies consumed by the application are identified by finding leaf nodes that descend from a read command, while data dependencies produced by the application are identified by finding leaf nodes that descend from a write command.
    Type: Grant
    Filed: May 7, 2020
    Date of Patent: April 12, 2022
    Assignee: eBay Inc.
    Inventors: Sizhong Liu, Zou Qingnan, Yi Liu, Ian Chi-Yee Ma, Haowen Zhu
  • Patent number: 11288447
    Abstract: Using a step editor for data preparation includes: receiving an indication of a user input with respect to at least some of a set of sequenced data preparation operations on a set of data; generating, using one or more processors, a signature based at least in part on the set of sequenced data preparation operations, references to the set of data, and the user input; using the generated signature to determine whether there exists a cached result associated with the set of sequenced data preparation operations, the references to the set of data, and the user input; based at least in part on the determination, obtaining a data traversal program representing a result associated with the set of sequenced operations, the references to the set of data, and the user input; and providing output based at least in part on the result represented by the obtained data traversal program.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: March 29, 2022
    Assignee: DR HoldCo 2, Inc.
    Inventors: Nenshad Dinshaw Bardoliwalla, Michael Matthews, Ian Timourian, Jing Chen, Lilia Gutnik, Whitman Kwok, Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 11288179
    Abstract: Systems and methods for computer memory management by a memory coordinator and a plurality of memory consumers. An urgency and memory quota of each memory consumer is initialized by the memory coordinator, which then adjusts the memory quota of each memory consumer such that the sum of the memory quota of each memory consumer does not exceed a finite amount of computer memory. Each memory consumer adjusts its memory usage in response to the quota input and urgency input from the memory coordinator.
    Type: Grant
    Filed: July 27, 2020
    Date of Patent: March 29, 2022
    Assignee: Kinaxis Inc.
    Inventors: Angela Lin, Robert Walker, Marin Creanga, Dylan Ellicott, Alex Fitzpatrick
  • Patent number: 11281668
    Abstract: A database engine receives a query batch of database queries from a client. The database engine identifies one or more object model queries from the query batch. Each object model query includes an outer-most outer-join that joins a respective dimension subquery and respective aggregated measure subqueries. The database engine forms a plurality of candidate subqueries by peeling off the respective outer-most outer-join for each of the object model queries. The database engine then fuses at least some of the plurality of candidate subqueries to form a set of optimized subqueries. The set of optimized subqueries has fewer subqueries than the plurality of candidate queries. The database engine also forms an optimized execution plan based on the set of one or more optimized subqueries. The database engine subsequently obtains a result set from the database based on the optimized execution plan, and returns the result set to the client.
    Type: Grant
    Filed: June 18, 2020
    Date of Patent: March 22, 2022
    Assignee: TABLEAU SOFTWARE, LLC
    Inventors: Nicolas Ratigan Borden, Justin Talbot, Christian Gabriel Eubank
  • Patent number: 11269878
    Abstract: The embodiments of this application provide an uncorrelated subquery optimization method and apparatus, and a storage medium. The method includes determining whether there is an uncorrelated subquery statement in a target clause in a database query statement. In response to the determination that there is the uncorrelated subquery statement in the target clause in the database query statement, the method includes obtaining an estimated number of rows of an execution result set corresponding to the target clause; and determining whether the estimated number of rows is less than a preset threshold. In response to the determination that the estimated number of rows is less than a preset threshold, the method includes executing the uncorrelated subquery statement, and rewriting the target clause according to an execution result set of the uncorrelated subquery statement, to eliminate the uncorrelated subquery statement.
    Type: Grant
    Filed: July 25, 2019
    Date of Patent: March 8, 2022
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventor: Haixiang Li
  • Patent number: 11256480
    Abstract: A data-instantiator method handle is configured to create a target object based on a stream object. One type of data-instantiator method handles is a Stream Object Processor method handle (SOP_mh). A SOP_mh is a runtime-computed constant in a runtime constant pool. A runtime environment resolves the SOP_mh lazily responsive to a request to access the SOP_mh. The runtime environment invokes the SOP_mh to create a target object based on a stream object. By virtue of being a constant in the runtime constant pool, the SOP_mh is a candidate for optimization by a dynamic compiler in the runtime environment. The dynamic compiler may elect to constant fold the value of the SOP_mh and inline any code or executable logic that the SOP_mh refers to.
    Type: Grant
    Filed: February 9, 2021
    Date of Patent: February 22, 2022
    Assignee: Oracle International Corporation
    Inventors: Chris Hegarty, Alexander R. Buckley, Julia Katharina Boes
  • Patent number: 11256698
    Abstract: Embodiments utilize trained query performance machine learning (QP-ML) models to predict an optimal compute node cluster size for a given in-memory workload. The QP-ML models include models that predict query task runtimes at various compute node cardinalities, and models that predict network communication time between nodes of the cluster. Embodiments also utilize an analytical model to predict overlap between predicted task runtimes and predicted network communication times. Based on this data, an optimal cluster size is selected for the workload. Embodiments further utilize trained data capacity machine learning (DC-ML) models to predict a minimum number of compute nodes needed to run a workload. The DC-ML models include models that predict the size of the workload dataset in a target data encoding, models that predict the amount of memory needed to run the queries in the workload, and models that predict the memory needed to accommodate changes to the dataset.
    Type: Grant
    Filed: April 11, 2019
    Date of Patent: February 22, 2022
    Assignee: Oracle International Corporation
    Inventors: Sam Idicula, Tomas Karnagel, Jian Wen, Seema Sundara, Nipun Agarwal, Mayur Bency
  • Patent number: 11249998
    Abstract: A data input sub-system of a large scale application specific computing system receives a data set that includes a plurality of records, each with a plurality of data fields, and divides the data set into a plurality of data segments. The data input sub-system further restructures records of data segments based on a key field of the plurality of data fields to produce restructured data segments and generates storage instructions for storing the restructured data segments. A data storage and processing sub-system of the computing system interprets the storage instructions to determine resources to engage and stores the restructured data segments using engaged resources. A query and results sub-system of the computing system generates an initial query plan based on a data processing request, optimizes the initial query plan to produce an optimized query plan, and sends the optimized query plan to the data storage and processing sub-system for execution.
    Type: Grant
    Filed: February 4, 2019
    Date of Patent: February 15, 2022
    Assignee: Ocient Holdings LLC
    Inventors: George Kondiles, Jason Arnold
  • Patent number: 11249995
    Abstract: Predictive execution of query flows in an application aware database environment. A repository of previously received and registered database queries along with at least corresponding metadata having information about database query flows generating the database queries is maintained. Application metadata corresponding to a subsequent database query is received. The repository is checked to determine if the application metadata matches one of the previously received and registered database query flows. One or more queries corresponding to the query flow from the repository is/are retrieved if a match is determined. Execution of the retrieved one or more database queries is started prior to receiving the query from outside the repository.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: February 15, 2022
    Assignee: salesforce.com, Inc.
    Inventors: Arjun Kumar Sirohi, Vikas Taneja, Kim Lichong, Michael Allan Friedman, Vidushi Sharma
  • Patent number: 11243963
    Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed, and generates a subquery for the external data system indicating that the results of the subquery are to be sent to multiple worker nodes. The system also generates instructions for multiple worker nodes to receive and process results of the subquery from the external data system.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: February 8, 2022
    Assignee: Splunk Inc.
    Inventors: Sourav Pal, Arindam Bhattacharjee