Query Execution Plan Patents (Class 707/718)
  • Patent number: 11645283
    Abstract: Methods, computer program products, and systems are presented. The method computer program products, and systems can include, for instance: receiving an incoming query statement, wherein the incoming query statement comprises a query statement expression that includes an input variable; predicting an input variable value associated to the input variable; selecting an access path for runtime execution of the query statement in dependence on the predicted input variable value; and performing runtime execution of the query statement using the selected access path.
    Type: Grant
    Filed: April 26, 2021
    Date of Patent: May 9, 2023
    Assignee: International Business Machined Corporation
    Inventors: Li Cao, Shuo Li, Xiaobo Wang, Xin Peng Liu, Sheng Yan Sun
  • Patent number: 11645280
    Abstract: A function reference for a function is identified in a query. A plurality of processing environments that can provide the function is identified. Function costs for the function to process in the processing environments are obtained. Input data transfer costs are acquired for providing input data identified in the query to each of the functions. A specific one of the functions from a specific processing environment is selected based on the function costs and the input data transfer costs. A query execution plan for executing the query with the specific function is generated. The query execution plan is provided to a database engine for execution.
    Type: Grant
    Filed: December 20, 2017
    Date of Patent: May 9, 2023
    Assignee: Teradata US, Inc.
    Inventors: John Mark Morris, Bhashyam Ramesh
  • Patent number: 11645281
    Abstract: The subject technology receives a query, the query including a set of statements for performing the query. The subject technology populates a compilation context based at least in part the query. The subject technology invokes a compiler to perform a compilation process based on the compilation context. The subject technology performs a lookup operation on a stored plan cache for an exact match based on information from the compilation context. The subject technology, in response to determining an exact match, determines whether the particular query plan requires re-compilation based on a data dependent optimization. The subject technology determines whether a plan cache entry corresponding to the particular query plan includes a data property constraint. The subject technology determines whether the data property constraint still holds based on a set of data properties.
    Type: Grant
    Filed: August 30, 2022
    Date of Patent: May 9, 2023
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Xuelai Cui, Sangyong Hwang, Allison Waingold Lee, Boyung Lee, Nicola Dan Onose, William Waddington, Jiaqi Yan, Li Yan, Yongsik Yoon
  • Patent number: 11645305
    Abstract: Example resource management systems and methods are described. In one implementation, a resource manager is configured to manage data processing tasks associated with multiple data elements. An execution platform is coupled to the resource manager and includes multiple execution nodes configured to store data retrieved from multiple remote storage devices. Each execution node includes a cache and a processor, where the cache and processor are independent of the remote storage devices. A metadata manager is configured to access metadata associated with at least a portion of the multiple data elements.
    Type: Grant
    Filed: May 16, 2022
    Date of Patent: May 9, 2023
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Benoit Dageville, Marcin Zukowski
  • Patent number: 11640610
    Abstract: Provided are a system, method, and computer program product for generating synthetic data. The method includes receiving a plurality of data types associated with an environment to be evaluated and receiving a plurality of correlations of one data type to another data type. The method also includes generating a correlation graph of the plurality of data types based on the plurality of correlations and generating a directed acyclic graph of the plurality of data types based on the correlation graph. The method further includes generating a hierarchical graph of the plurality of data types by applying a path traversal technique to the directed acyclic graph and generating a synthetic dataset by repeatedly traversing the hierarchical graph to generate a plurality of records of data.
    Type: Grant
    Filed: December 29, 2020
    Date of Patent: May 2, 2023
    Assignee: Visa International Service Association
    Inventors: Xiao Tian, Claudia Carolina Barcenas Cardenas, Shi Cao, Chiranjeet Chetia, Jianhua Huang, Marc Corbalan Vila
  • Patent number: 11640617
    Abstract: Metric forecasting techniques and systems in a digital medium environment are described that leverage similarity of elements, one to another, in order to generate a forecast value for a metric for a particular element. In one example, training data is received that describes a time series of values of the metric for a plurality of elements. The model is trained to generate the forecast value of the metric, the training using machine learning of a neural network based on the training data. The training includes generating dimensional-transformation data configured to transform the training data into a simplified representation to determine similarity of the plurality of elements, one to another, with respect to the metric over the time series. The training also includes generating model parameters of the neural network based on the simplified representation to generate the forecast value of the metric.
    Type: Grant
    Filed: March 21, 2017
    Date of Patent: May 2, 2023
    Assignee: Adobe Inc.
    Inventors: Chunyuan Li, Hung Hai Bui, Mohammad Ghavamzadeh, Georgios Theocharous
  • Patent number: 11636108
    Abstract: A method builds a regression model for predicting processing times for federated queries using a variety of data sources. The method includes obtaining federated queries (e.g., from benchmarks), and generates a plurality of federated query plans for each federated query. Each federated query plan corresponds to executing a respective federated query using a respective data source as the federation engine. The method includes forming feature vectors for each federated query plan based on cost estimations for executing the respective federated query plan and cost estimations for data transfer. The method further includes training a regression model, using the feature vectors for the plurality of federated query plans, to predict runtimes for executing federated queries using the variety of data sources as a federation engine. Some implementations use the trained regression model to determine a suitable federation engine for a given federated query.
    Type: Grant
    Filed: September 17, 2019
    Date of Patent: April 25, 2023
    Assignee: TABLEAU SOFTWARE, LLC
    Inventors: Liqi Xu, Richard L. Cole, Daniel Ting
  • Patent number: 11630830
    Abstract: A format conversion engine for Apache Hadoop that converts data from its original format to a database-like format at certain time points for use by a low latency (LL) query engine. The format conversion engine comprises a daemon that is installed on each data node in a Hadoop cluster. The daemon comprises a scheduler and a converter. The scheduler determines when to perform the format conversion and notifies the converter when the time comes. The converter converts data on the data node from its original format to a database-like format for use by the low latency (LL) query engine.
    Type: Grant
    Filed: July 6, 2020
    Date of Patent: April 18, 2023
    Assignee: Cloudera Inc.
    Inventors: Marcel Kornacker, Justin Erickson, Nong Li, Lenni Kuff, Henry Noel Robinson, Alan Choi, Alex Behm
  • Patent number: 11625398
    Abstract: A cardinality of a query is estimated by creating a join plan for the query. The join plan is converted to a graph representation. A subtree graph kernel matrix is generated for the graph representation of the join plan. The subtree graph kernel matrix is submitted to a trained model for cardinality prediction which produces a predicted cardinality of the query.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: April 11, 2023
    Assignee: Teradata US, Inc.
    Inventor: Dhiren Kumar Bhuyan
  • Patent number: 11620288
    Abstract: Systems and methods are disclosed for mapping search nodes to a search head in a data intake and query system based on a tenant identifier in order to execute a query received by the data intake and query system. The mapping may allow same or similar search nodes to be used to execute queries that are associated with a particular tenant identifier, in order to take advantage of caching and local data stored with those search nodes. In some cases, search nodes can be mapped based on the tenant identifier using a hashing algorithm, such as a consistent hashing algorithm.
    Type: Grant
    Filed: February 25, 2022
    Date of Patent: April 4, 2023
    Assignee: Splunk Inc.
    Inventors: Alexandros Batsakis, Scott Calvert, Alexander Douglas James, Bei Li, Ashish Mathew, James Monschke, Sogol Moshtaghi, Christopher Madden Pride, Xiaowei Wang
  • Patent number: 11615086
    Abstract: Joining data using a disjunctive operator is described. An example computer-implemented method can include generating a query plan for a query, wherein there is a join operator expression for each of a plurality of disjunctive predicates and each join operator expression includes at least a conjunctive predicate and a disjunctive operator. The method may also include generating a bloom filter for each of the plurality of disjunctive operators. The method may further include evaluating each of the plurality of join operator expressions using a corresponding one of the plurality of disjunctive operators and bloom filter for each of the plurality of disjunctive predicates to generate a result set.
    Type: Grant
    Filed: August 2, 2022
    Date of Patent: March 28, 2023
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Florian Andreas Funke, Guangyan Hu, Jiaqi Yan
  • Patent number: 11611570
    Abstract: A computer implemented method to generate a signature of a network attack for a network-connected computing system, the signature including rules for identifying the network attack, the method including generating, at a trusted secure computing device, a copy of data distributed across a network; the computing device identifying information about the network attack stored in the copy of the data; and the computing device generating the signature for the network attack based on the information about the network attack so as to subsequently identify the network attack occurring on a computer network.
    Type: Grant
    Filed: December 19, 2017
    Date of Patent: March 21, 2023
    Assignee: British Telecommunications Public Limited Company
    Inventor: Fadi El-Moussa
  • Patent number: 11604795
    Abstract: Systems and methods are disclosed for executing a query that includes an indication to process data managed by an external data system. The system identifies the external data system that manages the data to be processed and generates a subquery for the external data system indicating that the results of the subquery are to be sent to one worker node of multiple worker nodes. The system instructs the one worker node to distribute the results received from the external data system to multiple worker nodes for processing.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: March 14, 2023
    Assignee: Splunk Inc.
    Inventors: Sourav Pal, Arindam Bhattacharjee
  • Patent number: 11593334
    Abstract: An apparatus, method and computer program product for physical database design and tuning in relational database management systems. A relational database management system executes in a computer system, wherein the relational database management system manages a relational database comprised of one or more tables storing data. A Deep Reinforcement Learning based feedback loop process also executes in the computer system for recommending one or more tuning actions for the physical database design and tuning of the relational database management system, wherein the Deep Reinforcement Learning based feedback loop process uses a neural network framework to select the tuning actions based on one or more query workloads performed by the relational database management system.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: February 28, 2023
    Assignee: Teradata US, Inc.
    Inventors: Louis Martin Burger, Emiran Curtmola, Sanjay Nair, Frank Roderic Vandervort, Douglas P. Brown
  • Patent number: 11593382
    Abstract: A computer-implemented method, a computer program product, and a computer system for detecting an inappropriate data type of a column in a database and correcting an encoding for the column. The computer system detects in a table a candidate column that has a mismatching type definition, using database usage statistics. The computer system determines whether conversion of the candidate column is possible. In response to determining that the conversion of the candidate column is possible, the computer system converts values in the candidate column with a first data type to values in a new column with a second data type. The computer system appends the new column in the table. The computer system registers the new column and the second data type in a metadata catalog. The computer system generates a query plan operator for processing a query for the new column.
    Type: Grant
    Filed: March 22, 2021
    Date of Patent: February 28, 2023
    Assignee: International Business Machines Corporation
    Inventors: Felix Beier, Knut Stolze, Reinhold Geiselhart, Luis Eduardo Oliveira Lizardo
  • Patent number: 11593371
    Abstract: A relational database management system (RDBMS) accepts a workload comprised of one or more queries against a relational database. The RDBMS evolves a default cost profile into a plurality of cost profiles using fixed or dynamic evolution, wherein each of the cost profiles captures one or more cost parameters for the workload. The cost profiles are represented by a multi-dimensional matrix that has one or more dimensions, and each of the dimensions represents one of the cost parameters. The RDBMS dynamically determines which of the cost profiles is an optimal cost profile for the workload by mapping the cost profiles to the workload using a random walk scoring algorithm or a biased walk scoring algorithm that searches the multi-dimensional matrix to identify the optimal cost profile. The RDBMS selects and performs one or more query execution plans for the workload based on the optimal cost profile for the workload.
    Type: Grant
    Filed: August 18, 2020
    Date of Patent: February 28, 2023
    Assignee: Teradata US, Inc.
    Inventors: Wellington Marcos Cabrera Arevalo, Kassem Awada, Mahbub Hasan, Allen N. Diaz, Mohammed Al-Kateb, Awny Kayed Al-Omari
  • Patent number: 11586612
    Abstract: Disclosed herein is a system and method for enabling creating and managing structured tables on a blockchain thereby facilitating retrieval of data records with standard database operational commands and in a structured manner. Creating structured tables as per a defined schema provides for data storage in a manner which enables easy integration with existing business applications. Further, the system and method provides for storing unstructured data records in a structured manner in the blockchain.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: February 21, 2023
    Assignee: Innoplexus AG
    Inventor: Abhijit Keskar
  • Patent number: 11586604
    Abstract: Embodiments of the present disclosure relate to a method, system, and computer program product for generating one or more in-memory data structures for data access. According to the method, target data associated with a database is identified. Further, the method determines at least one data structure for the target data based on at least one access pattern of the target data in a plurality of historical queries against the database, wherein the target data is accessed in execution of the plurality of historical queries. The method further implements the at least one data structure in a memory to store the target data. The at least one data structure is used for further access to the target data in execution of a further query against the database.
    Type: Grant
    Filed: July 2, 2020
    Date of Patent: February 21, 2023
    Assignee: International Business Machines Corporation
    Inventors: Xiaobo Wang, Shuo Li, Sheng Yan Sun, Peng Hui Jiang
  • Patent number: 11573960
    Abstract: A computer-implemented method provides application-based query transformations. The method includes determining an application is initiated. The method includes identifying a set of execution units included in the application. The execution units are based on of a set of queries in the application and a set of actions in the application. The method also includes building a query dependency graph (QDG) comprising a plurality of nodes, wherein each node of the plurality of nodes is correlated to an execution unit, and each node is linked to at least one additional node, the link indicating a relative execution order and a common attribute each node and the additional node. The method includes merging, based on a performance architecture, two or more of the set of execution units into a section. The method includes processing the application according to the QDG.
    Type: Grant
    Filed: March 18, 2021
    Date of Patent: February 7, 2023
    Assignee: International Business Machines Corporation
    Inventors: Shuo Li, Xiaobo Wang, Hong Mei Zhang, Sheng Yan Sun
  • Patent number: 11567937
    Abstract: Embodiments implement a prediction-driven, rather than a trial-driven, approach to automate database configuration parameter tuning for a database workload. This approach uses machine learning (ML) models to test performance metrics resulting from application of particular database parameters to a database workload, and does not require live trials on the DBMS managing the workload. Specifically, automatic configuration (AC) ML models are trained, using a training corpus that includes information from workloads being run by DBMSs, to predict performance metrics based on workload features and configuration parameter values. The trained AC-ML models predict performance metrics resulting from applying particular configuration parameter values to a given database workload being automatically tuned. Based on correlating changes to configuration parameter values with changes in predicted performance metrics, an optimization algorithm is used to converge to an optimal set of configuration parameters.
    Type: Grant
    Filed: May 12, 2021
    Date of Patent: January 31, 2023
    Assignee: Oracle International Corporation
    Inventors: Sam Idicula, Tomas Karnagel, Jian Wen, Seema Sundara, Nipun Agarwal, Mayur Bency
  • Patent number: 11567956
    Abstract: A format conversion engine for Apache Hadoop that converts data from its original format to a database-like format at certain time points for use by a low latency (LL) query engine. The format conversion engine comprises a daemon that is installed on each data node in a Hadoop cluster. The daemon comprises a scheduler and a converter. The scheduler determines when to perform the format conversion and notifies the converter when the time comes. The converter converts data on the data node from its original format to a database-like format for use by the low latency (LL) query engine.
    Type: Grant
    Filed: July 6, 2020
    Date of Patent: January 31, 2023
    Assignee: Cloudera, Inc.
    Inventors: Marcel Kornacker, Justin Erickson, Nong Li, Lenni Kuff, Henry Noel Robinson, Alan Choi, Alex Behm
  • Patent number: 11561977
    Abstract: According to some embodiments, a system to manage a query plan cache for a Database Management System (“DBMS”) includes a DBMS query plan cache data store. The DBMS query plan cache data store may contain, for example, electronic records representing a plurality of query plans each associated with a set of instructions created in response to a query previously submitted by a user. A DBMS query plan cache management platform may then calculate a utility score for each query plan in the DBMS query plan cache data store. At least one query plan may be evicted from the DBMS query plan cache data store based on the calculated utility score, wherein the evicting is not based on a size of the DBMS query plan cache.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: January 24, 2023
    Assignee: SAP SE
    Inventors: Sung Gun Lee, Sanghee Lee, Hyung Jo Yoon, Boyeong Jeon
  • Patent number: 11556710
    Abstract: A computer system processes a group of inputs. A group of entities that is input for processing is intercepted. The intercepted group is expanded into individual entities. Each of the individual entities is processed to produce results for each individual entity. The results for each individual entity are intercepted and merged to produce results for the group of entities. Embodiments of the present invention further include a method and program product for processing a group of inputs in substantially the same manner described above.
    Type: Grant
    Filed: May 11, 2018
    Date of Patent: January 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Brian S. Dreher, Sheng Hua Bao, Xiaoyang Gao, Yanyan Han
  • Patent number: 11556538
    Abstract: Methods, systems, and computer-readable storage media for receiving, by a current database system, a query plan file representative of a captured query plan from a source database system, receiving, by the current database system, a set of definitions including one or more definitions, each definition in the set of definitions corresponding to an object that is implicated by the query plan, the object being included in a set of objects, and determining, by the current database system, that each definition in the set of definitions is identical to a respective definition of a corresponding object within the current database system, and in response: executing the captured query plan in the current database system to provide a query result.
    Type: Grant
    Filed: May 15, 2020
    Date of Patent: January 17, 2023
    Assignee: SAP SE
    Inventors: Youngbin Bok, Jaehyok Chong, Won Jun Chang, Sungguk Lim
  • Patent number: 11550833
    Abstract: An architecture for semantic search over encrypted data that improves upon existing encrypted data search techniques by providing a solution that is space-efficient on both the cloud and client sides, considers the semantic meaning of the user's query, and returns a list of documents accurately ranked by their similarity to the query. Different search schemes are presented based on S3C architecture (namely, FKSS, SKSS, and KSWF) that are fine-tuned for different types of datasets. The system requires only a single plaintext query to be entered and is easily portable to thin-clients, making it simple and quick for users to use. The system is also shown to be secure and resistant to attacks.
    Type: Grant
    Filed: October 24, 2018
    Date of Patent: January 10, 2023
    Assignee: University of Louisiana at Lafayette
    Inventors: Jason Woodworth, Mohsen Amini Salehi
  • Patent number: 11544290
    Abstract: Embodiments for providing intelligent data replication and distribution in a computing environment. Data access patterns of one or more queries issued to a plurality of data partitions may be forecasted. Data may be dynamically distributed and replicated to one or more existing data partitions or additional of the plurality of data partitions according to the forecasting.
    Type: Grant
    Filed: January 13, 2020
    Date of Patent: January 3, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Stefano Braghin, Srikumar Venugopal
  • Patent number: 11544261
    Abstract: A system for optimizing data requests in an electronic data storage environment may be configured to receive and identify data requests to perform operations on data stored in a data storage environment. The system may further to implement tuning algorithms on the data requests upon identifying that the data requests are causing the data storage environment to perform below optimal performance. The present invention may be implemented as a system, a computer program product, or a computer-implemented method.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: January 3, 2023
    Assignee: BANK OF AMERICA CORPORATION
    Inventors: Krishna Rangarao Mamadapur, Jigesh Rajendra Safary
  • Patent number: 11537596
    Abstract: A method includes registering a type of data file. Registering the type of data file includes storing metadata describing the type of data file, the metadata including a file storage service and a parser for the type of data file. The method includes receiving a first data file of the type from a first domain, the first data file having raw data, storing the first data file, storing one or more access rules and a lineage of the first data file, parsing the first data file using the parser to generate a content from the raw data, storing the content separately from the raw data, providing the first data file and the content to a search service, and automatically updating one or more second data files from one or more other domains based on the content of the first data file using the search service and the lineage.
    Type: Grant
    Filed: June 22, 2020
    Date of Patent: December 27, 2022
    Assignee: Schlumberger Technology Corporation
    Inventors: Hrvoje Markovic, Hans Eric Klumpen, RajKumar Kannan
  • Patent number: 11537604
    Abstract: A system and method for processing of queries including receiving a query including a set operation and a sort operation, wherein the set operation includes a first data structure and a second data structure and the sort operation requests a result set that is sorted based on a column or attribute of the first data structure and a column or attribute of the second data structure; generating a query plan in which a sort operation occurs prior to the set operation; determining a first, partial set of one or more resultant rows responsive to the query; sending the first, partial set of one or more resultant rows responsive to the query to a client; determining a second, partial set of one or more resultant rows responsive to the query; and sending the second, partial set of one or more resultant rows to the client.
    Type: Grant
    Filed: November 25, 2020
    Date of Patent: December 27, 2022
    Assignee: PROGRESS SOFTWARE CORPORATION
    Inventors: Mohammed Sayeed Akthar, Sunil Jardosh
  • Patent number: 11531704
    Abstract: What is disclosed is an improved approach to perform automatic partitioning, without requiring any expertise on the part of the user. A three stage processing pipeline is provided to generate candidate partition schemes, to evaluate the candidate using real table structures that are empty, and to then implement a selected scheme with production data for evaluation. In addition, an improved approach is described to perform automatic interval partitioning, where the inventive concept implements interval partitioning that does not impose these implicit constraints on the partition key column.
    Type: Grant
    Filed: September 11, 2020
    Date of Patent: December 20, 2022
    Inventors: George Eadon, Ramesh Kumar, Ananth Raghavan
  • Patent number: 11520780
    Abstract: Systems and techniques are described for efficient, general-purpose, and potentially decentralized databases, distributed storage systems, version control systems, and/or other types of data repositories. Data is represented in a database system in such a way that any value is represented by a unique identifier which is derived from the value itself. Any database peer in the system will derive an identical identifier from the same logical value. The identifier for a value may be derived using a variety of mechanisms, including, without limitation, a hash function known to all peers in the system. The values may be organized hierarchically as a tree of nodes. Any two peers storing the same logical value will deterministically represent that value with a graph, such as the described “Prolly” tree, having the same topology and hash value, irrespective of possibly differing sequences of mutations which caused each to arrive at the same final value.
    Type: Grant
    Filed: May 10, 2021
    Date of Patent: December 6, 2022
    Assignee: Salesforce, Inc.
    Inventors: Aaron Boodman, Rafael Weinstein, Erik Arvidsson, Chris Masone, Dan Willhite, Benjamin Kalman
  • Patent number: 11507577
    Abstract: Methods, systems, apparatuses, and computer program products are provided for determining a query plan. A query is received that comprises a request for a data result for each of a plurality of original time windows. The plurality of original time windows included in the query are identified. An initial window representation is generated that identifies a set of connections between windows in a window set that includes at least the original time windows. A revised window representation is generated that includes an alternative set of connections between windows in the window set based at least on an execution cost for at least one window. The revised window representation is selected to obtain the data result for each of the plurality of original time windows. A revised query plan based on the revised window representation is provided to obtain the data result for each of the plurality of original time windows.
    Type: Grant
    Filed: May 28, 2020
    Date of Patent: November 22, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Alexander Raizman, Wentao Wu, Philip A. Bernstein
  • Patent number: 11507590
    Abstract: Techniques are introduced herein for maintaining geometry-type data on persistent storage and in memory. Specifically, a DBMS that maintains a database table, which includes at least one column storing spatial data objects (SDOs), also maintains metadata for the database table that includes definition data for one or more virtual columns of the table. According to an embodiment, the definition data includes one or more expressions that calculate minimum bounding box values for SDOs stored in the geometry-type column in the table. The one or more expressions in the metadata maintained for the table are used to create one or more in-memory columns that materialize the bounding box data for the represented SDOs. When a query that uses spatial-type operators to perform spatial filtering over data in the geometry-type column is received, the DBMS replaces the spatial-type operators with operators that operate over the scalar bounding box information materialized in memory.
    Type: Grant
    Filed: June 17, 2020
    Date of Patent: November 22, 2022
    Assignee: Oracle International Corporation
    Inventors: Siva Ravada, Ying Hu, Zhen Hua Liu, Shasank Kisan Chavan, Aurosish Mishra, Vikas Arora
  • Patent number: 11500871
    Abstract: A computer-implemented method is disclosed that includes operations of receiving a query to be executed, the query including an indication of a data source at which input data is be to obtained, wherein the query is to be executed on the input data, determining a schema of the input data, determining fields of the input data that are required for execution of the query by analyzing a sequence of operators forming the query, determining one or more alterations to the query to improve efficiency of the execution of the query based on the fields of input data required for the execution, and generating an altered query be altering the query in accordance with the one or more alterations. The method may further include converting the query to a directed acyclic graph (DAG) and providing the DAG to a distributed processing engine configured to execute the DAG.
    Type: Grant
    Filed: October 19, 2020
    Date of Patent: November 15, 2022
    Assignee: SPLUNK Inc.
    Inventors: Chinmay Madhav Kulkarni, Lin Ma, Amir Malekpour, Mohan Rajagopalan, John C. Reed, Ram Sriharsha
  • Patent number: 11494379
    Abstract: Disclosed herein are systems and methods for pre-filter deduplication for multidimensional two-sided interval joins. In an embodiment, a data platform receives query instructions for a two-sided N dimensional interval join, where N is an integer greater than 1. The two-sided N dimensional interval join has an interval-join predicate that compares intervals determined from the input relations in each of N dimensions. The data platform implements the two-sided N dimensional interval join as a query-plan section that includes an N dimensional band join that is followed by a deduplication operator that is followed by a filter that applies the interval-join predicate. The N dimensional band join includes a hash join keyed to N dimensional domain cells overlapped at least in part by intervals determined from the input relations in each of the N dimensions. The deduplication operator removes duplicate rows from a potential-duplicates subset of the output of the N dimensional band join.
    Type: Grant
    Filed: April 23, 2021
    Date of Patent: November 8, 2022
    Assignee: Snowflake Inc.
    Inventors: Matthias Carl Adams, Spyridon Triantafyllis, Lars Volker, Kevin Wang
  • Patent number: 11487772
    Abstract: The present disclosure provides a multi-party data joint query method, a device, a server and a storage medium. The multi-party data joint query method executed by a manager includes: analyzing a multi-party joint query sentence to obtain a logical execution plan; processing the logical execution plan according to providers of respective nodes in the logical execution plan to obtain a physical execution plan of each provider; and generating a query instruction of each provider according to the physical execution plan of each provider, and sending the query instruction to respective provider. The query instruction is configured to instruct the providers to perform a query cooperatively.
    Type: Grant
    Filed: December 26, 2019
    Date of Patent: November 1, 2022
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Zhi Feng, Yu Zhang, Sen Zhang
  • Patent number: 11487795
    Abstract: Disclosed is a template-based automatic question and answer method for software bug. An entity relationship triple is extracted from a bug corpus and a natural language pattern is acquired; an entity relationship in the triple is determined; a query template corresponding to the natural language pattern is acquired; an entity in a question q proposed by a user is replaced with an entity type to acquire a question q?; then, the entity type in q? and an entity type in the natural language pattern are compared and searched for and a similarity is calculated; then, a SPARQL query pattern of the question q is acquired according to the similarity and the entity in the question q; and finally, the SPARQL query pattern of the question q is executed so as to acquire an answer to the question q.
    Type: Grant
    Filed: August 28, 2019
    Date of Patent: November 1, 2022
    Inventors: Xiaobing Sun, Jinting Lu, Bin Li
  • Patent number: 11487708
    Abstract: Techniques for visual data preparation are described. An interactive visual data preparation service provides a user with a graphical user interface that presents values from a sample taken of a dataset along with statistical information associated with those values. A user uses the graphical user interface to test out various transformations to the sample dataset by applying transformations and viewing near-immediate results of those transformations as applied to the sample. The desired set of transformations is represented as a recipe object, which can be used to perform data preparation against the overall dataset or other datasets on behalf of the user or other users.
    Type: Grant
    Filed: November 11, 2020
    Date of Patent: November 1, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Surbhi Dangi, Gopinath Duddi, Amit Gul Phagwani, Romi Boimer, Ronald Stephen Kyker
  • Patent number: 11481398
    Abstract: A system for spilling comprises an interface and a processor. The interface is configured to receive an indication to perform a GROUP BY operation, wherein the indication comprises an input table and a grouping column. The processor is configured to: for each input table entry of the input table, determine a key, wherein the key is based at least in part on the input table entry and the grouping column; add the key to a grouping hash table, wherein adding the key to the grouping hash table comprises last-in, first-out (LIFO) spilling when necessary; create an output table based at least in part on the grouping hash table; and provide the output table.
    Type: Grant
    Filed: December 9, 2020
    Date of Patent: October 25, 2022
    Assignee: Databricks Inc.
    Inventors: Alexander Behm, Ankur Dave, Ryan Deng, Shoumik Palkar
  • Patent number: 11468073
    Abstract: Techniques are provided for gathering statistics in a database system. The techniques involve gathering some statistics using an “on-the-fly” technique, some statistics through a “high-frequency” technique, and yet other statistics using a “prediction” technique. The technique used to gather each statistic is based, at least in part, on the overhead required to gather the statistic. For example, low-overhead statistics may be gathered “on-the-fly” using the same process that is performing the operation that affects the statistic, while statistics whose gathering incurs greater overhead may be gathered in the background, while the database is live, using the high-frequency technique. The prediction technique may be used for relatively-high overhead statistics that can be predicted based on historical data and the current value of predictor statistics.
    Type: Grant
    Filed: August 6, 2019
    Date of Patent: October 11, 2022
    Assignee: Oracle International Corporation
    Inventors: Mohamed Zait, Yuying Zhang, Hong Su, Jiakun Li
  • Patent number: 11468065
    Abstract: An information processing apparatus according to the present application includes an acquiring unit and a selecting unit. The acquiring unit acquires a plurality of pieces of second triple information hierarchized based on a conceptual system in a plurality of pieces of first triple information indicating a relationship about three types of elements and statistical information indicating the number of pieces of the first triple information associated with each of the plurality of pieces of the second triple information. The selecting unit selects, based on the statistical information acquired by the acquiring unit and based on a predetermined standard related to the statistical information, from among the plurality of pieces of the second triple information, a plurality of pieces of target triple information to be used for a clustering process.
    Type: Grant
    Filed: February 21, 2019
    Date of Patent: October 11, 2022
    Assignee: YAHOO JAPAN CORPORATION
    Inventors: Kiyoshi Nitta, Iztok Savnik
  • Patent number: 11461327
    Abstract: The subject technology receives a query, the query including a set of statements for performing the query. The subject technology populates a compilation context based at least in part the query. The subject technology provides the compilation context to a compiler. The subject technology invokes the compiler to perform a compilation process based on the compilation context, the compilation process comprising performing a lookup operation on a stored plan cache for an exact match based on information from the compilation context, the stored plan cache including a set of stored query plans, and determining whether the exact match of a particular query plan is found in the stored plan cache to avoid compiling the query using the compilation context.
    Type: Grant
    Filed: April 8, 2022
    Date of Patent: October 4, 2022
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Xuelai Cui, Sangyong Hwang, Allison Waingold Lee, Boyung Lee, Nicola Dan Onose, William Waddington, Jiaqi Yan, Li Yan, Yongsik Yoon
  • Patent number: 11461195
    Abstract: A method for processing query fault, where a database server receives a query statement and generates a corresponding query plan tree including multiple layers of operators in a pipeline relationship, and each layer includes operation symbols having logical relationship with each other. The server executes the query statement according to the query plan tree, extracts intermediate status information of a faulty operator when a fault occurs in a process of executing the query statement, updates operation symbols of the faulty operator and a logical relationship among the operation symbols according to the query plan tree and the intermediate status information to obtain a reconstructed query plan tree, and continues to execute the query statement according to the reconstructed query plan tree after the fault is recovered.
    Type: Grant
    Filed: November 23, 2020
    Date of Patent: October 4, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Jinwei Zhu, Qingqing Zhou, Pinggao Zhou
  • Patent number: 11461304
    Abstract: Signature-based cache optimization for data preparation includes: performing a first set of sequenced data preparation operations on one or more sets of data to generate a plurality of transformation results; caching one or more of the plurality of transformation results and one or more corresponding operation signatures, a cached operation signature being derived based at least in part on a subset of sequenced operations that generated a corresponding result; receiving a specification of a second set of sequenced operations; determining an operation signature associated with the second set of sequenced operations; identifying a cached result among the cached results based at least in part on the determined operation signature; and outputting the cached result.
    Type: Grant
    Filed: March 10, 2020
    Date of Patent: October 4, 2022
    Assignee: DataRobot, Inc.
    Inventors: Dave Brewster, Victor Tze-Yeuan Tso
  • Patent number: 11455306
    Abstract: Techniques are described herein for leveraging recurrent neural networks for query processing. In some embodiments, a query analytic system determines a sequence of tokens for at least a portion of a query and determines a vector representation for each token. The query analytic system further generates, using a neural network based on the sequence of tokens, a performance prediction associated with executing at least the portion of the query, wherein the neural network assigns at least a first weight for at least a first token in the sequence of tokens based at least in part on at least a second token that preceded the token in the sequence. The query analytic system further triggers a responsive action, such as triggering an alert and/or tuning the query, based at least in part on the performance prediction.
    Type: Grant
    Filed: January 21, 2020
    Date of Patent: September 27, 2022
    Assignee: Oracle International Corporation
    Inventors: Arvind Kumar Maheshwari, Vamshidhar Reddy Pasham, Shantanu Mahajan, Debottam Kundu
  • Patent number: 11455307
    Abstract: A system includes determination of a plurality of queries of a workload, determination of a data source comprising a plurality of data rows, and determination of a sample data source based on a cardinality of each of the plurality of queries with respect to the data source and an estimated cardinality of each of the plurality of queries with respect to the data source, wherein the estimated cardinality of a query with respect to the data source is determined based on the sample data source.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: September 27, 2022
    Assignee: SAP SE
    Inventors: Axel Hertzschuch, Norman May, Lars Fricke, Florian Wolf, Guido Moerkotte, Wolfgang Lehner
  • Patent number: 11449481
    Abstract: Data storage and query method and device are disclosed, which facilitate a quick acquisition of query results through index queries at subsequent stages by establishing indexes for columns of a table. Furthermore, by scanning data in the table to obtain statistical information of data in the columns, this facilitates using the statistical information of the data in the columns to perform cost estimation in subsequent queries, in an attempt to obtain a data query mode that has the least cost and the best performance, thus improving query efficiency.
    Type: Grant
    Filed: June 5, 2020
    Date of Patent: September 20, 2022
    Assignee: Alibaba Group Holding Limited
    Inventors: Jiye Tu, Chuangxian Wei, Chaoqun Zhan
  • Patent number: 11442933
    Abstract: An approach for implementing function semantic based partition-wise SQL execution and partition pruning in a data processing system is provided. The system receives a query directed to a range-partitioned table and determines if operation key(s) of the query include(s) function(s) over the table partitioning key(s). If so, the system obtains a set of values corresponding to each partition by evaluating the function(s) on a low bound and/or a high bound table partitioning key value corresponding to the partition. The system may then compare the sets of values corresponding to different partitions and determine whether to aggregate results obtained by executing the query over the partitions based on the comparison. The system may also determine whether to prune any partitions from processing based on a set of correlations between the set of values for each partition and predicate(s) of the query including function(s) over the table partitioning key(s).
    Type: Grant
    Filed: September 21, 2017
    Date of Patent: September 13, 2022
    Assignee: Oracle International Corporation
    Inventors: Srikanth Bellamkonda, Andrew Witkowski, Manish Pratap Singh, Madhuri Kandepi
  • Patent number: 11443588
    Abstract: Embodiment method and associated apparatus relate to altering the expected value of a system modeled by a random process simplified to produce a binary outcome. Various embodiments modify a genetic algorithm to optimize such settings as population size, number of iterations to convergence, mutation chance, and sample space. Some embodiment ARON implementations correctly predict game outcome relative to the spread, based on transforming unrelated raw data, applying the transformed raw data to a modified genetic algorithm, generating multiple expected outcomes determined by the modified genetic algorithm as a function of the transformed raw data, and filtering the outcomes as a function of predefined metrics to produce a single end result that can be utilized effectively by an evolutionary-style algorithm.
    Type: Grant
    Filed: January 6, 2022
    Date of Patent: September 13, 2022
    Assignee: Ladris Technologies, Inc.
    Inventors: Leo Zlimen, Bowen Kyle
  • Patent number: 11429610
    Abstract: A method, a system, and a computer program product for generating a query executable plan. A query requiring access to data stored in a database system is received. Based on the received query, a query execution plan having a plurality of query execution pipelines is generated. Each query execution pipeline in the plurality of query execution pipelines is configured to execute a plurality of operations in a predetermined order associated with each query execution pipeline. The generated query execution plan is fragmented into a plurality of fragments. Each fragment has one or more query execution pipelines in the plurality of query execution pipelines. The received query is executed by executing each fragment in the plurality of fragments.
    Type: Grant
    Filed: April 1, 2020
    Date of Patent: August 30, 2022
    Assignee: SAP SE
    Inventors: Xun Cheng, Zhen Tian, Yuncong Qiao, Faming Qu, Paul Willems, Hongyong Lu, Yanxin Luo, Nitesh Maheshwari