Query Cost Estimation Patents (Class 707/719)
  • Patent number: 11966395
    Abstract: Systems and methods for query generation based on merger of subqueries are described. For example, methods may include accessing a first join graph representing tables in a database, wherein the first join graph has vertices corresponding to respective tables in the database and directed edges corresponding to join relationships; receiving a first query specification that references data in two or more of the tables of the database to specify multiple subqueries in a set of subqueries; checking that two or more subqueries from the set of subqueries have the same join graph; checking that the two or more subqueries have the same set of grouping columns; responsive, at least in part, to the two or more subqueries having the same join graph and the same set of grouping columns, merging the two or more subqueries to obtain a consolidated query.
    Type: Grant
    Filed: July 11, 2022
    Date of Patent: April 23, 2024
    Assignee: ThoughtSpot, Inc.
    Inventors: Naman Shah, Rakesh Kothari, Archit Bansal
  • Patent number: 11914603
    Abstract: A data layout model generation system generates, with reinforcement learning, a node configuration and a data layout key in a distributed parallel database. This system includes a sample acquisition processor that acquires, on the basis of a predetermined acquisition method, sample data from data stored in the distributed parallel database, a data layout estimator having, as states in the reinforcement learning, the node configuration and the data layout key including information regarding an order of sorting columns that constitute the data and information regarding a method for distribution between nodes, the data layout estimator estimating layout of the data on the basis of the state and the sample data, a reward calculator that calculates a reward in the reinforcement learning on the basis of a result obtained by estimating the layout of the data, the node configuration, and a processing cost of a query executed on the distributed parallel database.
    Type: Grant
    Filed: July 8, 2022
    Date of Patent: February 27, 2024
    Assignee: Hitachi, Ltd.
    Inventors: Ken Sugimoto, Norifumi Nishikawa
  • Patent number: 11907765
    Abstract: Fog computing systems are provided comprising edge-nodes and middle-nodes between edge-nodes and cloud-node. These nodes form a hierarchical structure with each cloud, middle node having children nodes, and each middle, edge node having a parent-node. Each edge-node receives data from sensors, assigns reception-timestamp to each data indicating when data has been received to produce series of timestamp-ordered data, trains local model through machine-learning based on said series of timestamp-ordered data, and sends said series to parent-node of the edge-node. Each middle-node collects series of timestamp-ordered data from children nodes of the middle-node, trains supra-local model through machine-learning based on said collected series of timestamp-ordered data, and sends said collected series to parent-node of the middle-node. Parent-children structures, edge-nodes and middle-nodes for such fog computing systems are also provided.
    Type: Grant
    Filed: July 6, 2018
    Date of Patent: February 20, 2024
    Assignees: BARCELONA SUPERCOMPUTING CENTER—CENTRO NACIONAL DE SUPERCOMPUTACIÓN, UNIVERSITAT POLITÈCNICA DE CATALUNYA
    Inventors: Juan Luís Pérez Rico, Alberto Gutiérrez Torre, Josep Lluís Berral García, David Carrera Perez
  • Patent number: 11880370
    Abstract: A method, system and computer program product for join graph generation based upon a log of previously executed database queries includes method for generating a join graph for relational database queries. The method includes loading into memory of a computer, a log of a set of database queries previously executed against data in a database and sequentially parsing each of the queries in the log to identify different semantically characterizable components of each of the queries. The method further includes generating a join graph for each of the queries from corresponding ones of the components. Finally, the method includes selectively adding each of the generated join graphs to a set of join graphs in a data model for the data in the database.
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: January 23, 2024
    Assignee: Google LLC
    Inventors: Julian Hyde, Jonathan Swenson
  • Patent number: 11880721
    Abstract: A method, including receiving, from a client, a unified query, and extracting, from the unified query, an endpoint query for a first data source on a first server and an endpoint query for a second data source on a second server. The extracted endpoint query for the first data source is forwarded to the first server. Upon receiving a response to the endpoint query forwarded to the first server, one or more parameters are extracted from the response. The endpoint query for the second data source is updated so as to include the extracted one or more parameters, and the updated endpoint query for the second data source is forwarded to the second server. Upon receiving, from the second server, a response to the forwarded endpoint query, a result for the received unified query is generated based on the receive responses, and the generated result is conveyed to the client.
    Type: Grant
    Filed: May 25, 2022
    Date of Patent: January 23, 2024
    Assignee: R SOFTWARE INC.
    Inventors: Iddo Gino, Andrey Bukati, Srivatsan Srinivasan
  • Patent number: 11809425
    Abstract: A data platform that implements memoizable functions for database objects. The data platform detects a first execution of a memoizable function and generates a first key based on metadata of one or more database objects operated on by the memoizable function and generates a first result for the memoizable function based on the one or more database objects. The data platform detects a second execution of the memoizable function and generates a second key based on the metadata of the one or more database objects operated on by the memoizable function. When the first key and the second key are equal, the data platform reuses the first result of the memoizable function. When the first key and second key do not match, the data platform generates a second result for the second execution of the memoizable function.
    Type: Grant
    Filed: August 15, 2022
    Date of Patent: November 7, 2023
    Assignee: Snowflake Inc.
    Inventors: Raja Suresh Krishna Balakrishnan, Thierry Cruanes, Yujie Li, Subramanian Muralidhar, David Schultz, Jiaqi Yan
  • Patent number: 11768953
    Abstract: Systems, methods, and devices for implementing secure views for zero-copy data sharing in a multi-tenant database system are disclosed. A method includes generating a share object in a first account comprising a share role. The method includes associating view privileges for the share object such that an underlying detail of the share object comprises a secure view definition. The method includes granting, to a second account, cross-account access rights to the share role or share object in the first account. The method includes receiving a request from the second account to access data or services of the first account and providing a response to the second account based on the data or services of the first account. The method is such that the underlying detail of the share object that comprises the secure view definition is hidden from the second account and visible to the first account.
    Type: Grant
    Filed: July 23, 2020
    Date of Patent: September 26, 2023
    Assignee: Snowflake Inc.
    Inventors: Allison Waingold Lee, Peter Povinec, Martin Hentschel, Robert Muglia
  • Patent number: 11755318
    Abstract: Even when one refactoring operation cannot establish a target software structure, an appropriate refactoring operation establishes the target software structure. An improvement proposing device includes: a structure comparator to output, as an improvement object, a difference between a first software structure and a second software structure different in software structure from the first software structure; and an improvement plan examining unit to examine an improvement plan for each improvement portion in the improvement object, the improvement plan being a method for bringing the first software structure closer to the second software structure.
    Type: Grant
    Filed: July 2, 2020
    Date of Patent: September 12, 2023
    Assignee: MITSUBISHI ELECTRIC CORPORATION
    Inventors: Daiki Shima, Toshiki Kitajima, Toshihiro Kobayashi, Yuki Hikawa, Taishi Azuma
  • Patent number: 11727057
    Abstract: Provided is a method of indexing in a network key value indexing system. The method includes retrieving a first key name from a storage device of the network key value indexing system, the first key name identifying a first prefix, a first bucket, and a first key, the first prefix indicating the first bucket, parsing the first key name into the first prefix, the first bucket, and the first key, determining the first prefix, the first bucket, and the first key based on a first delimiter, and generating a hash table in a memory cache of the network key value indexing system to associate the first prefix with the first key.
    Type: Grant
    Filed: March 16, 2020
    Date of Patent: August 15, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Somnath Roy, Ronald Lee
  • Patent number: 11625399
    Abstract: A method for dynamic filter pushdown for massive parallel processing databases on the cloud, including acquiring one or more filters corresponding to a query, acquiring statistics information of one or more database tables, determining a selectivity of the one or more database tables based on the statistics information, determining whether the selectivity satisfies a threshold condition, and pushing down the one or more filters to the one or more database tables based on the determination of whether the selectivity satisfies a threshold condition.
    Type: Grant
    Filed: May 16, 2019
    Date of Patent: April 11, 2023
    Assignee: Alibaba Group Holding Limited
    Inventors: Huaizhi Li, Congnan Luo, Ruiping Li, Xiaowei Zhu
  • Patent number: 11550696
    Abstract: One example method includes evaluating code of a quantum circuit, estimating one or more runtime statistics concerning the code, generating a recommendation based on the one or more runtime statistics, and the recommendation identifies one or more resources recommended to be used to execute the quantum circuit, checking availability of the resources for executing the quantum circuit, allocating resources, when available, sufficient to execute the quantum circuit, and using the allocated resources to execute the quantum circuit.
    Type: Grant
    Filed: June 30, 2020
    Date of Patent: January 10, 2023
    Assignee: EMC IP Holding Company LLC
    Inventors: Kenneth Durazzo, Seth Jacob Rothschild, Victor Fong
  • Patent number: 11537909
    Abstract: Methods and system are presented for monitoring database processes to generate machine learning predictions. A plurality of database processes executed on database implementations can be monitored, wherein the monitoring includes determining a start time, an end time, and a number of rows impacted by portions of the database processes, and the monitored database processes generate instances of machine learning data including at least the number of rows impacted and an associated duration of time. Using a machine learning component and the machine learning data, a duration of time can be predicted for a candidate database process for execution on a database implementation.
    Type: Grant
    Filed: December 30, 2019
    Date of Patent: December 27, 2022
    Assignee: Oracle International Corporation
    Inventors: Sudhir Arthanat, Prashant Prakash
  • Patent number: 11461319
    Abstract: Examples of dynamic database query efficiency improvement are provided herein. Query portions of a received database query can be identified as candidates for replacement. The candidates for replacement can be query portions that reduce the efficiency of the query. Alternative queries can be determined that include substitute query portion(s) in place of candidate(s) for replacement. An expected performance of the alternative queries can be determined. Based at least in part on the expected performance of the alternative queries, one or more alternative queries can be selected as replacement database queries for the received database query.
    Type: Grant
    Filed: October 6, 2014
    Date of Patent: October 4, 2022
    Assignee: Business Objects Software, Ltd.
    Inventor: Alan McShane
  • Patent number: 11429893
    Abstract: Techniques for massively-parallel real-time database-integrated machine learning (ML) inference are described. An ML model is deployed as one or more model serving units behind an endpoint. The ML model can be associated with a virtual table or function, and a query that is received that references the virtual table or function can be processed by issuing inference requests to the endpoint by the query execution engine(s).
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: August 30, 2022
    Assignee: Amazon Technologies, Inc.
    Inventor: Dylan Tong
  • Patent number: 11416550
    Abstract: Mechanisms for resolving a database query are provided. In some instances, these mechanisms include identifying a connected component in a query graph corresponding to the database query. In some instances, these mechanisms further include determining a longest path length for the connected component. In some instances, these mechanisms further include selecting a path having the longest path length. In some instances, these mechanisms still further include building an algebraic expression for the path. In some instances, these mechanisms still further include solving the algebraic expression using matrix-matrix multiplication to provide a solution. And, in some instances, these mechanisms still further include responding to the query based on the solution.
    Type: Grant
    Filed: April 10, 2020
    Date of Patent: August 16, 2022
    Assignee: Redis, LTD
    Inventor: Roi Lipman
  • Patent number: 11372858
    Abstract: Operations include estimating, in real time, a runtime of a query. The query optimization system receives set of query definitions for defining a target query. The system uses the set of query definition elements to determine an estimated runtime for the target query. If the estimated runtime exceeds some acceptable threshold value, then the system determines a modification to the set of query definition elements. The system uses the modification to generate a modified query, corresponding to a lower estimated runtime.
    Type: Grant
    Filed: May 18, 2017
    Date of Patent: June 28, 2022
    Assignee: Oracle International Corporation
    Inventors: Oleksiy Ignatyev, Ondrej Bohaciak
  • Patent number: 11334848
    Abstract: The invention relates to a commitment process for efficiently generating a jointly supported decision.
    Type: Grant
    Filed: December 20, 2018
    Date of Patent: May 17, 2022
    Inventor: Richard Graf
  • Patent number: 11327970
    Abstract: Context dependent execution time prediction may be applied to redirect queries to additional query processing resources. A query to a database may be received at a first query engine. A prediction model for executing queries at the first query engine may be applied to determine predicted query execution time for the first query engine. A prediction model for executing queries at a second query engine may also be applied to determine predicted query execution time for the second query engine. One of the query engines may be selected to perform the query based on a comparison of the predicted query execution times.
    Type: Grant
    Filed: March 25, 2019
    Date of Patent: May 10, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Mingda Li, Gaurav Saxena, Naresh Chainani
  • Patent number: 11327995
    Abstract: A complex data type is encoded over columns of a table of a columnar database by mapping fields of the complex data type to the columns. An optimized query can be generated for a query specifying the complex data type. The optimized query specifies the columns to which the fields specified within the query are mapped, instead of specifying the fields. The optimized query can be processed against the database in a late materialization manner to fulfill the query.
    Type: Grant
    Filed: September 11, 2019
    Date of Patent: May 10, 2022
    Assignee: MICRO FOCUS LLC
    Inventors: Deepak Majeti, Natalya Aksman, James Clampffer, Stephen Gregory Walkauskas
  • Patent number: 11294916
    Abstract: A method for execution by a query processing system includes receiving a query request that indicates a query for execution by a database system. A plurality of query execution mode options for execution of the query via the database system can be determined. A plurality of execution success conditions corresponding to the plurality of query execution mode options can be determined. A plurality of resultant correctness guarantee data corresponding to the plurality of query execution mode options based on the plurality of execution success conditions can be generated. Query execution mode selection data can be generated by selecting a query execution mode from the plurality of query execution mode options based having resultant correctness guarantee data that compares favorably to determined resultant correctness requirement data. A resultant for the query can be generated by facilitating execution of the query in accordance with the selected execution mode.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: April 5, 2022
    Assignee: Ocient Holdings LLC
    Inventors: George Kondiles, Jason Arnold, S. Christopher Gladwin, Joseph Jablonski, Daniel Coombs, Andrew D. Baptist
  • Patent number: 11240119
    Abstract: A method of operating a communications network is disclosed. Modern communications networks produce vast amounts of network operational data which have the potential to provide a useful summary of the operational state of the network. Whilst processes such as clustering are known for arranging the vast amount of data into groups, the clusters themselves do not provide data which might be easily interpreted by network elements or administrators. Network operational data often comprises a plurality of data items, each of which gives a value for each of a set of attributes. By processing a cluster to identify attributes in the cluster whose values vary less in the cluster then they vary outside of the cluster, and then generating a cluster description which is based on a measure of the central tendency of the values of those attribute in the cluster, an easily interpretable general description of the data items in the cluster is provided.
    Type: Grant
    Filed: July 28, 2016
    Date of Patent: February 1, 2022
    Assignee: BRITISH TELECOMMUNICATIONS public limited company
    Inventors: Alexander Healing, Michael Turner
  • Patent number: 11194649
    Abstract: A method, system and computer program product for providing early diagnosis of hardware, software or configuration problems in a data warehouse system. A received query is parsed to determine the properties of the query. The query may then be joined to existing groups of queries if those groups have shared properties of the query. After executing the query according to an execution plan, results from the execution of the query is received, which may include problem(s) that occurred during execution of the query. For those problems that reach a pre-defined threshold of becoming a “group problem” in those groups joined by the query, the problem is reported to the end user concerning those groups where the problem exceeds the pre-defined threshold. In this manner, an early diagnosis of the problems in the data warehouse system that can cause delay and failure of the processing of queries is able to occur.
    Type: Grant
    Filed: July 10, 2019
    Date of Patent: December 7, 2021
    Assignee: International Business Machines Corporation
    Inventors: Lukasz Gaza, Artur M. Gruszecki, Tomasz Kazalski, Bartlomiej T. Malecki, Konrad K. Skibski, Tomasz Stradomski
  • Patent number: 11182386
    Abstract: Methods and systems for generating database statistics. Table statistics in a metadata catalog of a source database system are observed, statistics generation costs utilizing a target database system are estimated, and source statistics generation costs utilizing a source database system are estimated. The statistics generation costs are compared and statistics generation queries by the target database system are triggered in response to the statistics generation costs utilizing the target database system having a predefined relationship with the source statistics generation costs utilizing the source database system. The statistics generation queries are performed by the target database system in response to the triggering by the source database system. The generated statistics are sent from the target database system to the source database system, the table statistics in a metadata catalog are updated based on the generated statistics, and the updated table statistics are used to optimize a query plan.
    Type: Grant
    Filed: March 24, 2020
    Date of Patent: November 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Dennis Butterstein, Oliver Benke, Tobias Ulrich Bergmann, Felix Beier, Terence P. Purcell
  • Patent number: 11163755
    Abstract: Various embodiments relate generally to data science and data analysis, computer software and systems, and wired and wireless network communications to provide an interface between repositories of disparate datasets and computing machine-based entities that seek access to the datasets, and, more specifically, to a computing and data storage platform that facilitates consolidation of one or more datasets, whereby a collaborative data layer and associated logic facilitate, for example, efficient access to, and implementation of, collaborative datasets. In some examples, a method may include receiving data representing a query of a consolidated dataset that may include datasets formatted atomized datasets, analyzing the query to classify portions of the query to form classified query portions, partitioning the query into sub-queries as a function of a classification type for each of the classified query portions, and retrieving data representing a query result from distributed data repositories.
    Type: Grant
    Filed: April 25, 2019
    Date of Patent: November 2, 2021
    Assignee: data.world, Inc.
    Inventors: Bryon Kristen Jacob, David Lee Griffith, Triet Minh Le, Jon Loyens, Brett A. Hurt, Arthur Albert Keen
  • Patent number: 11132343
    Abstract: In general, embodiments of the present invention provide systems, methods and computer readable media for automatic cleaning of entity resolution (ER) data persistently stored in a data repository.
    Type: Grant
    Filed: March 18, 2016
    Date of Patent: September 28, 2021
    Assignee: Groupon, Inc.
    Inventors: Taylor Raack, David Alan Johnston
  • Patent number: 11093490
    Abstract: In accordance with one aspect of the present disclosure, a request to provide recommendations of data enrichments for a database is received at a recommendation engine. The recommendation engine may perform static and dynamic analysis of data associated with the database and may further refine recommendations based on policies. The recommendation engine may then provide the recommendations, if any, of data enrichments to allow a software developer, for example, to indicate whether the data enrichments are to be used.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: August 17, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Anthony Nino Bice, David Michael Robinson, Hariharan Sivaramakrishnan
  • Patent number: 11068483
    Abstract: In accordance with embodiments, there are provided mechanisms and methods for facilitating dynamic selection and application for rules for query processing for large datasets in an on-demand services environment according to one embodiment. In one embodiment and by way of example, a method comprises determining processing patterns of a query based on historical performances associated with the query placed on behalf of a tenant in a multi-tenant environment, and dynamically applying one or more rules to the query for processing of the query within a predictable amount of time, where the one or more rules are dynamically selected from sets of rules based on the processing patterns. The method may further include executing the query based on the one or more rules by scanning one or more portions of a database having contents pertinent to the query and generating results by processing the query based on the contents and within the predictable amount of time.
    Type: Grant
    Filed: September 18, 2018
    Date of Patent: July 20, 2021
    Assignee: salesforce.com, inc.
    Inventors: Cody Marcel, Sahil Ramrakhyani, Saikiran Perumala, Brian Esserlieu, Seshank Kalvala
  • Patent number: 11068460
    Abstract: Automated Index Management entails automated monitoring of query workload in a DBMS to determine a set of higher load queries to use to evaluate new potential indexes. Without the need of user approval or action, the potential indexes are automatically created, evaluated and tested, and then made available for system wide use for executing queries issued by end users. Indexes created by Automated Index Management are referred to herein as auto indexes.
    Type: Grant
    Filed: January 15, 2019
    Date of Patent: July 20, 2021
    Assignee: Oracle International Corporation
    Inventors: Mohamed Zait, Sunil Chakkappen, Christoforus Widodo, Zhan Li
  • Patent number: 11055352
    Abstract: Optimized query plans may be generated independent of the query engine that performs the optimized query plan. A request to generate an optimized query plan for a query may be received and a type of engine for performing the query may be identified. An initial plan may be generated in an engine-specific format for the type of engine that is translated into an optimization plan format. An analysis of the initial plan optimization plan format may be performed to generate an optimized query plan. The optimized query plan may be translated into the engine-specific format and sent in response to the request for the optimized query plan.
    Type: Grant
    Filed: June 8, 2017
    Date of Patent: July 6, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Marc Howard Beitchman, Andrew Edward Caldwell, Rahul Sharma Pathak
  • Patent number: 11010378
    Abstract: Joining data using a disjunctive operator using a lookup table is described. An example computer-implemented method can include receiving a query with a set of conjunctive predicates and a set of disjunctive predicates. The method may also include generating a lookup table for each predicate in the sets of conjunctive predicates and disjunctive predicates. The method, for each row in a probe-side table, may also further include looking up a value associated with that row in each of the lookup tables and adding the row to a results set when there is a match. Additionally, the method may also include returning the results set.
    Type: Grant
    Filed: March 13, 2020
    Date of Patent: May 18, 2021
    Assignee: Snowflake Inc.
    Inventors: Thierry Cruanes, Florian Andreas Funke, Guangyan Hu, Jiaqi Yan
  • Patent number: 11003649
    Abstract: Embodiments of the present invention provide an index establishment method and device. The method can include determining, according to index status information of a column in a database within a preset time threshold, whether an index is to be established for the column; determining an index type according to the data information of the column; and establishing the index for the column according to the index type based on the determination that the index is to be established for the column.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: May 11, 2021
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventors: Bowen Zheng, Yue Pan, Chuangxian Wei
  • Patent number: 10956400
    Abstract: Querying a data set formed from a version of primary data and secondary data is facilitated. First and second versions of primary data are stored in a primary data version store. Secondary data is received. The secondary data is stored in a secondary data store. A query language statement is received. The query language statement is executed by selecting query results from a data set that includes the secondary data and elements of the first version of primary data not inconsistent with the secondary data.
    Type: Grant
    Filed: July 15, 2016
    Date of Patent: March 23, 2021
    Assignee: SAP SE
    Inventors: Frank Feiks, Thomas Gross-Boelting, Michael Mueller, Armin Weidenschlager, Anton Forstreuter, Xiaomeng Wang, Florian Roeger, Jordan Tchorbadjiyski, Ruadhan MacFadden
  • Patent number: 10846287
    Abstract: Systems and methods are provided for loading data in connection with testing an application, based on a hierarchical framework. An exemplary method includes receiving a data query from a requestor, identifying a dependency of the data query on a data structure, and determining whether said data structure is annotated with a poor quality indicator. The exemplary method further includes providing a warning to the requestor when the data structure is annotated with the poor quality indicator, thereby informing the requestor of a potential quality issuer with a report relying on the data structure query.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: November 24, 2020
    Assignee: MASTERCARD INTERNATIONAL INCORPORATED
    Inventors: Chinmay Sharad Sagade, Sanchit Kaushik, Srinivas Kosaraju
  • Patent number: 10846284
    Abstract: A data analysis platform may be based on database views. A build module may receive, from a source code repository, information about a modified definition of a view. The build module may identify schema objects on which the view depends and form instructions for creating the view and the schema objects. The instructions may be executed to form the updated version of the view in a schema space separate from a production schema space. A deployment pipeline may coordinate replacing the production version of the view with the new version in response to validating the new version of the view.
    Type: Grant
    Filed: March 30, 2015
    Date of Patent: November 24, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Yangbae Park, Aaron John Seldon Steers, Jason Scott Flittner, Stuart William Barnes, Siddharth Sridhar
  • Patent number: 10831764
    Abstract: An example operation may include one or more of identifying a query from a requesting entity, where the query requests access to one or more blockchains, converting the query to an expression tree, creating one or more expression tree variations based on the expression tree, the one or more expression tree variations provide one or more different expressions than the expression tree and a same result as the expression tree, determining access conformity between one or more expression tree variations and the expression tree, selecting an expression tree variation with a greatest conformity rating, performing the query using the expression tree variation with the greatest conformity rating, and providing query results to a requesting entity.
    Type: Grant
    Filed: December 2, 2017
    Date of Patent: November 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Vijay Kumar Ananthapur Bache, Jhilam Bera, Vijay Ekambaram, Padmanabha Venkatagiri Seshadri
  • Patent number: 10762539
    Abstract: Disclosed are systems and methods for managing queries on on-line advertisement data. The system includes a query engine device for receiving queries from and outputting query results to query client devices and a training engine for generating and adjusting a model for predicting an estimation of resource usage for execution of each received query based on each query's corresponding feature vector having values pertaining to the query and a system status. The query engine device is further configured to provide the estimation of resource usage for each query to the corresponding query client device and, in response, receive input from such corresponding query client device and specifying whether to proceed with the corresponding query. A database system receives input from each query's corresponding query client device as to whether to proceed with the query and, in response, initiates or inhibits execution of such query with respect to a database storage system.
    Type: Grant
    Filed: January 27, 2016
    Date of Patent: September 1, 2020
    Assignee: AMOBEE, INC.
    Inventors: Mummoorthy Murugesan, Jianqiang Shen, Yan Qi
  • Patent number: 10754856
    Abstract: A large highly parallel database management system includes thousands of nodes storing huge volume of data. The database management system includes a query optimizer for optimizing data queries. The optimizer estimates the column cardinality of a set of rows based on estimated column cardinalities of disjoint subsets of the set of rows. For a particular column, the actual column cardinality of the set of rows is the sum of the actual column cardinalities of the two subsets of rows. The optimizer creates two respective Bloom filters from the two subsets, and then combines them to create a combined Bloom filter using logical OR operations. The actual column cardinality of the set of rows is estimated using a computation from the combined Bloom filter.
    Type: Grant
    Filed: May 29, 2018
    Date of Patent: August 25, 2020
    Assignee: OCIENT INC.
    Inventors: Jason Arnold, George Kondiles
  • Patent number: 10740304
    Abstract: Various embodiments virtualize data across heterogeneous formats. In one embodiment, a plurality of heterogeneous data sources is received as input. A local schema graph including a set of attribute nodes and a set of type nodes is generated for each of the plurality of heterogeneous data sources. A global schema graph is generated based on each local schema graph that has been generated. The global schema graph comprises each of the local schema graphs and at least one edge between at least one of two or more attributes nodes and two or more type nodes from different local schema graphs. The edge indicates a relationship between the data sources represented by the different local schema graphs comprising the two or more attributes nodes based on a computed similarity between at least one value associated with each of the two or more attributes nodes.
    Type: Grant
    Filed: August 25, 2014
    Date of Patent: August 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Achille Belly Fokoue-Nkoutche, Oktie Hassanzadeh, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward
  • Patent number: 10691646
    Abstract: Embodiments of the present invention relate to elimination of blocks such as splits in distributed processing systems such as MapReduce systems using the Hadoop Distributed Filing System (HDFS). In one embodiment, a method of and computer program product for optimizing queries in distributed processing systems are provided. A query is received. The query includes at least one predicate. The query refers to data. The data includes a plurality of records. Each record comprises a plurality of values in a plurality of attributes. Each record is located in at least one of a plurality of blocks of a distributed file system. Each block has a unique identifier. For each block of the distributed file system, at least one value cluster is determined for an attribute of the plurality of attributes. Each value cluster has a range. The predicate of the query is compared with the at least one value cluster of each block.
    Type: Grant
    Filed: March 5, 2018
    Date of Patent: June 23, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mohamed Eltabakh, Peter J. Haas, Fatma Ozcan, Mir Hamid Pirahesh, John (Yannis) Sismanis, Jan Vondrak
  • Patent number: 10635671
    Abstract: Techniques herein optimize sort-merge join method for a band join. In an embodiment, for a query comprising a query block specifying a join between a first table and a second table, a band join condition is detected between the first table and the second table. Once the band join condition in detected, an execution plan is generated and executed. The execution of the execution plan includes: for a first row of at least a subset of first sorted rows, scanning second rows from a set of second sorted rows, joining each of said second rows with said first row, and ceasing to scan when encountering a row from the second sorted rows that falls outside a bound of said band join condition. Techniques also include parallelizing a workload by overlapping the distribution of rows to the same slave process and computing cost and cardinality estimation for enhanced band join.
    Type: Grant
    Filed: October 5, 2017
    Date of Patent: April 28, 2020
    Assignee: Oracle International Corporation
    Inventors: Lei Sheng, Rafi Ahmed, Andrew Witkowski, Sankar Subramanian
  • Patent number: 10628416
    Abstract: A method includes receiving, from a client device, an enhanced database query for a union operation of a first query and at least a second query, parsing the enhanced database query to identify two or more parameters controlling handling of duplicate rows in the first query and the second query, evaluating the enhanced database query utilizing the parameters to generate a result table, and providing the result table to the client device.
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: April 21, 2020
    Assignee: International Business Machines Corporation
    Inventor: Chitwan Humad
  • Patent number: 10621235
    Abstract: Mechanisms are provided for resolving database queries. These mechanisms identify a connected component in a query graph corresponding to a database query. They then determine a longest path length for the connected component. Next, the mechanisms select a path having the longest path length and build an algebraic expression for the path. Finally, the mechanisms solve the algebraic expression using matrix-matrix multiplication to provide a solution, and then respond to the query based on the solution.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: April 14, 2020
    Assignee: Redis Labs Ltd.
    Inventor: Roi Lipman
  • Patent number: 10528599
    Abstract: Data processing engines implement tiered data processing for distributed data in local and remote data stores. Requests to access distributed data including a data object in a remote data store are received at a data processing engine. A query plan is generated to service the access request. Different operations in the query plan are identified and assigned to one or more remote query processing engines that may access the remote data object. Requests to perform the different operations are sent to the one or more remote query processing engines. A final result is generated for the request based on the results received for the different operations from the remote query processing engine and results from operations performed with respect to locally stored data.
    Type: Grant
    Filed: December 16, 2016
    Date of Patent: January 7, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Ippokratis Pandis, Mengchu Cai, Martin Grund, Anurag Windlass Gupta
  • Patent number: 10459987
    Abstract: Systems and methods for enhancing workflows with data virtualization. An example method may comprise: executing, by a processing device, a workflow comprising a conditional statement; performing a query in view of the conditional statement, the query employing virtualized data access to a plurality of heterogeneous data sources conforming to different data source schemas; transforming, by the processing device, data items returned by the query into a resulting data set conforming to a certain schema, wherein the data items correspond to the different data source schemas; and evaluating the conditional statement in view of the resulting data set.
    Type: Grant
    Filed: February 6, 2015
    Date of Patent: October 29, 2019
    Assignee: Red Hat, Inc.
    Inventors: Kimberly Palko, Kenneth W. Peeples, Prakash Aradhya
  • Patent number: 10453104
    Abstract: A method, system, and computer program product for pricing data according to contribution in a query are provided in the illustrative embodiments. A set of data cubes is identified, wherein a data cube in the set of data cubes comprises a quantum of data configured for trading in exchange for a payment, the set of data cubes being usable for answering a query. A first portion of a price for performing the query is computed, wherein the first portion corresponds to a first number of records used from a first data cube by the query, the first data cube being included in the set of data cubes. The first portion and the first number of records are presented in a pricing preview of the query.
    Type: Grant
    Filed: January 13, 2014
    Date of Patent: October 22, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tamer E. Abuelsaad, Gregory J. Boss, John M. Ganci, Jr., Craig M. Trim
  • Patent number: 10423479
    Abstract: A method, system and computer program product for providing early diagnosis of hardware, software or configuration problems in a data warehouse system. A received query is parsed to determine the properties of the query. The query may then be joined to existing groups of queries if those groups have shared properties of the query. After executing the query according to an execution plan, results from the execution of the query is received, which may include problem(s) that occurred during execution of the query. For those problems that reach a pre-defined threshold of becoming a “group problem” in those groups joined by the query, the problem is reported to the end user concerning those groups where the problem exceeds the pre-defined threshold. In this manner, an early diagnosis of the problems in the data warehouse system that can cause delay and failure of the processing of queries is able to occur.
    Type: Grant
    Filed: June 8, 2017
    Date of Patent: September 24, 2019
    Assignee: International Business Machines Corporation
    Inventors: Lukasz Gaza, Artur M. Gruszecki, Tomasz Kazalski, Bartlomiej T. Malecki, Konrad K. Skibski, Tomasz Stradomski
  • Patent number: 10423620
    Abstract: A central relational database management system (RDBMS) is operatively interconnected to one or more back-end database systems. A set of different query criteria specified for each of different types of queries for a mixed query workload is evaluated. At least one remote derived source of data requested by at least one of the different types of queries is dynamically created using at least one of the one or more back-end database systems that supports remote processing of the at least one of the different types of queries.
    Type: Grant
    Filed: April 22, 2017
    Date of Patent: September 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gary W. Crupi, Shantan Kethireddy, Ruiping Li, David R. Trotter
  • Patent number: 10394874
    Abstract: A computing device includes a storage machine holding instructions executable by a logic machine to generate multi-string clusters, each containing alphanumeric strings of a dataset. Further multi-string clusters are generated via iterative performance of a combination operation in which a hierarchically-superior cluster is generated from a set of multi-string clusters. The combination operation includes, for candidate pairs of multi-string clusters, generating syntactic profiles describing an alphanumeric string from each multi-string cluster of the candidate pair. For each of the candidate pairs, a cost factor is determined for at least one of its syntactic profiles. Based on the cost factors determined for the syntactic profiles, one of the candidate pairs is selected. The multi-string clusters from the selected candidate pair are combined to generate the hierarchically-superior cluster including all of the alphanumeric strings from the selected candidate pair of multi-string clusters.
    Type: Grant
    Filed: July 28, 2017
    Date of Patent: August 27, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Sumit Gulwani, Prateek Jain, Daniel Adam Perelman, Saswat Padhi, Oleksandr Polozov
  • Patent number: 10346396
    Abstract: The sequence of one or more searches can be altered to cause multiple searches to load and evaluate segments together. When a first search query is executed, a query processor can commence loading segments of an indexed store to thereby evaluate the first search query against the segments. Then, if a second search query is received while the first search query is executing, the query processor can cause the second search query to commence on the same segment that the first search query is currently being evaluated against. The first and second search queries can then continue execution together until the first search query has been evaluated against all segments. The query processor can then continue executing the second search query against the remaining segments until it reaches the segment on which its execution commenced.
    Type: Grant
    Filed: August 2, 2016
    Date of Patent: July 9, 2019
    Assignee: QUEST SOFTWARE INC.
    Inventors: Artem Nikolaevich Goussev, Vadim Alexandrovich Senchukov
  • Patent number: 10339136
    Abstract: The present invention extends to methods, systems, and computing system program products for incrementally calculating skewness for Big Data or streamed data in real time by incrementally calculating one or more components of skewness. Embodiments of the invention include incrementally calculating one or more components of skewness in a modified computation subset based on the one or more components of the skewness calculated for a previous computation subset and then calculating the skewness based on the incrementally calculated components. Incrementally calculating skewness avoids visiting all data elements in the modified computation subset and performing redundant computations thereby increasing calculation efficiency, saving computing resources and reducing computing system's power consumption.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: July 2, 2019
    Assignee: CLOUD & STREAM GEARS LLC
    Inventor: Jizhu Lu