Based On Joins Patents (Class 707/714)

System, method, and computer-readable medium for optimization of multiple parallel join operations on skewed data

Patent number: 8195644

Abstract: A system, method, and computer-readable medium that facilitate management of data skew during a parallel multiple join operation are provided. Portions of tables involved in the join operation are distributed among a plurality of processing modules, and each of the processing modules is provided with a list of skewed values of a join column of a larger table involved in the join operation. Each of the processing modules scans the rows of first and second tables distributed to the processing modules and compares values of the join columns of both tables with the list of skewed values. Rows of a larger table having non-skewed values in the join column are redistributed, and rows of the larger table having skewed values in the join column are maintained locally at the processing modules. Rows of the smaller table that have non-skewed values in the join column are redistributed, and rows of the smaller table that have skewed values in the join column are duplicated among the processing modules.

Type: Grant

Filed: October 6, 2008

Date of Patent: June 5, 2012

Assignee: Teradata US, Inc.

Inventor: Yu Xu
Method for assembly of personalized enterprise information integrators over conjunctive queries

Patent number: 8190596

Abstract: A plurality of sources are registered. A plurality of schemas are constructed, based on the plurality of sources. A desired output is obtained as a conjunctive query. A list of potential connections between at least selected ones of the sources is provided. A plurality of join plans are developed, based on the connections.

Type: Grant

Filed: November 28, 2007

Date of Patent: May 29, 2012

Assignee: International Business Machines Corporation

Inventors: Ullas B. Nambiar, Biplav Srivastava
Method, system and program product for rewriting structured query language (SQL) statements

Patent number: 8185518

Abstract: Under the present invention, a SQL statement having search criteria is received. Upon receipt, a table that lists all possible combinations of the search criteria is created. From the table, a set of patterns among the possible combinations is identified. Based on these patterns, the table is then sorted. Once sorted, the table is divided into a set of temporary tables based on the set of patterns/sorting operation. The set of temporary tables are then individually joined with the SQL statement and separate searches are conducted. A new set of temporary tables is then generated and populated with results of the separate searches. These result tables are then unioned/combined into a single result table.

Type: Grant

Filed: November 12, 2004

Date of Patent: May 22, 2012

Assignee: International Business Machines Corporation

Inventors: Howard S. Bloom, Roy Froehlich, Thomas A. Jobson, Jr., Edith A. Kanyock, Charles F. Matula, Arnold M. Rosenberg
Techniques for exact cardinality query optimization

Patent number: 8185519

Abstract: An exact cardinality query optimization system and method for optimizing a query having a plurality of expressions to obtain a cardinality-optimal query execution plan for the query. Embodiments of the system and method use various techniques to shorten the time necessary to obtain the cardinality-optimal query execution plan, which contains the query execution plan when all cardinalities are exact. Embodiments of the system and method include a covering queries technique that leverages query execution feedback to obtain an unordered subset of relevant expressions for the query, an early termination technique that bounds the cardinality to determine whether the processing can be terminate before each of the expressions are executed, and an expressions ordering technique that finds an ordering of expressions that yields the greatest reduction in time to obtain the cardinality-optimal query execution plan.

Type: Grant

Filed: March 14, 2009

Date of Patent: May 22, 2012

Assignee: Microsoft Corporation

Inventors: Surajit Chaudhuri, Vivek Narasayya, Ravishankar Ramamurthy
Data Skew Insensitive Parallel Join Scheme

Publication number: 20120117055

Abstract: A method for creating a joined data set from a join input data set is disclosed. The method starts by categorizing the join input data set into a high-skew data set and a low-skew data set. The low-skew data set is distributed to the plurality of CPUs using a first distribution method. The high-skew data set is distributed to the plurality of CPUs using a second distribution method. The plurality of CPUs process the high-skew data set and the low-skew data set to create the joined data set.

Type: Application

Filed: January 23, 2012

Publication date: May 10, 2012

Inventors: Awny K. Al-Omari, QiFan Chen, Gregory S. Battas, Kashif A. Siddiqui, Michael J. Hanlon
SYSTEM AND METHOD FOR OUTER JOINS ON A PARALLEL DATABASE MANAGEMENT SYSTEM

Publication number: 20120117056

Abstract: There is provided a computer-executable method of executing an outer join on a parallel database management system. An exemplary method comprises receiving an outer skewed values list (SVL). The outer SVL may comprise values that are indicated to be skewed. The exemplary method further comprises receiving an inner SVL. The inner SVL may comprise values that are indicated to be skewed. Additionally, the exemplary method comprises partitioning the outer table and the inner table across a plurality of join instances, based on the outer SVL and the inner SVL. A missing skew value is identified. The missing skewed value may be a value of the inner SVL that is not found in the inner table. The outer join is performed using the plurality of join instances, based on the missing skewed value.

Type: Application

Filed: March 30, 2010

Publication date: May 10, 2012

Inventors: Awny K. Al-Omari, QiFan Chen
Join paths across multiple databases

Patent number: 8176036

Abstract: Methods, systems and computer instructions on computer readable media are disclosed for optimizing a query, including a first join path, a second join path, and an optimizer, to efficiently provide high quality information from large, multiple databases. The methods and systems include evaluating a schema graph identifying the join paths between a field X and a field Y, and a value X=x, to identify the top-few values of Y=y that are reachable from a specified X=x value when using the join paths. Each data path that instantiates the schema join paths can be scored and evaluated as to the quality of the data with respect to specified integrity constraints to alleviate data quality problems. Agglomerative scoring methodologies can be implemented to compute high quality information in the form of a top-few answers to a specified problem as requested by the query.

Type: Grant

Filed: October 23, 2009

Date of Patent: May 8, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Divesh Srivastava, Ioannis Kotidis
Query generator

Patent number: 8166020

Abstract: A query generator for generating a query that returns a result set comprising data retrieved from a database and data returned by an analytic function that operates on at least a portion of the retrieved data is disclosed.

Type: Grant

Filed: December 22, 2005

Date of Patent: April 24, 2012

Assignee: Oracle International Corporation

Inventors: Joel Turkel, Raghuram Venkatasubramanian
QUERY OPTIMIZATION ON VPD PROTECTED COLUMNS

Publication number: 20120095988

Abstract: A method and apparatus for preserving optimization hints in a transformed query is provided. In one embodiment, the methodology is implemented by query optimization logic. Upon receiving a first query to access values in a column of a table protected by an access control policy, the query optimization logic creates a second query that is equivalent to the first query as subject to the access control policy. Furthermore, the second query contains a new predicate that conjunctively joins a clone of a first expression in a predicate of the first query with a second expression that is derived, based on the access control policy, from the first expression. In one embodiment, the query optimization logic submits the second query for execution.

Type: Application

Filed: October 20, 2011

Publication date: April 19, 2012

Inventor: Chon Hei Lei
PERFORMING DATABASE JOINS

Publication number: 20120089594

Abstract: A method of performing a database join is provided herein. The method includes receiving a query. The query may specify a join of a first table and a second table. The method further includes determining a new predicate based on a mapping between a first column of the first table and a second column of the second table for a plurality of tuples of the join. Further, the method includes modifying the query such that the query comprises the new predicate.

Type: Application

Filed: October 11, 2010

Publication date: April 12, 2012

Inventors: Murali Krishna, Harumi Kuno, Vijay M. Sarathy, Subrata Naskar
Hybrid Query Execution Plan

Publication number: 20120089595

Abstract: A procedural pattern in a received query execution plan can be matched to a stored pattern for which an equivalent declarative operator has been pre-defined. The query execution plan can describe a query for accessing data. A hybrid execution plan can be generated by replacing the procedural pattern with the equivalent declarative operator. A hybrid execution plan processing cost can be assigned to execution of the hybrid execution plan and a query execution plan processing cost can be assigned to execution of the query execution plan. The assigning can include evaluating a cost model for the hybrid execution plan and the query execution plan. The query can be executed using the hybrid execution plan if the hybrid execution plan processing cost is less than the query execution plan processing cost or the query execution plan if the hybrid execution plan processing cost is greater than the query execution plan processing cost. Related systems, methods, and articles of manufacture are disclosed.

Type: Application

Filed: December 17, 2010

Publication date: April 12, 2012

Inventor: Bernhard Jaecksch
Rescheduling of modification operations for loading data into a database system

Patent number: 8156110

Abstract: A method or apparatus for use with a database system that stores a join view associated with plural base relations includes receiving modification operations to modify at least two of the base relations of the join view, and re-ordering the received modification operations to avoid concurrent execution of modification operations of more than one of at least two base relations.

Type: Grant

Filed: January 29, 2004

Date of Patent: April 10, 2012

Assignee: Teradata US, Inc.

Inventors: Gang Luo, Michael W. Watzke, Curt J. Ellmann, Jeffrey F. Naughton
System, method and computer program product for storing a formula having first and second object fields

Patent number: 8150833

Abstract: In accordance with embodiments, there are provided mechanisms and methods for storing a formula having first and second object fields. These mechanisms and methods for storing a formula having first and second object fields can allow access to data from related object types other than the object type being currently accessed. The ability of embodiments to provide such access may allow access to additional contents of a database for performing validations, calculations, etc.

Type: Grant

Filed: May 6, 2009

Date of Patent: April 3, 2012

Assignee: salesforce.com, inc.

Inventors: Mary Scotton, Walter Macklem, Eric Bezar, Jesse Collins
System, method, and computer-readable medium for reducing row redistribution costs for parallel join operations

Patent number: 8150836

Abstract: A system, method, and computer-readable medium for optimizing execution of a join operation in a parallel processing system are provided. A plurality of processing nodes that have at least one row of one or more tables involved in a join operation are identified. For each of the processing nodes, respective counts of rows that would be redistributed to each of the processing nodes based on join attributes of the rows are determined. A redistribution matrix is calculated from the counts of rows of each of the processing nodes. An optimized redistribution matrix is generated from the redistribution matrix, wherein the optimized redistribution matrix provides a minimization of rows to be redistributed among the nodes to execute the join operation.

Type: Grant

Filed: August 19, 2008

Date of Patent: April 3, 2012

Assignee: Teradata US, Inc.

Inventors: Yu Xu, Olli Pekka Kostamaa, Xin Zhou
Method and system for data processing using multidimensional filtering

Patent number: 8145626

Abstract: In one embodiment the present invention includes a method comprising receiving a data filter for filtering a collection of data, wherein the collection of data is configured as a star schema including a fact table and dimension tables. The data filter is applied against the dimension tables to generate a modified dimension table. The modified dimension tables are applied against the fact table to produce a modified fact table. The data filter is then applied against the modified fact table to generate a second modified fact table, which is the output of the process.

Type: Grant

Filed: December 31, 2008

Date of Patent: March 27, 2012

Assignee: SAP AG

Inventors: Peter John, Thomas Zurek
System for finding queries aiming at tail URLs

Patent number: 8145622

Abstract: Systems and methodologies for improved query classification and processing are provided herein. As described herein, a query prediction model can be constructed from a set of training data (e.g., diagnostic data obtained from an automatic diagnostic system and/or other suitable data) using a machine learning-based technique. Subsequently upon receiving a query, a set of features corresponding to the query, such as the length and/or frequency of the query, unigram probabilities of respective words and/or groups of words in the query, presence of pre-designated words or phrases in the query, or the like, can be generated. The generated features can then be analyzed in combination with the query prediction model to classify the query by predicting whether the query is aimed at a head Uniform Resource Locator (URL) or a tail URL. Based on this prediction, an appropriate index or combination of indexes can be assigned to answer the query.

Type: Grant

Filed: January 9, 2009

Date of Patent: March 27, 2012

Assignee: Microsoft Corporation

Inventors: Xiaoxin Yin, Vijay Ravindran Nair, Ryan Frederick Stewart, Fang Liu, Junhua Wang, Tiffany Kumi Dohzen, Yi-Min Wang
Method and apparatus for associating metadata with data

Patent number: 8145624

Abstract: Method and apparatus for associating at least one query expression to an original database table is described. In one example, a metadata table is added to a database, wherein at least one portion of the metadata table comprises the at least one query expression. Afterwards, the at least one query expression is associated to at least one value from at least one tuple belonging to a data table of the database.

Type: Grant

Filed: October 31, 2006

Date of Patent: March 27, 2012

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Divesh Srivastava, Ioannis Velegrakis
EVALUATING EXECUTION PLAN CHANGES AFTER A WAKEUP THRESHOLD TIME

Publication number: 20120072412

Abstract: In an embodiment an execution plan for a query is created. A wakeup threshold is set proportional to an amount of time taken by the creation of the execution plan. In various embodiments, the wakeup threshold is increased by a percentage equal to one minus a percentage of free resources at a computer system, is increased inversely proportional to an amount of execution time of a previous execution of the execution plan, or is decreased proportional to a number of times the execution plan was executed. A portion of the execution plan is executed to produce a portion of rows in a result set until the wakeup threshold expires. After the wakeup threshold expires, changes to the execution plan are evaluated.

Type: Application

Filed: September 20, 2010

Publication date: March 22, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Robert J. Bestgen, Robert V. Downer, Brian R. Muras
DATA COMBINATION SYSTEM AND DATA COMBINATION METHOD

Publication number: 20120066207

Abstract: A data join system of the present invention includes a table determination unit 11 selecting a record b as a join target if a value of a key item included in a record a acquired by a data write unit from a table A falls within a first predetermined range set based on a value of a key item included in the record b stored in a table B, a data join unit joining the selected record b with the record a to generate a record c, and a data write unit storing the record c into a table C. Thus, the success rate of data join can be improved while the accuracy of join of the record b and the record a to be joined is improved.

Type: Application

Filed: May 10, 2010

Publication date: March 15, 2012

Applicant: NTT DOCOMO, INC.

Inventors: Daisuke Ochi, Ichiro Okajima, Hiroshi Kawakami, Toshihiro Suzuki, Manhee Jo, Tomohiro Nagata, Motonari Kobayashi, Yuki Oyabu
System, method, and computer-readable medium for partial redistribution, partial duplication of rows of parallel join operation on skewed data

Patent number: 8131711

Abstract: A system, method, and computer-readable medium that facilitate management of data skew during a parallel join operation are provided. Portions of tables involved in the join operation are distributed among a plurality of processing modules, and each of the processing modules is provided with a list of skewed values of a join column of a larger table involved in the join operation. Each of the processing modules scans the rows of the tables distributed to the processing modules and compares values of the join columns of both tables with the list of skewed values. Rows of the larger table having non-skewed values in the join column are redistributed, and rows of the larger table having skewed values in the join column are maintained locally at the processing modules. Rows of the smaller table that have non-skewed values in the join column are redistributed, and rows of the smaller table that have skewed values in the join column are duplicated among the processing modules.

Type: Grant

Filed: May 22, 2008

Date of Patent: March 6, 2012

Assignee: Teradata Corporation

Inventors: Yu Xu, Pekka Kostamaa
GEOSPATIAL DATABASE INTEGRATION USING BUSINESS MODELS

Publication number: 20120054174

Abstract: In certain examples, a mechanism is provided for automatically performing join operations. Source data is received and a metadata model is received. The metadata model includes a hierarchical structure. The source data is aligned to the hierarchical structure in the metadata model to form a source data hierarchy. Based on the source data hierarchy, the source data is joined to geocoded information.

Type: Application

Filed: July 18, 2011

Publication date: March 1, 2012

Applicant: International Business Machines Corporation

Inventors: Ronald L. Gagnier, Michael A. Iles, Steven R. McDougall, David J. Ridgeway, Craig A. Statchuk
TRANSFORMING RELATIONAL QUERIES INTO STREAM PROCESSING

Publication number: 20120054173

Abstract: A method of transforming relational queries of a database into on a data processing system includes receiving a series of relational queries, transforming first parts of the queries into a continuous query embodied as a streaming application, sending parameters in second parts of the queries in the series to the streaming application as a data stream, and executing the continuous query based on the received data stream to generate query results for the series of relational queries. Each query in the series includes a first part and a second part. The first parts are a pattern common to all the queries in the series and the second parts each have one or more parameters that are not common to all of the queries in the series.

Type: Application

Filed: August 25, 2010

Publication date: March 1, 2012

Applicant: International Business Machines Corporation

Inventors: Henrique Andrade, Bugra Gedik, Martin J. Hirzel, Robert Soule, Hua Yong Wang, Kun-Lung Wu, Qiong Zou
METHODS AND SYSTEMS FOR HARDWARE ACCELERATION OF STREAMED DATABASE OPERATIONS AND QUERIES BASED ON MULTIPLE HARDWARE ACCELERATORS

Publication number: 20120047126

Abstract: Embodiments of the present invention provide a hardware accelerator that assists a host database system in processing its queries. The hardware accelerator comprises special purpose processing elements that are capable of receiving database query/operation tasks in the form of machine code database instructions, execute them in hardware without software, and return the query/operation result back to the host system.

Type: Application

Filed: June 29, 2011

Publication date: February 23, 2012

Applicant: TERADATA US, INC.

Inventors: Jeremy L. Branscome, Michael Paul Corwin, Joseph Irawan Chamdani, Rajasekhar Cherabuddi
Incremental Maintenance of Immediate Materialized Views with Outerjoins

Publication number: 20120047117

Abstract: Methods and systems for using algorithms in relational database management systems (RDBMSs) for incremental maintenance of materialized views with outerjoins are disclosed. The algorithms achieve the following goals with respect to a class of materialized outerjoin views and the performance of update operations: relax the requirement for the existence of the primary key attributes in a select list of the view to only some of the relations (i.e., the relations referenced as a preserved side in an outerjoin); relax null-intolerant property requirements for some predicates used in the view definition (i.e., predicates referencing relations which can be null-supplied by more than one outerjoin); and implement maintenance of outerjoin views by using one update statement (e.g., MERGE, UPDATE, INSERT, or DELETE) per view for each relation referenced in the view. The algorithms allow design and implementation of the incremental maintenance of materialized views with outerjoins to be integrated into an RDBMS.

Type: Application

Filed: December 20, 2010

Publication date: February 23, 2012

Applicant: iAnywhere Solutions, Inc.

Inventor: Anisoara NICA
DATABASE QUERY OPTIMIZATIONS

Publication number: 20120047124

Abstract: A method of processing a query is provided. The method includes performing on a processor: receiving a database query that includes a plurality of predicates that associate a subject with an object, where one or more of the predicates is a variable predicate; generating at least one new query by selectively replacing the at least one variable predicate in the database query with a non-variable predicate; and performing the at least one new database query on a database to obtain a query result.

Type: Application

Filed: August 17, 2010

Publication date: February 23, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Min Wang
EXECUTING A QUERY PLAN WITH DISPLAY OF INTERMEDIATE RESULTS

Publication number: 20120047125

Abstract: In an embodiment, a FIRSTIO execution plan is selected that has a lowest estimated execution time for finding a number of records that satisfy the query and are simultaneously viewable. An ALLIO execution plan is selected that has a lowest estimated execution time for finding all records that satisfy the query. The FIRSTIO execution plan is executed for a first time period to create a FIRSTIO result set. The ALLIO execution plan is executed for a second time period to create an ALLIO result set. The FIRSTIO result set is displayed if the FIRSTIO result set comprises more records than the ALLIO result set. The ALLIO result set is displayed if the ALLIO result set comprises more records than the FIRSTIO result set. In an embodiment, the first and second time periods expire in response to the expiration of a maximum time specified by the query.

Type: Application

Filed: August 17, 2010

Publication date: February 23, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Paul R. Day, Randy L. Egan, Roger A. Mittelstadt, Brian R. Muras
Joining tables in multiple heterogeneous distributed databases

Patent number: 8122008

Abstract: A method for joining tables in multiple heterogeneous distributed databases implemented by at least two data sources accessible to a federal database server over a network includes: transmitting from the federated database server a sub-command to a first of the data sources responsive to the federated database server receiving a data query; retrieving, with the federated database server, block data from the first data source related to the data query using block fetching according to the sub-command; transmitting, with the federated database server, at least a portion of the block data to a second of the data sources together with an instruction for the second data source to perform a join operation on the portion of the block data and a data table stored by the second data source related to the query; and retrieving a result of the join operation with the federated database server.

Type: Grant

Filed: September 23, 2009

Date of Patent: February 21, 2012

Assignee: International Business Machines Corporation

Inventors: Ming Li, Hai Feng Li, Yun Feng Sun, Sheng Zhao
Method and apparatus for fast similarity-based query, self-join, and join for massive, high-dimension datasets

Patent number: 8117213

Abstract: A method and apparatus for fast similarity-based query, self-join, and join for massive, high-dimension datasets have been disclosed.

Type: Grant

Filed: October 30, 2009

Date of Patent: February 14, 2012

Assignee: Nahava Inc.

Inventors: Russell Toshio Nakano, Stanley Cheng
Database processing apparatus, information processing method, and computer program product

Patent number: 8117186

Abstract: A database processing apparatus generates a first processing instruction for acquiring an element included in the processing-target structured document, a second processing instruction for performing a natural join by using result data including the acquired element, a third processing instruction for performing a cross join by using the result data, and a fourth processing instruction for updating a correspondence relation between a result of the execution of the natural join and a result of the execution of the cross join by using these result of executions, and joins these processing instructions to generate the process plan. At this time, the database processing apparatus converts the first processing instruction into a fifth processing instruction for transmitting an acquisition request for the element to the database servers, and receiving the result data including the acquired element from the plural database servers.

Type: Grant

Filed: January 27, 2009

Date of Patent: February 14, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventor: Masakazu Hattori
SQL adapter business service

Patent number: 8117184

Abstract: A Structured Query Language (SQL) adapter business service that converts data from a data set to a common representation format used for all data sets with which the SQL adapter business service interacts. Hence the SQL adapter business service can communicate with various internal and external systems independently of the native format in which those systems maintain and store data. The SQL adapter business service optimizes operations to update data in the data sets by combining operations when possible and by using result sets from executing previous SQL statements to construct subsequent SQL statements. SQL adapter business service takes advantage of parent/child relationships between tables to construct SQL statements in an order such that the SQL statements process only a minimum amount of data, thereby making retrieval of data as efficient as possible.

Type: Grant

Filed: January 2, 2004

Date of Patent: February 14, 2012

Assignee: Siebel Systems, Inc.

Inventors: Arjun Chandrasekar Iyer, Chandrakant Ramkrishna Bhavsar
VIRTUAL COLUMNS

Publication number: 20120036111

Abstract: Techniques are described herein for performing column functions on virtual columns in database tables. A virtual column is defined by the database to contain results of a defining expression. Statistics are collected and maintained for virtual columns. Indexing is performed on virtual columns. Referential integrity is maintained between two tables using virtual columns as keys. Join predicate push-down operations are also performed using virtual columns.

Type: Application

Filed: October 20, 2011

Publication date: February 9, 2012

Inventors: Subhransu Basu, Harmeek Singh Bedi
Dynamically Joined Fast Search Views for Business Objects

Publication number: 20120030189

Abstract: According to some embodiments, an anchor transactional view may be defined for at least one business object data structure. The anchor transactional may have a plurality of anchor fields, each anchor field representing a data source and being associated with a field of an anchor search view. An indication of at least one extension field to the anchor search view may be received. The anchor search view and at least one extension field may represent, for example, a virtual fast search infrastructure view. Responsive to the received indication of the at least one extension field, an additional view may be dynamically joined at runtime to the anchor transactional view. The additional view may have at least one additional field, and each additional field may be associated with one of the extension fields.

Type: Application

Filed: July 30, 2010

Publication date: February 2, 2012

Inventors: Oliver Vossen, Martin Müller, Maic Wintel
Index backbone join

Patent number: 8103658

Abstract: Techniques described herein perform an index backbone join of data that is contained within two or more tables. Significantly, key data are selected from the indices constructed on the tables, and such data are filtered by the query-indicated criteria, before any data is selected from the tables themselves. Row identifiers of the rows remaining after the index filtering has been performed are then used to select the qualifying rows (only) from the tables. Data selected from the tables is joined to produce query results. Because all of the filtering is performed based on index entries prior to any table access, and because index access is typically much faster than table access, queries whose results require very large quantities of data from multiple tables can be performed much more quickly.

Type: Grant

Filed: November 13, 2009

Date of Patent: January 24, 2012

Assignee: Oracle International Corporation

Inventors: Lothar Flatz, Bjorn Kisbye Engsig
System, method, and computer-readable medium for duplication optimization for parallel join operations on similarly large skewed tables

Patent number: 8099409

Abstract: A system, method, and computer-readable medium for optimizing join operations in a parallel processing system are provided. A respective set of rows of a first table and a second table involved in a join operation are distributed to each of a plurality of processing modules. The join operation comprises a join on a first column of the first table and a second column of the second table. Each of the plurality of processing modules redistributes at least a portion of the rows of the first table distributed thereto substantially equally among the other processing modules and duplicates at least a portion of the rows of the second table distributed thereto among the plurality of processing modules. The disclosed optimization mechanisms provide for reduced spool space requirements for execution of the parallel join operation.

Type: Grant

Filed: September 3, 2008

Date of Patent: January 17, 2012

Assignee: Teradata US, Inc.

Inventors: Xin Zhou, Olli Pekka Kostamaa
HASH-JOIN IN PARALLEL COMPUTATION ENVIRONMENTS

Publication number: 20120011108

Abstract: According to some embodiments, a system and method for a parallel join of relational data tables may be provided by calculating, by a plurality of concurrently executing execution threads, hash values for join columns of a first input table and a second input table; storing the calculated hash values in a set of disjoint thread-local hash maps for each of the first input table and the second input table; merging the set of thread-local hash maps of the first input table, by a second plurality of execution threads operating concurrently, to produce a set of merged hash maps; comparing each entry of the merged hash maps to each entry of the set of thread-local hash maps for the second input table to determine whether there is a match, according to a join type; and generating an output table including matches as determined by the comparing.

Type: Application

Filed: December 23, 2010

Publication date: January 12, 2012

Inventors: Christian Bensberg, Christian Mathis, Frederik Transier, Nico Bohnsack, Kai Stammerjohann
Automated Joining of Disparate Data for Database Queries

Publication number: 20110320433

Abstract: Described is associating metadata with different sources of data (e.g., database tables) that allows a single view of data from the sources to be created. An administrator creates baseviews corresponding to database tables and associates metadata with the baseviews, including primary key metadata for the baseviews and meta-tags for one or more of the columns of each baseview. A user selects fields (corresponding to table columns) from a starting baseview, along with fields from any other baseview that has metadata that matches the starting baseview's metadata. A join mechanism automatically creates the view if a metadata match is detected.

Type: Application

Filed: June 25, 2010

Publication date: December 29, 2011

Applicant: MICROSOFT CORPORATION

Inventors: Imran Mohiuddin, Mahmood Gulam Qadir, Yi Miao, Bryan Jason Dove, Jonathan Alan Handler, Craig F. Feied, Mehul Y. Shah
Query optimizer with schema conversion

Patent number: 8086598

Abstract: Methods, program products and systems for determining, for a database query that does not represent a snowflake schema, a graph comprising vertices each representing a table joined in the query, a directed edge between each pair of vertices of which a first vertex represents a first table and a second vertex represents a second table that is joined in the query with the first table, each of the edges representing one of an outer join and an inner join. Further determining, for the graph, a directed spanning tree that represents an ordering of joins in the query and includes all outer join edges in the graph.

Type: Grant

Filed: February 6, 2009

Date of Patent: December 27, 2011

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Andrew Lamb, Mitch Cherniack, Shilpa Lawande, Nga Tran
Between matching

Patent number: 8086597

Abstract: A query of at least one mark-up language document has a path expression comprising a conjunction, a first filter and a second filter. The first filter has a first probe. The second filter has a second probe. The first and second filters form a between filter having start and stop values specified by the first and second probes. A plan to process the query is generated based on, at least in part, a range defined by the start and stop values. An index of mark-up language documents is defined by another path expression; the index comprises values of mark-up language documents that satisfy the other path expression; the values are key values of the index. The plan is to perform a single scan of the key values from the start value to the stop value to identify at least one key value that satisfies the between filter.

Type: Grant

Filed: June 28, 2007

Date of Patent: December 27, 2011

Assignee: International Business Machines Corporation

Inventors: Andrey Balmin, Sauraj Goswami
Accelerating Database Management System Operations

Publication number: 20110307471

Abstract: Techniques for accelerating an operation in a database management system are provided. The techniques include reading data pertaining to a database management system operation from a storage unit, sending the database management system operation data to an accelerator unit, and processing the database management system operation data via the accelerator unit, wherein processing the data via the accelerator unit comprises using a multithreaded execution unit and compression hardware to perform the database management system operation with reduced execution time.

Type: Application

Filed: June 9, 2010

Publication date: December 15, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Vadim Sheinin
Optimization technique for dealing with data skew on foreign key joins

Patent number: 8078610

Abstract: A method for determining when a database system query optimizer should employ join skew avoidance steps. The method includes dynamically calculating the worst-case anticipated frequency distribution for a particular relation along a particular set of join column(s) at query execution time. The calculated frequency distribution value is compared to a skew threshold, the skew threshold representing the number of rows on the same distinct value that would lead to avoidable processing inefficiencies. It is then determined that the database system query optimizer should employ join skew avoidance steps if the calculated frequency distribution value exceeds the skew threshold.

Type: Grant

Filed: March 26, 2008

Date of Patent: December 13, 2011

Assignee: Teradata US, Inc.

Inventor: Stephen Molini
Efficient SQL access to point data and relational data

Patent number: 8078598

Abstract: Some embodiments include reception of a structured query language query, determination of at least one point data query and at least one relational data query based on the structured query language query, transmission of the at least one point data query to at least one point data server, transmission of the at least one relational data query to at least one relational data server, reception of point data and relational data in response to the point data query and the relational data query, and joining of the received point data and the received relational data into a result rowset.

Type: Grant

Filed: January 8, 2007

Date of Patent: December 13, 2011

Assignee: Siemens Aktiengesellschaft

Inventors: Trevor Bell, Christopher Patrick Milam
Query Execution Systems and Methods

Publication number: 20110302151

Abstract: System, method, and computer program product for processing data are disclosed. The method includes receiving a query for processing of data, wherein the data is stored in a table in a plurality of tables, wherein the table is stored on at least one node within the database system, determining an attribute of the table and another table in the plurality of tables, partitioning one of the table and the another table in the plurality of tables using the determined attribute into a plurality of partitions, and performing a join of at least two partitions of the table and the another table using the determined attribute. The join is performed on a single node in the database system.

Type: Application

Filed: February 22, 2011

Publication date: December 8, 2011

Applicant: YALE UNIVERSITY

Inventors: Daniel Abadi, Kamil Bajda-Pawlikowski
Querying joined data within a search engine index

Patent number: 8073840

Abstract: Techniques and systems for indexing and retrieving data and documents stored in a record-based database management system (RDBMS) utilize a search engine interface. Search-engine indices are created from tables in the RDBMS and data from the tables is used to create “documents” for each record. Queries that require data from multiple tables may be parsed into a primary query and a set of one or more secondary queries. Join mappings and documents are created for the necessary tables. Documents matching the query string are retrieved using the search-engine indices and join mappings.

Type: Grant

Filed: June 16, 2009

Date of Patent: December 6, 2011

Assignee: Attivio, Inc.

Inventors: Tim Smith, William Kimble Johnson, III, Rik Tamm-Daniels, Sid Probstein
SYSTEMS AND METHODS FOR PROVIDING MULTILINGUAL SUPPORT FOR DATA USED WITH A BUSINESS INTELLIGENCE SERVER

Publication number: 20110295837

Abstract: A business intelligence (BI) server is described that supports data and schemas stored in multiple languages. The BI server implements a lookup table and lookup function that allows users to work with queries in different languages. When the user logs in, a session object is created for the user, which maintains the state information. A session variable specifies the language currently being used by the user. The BI server can inspect this session variable to determine the language of the user and perform the lookup translations as necessary. For example, if the language used by the session is different from the language of the base table storing the necessary information, the BI server can perform a translation by invoking a lookup function. The execution of the lookup can include performing a join operation of the base table with the lookup table to yield a translated value requested by the query.

Type: Application

Filed: March 1, 2011

Publication date: December 1, 2011

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Roger Bolsius, Raghuram Venkatasubramanian, Ling Ni, Donko Donjerkovic, Saugata Chowdhury
SYSTEMS AND METHODS FOR PROVIDING VALUE HIERARCHIES, RAGGED HIERARCHIES AND SKIP-LEVEL HIERARCHIES IN A BUSINESS INTELLIGENCE SERVER

Publication number: 20110295836

Abstract: A business intelligence (BI) server and repository are described which support a set of hierarchical relationships among the data. The BI server receives user input specifying a set of parent-child or other ancestral relationship among a set of data in a data source. The BI server generates a set of SQL queries and executes the queries to pre-populate a set of tables which specify the parent child relationships among the data in the data source. One such table is a parent-child relationship closure table that defines the inter-member relationships among the data members. Once the tables are populated, the BI server uses the closure tables to answer queries that require knowledge of the ancestral relationships among data.

Type: Application

Filed: March 1, 2011

Publication date: December 1, 2011

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Roger Bolsius, Raghuram Venkatasubramanian, Ling Ni, Donko Donjerkovic, Saugata Chowdhury
JOIN TUPLE ASSEMBLY BY PARTIAL SPECIALIZATIONS

Publication number: 20110289069

Abstract: Various embodiments of systems and methods for join tuple assembly by partial specializations are described herein. The join tuple assembly by partial specializations is a phase of the method for join query evaluation by semi-join reduction. By using partial specializations of the non-join part of the WHERE clause of a join query and matching sets, the join tuple assembly is organized in a manner that all computations are necessary, none are repeated, and failure to complete a partial join tuple to a full tuple is detected as early as possible. The method can be applied to inner and outer joins, and to arbitrary join graphs and non-join conditions in the WHERE clause. It can also be used outside the context of semi-join reductions.

Type: Application

Filed: May 18, 2010

Publication date: November 24, 2011

Inventor: Gerhard Hill
Query optimization on VPD protected columns

Patent number: 8065329

Abstract: A method and apparatus for preserving optimization hints in a transformed query is provided. In one embodiment, the methodology is implemented by query optimization logic. Upon receiving a first query to access values in a column of a table protected by an access control policy, the query optimization logic creates a second query that is equivalent to the first query as subject to the access control policy. Furthermore, the second query contains a new predicate that conjunctively joins a clone of a first expression in a predicate of the first query with a second expression that is derived, based on the access control policy, from the first expression. In one embodiment, the query optimization logic submits the second query for execution.

Type: Grant

Filed: June 18, 2007

Date of Patent: November 22, 2011

Assignee: Oracle International Corporation

Inventor: Chon Hei Lei
Distribution of join operations on a multi-node computer system

Patent number: 8055651

Abstract: A method and apparatus distributes database query joins on a multi-node computing system. In the illustrated examples, a join execution unit utilizes various factors to determine where to best perform the query join. The factors include user controls in a hints record set up by a system user and properties of the system such as database configuration and system resources. The user controls in the hints record include a location flag and a determinicity flag. The properties of the system include the free space on the node and the size join, the data traffic on the networks and the data traffic generated by the join, the time to execute the join and nodes that already have code optimization. The join execution unit also determines whether to use collector nodes to optimize the query join.

Type: Grant

Filed: February 10, 2009

Date of Patent: November 8, 2011

Assignee: International Business Machines Corporation

Inventors: Eric Lawrence Barsness, Amanda Peters, John Matthew Santosuosso
METHODS AND SYSTEMS FOR PERFORMING CROSS STORE JOINS IN A MULTI-TENANT STORE

Publication number: 20110258178

Abstract: Methods and systems for performing cross store joins in a multi-tenant store are described. In one embodiment, such a method includes retrieving data from a multi-tenant database system having a relational data store and a non-relational data store, receiving a request specifying data to be retrieved from the multi-tenant database system, retrieving, based on the request, one or more locations of the data to be retrieved, generating a database query based on the request, in which the database query specifies a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store and one or more other data elements residing within the relational data store, and executing the database query against the multi-tenant database system to retrieve the data.

Type: Application

Filed: December 20, 2010

Publication date: October 20, 2011

Applicant: Salesforce.com

Inventors: BILL C. EIDSON, Craig Weissman, Kevin Oliver, James Taylor, Simon Z. Fell, Donovan A. Schneider
METHODS AND SYSTEMS FOR OPTIMIZING QUERIES IN A MULTI-TENANT STORE

Publication number: 20110258179

Abstract: Methods and systems for optimizing queries in a multi-tenant store are described. In one embodiment, such a method includes retrieving data from a multi-tenant database system having a relational data store and a non-relational data store, receiving a request specifying data to be retrieved, retrieving one or more locations of the data to be retrieved, generating a database query based on the request, in which the database query specifies a plurality of data elements to be retrieved, the plurality of data elements including one or more data elements residing within the non-relational data store and one or more other data elements residing within the relational data store, generating an optimized database query having an optimized query syntax that is distinct from a query syntax of the database query, and executing the optimized database query against the multi-tenant database system to retrieve the data.

Type: Application

Filed: December 20, 2010

Publication date: October 20, 2011

Applicant: Salesforce.com

Inventors: Craig WEISSMAN, James Taylor

prev … 8 9 10 11 12 13 14 next