Abstract: Techniques for optimizing outer joins in database operations are provided. In an embodiment, a query having an outer join with a GROUP BY clause is rewritten and expanded to expand a first level partition of that GROUP BY clause to produce a modified query. In another situation, rows associated with processing a query are each row split and then hashed based at least in part on attributes of the outer join predicate. A left outer join is performed and a global aggregation processed to produce a spooled table to process the query.
Abstract: Embodiments of the present invention provide hardware-friendly indexing of databases. In particular, forward and reverse indexing are utilized to allow for easy traversal of primary key to foreign key relationships. A novel structure known as a hit list also allows for easy scanning of various indexes in hardware. Group indexing is provided for flexible support of complex group key definition, such as for date range indexing and text indexing. A Replicated Reordered Column (RRC) may also be added to the group index to convert random I/O pattern into sequential I/O of only needed column elements.
Type:
Grant
Filed:
April 7, 2008
Date of Patent:
October 14, 2014
Assignee:
Teradata US, Inc.
Inventors:
Krishnan Meiyyappan, Liuxi Yang, Jeremy Branscome, Michael Corwin, Ravi Krishnamurthy, Kapil Surlaker, James Shau, Joseph I. Chamdani
Abstract: Optimizing the execution of a query in a multi-database system includes identifying a region within a table, the table being referenced in the query. The region is stored on data-storage devices on first and second system databases in the multi-database system. A first access plan for the query is developed, the first access plan comprising accessing the version of the region stored on the first system database. A second access plan for the query is developed, the second access plan comprising accessing the version of the region stored on the second system database. A selection is made between the first access plan and the second access plan to execute the query. The query is executed using the selected access plan to produce a result.
Abstract: Techniques for organizing single or multi-column temporal data into R-tree spatial indexes are provided. Temporal data for single or multiple column data, within a database system, is converted into one or more line segments. The resulting line segments are transformed into a minimum bounding rectangle (MBR). Finally, the MBR is inserted into an R-tree spatial index.
Abstract: An event tap associated with a server, such as a Web server, at a machine can transform a server event into a tuple, select a database node for the tuple, and place the tuple in a queue for that database node, and then flush the queue periodically directly into database nodes. The use of an event tap can thus reduce the computational burden on the database while keeping the server event data in the database relatively fresh.
Type:
Grant
Filed:
December 19, 2006
Date of Patent:
September 30, 2014
Assignee:
Teradata US, Inc.
Inventors:
George Candea, Anastasios Argyros, Mayank Bawa
Abstract: Techniques for improving complex database queries are provided. A determination is made whether to adopt a static or dynamic query execution plan based on metrics. When the dynamic query execution plan is used, a request fragment of the request is planned and the corresponding plan fragment is executed. The processed fragment provides feedback related to its processing to the remaining request and the process is repeated on the remaining request until the request is completed.
Abstract: Techniques for data integration are provided. Source attributes for source data are interactively mapped to target attributes for target data. Rules define how records from the source data are merged, selected, and for duplication detection. The mappings and rules are recorded as a profile for the source data and processed against the source data to transform the source attributes to the target attributes.
Type:
Application
Filed:
September 30, 2013
Publication date:
September 18, 2014
Applicant:
Teradata US, Inc.
Inventors:
Thomas Kevin Ryan, Achal Patel, Neelesh Bansode, Arvind Kumar, Anand Louis
Abstract: Data cleansing and standardization techniques are provided. A user interactively defines rules for cleansing and standardizing data of a source dataset. The rules are applied to the data and varying degrees of results and metrics associated with applying the rules are presented to the user for inspection and analysis.
Abstract: Probabilistic record linking methods and a system are provided. Selections are acquired; the selections identify the two data sources, column identifiers from each of the two data sources, pairs of column identifiers from each of the two data sources, a confidence values for matching each record associated with each pair. The selections are used to compare data housed in the two data sources. Based on the comparison, matched records and non matched records are identified from the two data sources.
Abstract: Techniques for data modeling are provided. Enterprise data is organized into reference data for entities that an enterprise wants to track and monitor. Relationship data is created that establishes relationships among the various entities within the enterprise data. The reference data and the relationship data are published within an enterprise data warehouse for accessing the enterprise data.
Abstract: Techniques for mapping a virtual R-Tree to an extensible-hash based file system for databases are provided. Spatial data is identified within an existing file system, which stores data for a database. Rows of the spatial data are organized into collections; each collection represents a virtual block. The virtual blocks are used to form an R-Tree spatial index that overlays an existing index for the database on the existing file system. Each row within its particular virtual block includes a pointer to its native storage location within the existing file system.
Abstract: A system, method, and computer-readable medium that facilitate dynamic skew avoidance are provided. The disclosed mechanisms advantageously do not require any statistic information regarding which values are skewed in a column on which a query is applied. Query selectivity is evaluated at a check point and thereby facilitates accurate detection of an overloaded processing module. The successful detection of an overloaded processing module causes other processing modules to stop sending more skewed rows to the overloaded processing module. Detection of an overloaded processing module is made when the overloaded processing module has received more rows than a target number of rows. Further, skewed rows that are maintained locally rather than redistributed to a detected processing module may result in more processing modules becoming overloaded. Advantageously, the disclosed mechanisms provide for a final redistribution adjustment to provide for even distribution of rows among all processing modules.
Abstract: A system, method, and computer-readable medium that facilitate counting the number of distinct values in several columns of a table utilizing parallel aggregation mechanisms.
Abstract: A database system includes an optimizer to generate resource estimates regarding execution of a request in the database system, and a regulator to monitor execution of a request and to adjust a priority level of the request based on the monitored execution and based on the resource estimates provided by the optimizer. The regulator is executable to further feed back statistics regarding execution of the request to the optimizer to improve accuracy of resource estimates provided by the optimizer.
Type:
Grant
Filed:
June 11, 2009
Date of Patent:
August 26, 2014
Assignee:
Teradata US, Inc.
Inventors:
Douglas P. Brown, Anita Richards, Louis M. Burger, Stephen A. Brobst
Abstract: Techniques for accessing a parallel database system via an external program using vertical and/or horizontal partitioning are provided. An external program to a database management system (DBMS) configures external mappers to process a specific portion of query results on specific access module processors of the DBMS that are to house query results. The query is submitted by the external program to the DBMS and the DBMS is directed to organize the query results in a vertical or horizontal manner. Each external mapper accesses its portion of the query results for processing in parallel on its designated AMP or set of AMPS to process the query results.
Abstract: Techniques for data assignment from an external distributed file system (DFS) to a database management system (DBMS) are provided. Data blocks from the DFS are represented as first nodes and access module processors of the DBMS are represented as second nodes. A graph is produced with the first and second nodes. Assignments are made for the first nodes to the second nodes based on evaluation of the graph to integrate the DFS with the DBMS.
Type:
Application
Filed:
March 18, 2014
Publication date:
August 7, 2014
Applicant:
Teradata US, Inc.
Inventors:
Yan Qi, Yu Xu, Olli Pekka Kostamaa, Jian Wen
Abstract: There is provided a method, a system and a machine readable medium to optimize storage allocation in a database management system. The method comprises receiving a processing step at a step processing module of an access module processor from a dispatcher module. The method further comprises determining whether a fast access storage flag is set in the processing step, the fast access storage flag indicating use of an intermediate file in fast access storage to store one or more redistributed data rows of a table of a database that is distributed across one or more storage devices of the database management system; Yet further the method comprises selectively allocating a free fast access storage data block to the intermediate file from a fast access storage pool based on the determination that the fast access storage flag is set. Lastly, the method comprises writing a redistributed data row from the one or more redistributed data rows to the allocated fast access storage data block.
Abstract: Techniques for data store list generation and management are provided. A user supplies criteria for a list via a graphical user interface tool. The criteria are used to generate a query, and the query when executed against a data store produces results representing the list. The list may then be used for a variety of purposes.
Type:
Grant
Filed:
December 28, 2006
Date of Patent:
July 29, 2014
Assignee:
Teradata US, Inc.
Inventors:
Paul H. Phibbs, Jr., Thomas Kevin Ryan, Linette Draper
Abstract: Several methods and a system of a workload management of a concurrently accessed database server are disclosed. In one embodiment, a method includes applying a weight to a service class. The method also includes generating a priority of the service class. In addition, the method includes selecting a group based on the weight of the service class. The method further includes determining a priority level based on the priority of the service class. The method also includes generating a characteristic of a shadow process through the weight and the priority of the service class. In addition, the method includes executing a query.
Type:
Grant
Filed:
April 5, 2011
Date of Patent:
July 22, 2014
Assignee:
Teradata US, Inc.
Inventors:
Daniel Braga De Faria, Mohit Aron, Hariharan Kolam Govindarajan
Abstract: Apparatus, systems, and methods may operate to receive a designation of multiple rows to supply data to a single user defined function, which is made available in a structured query language SELECT statement. Further activities may include retrieving the data from at least one storage medium, packing each of the multiple rows having a common key into a single row, and transforming the data from a first state into a second state by applying the single function to the data using a single access module processor. Other apparatus, systems, and methods are disclosed.
Type:
Grant
Filed:
October 26, 2010
Date of Patent:
July 15, 2014
Assignee:
Teradata US, Inc.
Inventors:
Lorenzo Danesi, Zhenrong Li, Blazimir Radovic