Abstract: A system, method, and computer-readable medium for dynamic detection and management of data skew in parallel join operations are provided. Receipt of an excessive number of redistributed rows by a processing module is detected thereby identifying the processing module as a hot processing module. Other processing modules then terminate redistribution of rows to the hot processing module and maintain rows of a skewed table of the join operation that would be redistributed to the hot processing module in a local spool. Rows of a smaller table that would be redistributed to the hot processing module are duplicated to each processing module involved in the join operation. Rows of tables that are to be redistributed by a processing module to any processing module excluding the hot processing module are redistributed accordingly and maintained locally by the processing module. The join operation is completed by merging results of local join data sets of each processing module.
Abstract: Techniques for scoring and comparing query execution plans are provided. Predefined parameter types are identified in query execution plans and predefined weighted values are assigned to any identified parameters within the query execution plans. The weights are summed on a per processing step bases and the sum of the processing steps represents a total score for a particular query execution plan. The total scores or individual step scores from different query execution plans can then be compared or evaluated against one another for optimization and problem detection analysis.
Abstract: Embodiments of the present invention provide for batch and incremental loading of data into a database. In the present invention, the loader infrastructure utilizes machine code database instructions and hardware acceleration to parallelize the load operations with the I/O operations. A large, hardware accelerator memory is used as staging cache for the load process. The load process also comprises an index profiling phase that enables balanced partitioning of the created indexes to allow for pipelined load. The online incremental loading process may also be performed while serving queries.
Type:
Application
Filed:
January 4, 2011
Publication date:
April 28, 2011
Applicant:
TERADATA US INC.
Inventors:
James Shau, Krishnan Meiyyappan, Hung Tran, Ravi Krishnamurthy, Kapil Surlaker, Jeremy Branscome, Joseph I. Chamdani
Abstract: A database system includes a storage to store a view containing results of a cube-based operation on at least one base table, with the view containing a first result set for a group-by on a first grouping set, and a second result set for a group-by on a second grouping set. In response to a change to the at least one base table, a controller updates the first result set by computing a change to the first result set based on a change in the at least one base table, and updates the second result set by computing a change to the second result set based on the change to the first result set.
Type:
Grant
Filed:
November 12, 2003
Date of Patent:
April 26, 2011
Assignee:
Teradata US, Inc.
Inventors:
Hong Gui, Ambuj Shatdal, Curt J. Ellmann
Abstract: A computer-implemented method for estimation of order-based statistics on slowly changing distributions of data stored on a computer. An initial set of data is converted to an initial histogram based representation of the data set's distribution. New or removed data is converted into a new histogram separate from the initial histogram. The new histogram is combined with the initial histogram to build a combined histogram. Percentiles and order-based statistics are estimated from the combined histogram to provide analysis of a combination of the initial set of data combined with the new or removed data.
Abstract: A method, apparatus, and article of manufacture provide the ability to define a view of data in a computer system A relational database management system (RDBMS) executes and stores the information in the computer system. As part of a process and framework, a series of business rules and process workflows are maintained to manage hierarchical data that resides in RDBMS tables. User input is accepted that defines a hierarchy that is projected onto the data. The hierarchy may be parent-child relationships with no level consistency. Alternatively, the hierarchy may have branches and levels, with each of the levels having a consistent meaning but inconsistent depths due to one level of a branch being unpopulated. The hierarchy is stored as metadata in the RDBMS and utilized to graphically visualize, manage, and manipulate the data.
Type:
Application
Filed:
June 24, 2010
Publication date:
March 31, 2011
Applicant:
TERADATA US, INC.
Inventors:
Thomas K. Ryan, Carl L. Christofferson, Neelesh V. Bansode, Vivek Shandilya, Latesh Pant, Madhavi Chandrashekhar
Abstract: A method, system, apparatus, and article of manufacture provides the ability to visualize master data management (MDM) data as part of a MDM workflow user interface (UI) in a computer system. MDM data resides in one or more tables of a relational database management system. An MDM system maintains, as part of a process and framework, a first process workflow to manage relationship data. The relationship data is data required to manage an association of one piece of MDM data to another piece of MDM data. A first process workflow provides a UI node that contains a link to a file that describes UI components to display when the first process workflow is executed. A first component of the UI component identifies an Adobe™ Flex™ based UI component. The Adobe™ Flex™ based UI component enables the representation and viewing of the MDM data in a hierarchy.
Type:
Application
Filed:
May 25, 2010
Publication date:
March 31, 2011
Applicant:
TERADATA US, INC.
Inventors:
Thomas K. Ryan, Neelesh V. Bansode, Carl L. Christofferson
Abstract: A multi-dimensional table having plural dimensions is stored in a database system, where plural grouping combinations of the plural dimensions define corresponding subsets of the multi-dimensional table. An aggregate measure for each of at least some of the plural subsets is computed, where the aggregate measure for a particular subset represents a relationship of the particular subset to one or more parents of the particular subset. Less than all of the at least some subsets are selected to materialize based on the aggregate measures.
Abstract: A method of graphically displaying the path of a customer traversing a web site and related business data is described. The method includes receiving a user request for a visualization. The user request may include data filters and exclusions. Responsive to the user request, traffic data is selected for analysis. The selected traffic data is analyzed and displayed to the user. The display may be in the form of a visualization including a graph and related business data. The graph may be of an overview, referral, path, page-to-page path, and animation type. A system for visualizing traffic patterns and the path of a customer at a site is described in conjunction with the above method. The system includes a logical data model, a dimensional data model, a report specification, a graphical interface, metadata database and an analysis report. The graphical interface is used for viewing visualization information.
Type:
Grant
Filed:
March 5, 2003
Date of Patent:
March 29, 2011
Assignee:
Teradata US, Inc.
Inventors:
Paul Cereghini, Kavitha Devarakonda, Giai Do, Eric Dunsker, Ahsan U. Haque, Karen Papierniak, Sreedhar Srikant, Ellen Boerger
Abstract: A system and method include obtaining a query and identifying an aggregate join index (AJI) at a high level of aggregation. The dimension table may be rolled-up with the grouping key being the union of the grouping key in the AJI and the grouping key of the query. The identified AJI is joined with the rolled-up dimension table to obtain columns in the query that are not in the identified AJI. The joined AJI and rolled-up dimension table are then rolled up to answer the query.
Abstract: Embodiments of the present invention provide a hardware accelerator that assists a host database system in processing its queries. The hardware accelerator comprises special purpose processing elements that are capable of receiving database query/operation tasks in the form of machine code database instructions, execute them in hardware without software, and return the query/operation result back to the host system. For example, table and column descriptors are embedded in the machine code database instructions. For ease of installation, the hardware accelerators employ a standard interconnect, such as a PCIe or HT interconnect. The processing elements implement a novel dataflow design and Inter Macro-Op Communication (IMC) data structures to execute the machine code database instructions. The hardware accelerator may also comprise a relatively large memory to enhance the hardware execution of the query/operation tasks requested.
Type:
Grant
Filed:
August 27, 2007
Date of Patent:
March 15, 2011
Assignee:
Teradata US, Inc.
Inventors:
Jeremy Branscome, Michael Corwin, Liuxi Yang, Joseph I. Chamdani
Abstract: Methods, data structures, and systems for generating customer segmentation models are provided. Basket transactions are analyzed and classified into a first segment type, a second segment type, a third segment type, or a fourth segment type. A number of the basket transaction within a number of the segment types are separately analyzed to determine sub classifications or sub segments within a particular segment type. Each basket transaction is augmented with a segment type that identifies the segment type classification, and a number of the basket transactions include a segment identifier that identifies the sub segment within a segment type that a basket transaction is associated with. The augmented basket transactions represent a customer segmentation model. In one embodiment, daily transactions are monitored by a script and used to dynamically adjust the customer segmentation model.
Abstract: A database system includes a plurality of access modules and corresponding persistent storage devices each having a pool of storage elements that can be allocated to store permanent files and temporary files. Each access module is associated with a non-persistent file management context and each storage device contains a persistent file management context. The persistent file management context indicates allocation of permanent files, while the non-persistent file management context indicates the allocation of both permanent and temporary files.
Type:
Grant
Filed:
December 8, 2000
Date of Patent:
March 8, 2011
Assignee:
Teradata US, Inc.
Inventors:
Gregory H. Milby, Steven C. Grolemund, Susan E. Choo
Abstract: Embodiments of the present invention provide for batch and incremental loading of data into a database. In the present invention, the loader infrastructure utilizes machine code database instructions and hardware acceleration to parallelize the load operations with the I/O operations. A large, hardware accelerator memory is used as staging cache for the load process. The load process also comprises an index profiling phase that enables balanced partitioning of the created indexes to allow for pipelined load. The online incremental loading process may also be performed while serving queries.
Type:
Grant
Filed:
June 23, 2008
Date of Patent:
February 22, 2011
Assignee:
Teradata US, Inc.
Inventors:
James Shau, Krishnan Meiyyappan, Hung Tran, Ravi Krishnamurthy, Kapil Surlaker, Jeremy Branscome, Joseph I Chamdani
Abstract: Apparatus, systems, and methods may operate to receive a designation of multiple rows to supply data to a single user defined function, which is made available in a structured query language SELECT statement. Further activities may include retrieving the data from at least one storage medium, packing each of the multiple rows having a common key into a single row, and transforming the data from a first state into a second state by applying the single function to the data using a single access module processor. Other apparatus, systems, and methods are disclosed.
Type:
Application
Filed:
October 26, 2010
Publication date:
February 17, 2011
Applicant:
Teradata US, Inc.
Inventors:
Lorenzo Danesi, Zhenrong Li, Blazimir Radovic
Abstract: Techniques for using metadata as comments to assist with search problem determination and analysis are provided. Before an action is taken on a search, contextual information is gathered as metadata about the action and actor requesting the action. The metadata is embedded in the search as comments and the comments are subsequently logged when the action is performed on the search. The comments combine with other comments previously recorded to permit subsequent analysis on searches.
Abstract: Techniques for discovering database connectivity leaks are presented. Each connection made by an application to a database is monitored. When the application is shut down, if information regarding a particular connection remains in memory, then that connection is reported as a potential database connectivity leak.
Abstract: A data-warehousing system allows various areas of an enterprise to view data at varying levels of data freshness. The system acquires data that represents an event in the life of a business enterprise, such as a transaction between the enterprise and one of its customers, and loads this data into a database table. The system then makes the data available for retrieval from the table and stores information indicating when the data was made available for retrieval. In some embodiments, the system also acquires data that is related to and more current than the data representing the event and stores the more current data in the database. The system then stores information indicating when the more current data was stored in the database. Such a data warehouse allows decision-makers in the business to see some information (e.g., customer transaction or account data) up-to-the-moment and other information as it stood at some specific point-in-time, such as at the end of the previous month.
Abstract: A SQL query that includes an IN-List is optimized by (1) performing an evaluation to determine whether access to a table can be performed as a join operation, (2) converting the IN-List to an IN-LIST relation, and (3) joining the IN-List relation with the table to access the data in the table.
Abstract: A method, computer program, and database system are disclosed for querying tables stored on multiple processing modules. The method includes specifying module group characteristics. A plurality of modules corresponding to the module group characteristics are then identified. The identified modules are sampled for statistics concerning at least one table specified in a query. An execution plan for the query is optimized based at least in part on the sampled statistics.
Type:
Grant
Filed:
May 24, 2004
Date of Patent:
January 25, 2011
Assignee:
Teradata US, Inc.
Inventors:
Arthur Vargas Lopes, Jerry Lynn Klindt, Kuorong Chiang, Donald Raymond Pederson, Pradeep Sathyanarayan