Abstract: Query results and statistics regarding them are generated using a novel representation of an n-attribute relation as an order n relational tensor. Orders of the relational tensor respectively correspond to each of the attributes, and each coordinate along an order relates to a key value of the corresponding attribute. Numeric values are stored in the relational tensor, each numeric value representing a count of tuples having the attribute key values that correspond to the coordinate of the numeric value along the orders of the relational tensor. This storage representation is useful in a variety of contexts for enhancing the performance of a RDBMS system. Specifically, a data-representing relational tensor can be used to produce results for join operations such as the SQL operations JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN and FULL OUTER JOIN.
Type:
Grant
Filed:
April 14, 2000
Date of Patent:
May 13, 2003
Assignee:
International Business Machines Corporation
Inventors:
Lance Christopher Amundsen, Robert Joseph Bestgen, Robert Douglas Driesch, Jr., Abdo Esmail Abdo, Daniel Virgil Toft
Abstract: A method and computer program product for analyzing data retrieval using index scanning in a database management system. The method involves scanning an index associated with a table in the database management system and selecting pages in the index. For each of the selected pages, the index entries are extracted and ranked. A distance parameter is then determined for each of the ranked index entries. The number of page transfers is estimated based on the distance parameters and the number of consecutive index entries which can be stored in the buffer pool.
Type:
Grant
Filed:
July 5, 2000
Date of Patent:
April 15, 2003
Assignee:
International Business Machines Corporation
Abstract: A method and system are provided for visualization of information stored in database records. Data visualizations offer a view of the underlying database records and are created and stored with in the database along with the database records. A data visualization includes data visualization points that are associated with underlying database records. Each data visualization point provides a direct drill-down capability, allowing for the full record display of the underlying database record associated with the a visualization point. Specific user interface mechanisms are provided with the data visualizations to allow a user to navigate among the database records as well as to manipulate the data stored in the underlying database records. Data manipulation may occur as a result of direct user interaction with the visualization points or through a specific user interface designed to allow data modification.
Type:
Grant
Filed:
January 14, 2000
Date of Patent:
March 4, 2003
Assignee:
International Business Machines Corporation
Inventors:
John F. Patterson, Steven L. Rohall, Arjuna Wijeyekoon
Abstract: Emerging patterns (EPs) are itemsets having supports that change significantly from one dataset to another. A classifier, CAEP, is disclosed using the following main ideas based on EPs: (i) Each EP can sharply differentiate the class membership of a (possibly small) fraction of instances containing the EP, due to the big difference between the EP's supports in the opposing classes; the differentiating power of the EP is defined in terms of the EP's supports and ratio, on instances containing the EP. (ii) For each instance t, by aggregating (124) the differentiating power of a fixed, automatically selected set of EPs, a score is obtained for each class (126). The scores for all classes are normalized (144) and the largest score determines t's class (146). CAEP is suitable for many applications, even those with large volumes of high dimensional data. CAEP does not depend on dimension reduction on data and is usually equally accurate on all classes even if their populations are unbalanced.