Patents by Inventor Pekka Kostamaa

Pekka Kostamaa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20180181621
    Abstract: A system and method for random sampling of distributed data, including distributed data streams. The system and method use a multi-level reservoir sampling technique that leverages the conventional reservoir sampling algorithm for distributed data or distributed data streams. The method establishes an intermediate reservoir for each distributed data source or data stream and populates the intermediate reservoirs with a sample of data elements received from each distributed data source or data stream. A final reservoir is established and data elements are randomly selected from each one of the intermediate reservoirs to populate the final reservoir.
    Type: Application
    Filed: December 22, 2016
    Publication date: June 28, 2018
    Applicant: Teradata US, Inc.
    Inventors: Mohammed Hussein Al-Kateb, Olli Pekka Kostamaa
  • Patent number: 9424260
    Abstract: Techniques for data assignment from an external distributed file system (DFS) to a database management system (DBMS) are provided. Data blocks from the DFS are represented as first nodes and access module processors of the DBMS are represented as second nodes. A graph is produced with the first and second nodes. Assignments are made for the first nodes to the second nodes based on evaluation of the graph to integrate the DFS with the DBMS.
    Type: Grant
    Filed: March 18, 2014
    Date of Patent: August 23, 2016
    Assignee: Teradata US, Inc.
    Inventors: Yan Qi, Yu Xu, Olli Pekka Kostamaa, Jian Wen
  • Patent number: 9336270
    Abstract: Techniques for accessing a parallel database system via an external program using vertical and/or horizontal partitioning are provided. An external program to a database management system (DBMS) configures external mappers to process a specific portion of query results on specific access module processors of the DBMS that are to house query results. The query is submitted by the external program to the DBMS and the DBMS is directed to organize the query results in a vertical or horizontal manner. Each external mapper accesses its portion of the query results for processing in parallel on its designated AMP or set of AMPS to process the query results.
    Type: Grant
    Filed: March 18, 2014
    Date of Patent: May 10, 2016
    Assignee: Teradata US, Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Patent number: 9235590
    Abstract: A database system may implement compression management of tables in the database system. The compression management may include determination of a pattern of usage of various database tables in the database system. Based on this pattern of usage, the database tables may be selected as candidates for compression or decompression at the appropriate time. In one example, the pattern of usage may be based on the contents of a query log of the database system. The compression management may also include evaluation of various compression strategies to apply to a candidate database table. Each compression strategy may be evaluated to determine if application to a database table or a portion of the database table would be beneficial based on various conditions. The compression management may also include consideration of each available compression strategy to be applied solely or in combination with one another.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: January 12, 2016
    Assignee: Teradata US, Inc.
    Inventors: Guilian Wang, Olli Pekka Kostamaa, Gary Allen Roberts, Steven Cohen, John R. Catozzi
  • Patent number: 8938444
    Abstract: Techniques for external application-directed data partitioning in data exported from a parallel database management system (DBMS) are provided. An external application sends a query, a total number of requested access module processors (AMPs), and an application-defined data partitioning expression to the DBMS. The DBMS executes the query with the results vertical partitioned on the identified number of AMPs. Individual external mappers access their assigned AMPs asking for specific partitions that they are assigned to process the query results.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: January 20, 2015
    Assignee: Teradata US, Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Patent number: 8832074
    Abstract: A system, method, and computer-readable medium that facilitate dynamic skew avoidance are provided. The disclosed mechanisms advantageously do not require any statistic information regarding which values are skewed in a column on which a query is applied. Query selectivity is evaluated at a check point and thereby facilitates accurate detection of an overloaded processing module. The successful detection of an overloaded processing module causes other processing modules to stop sending more skewed rows to the overloaded processing module. Detection of an overloaded processing module is made when the overloaded processing module has received more rows than a target number of rows. Further, skewed rows that are maintained locally rather than redistributed to a detected processing module may result in more processing modules becoming overloaded. Advantageously, the disclosed mechanisms provide for a final redistribution adjustment to provide for even distribution of rows among all processing modules.
    Type: Grant
    Filed: October 21, 2009
    Date of Patent: September 9, 2014
    Assignee: Teradata US, Inc.
    Inventors: Xin Zhou, Olli Pekka Kostamaa
  • Publication number: 20140222871
    Abstract: Techniques for data assignment from an external distributed file system (DFS) to a database management system (DBMS) are provided. Data blocks from the DFS are represented as first nodes and access module processors of the DBMS are represented as second nodes. A graph is produced with the first and second nodes. Assignments are made for the first nodes to the second nodes based on evaluation of the graph to integrate the DFS with the DBMS.
    Type: Application
    Filed: March 18, 2014
    Publication date: August 7, 2014
    Applicant: Teradata US, Inc.
    Inventors: Yan Qi, Yu Xu, Olli Pekka Kostamaa, Jian Wen
  • Publication number: 20140222787
    Abstract: Techniques for accessing a parallel database system via an external program using vertical and/or horizontal partitioning are provided. An external program to a database management system (DBMS) configures external mappers to process a specific portion of query results on specific access module processors of the DBMS that are to house query results. The query is submitted by the external program to the DBMS and the DBMS is directed to organize the query results in a vertical or horizontal manner. Each external mapper accesses its portion of the query results for processing in parallel on its designated AMP or set of AMPS to process the query results.
    Type: Application
    Filed: March 18, 2014
    Publication date: August 7, 2014
    Applicant: TERADATA US, INC.
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Patent number: 8713057
    Abstract: Techniques for data assignment from an external distributed file system (DFS) to a database management system (DBMS) are provided. Data blocks from the DFS are represented as first nodes and access module processors of the DBMS are represented as second nodes. A graph is produced with the first and second nodes. Assignments are made for the first nodes to the second nodes based on evaluation of the graph to integrate the DFS with the DBMS.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: April 29, 2014
    Assignee: Teradata US, Inc.
    Inventors: Yan Qi, Yu Xu, Olli Pekka Kostamaa, Jian Wen
  • Patent number: 8712994
    Abstract: Techniques for accessing a parallel database system via an external program using vertical and/or horizontal partitioning are provided. An external program to a database management system (DBMS) configures external mappers to process a specific portion of query results on specific access module processors of the DBMS that are to house query results. The query is submitted by the external program to the DBMS and the DBMS is directed to organize the query results in a vertical or horizontal manner. Each external mapper accesses its portion of the query results for processing in parallel on its designated AMP or set of AMPS to process the query results.
    Type: Grant
    Filed: December 29, 2011
    Date of Patent: April 29, 2014
    Assignee: Teradata US. Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Patent number: 8688722
    Abstract: To process a sequence of outer joins in a database system, the database system performs a first outer join of the sequence of outer joins. A result of the first outer join is stored in a result table stored across plural storage modules of the database system. At least a subset of records of the result table is redistributed across the storage modules according to a first join attribute of the result table, where any record of the result table that has a null value for the first join attribute is not redistributed. A second outer join of the sequence is performed using the redistributed result table and another table, where the second outer join is based on the first join attribute of the result table.
    Type: Grant
    Filed: December 16, 2009
    Date of Patent: April 1, 2014
    Assignee: Teradata US, Inc.
    Inventors: O. Pekka Kostamaa, Yu Xu
  • Patent number: 8600994
    Abstract: A small table S is outer joined to a large table L on a join condition on a database system with a plurality B of parallel units (PUs). S and L are partitioned across the PUs. Each row in S has a unique row-id. Each row of S is duplicated on all PUs to form Sdup. On each PU, dangling rows in S that do not have a match in L under the join condition are identified and the row-ids of the dangling rows are saved in Tredis. Tredis is partitioned across the PUs. P is formed from dangling rows of S whose corresponding entries in Tredis appear in all PUs. A result is produced by unioning P and I. I is formed by inner joining non-dangling rows of S with L. The result is saved.
    Type: Grant
    Filed: September 2, 2010
    Date of Patent: December 3, 2013
    Assignee: Teradata US, Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Patent number: 8543596
    Abstract: In general, a technique or mechanism is provided to efficiently transfer data of a distributed file system to a parallel database management system using an algorithm that avoids or reduces sending of blocks of files across computer nodes on which the parallel database management system is implemented.
    Type: Grant
    Filed: December 17, 2009
    Date of Patent: September 24, 2013
    Assignee: Teradata US, Inc.
    Inventors: O. Pekka Kostamaa, Keliang Zhao, Yu Xu
  • Patent number: 8510280
    Abstract: A system, method, and computer-readable medium for dynamic detection and management of data skew in parallel join operations are provided. Rows allocated to processing modules involved in a join operation are redistributed among the processing modules by a hash redistribution of the join attributes. Receipt by a processing module of an excessive number of redistributed rows having a skewed value on the join attribute is detected by a processing module which notifies other processing modules of the skewed value. Processing modules then terminate redistribution of rows having a join attribute value matching the skewed value and either store such rows locally or duplicate the rows. The processing module that has received an excessive number of redistributed rows removes rows having a skewed value of the join attribute from a redistribution spool allocated thereto and duplicates the rows to each of the processing modules.
    Type: Grant
    Filed: June 30, 2009
    Date of Patent: August 13, 2013
    Assignee: Teradata US, Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa, Xin Zhou
  • Publication number: 20130173594
    Abstract: Techniques for accessing a parallel database system via an external program using vertical and/or horizontal partitioning are provided. An external program to a database management system (DBMS) configures external mappers to process a specific portion of query results on specific access module processors of the DBMS that are to house query results. The query is submitted by the external program to the DBMS and the DBMS is directed to organize the query results in a vertical or horizontal manner. Each external mapper accesses its portion of the query results for processing in parallel on its designated AMP or set of AMPS to process the query results.
    Type: Application
    Filed: December 29, 2011
    Publication date: July 4, 2013
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Publication number: 20130173595
    Abstract: Techniques for external application-directed data partitioning in data exported from a parallel database management system (DBMS) are provided. An external application sends a query, a total number of requested access module processors (AMPs), and an application-defined data partitioning expression to the DBMS. The DBMS executes the query with the results vertical partitioned on the identified number of AMPs. Individual external mappers access their assigned AMPs asking for specific partitions that they are assigned to process the query results.
    Type: Application
    Filed: December 29, 2011
    Publication date: July 4, 2013
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Publication number: 20130173666
    Abstract: Techniques for data assignment from an external distributed file system (DFS) to a database management system (DBMS) are provided. Data blocks from the DFS are represented as first nodes and access module processors of the DBMS are represented as second nodes. A graph is produced with the first and second nodes. Assignments are made for the first nodes to the second nodes based on evaluation of the graph to integrate the DFS with the DBMS.
    Type: Application
    Filed: December 29, 2011
    Publication date: July 4, 2013
    Inventors: Yan Qi, Yu Xu, Olli Pekka Kostamaa, Jian Wen
  • Patent number: 8234292
    Abstract: A system, method, and computer-readable medium for optimized processing of queries that feature maximum or minimum equality conditions are provided. A table on which the query is applied is scanned a single time. Rows of the table distributed to respective processing modules are scanned by the processing modules. Each processing module maintains identification of any rows distributed to the respective processing module that have attribute values that equal the maximum or minimum attribute value locally identified by the processing module. Subsequently, a global aggregation mechanism is invoked to compute the query result without requiring an additional rescan of the table. Further, the disclosed mechanisms may be extended to compute top N queries featuring maximum or minimum equality conditions.
    Type: Grant
    Filed: December 11, 2008
    Date of Patent: July 31, 2012
    Assignee: Teradata US, Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Patent number: 8234268
    Abstract: A system, method, and computer-readable medium for optimization of query processing in a parallel processing system are provided. Skewed values and non-skewed values are treated differently to improve upon conventional DISTINCT and aggregation query processing. Skewed attribute values on which a DISTINCT selection or group by aggregation is applied are allocated entries in a hash table. In this manner, a processing module may consult the hash table to determine if a skewed attribute value has been encountered during the query processing in a manner that precludes repetitive redistribution of rows with highly skewed attribute values on which a DISTINCT selection or group by aggregation is applied.
    Type: Grant
    Filed: November 25, 2008
    Date of Patent: July 31, 2012
    Assignee: Teradata US, Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa
  • Patent number: 8150836
    Abstract: A system, method, and computer-readable medium for optimizing execution of a join operation in a parallel processing system are provided. A plurality of processing nodes that have at least one row of one or more tables involved in a join operation are identified. For each of the processing nodes, respective counts of rows that would be redistributed to each of the processing nodes based on join attributes of the rows are determined. A redistribution matrix is calculated from the counts of rows of each of the processing nodes. An optimized redistribution matrix is generated from the redistribution matrix, wherein the optimized redistribution matrix provides a minimization of rows to be redistributed among the nodes to execute the join operation.
    Type: Grant
    Filed: August 19, 2008
    Date of Patent: April 3, 2012
    Assignee: Teradata US, Inc.
    Inventors: Yu Xu, Olli Pekka Kostamaa, Xin Zhou