Patents by Inventor David Everett Simmen
David Everett Simmen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11023443Abstract: A system and method for determining optimal query plans within distributed database system employing table operators for performing analytic operations for storing and processing multi-structured data. The optimization of a query plan proceeds through a collaborative exchange between a database system optimizer, or planner, and a table operator, wherein multiple communications between said optimizer and said table operator are conducted to exchange input and output information relevant to optimizing execution of the query and table operator.Type: GrantFiled: February 16, 2016Date of Patent: June 1, 2021Assignee: Teradata US, Inc.Inventors: Derrick Poo-Ray Kondo, Tongxin Bai, Anjali Betawadkar-Norwood, Aditi Subodh Pandit, David Everett Simmen
-
Publication number: 20160239544Abstract: A system and method for determining optimal query plans within distributed database system employing table operators for performing analytic operations for storing and processing multi-structured data. The optimization of a query plan proceeds through a collaborative exchange between a database system optimizer, or planner, and a table operator, wherein multiple communications between said optimizer and said table operator are conducted to exchange input and output information relevant to optimizing execution of the query and table operator.Type: ApplicationFiled: February 16, 2016Publication date: August 18, 2016Applicant: Teradata US, Inc.Inventors: Derrick Poo-Ray Kondo, Tongxin Bai, Anjali Betawadkar-Norwood, Aditi Subodh Pandit, David Everett Simmen
-
Patent number: 9418069Abstract: A data mashup system having information extraction capabilities for receiving multiple streams of textual data, at least one of which contains unstructured textual data. A repository stores annotators that describe how to analyze the streams of textual data for specified unstructured data components. The annotators are applied to the data streams to identify and extract the specified data components according to the annotators. The extracted data components are tagged to generate structured data components and the specified unstructured data components in the input data streams are replaced with the tagged data components. The system then combines the tagged data from the multiple streams to form a mashup output data stream.Type: GrantFiled: May 26, 2010Date of Patent: August 16, 2016Assignee: International Business Machines CorporationInventors: Yunyao Li, Frederick Ralph Reiss, David Everett Simmen, Suresh Thalamati
-
Patent number: 8805834Abstract: A data mashup system having information extraction capabilities for receiving multiple streams of textual data, at least one of which contains unstructured textual data. A repository stores annotators that describe how to analyze the streams of textual data for specified unstructured data components. The annotators are applied to the data streams to identify and extract the specified data components according to the annotators. The extracted data components are tagged to generate structured data components and the specified unstructured data components in the input data streams are replaced with the tagged data components. The system then combines the tagged data from the multiple streams to form a mashup output data stream.Type: GrantFiled: March 7, 2012Date of Patent: August 12, 2014Assignee: International Business Machines CorporationInventors: Yunyao Li, Frederick Ralph Reiss, David Everett Simmen, Suresh Thalamati
-
Patent number: 8538985Abstract: Methods and apparatus, including computer program products, implementing and using techniques for processing a federated query in a federated database system. A federated query is received at a federated database server. A federated query execution plan is generated based on the received federated query. The federated query execution plan defines one or more source servers of the federated database and a unique subquery to be executed on each of the source servers. The subqueries are distributed to the source servers in accordance with the federated query execution plan. The respective subqueries are executed asynchronously at the source servers. The subquery results are passed to a first designated source server defined in the federated query execution plan. The subquery results are joined and aggregated at the first designated source server into a final query result. The final query result is returned to the federated database server.Type: GrantFiled: March 11, 2008Date of Patent: September 17, 2013Assignee: International Business Machines CorporationInventors: Anjali Betawadkar-Norwood, Hamid Pirahesh, David Everett Simmen
-
Patent number: 8140596Abstract: Methods and systems for improving a data transformation operation that converts a source data instance containing repeating elements into a target data instance having a user-specified structure, based solely on a user's specification of a target template. The methods and systems derive and calculate sub-iteration contexts by applying a selected heuristic to the source data instance and the target template, and use these sub-iteration contexts to create a target data instance having a repeating structure that agrees with the user-specified target template. The methods and systems can be customized by the selection of heuristic, and by the specification of explicit sub-iteration contexts that may override the derived contexts.Type: GrantFiled: October 15, 2009Date of Patent: March 20, 2012Assignee: International Business Machines CorporationInventors: Armageddon Rhabdizo Brown, David Everett Simmen
-
Patent number: 7958113Abstract: A method and system for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated. A query workload and/or database statistics are dynamically updated. A new set of training points is collected off-line. Using the new set of training points, the first classifier is modified into a second classifier. A database query is received at a runtime subsequent to the off-line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. A mapping of the selectivities into a plan determines the query execution plan. The determined query execution plan is included in an augmented set of training points, where the augmented set includes the initial set and the new set.Type: GrantFiled: May 22, 2008Date of Patent: June 7, 2011Assignee: International Business Machines CorporationInventors: Wei Fan, Guy Maring Lohman, Volker Gerhard Markl, Nimrod Megiddo, Jun Rao, David Everett Simmen, Julia Stoyanovich
-
Publication number: 20110093514Abstract: Methods and systems for improving a data transformation operation that converts a source data instance containing repeating elements into a target data instance having a user-specified structure, based solely on a user's specification of a target template. The methods and systems derive and calculate sub-iteration contexts by applying a selected heuristic to the source data instance and the target template, and use these sub-iteration contexts to create a target data instance having a repeating structure that agrees with the user-specified target template. The methods and systems can be customized by the selection of heuristic, and by the specification of explicit sub-iteration contexts that may override the derived contexts.Type: ApplicationFiled: October 15, 2009Publication date: April 21, 2011Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Armageddon Rhabdizo Brown, David Everett Simmen
-
Patent number: 7716215Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.Type: GrantFiled: November 14, 2007Date of Patent: May 11, 2010Assignee: International Business Machines CorporationInventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
-
Publication number: 20090234799Abstract: Methods and apparatus, including computer program products, implementing and using techniques for processing a federated query in a federated database system. A federated query is received at a federated database server. A federated query execution plan is generated based on the received federated query. The federated query execution plan defines one or more source servers of the federated database and a unique subquery to be executed on each of the source servers. The subqueries are distributed to the source servers in accordance with the federated query execution plan. The respective subqueries are executed asynchronously at the source servers. The subquery results are passed to a first designated source server defined in the federated query execution plan. The subquery results are joined and aggregated at the first designated source server into a final query result. The final query result is returned to the federated database server.Type: ApplicationFiled: March 11, 2008Publication date: September 17, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Anjali Betawadkar-Norwood, Hamid Pirahesh, David Everett Simmen
-
Publication number: 20080222093Abstract: A method and system for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated. A query workload and/or database statistics are dynamically updated. A new set of training points is collected off-line. Using the new set of training points, the first classifier is modified into a second classifier. A database query is received at a runtime subsequent to the off-line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. A mapping of the selectivities into a plan determines the query execution plan. The determined query execution plan is included in an augmented set of training points, where the augmented set includes the initial set and the new set.Type: ApplicationFiled: May 22, 2008Publication date: September 11, 2008Inventors: Wei Fan, Guy Maring Lohman, Volker Gerhard Markl, Nimrod Megiddo, Jun Rao, David Everett Simmen, Julia Stoyanovich
-
Publication number: 20080195577Abstract: A method for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated using a set of random decision trees (RDTs). A query workload and/or database statistics are dynamically updated. A new set of training points collected off-line is used to modify the first classifier into a second classifier. A database query is received at a runtime subsequent to the off line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. The query execution plan is determined by identifying an optimal average of posterior probabilities obtained across a set of RDTs and mapping the selectivities to a plan. The determined query execution plan is included in an augmented set of training points that includes the initial set and the new set.Type: ApplicationFiled: February 9, 2007Publication date: August 14, 2008Inventors: Wei Fan, Guy Maring Lohman, Volker Gerhard Markl, Nimrod Megiddo, Jun Rao, David Everett Simmen, Julia Stoyanovich
-
Publication number: 20080177722Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.Type: ApplicationFiled: November 14, 2007Publication date: July 24, 2008Applicant: International Business Machines Corp.Inventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
-
Patent number: 7383246Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.Type: GrantFiled: October 31, 2003Date of Patent: June 3, 2008Assignee: International Business Machines CorporationInventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
-
Patent number: 7185004Abstract: A reverse routing system optimizes execution of a query that accesses data stored in one or more materialized query tables in a database of a computer system. The system receives a query directly referencing the materialized query table. The system identifies the referenced materialized query tables in a catalogue of materialized query tables and a defining query associated with the referenced materialized query table. The system substitutes the defining query for the referenced materialized query table in the received query. The system adds the referenced materialized query table to the set of eligible materialized query tables that are selected using query matching algorithms so that they can be considered for routing by the query optimizer.Type: GrantFiled: December 9, 2005Date of Patent: February 27, 2007Assignee: International Business Machines CorporationInventors: David Everett Simmen, Mir Hamid Pirahesh
-
Patent number: 6993516Abstract: A system, method and computer readable medium for sampling data from a relational database are disclosed, where an information processing system chooses rows from a table in a relational database for sampling, wherein data values are arranged into rows, rows are arranged into pages, and pages are arranged into tables. Pages are chosen for sampling according to a probability P and rows in a selected page are chosen for sampling according to a probability R, so that the overall probability of choosing a row for sampling is Q=PR. The probabilities P and R are based on the desired precision of estimates computed from a sample, as well as processing speed. The probabilities P and R are further based on either catalog statistics of the relational database or a pilot sample of rows from the relational database.Type: GrantFiled: December 26, 2002Date of Patent: January 31, 2006Assignee: International Business Machines CorporationInventors: Peter Jay Haas, Guy Maring Lohman, Mir Hamid Pirahesh, David Everett Simmen, Ashutosh Vir Vikram Singh, Michael Jeffrey Winer, Markos Zaharioudakis
-
Publication number: 20040128290Abstract: A system, method and computer readable medium for sampling data from a relational database are disclosed, where an information processing system chooses rows from a table in a relational database for sampling, wherein data values are arranged into rows, rows are arranged into pages, and pages are arranged into tables. Pages are chosen for sampling according to a probability P and rows in a selected page are chosen for sampling according to a probability R, so that the overall probability of choosing a row for sampling is Q=PR. The probabilities P and R are based on the desired precision of estimates computed from a sample, as well as processing speed. The probabilities P and R are further based on either catalog statistics of the relational database or a pilot sample of rows from the relational database.Type: ApplicationFiled: December 26, 2002Publication date: July 1, 2004Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Peter Jay Haas, Guy Maring Lohman, Mir Hamid Pirahesh, David Everett Simmen, Ashutosh Vir Vikram Singh, Michael Jeffrey Winer, Markos Zaharioudakis
-
Patent number: 6105020Abstract: A system and method for a relational database system for identifying star joins in a query and for breaking the query down for bitmap ANDing. The fact table of the star join is located, and cycles between and within dimension tables are broken. Then, the minimal set of tables necessary to execute the star join is identified, and the dimension tables that should appear in the bitmap ANDing plan are also identified. A bitmap ANDing plan is then generated, or, if the query does not qualify for bitmap ANDing, a conventional execution plan is generated.Type: GrantFiled: October 11, 1999Date of Patent: August 15, 2000Assignee: International Business Machines CorporationInventors: Bruce Gilbert Lindsay, Eugene Jon Shekita, David Everett Simmen, Kaarel Truuvert
-
Patent number: 6081801Abstract: An automated methodology, and an apparatus for practicing the methodology, which enables the power and flexibility inherent in shared nothing parallel database systems (MPP) to be utilized on complex queries which have, heretofore, contained query elements requiring local computation or local coordination of data computation performed across the nodes of the distributed system. The present invention provides these features and advantages by identifying and marking the subgraphs containing these types of query elements as "no TQ zones" in the preparation phase prior to optimization. When the optimizer sees the markings, it builds a plan that will force the computation of the marked subgraphs to be in the same section. This preparation phase also provides the partitioning information for all inputs to the "no TQ zones". This allows the bottom-up optimizer to correctly plan the partitioning for the "no TQ zones".Type: GrantFiled: June 30, 1997Date of Patent: June 27, 2000Assignee: International Business Machines CorporationInventors: Roberta Jo Cochrane, George Lapis, Mir Hamid Pirahesh, Richard Sefton Sidle, David Everett Simmen, Tuong Chanh Truong, Monica Sachiye Urata
-
Patent number: 5960428Abstract: Unwieldy star/join queries are performed more efficiently using a filtered fact table. Suitable queries include star/join queries with a large fact table joined with multiple subsidiary dimension tables, where indices exist over fact table join columns. The query is analyzed to prepare a query plan for the dimension table accesses. This plan is supplemented by adding nested loop join operations, where the inner table is a dimension table plan and the outer table is an index scan performed over a fact table index of the join column with the dimension table. The plan is also supplemented by filtering records resulting from the nested loop joins using a sequence of dynamic bit vectors, ultimately yielding a list of probable fact table records. The plan is further supplemented by fetching these records to construct a distilled fact which is used, instead of the large original table, to execute the query in considerably less time.Type: GrantFiled: August 28, 1997Date of Patent: September 28, 1999Assignee: International Business Machines CorporationInventors: Bruce Gilbert Lindsay, Guy Maring Lohman, Mir Hamid Pirahesh, Eugene Jon Shekita, David Everett Simmen, Monica Sachiye Urata