Patents by Inventor Mir Hamid Pirahesh
Mir Hamid Pirahesh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20150220529Abstract: Embodiments of the present invention relate to elimination of blocks such as splits in distributed processing systems such as MapReduce systems using the Hadoop Distributed Filing System (HDFS). In one embodiment, a method of and computer program product for optimizing queries in distributed processing systems are provided. A query is received. The query includes at least one predicate. The query refers to data. The data includes a plurality of records. Each record comprises a plurality of values in a plurality of attributes. Each record is located in at least one of a plurality of blocks of a distributed file system. Each block has a unique identifier. For each block of the distributed file system, at least one value cluster is determined for an attribute of the plurality of attributes. Each value cluster has a range. The predicate of the query is compared with the at least one value cluster of each block.Type: ApplicationFiled: February 6, 2014Publication date: August 6, 2015Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Mohamed Eltabakh, Peter J. Haas, Fatma Ozcan, Mir Hamid Pirahesh, John (Yannis) Sismanis, Jan Vondrak
-
Publication number: 20140214796Abstract: Embodiments of the invention relate to processing queries that utilize fact and/or dimension tables. In one aspect, a pre-join filtering phase precedes a star join. The necessary conditions for the pre-join filtering are considered for a given SQL query, including an estimated size of the hash table exceeding a threshold and presence of a local predicate either on the fact table or one or more dimension tables that is not a large dimension table. Once the necessary conditions are satisfied, the execution of the query exploits the pre-join filtering to build a pre-join output filter from columns of a reduced fact table that joins with each large dimension table. Thereafter, all the dimension tables and the fact table are joined in a star join while exploiting each pre-join filter. Accordingly, the order of when joins occur is changed in order to reduce the size of the fact table and to work from the fact table to reduce the size of large dimension tables.Type: ApplicationFiled: January 31, 2013Publication date: July 31, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Ronald J. Barber, Naresh K. Chainani, Guy M. Lohman, Mir Hamid Pirahesh, Vijayshankar Raman, Richard S. Sidle, Sandeep Tata
-
Patent number: 8661019Abstract: According to one embodiment of the present invention, a method for processing join predicates in full-text indexes is provided. The method includes evaluating local predicates of an outer full text index to generate a first posting list of documents. For each document in the first posting list, the value of a join attribute is determined and an inner full text index is probed to obtain a second posting list of documents containing one of the join attributes determined for each document. Local predicates of an inner full text index are evaluated to generate a third posting list of documents, and the second posting list is merged with the third posting list to generate a merge list of documents. Documents in the first posting list may be paired up with documents in the merge list.Type: GrantFiled: January 28, 2010Date of Patent: February 25, 2014Assignee: International Business Machines CorporationInventors: Latha Sankar Colby, Quanzhong Li, Fatma Ozcan, Mir Hamid Pirahesh, Eugene J. Shekita, Zografoula Vagena
-
Patent number: 8326847Abstract: A system, method and computer program product for executing a query on linked data sources. Embodiments of the invention generate an instance graph expressing relationships between objects in the linked data sources and receive a query including at least first and second search terms. The first search term is then executed on the instance graph and a summary graph is generated using the results of the executing step. A second search term is then executed on the summary graph.Type: GrantFiled: March 22, 2008Date of Patent: December 4, 2012Assignee: International Business Machines CorporationInventors: Andrey Balmin, Heasoo Hwang, Mir Hamid Pirahesh, Berthold Reinwald
-
Publication number: 20110184933Abstract: According to one embodiment of the present invention, a method for processing join predicates in full-text indexes is provided. The method includes evaluating local predicates of an outer full text index to generate a first posting list of documents. For each document in the first posting list, the value of a join attribute is determined and an inner full text index is probed to obtain a second posting list of documents containing one of the join attributes determined for each document. Local predicates of an inner full text index are evaluated to generate a third posting list of documents, and the second posting list is merged with the third posting list to generate a merge list of documents. Documents in the first posting list may be paired up with documents in the merge list.Type: ApplicationFiled: January 28, 2010Publication date: July 28, 2011Applicant: International Business Machines CorporationInventors: Latha Sankar Colby, Quanzhong Li, Fatma Ozcan, Mir Hamid Pirahesh, Eugene J. Shekita, Zografoula Vagena
-
Patent number: 7953694Abstract: Provided is a system, method, and program for specifying multidimensional calculations. Selection of a subset of a cube model metadata object that is generated from a facts metadata object and one or more dimension metadata objects is received. The facts metadata object references one or more measure metadata objects. A statement is generated for retrieving multidimensional information using metadata in the cube model metadata object and the measure metadata objects, wherein each of the measure metadata objects specifies one or more aggregations.Type: GrantFiled: January 13, 2003Date of Patent: May 31, 2011Assignee: International Business Machines CorporationInventors: Nathan Gevaerd Colossi, William Earl Malloy, Mir Hamid Pirahesh, Craig Reginald Tomlyn
-
Patent number: 7945557Abstract: A set of algebraic rules applicable to a query are identified, wherein each of the algebraic rules represents a relationship between two columns in a relational database table. A source column is identified by searching the query for a source predicate, wherein the source predicate is a range predicate. One or more candidate target columns are identified by searching the set of algebraic rules, wherein each of the candidate target columns occurs on one side of a binding expression and the source column occurs on the other side of the binding expression. For each of the one or more candidate target columns, a bounds subquery that provides a lower bound and an upper bound for a new range predicate is derived and he new range predicate is introduced into the query, wherein the query is executed to retrieve data from one or more data stores.Type: GrantFiled: May 25, 2007Date of Patent: May 17, 2011Assignee: International Business Machines CorporationInventors: Qi Cheng, Mir Hamid Pirahesh, Yang Sun, Calisto Paul Zuzarte
-
Patent number: 7945577Abstract: A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.Type: GrantFiled: May 19, 2008Date of Patent: May 17, 2011Assignee: International Business Machines CorporationInventors: Mehmet Altinel, Christof Bomhoevd, Chandrasekaran Mohan, Mir Hamid Pirahesh, Berthold Reinwald, Saileshwar Krishnamurthy
-
Patent number: 7716215Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.Type: GrantFiled: November 14, 2007Date of Patent: May 11, 2010Assignee: International Business Machines CorporationInventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
-
Publication number: 20090240682Abstract: A system, method and computer program product for executing a query on linked data sources. Embodiments of the invention generate an instance graph expressing relationships between objects in the linked data sources and receive a query including at least first and second search terms. The first search term is then executed on the instance graph and a summary graph is generated using the results of the executing step. A second search term is then executed on the summary graph.Type: ApplicationFiled: March 22, 2008Publication date: September 24, 2009Applicant: International Business Machines CorporationInventors: Andrey Balmin, Heasoo Hwang, Mir Hamid Pirahesh, Berthold Reinwald
-
Patent number: 7539667Abstract: Disclosed is a data processing system implemented method, a data processing system and an article of manufacture for executing a query having a union operator. A data processing system implemented method direct the data processing system to execute a query against a database having data objects. The query has sub-queries and having a union operator. The union operator is operable on sub-queries associated with the query. The database is operatively coupled to the data processing system.Type: GrantFiled: November 5, 2004Date of Patent: May 26, 2009Assignee: International Business Machines CorporationInventors: Bruce Gilbert Lindsay, Linqi Liu, Robert Paul Neugebauer, Mir Hamid Pirahesh, David C. Sharpe, Nattavut Sutyanyong, Calisto Paul Zuzarte
-
Patent number: 7478080Abstract: A system, apparatus, and program storage device implementing a method of optimizing queries used for searching a computerized database, wherein the method comprises providing a query comprising a sequence of inner joins and outerjoins; and rewriting the query by producing a sequence of outer Cartesian products for the query; producing a sequence of nullification operations on the query; and producing a sequence of best match operations on the query. The method further comprises optimizing the query using a query execution plan for processing the rewritten query, wherein the query execution plan expands a search space in the database for which the rewritten query may be run.Type: GrantFiled: September 30, 2004Date of Patent: January 13, 2009Assignee: International Business Machines CorporationInventors: Mir Hamid Pirahesh, Jun Rao, Calisto Zuzarte
-
Publication number: 20080215580Abstract: A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.Type: ApplicationFiled: May 19, 2008Publication date: September 4, 2008Applicant: International Business Machines CorporationInventors: MEHMET ALTINEL, Christof Bomhoevd, Chandasekaran Mohan, Mir Hamid Pirahesh, Berthold Reinwald, Saileshwar Krishnamurthy
-
Patent number: 7409385Abstract: Disclosed is a data processing system implemented method, a data processing system and an article of manufacture for executing a query having a union operator. The data processing system implemented method directs the data processing system to process a query against data objects. The data objects are operatively coupled to the data processing system. The query includes a parent operator. The parent operator references a union operator. The union operator references sub-queries. The sub-queries reference the data objects. The data processing system implemented method includes noting a set of partitionings for the union operator, the noted set of partitionings being based on the sub-queries and being based on the data objects reference by the sub-queries, and executing the query having the union operator, the execution of the query being based on the noted set of partitionings and the parent operator.Type: GrantFiled: November 5, 2004Date of Patent: August 5, 2008Assignee: International Business Machines CorporationInventors: Bruce Gilbert Lindsay, Linqi Liu, Robert Paul Neugebauer, Mir Hamid Pirahesh, David C. Sharpe, Nattavut Sutyanyong, Calisto Paul Zuzarte
-
Publication number: 20080177722Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.Type: ApplicationFiled: November 14, 2007Publication date: July 24, 2008Applicant: International Business Machines Corp.Inventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
-
Patent number: 7395258Abstract: A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.Type: GrantFiled: July 30, 2004Date of Patent: July 1, 2008Assignee: International Business Machines CorporationInventors: Mehmet Altinel, Christof Bornhoevd, Chandrasekaran Mohan, Mir Hamid Pirahesh, Berthold Reinwald, Saileshwar Krishnamurthy
-
Patent number: 7383246Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.Type: GrantFiled: October 31, 2003Date of Patent: June 3, 2008Assignee: International Business Machines CorporationInventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
-
Patent number: 7315852Abstract: A method for using pre-computed information stored in auxiliary structures to speed up processing of expensive queries on hierarchical documents such as XML documents being queried using XPath. The invention defines a taxonomy of such structures such as indexes and materialized views for storing pre-computed XPath results (PXRs), determines what portion of the query can be evaluated by the structures, and computes the compensation for the results generated by the structures. The invention detects all structures applicable to the query and rewrites the query to use such structures, speeding up the performance of the queries. The invention identifies the matching structures by detecting containment mappings between XPath expressions in the query and the structure. The invention also includes a new representation for XPath expressions that is rich enough to express all features of XPath.Type: GrantFiled: October 31, 2003Date of Patent: January 1, 2008Assignee: International Business Machines CorporationInventors: Audrey L. Balmin, Kevin S. Beyer, Roberta Jo Cochrane, Fatma Ozcan, Mir Hamid Pirahesh
-
Patent number: 7275056Abstract: A system and method transform queries with subqueries, using window aggregation. An optimizer in a relational database management system transforms queries to optimize their efficiency and speed. The method transforms queries that have a subquery, replacing the subquery with a window aggregation function. In the case of a correlated subquery, the window aggregation function is partitioned by a correlated column of a correlated table. All data in the main select clause, or outer block, of the query that was obtained through references to the correlated table is instead obtained through the new window aggregation subquery. By using window aggregation, the aggregation is performed at the same time as the selection of relevant data from the correlated table, thereby compiling all needed data in a single pass through the table or view. Reducing the number of times that tables or views are accessed reduces the computational demands of a query.Type: GrantFiled: April 29, 2003Date of Patent: September 25, 2007Assignee: International Business Machines CorporationInventors: Qi Cheng, Linqi Liu, Wenbin Ma, Mir Hamid Pirahesh, Calisto P. Zuzarte
-
Patent number: 7240078Abstract: A query is matched to an outlier materialized query table that stores exception data. The query is searched for a source predicate. An outlier predicate in the outlier materialized query table that corresponds to the source predicate is searched for a target column that corresponds to a source column in the source predicate. A new range predicate is derived based on the target column and introduced into the query, wherein the query is executed to retrieve data from one or more data stores.Type: GrantFiled: November 25, 2003Date of Patent: July 3, 2007Assignee: International Business Machines CorporationInventors: Qi Cheng, Mir Hamid Pirahesh, Yang Sun, Calisto Paul Zuzarte