Patents by Inventor Mir Hamid Pirahesh

Mir Hamid Pirahesh has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150220529
    Abstract: Embodiments of the present invention relate to elimination of blocks such as splits in distributed processing systems such as MapReduce systems using the Hadoop Distributed Filing System (HDFS). In one embodiment, a method of and computer program product for optimizing queries in distributed processing systems are provided. A query is received. The query includes at least one predicate. The query refers to data. The data includes a plurality of records. Each record comprises a plurality of values in a plurality of attributes. Each record is located in at least one of a plurality of blocks of a distributed file system. Each block has a unique identifier. For each block of the distributed file system, at least one value cluster is determined for an attribute of the plurality of attributes. Each value cluster has a range. The predicate of the query is compared with the at least one value cluster of each block.
    Type: Application
    Filed: February 6, 2014
    Publication date: August 6, 2015
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mohamed Eltabakh, Peter J. Haas, Fatma Ozcan, Mir Hamid Pirahesh, John (Yannis) Sismanis, Jan Vondrak
  • Publication number: 20140214796
    Abstract: Embodiments of the invention relate to processing queries that utilize fact and/or dimension tables. In one aspect, a pre-join filtering phase precedes a star join. The necessary conditions for the pre-join filtering are considered for a given SQL query, including an estimated size of the hash table exceeding a threshold and presence of a local predicate either on the fact table or one or more dimension tables that is not a large dimension table. Once the necessary conditions are satisfied, the execution of the query exploits the pre-join filtering to build a pre-join output filter from columns of a reduced fact table that joins with each large dimension table. Thereafter, all the dimension tables and the fact table are joined in a star join while exploiting each pre-join filter. Accordingly, the order of when joins occur is changed in order to reduce the size of the fact table and to work from the fact table to reduce the size of large dimension tables.
    Type: Application
    Filed: January 31, 2013
    Publication date: July 31, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ronald J. Barber, Naresh K. Chainani, Guy M. Lohman, Mir Hamid Pirahesh, Vijayshankar Raman, Richard S. Sidle, Sandeep Tata
  • Patent number: 8661019
    Abstract: According to one embodiment of the present invention, a method for processing join predicates in full-text indexes is provided. The method includes evaluating local predicates of an outer full text index to generate a first posting list of documents. For each document in the first posting list, the value of a join attribute is determined and an inner full text index is probed to obtain a second posting list of documents containing one of the join attributes determined for each document. Local predicates of an inner full text index are evaluated to generate a third posting list of documents, and the second posting list is merged with the third posting list to generate a merge list of documents. Documents in the first posting list may be paired up with documents in the merge list.
    Type: Grant
    Filed: January 28, 2010
    Date of Patent: February 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Latha Sankar Colby, Quanzhong Li, Fatma Ozcan, Mir Hamid Pirahesh, Eugene J. Shekita, Zografoula Vagena
  • Patent number: 8326847
    Abstract: A system, method and computer program product for executing a query on linked data sources. Embodiments of the invention generate an instance graph expressing relationships between objects in the linked data sources and receive a query including at least first and second search terms. The first search term is then executed on the instance graph and a summary graph is generated using the results of the executing step. A second search term is then executed on the summary graph.
    Type: Grant
    Filed: March 22, 2008
    Date of Patent: December 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Andrey Balmin, Heasoo Hwang, Mir Hamid Pirahesh, Berthold Reinwald
  • Publication number: 20110184933
    Abstract: According to one embodiment of the present invention, a method for processing join predicates in full-text indexes is provided. The method includes evaluating local predicates of an outer full text index to generate a first posting list of documents. For each document in the first posting list, the value of a join attribute is determined and an inner full text index is probed to obtain a second posting list of documents containing one of the join attributes determined for each document. Local predicates of an inner full text index are evaluated to generate a third posting list of documents, and the second posting list is merged with the third posting list to generate a merge list of documents. Documents in the first posting list may be paired up with documents in the merge list.
    Type: Application
    Filed: January 28, 2010
    Publication date: July 28, 2011
    Applicant: International Business Machines Corporation
    Inventors: Latha Sankar Colby, Quanzhong Li, Fatma Ozcan, Mir Hamid Pirahesh, Eugene J. Shekita, Zografoula Vagena
  • Patent number: 7953694
    Abstract: Provided is a system, method, and program for specifying multidimensional calculations. Selection of a subset of a cube model metadata object that is generated from a facts metadata object and one or more dimension metadata objects is received. The facts metadata object references one or more measure metadata objects. A statement is generated for retrieving multidimensional information using metadata in the cube model metadata object and the measure metadata objects, wherein each of the measure metadata objects specifies one or more aggregations.
    Type: Grant
    Filed: January 13, 2003
    Date of Patent: May 31, 2011
    Assignee: International Business Machines Corporation
    Inventors: Nathan Gevaerd Colossi, William Earl Malloy, Mir Hamid Pirahesh, Craig Reginald Tomlyn
  • Patent number: 7945557
    Abstract: A set of algebraic rules applicable to a query are identified, wherein each of the algebraic rules represents a relationship between two columns in a relational database table. A source column is identified by searching the query for a source predicate, wherein the source predicate is a range predicate. One or more candidate target columns are identified by searching the set of algebraic rules, wherein each of the candidate target columns occurs on one side of a binding expression and the source column occurs on the other side of the binding expression. For each of the one or more candidate target columns, a bounds subquery that provides a lower bound and an upper bound for a new range predicate is derived and he new range predicate is introduced into the query, wherein the query is executed to retrieve data from one or more data stores.
    Type: Grant
    Filed: May 25, 2007
    Date of Patent: May 17, 2011
    Assignee: International Business Machines Corporation
    Inventors: Qi Cheng, Mir Hamid Pirahesh, Yang Sun, Calisto Paul Zuzarte
  • Patent number: 7945577
    Abstract: A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.
    Type: Grant
    Filed: May 19, 2008
    Date of Patent: May 17, 2011
    Assignee: International Business Machines Corporation
    Inventors: Mehmet Altinel, Christof Bomhoevd, Chandrasekaran Mohan, Mir Hamid Pirahesh, Berthold Reinwald, Saileshwar Krishnamurthy
  • Patent number: 7716215
    Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.
    Type: Grant
    Filed: November 14, 2007
    Date of Patent: May 11, 2010
    Assignee: International Business Machines Corporation
    Inventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
  • Publication number: 20090240682
    Abstract: A system, method and computer program product for executing a query on linked data sources. Embodiments of the invention generate an instance graph expressing relationships between objects in the linked data sources and receive a query including at least first and second search terms. The first search term is then executed on the instance graph and a summary graph is generated using the results of the executing step. A second search term is then executed on the summary graph.
    Type: Application
    Filed: March 22, 2008
    Publication date: September 24, 2009
    Applicant: International Business Machines Corporation
    Inventors: Andrey Balmin, Heasoo Hwang, Mir Hamid Pirahesh, Berthold Reinwald
  • Patent number: 7539667
    Abstract: Disclosed is a data processing system implemented method, a data processing system and an article of manufacture for executing a query having a union operator. A data processing system implemented method direct the data processing system to execute a query against a database having data objects. The query has sub-queries and having a union operator. The union operator is operable on sub-queries associated with the query. The database is operatively coupled to the data processing system.
    Type: Grant
    Filed: November 5, 2004
    Date of Patent: May 26, 2009
    Assignee: International Business Machines Corporation
    Inventors: Bruce Gilbert Lindsay, Linqi Liu, Robert Paul Neugebauer, Mir Hamid Pirahesh, David C. Sharpe, Nattavut Sutyanyong, Calisto Paul Zuzarte
  • Patent number: 7478080
    Abstract: A system, apparatus, and program storage device implementing a method of optimizing queries used for searching a computerized database, wherein the method comprises providing a query comprising a sequence of inner joins and outerjoins; and rewriting the query by producing a sequence of outer Cartesian products for the query; producing a sequence of nullification operations on the query; and producing a sequence of best match operations on the query. The method further comprises optimizing the query using a query execution plan for processing the rewritten query, wherein the query execution plan expands a search space in the database for which the rewritten query may be run.
    Type: Grant
    Filed: September 30, 2004
    Date of Patent: January 13, 2009
    Assignee: International Business Machines Corporation
    Inventors: Mir Hamid Pirahesh, Jun Rao, Calisto Zuzarte
  • Publication number: 20080215580
    Abstract: A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.
    Type: Application
    Filed: May 19, 2008
    Publication date: September 4, 2008
    Applicant: International Business Machines Corporation
    Inventors: MEHMET ALTINEL, Christof Bomhoevd, Chandasekaran Mohan, Mir Hamid Pirahesh, Berthold Reinwald, Saileshwar Krishnamurthy
  • Patent number: 7409385
    Abstract: Disclosed is a data processing system implemented method, a data processing system and an article of manufacture for executing a query having a union operator. The data processing system implemented method directs the data processing system to process a query against data objects. The data objects are operatively coupled to the data processing system. The query includes a parent operator. The parent operator references a union operator. The union operator references sub-queries. The sub-queries reference the data objects. The data processing system implemented method includes noting a set of partitionings for the union operator, the noted set of partitionings being based on the sub-queries and being based on the data objects reference by the sub-queries, and executing the query having the union operator, the execution of the query being based on the noted set of partitionings and the parent operator.
    Type: Grant
    Filed: November 5, 2004
    Date of Patent: August 5, 2008
    Assignee: International Business Machines Corporation
    Inventors: Bruce Gilbert Lindsay, Linqi Liu, Robert Paul Neugebauer, Mir Hamid Pirahesh, David C. Sharpe, Nattavut Sutyanyong, Calisto Paul Zuzarte
  • Publication number: 20080177722
    Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.
    Type: Application
    Filed: November 14, 2007
    Publication date: July 24, 2008
    Applicant: International Business Machines Corp.
    Inventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
  • Patent number: 7395258
    Abstract: A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.
    Type: Grant
    Filed: July 30, 2004
    Date of Patent: July 1, 2008
    Assignee: International Business Machines Corporation
    Inventors: Mehmet Altinel, Christof Bornhoevd, Chandrasekaran Mohan, Mir Hamid Pirahesh, Berthold Reinwald, Saileshwar Krishnamurthy
  • Patent number: 7383246
    Abstract: A method, system, and computer program product to make query processing more robust in the face of optimization errors. The invention validates the statistics and assumptions used for compiling a query as the query is executed and, when necessary, progressively re-optimizes the query in mid-execution based on the knowledge learned during its partial execution. The invention selectively places a number of CHECK operators in a query execution plan to validate the optimizer's cardinality estimates against actual cardinalities. Errors beyond a threshold trigger re-optimization, and the optimizer decides whether the old plan is still optimal and whether to re-use previously computed results. The invention addresses arbitrary SQL queries whose plans can contain sub-queries, updates, trigger checking, and view maintenance operations.
    Type: Grant
    Filed: October 31, 2003
    Date of Patent: June 3, 2008
    Assignee: International Business Machines Corporation
    Inventors: Guy Maring Lohman, Marki Volker, Mir Hamid Pirahesh, Vijayshankar Raman, David Everett Simmen
  • Patent number: 7315852
    Abstract: A method for using pre-computed information stored in auxiliary structures to speed up processing of expensive queries on hierarchical documents such as XML documents being queried using XPath. The invention defines a taxonomy of such structures such as indexes and materialized views for storing pre-computed XPath results (PXRs), determines what portion of the query can be evaluated by the structures, and computes the compensation for the results generated by the structures. The invention detects all structures applicable to the query and rewrites the query to use such structures, speeding up the performance of the queries. The invention identifies the matching structures by detecting containment mappings between XPath expressions in the query and the structure. The invention also includes a new representation for XPath expressions that is rich enough to express all features of XPath.
    Type: Grant
    Filed: October 31, 2003
    Date of Patent: January 1, 2008
    Assignee: International Business Machines Corporation
    Inventors: Audrey L. Balmin, Kevin S. Beyer, Roberta Jo Cochrane, Fatma Ozcan, Mir Hamid Pirahesh
  • Patent number: 7275056
    Abstract: A system and method transform queries with subqueries, using window aggregation. An optimizer in a relational database management system transforms queries to optimize their efficiency and speed. The method transforms queries that have a subquery, replacing the subquery with a window aggregation function. In the case of a correlated subquery, the window aggregation function is partitioned by a correlated column of a correlated table. All data in the main select clause, or outer block, of the query that was obtained through references to the correlated table is instead obtained through the new window aggregation subquery. By using window aggregation, the aggregation is performed at the same time as the selection of relevant data from the correlated table, thereby compiling all needed data in a single pass through the table or view. Reducing the number of times that tables or views are accessed reduces the computational demands of a query.
    Type: Grant
    Filed: April 29, 2003
    Date of Patent: September 25, 2007
    Assignee: International Business Machines Corporation
    Inventors: Qi Cheng, Linqi Liu, Wenbin Ma, Mir Hamid Pirahesh, Calisto P. Zuzarte
  • Patent number: 7240078
    Abstract: A query is matched to an outlier materialized query table that stores exception data. The query is searched for a source predicate. An outlier predicate in the outlier materialized query table that corresponds to the source predicate is searched for a target column that corresponds to a source column in the source predicate. A new range predicate is derived based on the target column and introduced into the query, wherein the query is executed to retrieve data from one or more data stores.
    Type: Grant
    Filed: November 25, 2003
    Date of Patent: July 3, 2007
    Assignee: International Business Machines Corporation
    Inventors: Qi Cheng, Mir Hamid Pirahesh, Yang Sun, Calisto Paul Zuzarte