Patents by Inventor Nimrod Megiddo

Nimrod Megiddo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20080222093
    Abstract: A method and system for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated. A query workload and/or database statistics are dynamically updated. A new set of training points is collected off-line. Using the new set of training points, the first classifier is modified into a second classifier. A database query is received at a runtime subsequent to the off-line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. A mapping of the selectivities into a plan determines the query execution plan. The determined query execution plan is included in an augmented set of training points, where the augmented set includes the initial set and the new set.
    Type: Application
    Filed: May 22, 2008
    Publication date: September 11, 2008
    Inventors: Wei Fan, Guy Maring Lohman, Volker Gerhard Markl, Nimrod Megiddo, Jun Rao, David Everett Simmen, Julia Stoyanovich
  • Publication number: 20080195577
    Abstract: A method for automatically and adaptively determining query execution plans for parametric queries. A first classifier trained by an initial set of training points is generated using a set of random decision trees (RDTs). A query workload and/or database statistics are dynamically updated. A new set of training points collected off-line is used to modify the first classifier into a second classifier. A database query is received at a runtime subsequent to the off line phase. The query includes predicates having parameter markers bound to actual values. The predicates are associated with selectivities. The query execution plan is determined by identifying an optimal average of posterior probabilities obtained across a set of RDTs and mapping the selectivities to a plan. The determined query execution plan is included in an augmented set of training points that includes the initial set and the new set.
    Type: Application
    Filed: February 9, 2007
    Publication date: August 14, 2008
    Inventors: Wei Fan, Guy Maring Lohman, Volker Gerhard Markl, Nimrod Megiddo, Jun Rao, David Everett Simmen, Julia Stoyanovich
  • Publication number: 20080154846
    Abstract: A method for consistent selectivity estimation based on the principle of maximum entropy (ME) is provided. The method efficiently exploits all available information and avoids the bias problem. In the absence of detailed knowledge, the ME approach reduces to standard uniformity and independence assumptions. The disclosed method, based on the principle of ME, is used to improve the optimizer's cardinality estimates by orders of magnitude, resulting in better plan quality and significantly reduced query execution times.
    Type: Application
    Filed: March 4, 2008
    Publication date: June 26, 2008
    Applicant: International Business Machine Corporation
    Inventors: Marcel Kutsch, Volker Gerhard Markl, Nimrod Megiddo, Tam Minh Dai Tran
  • Patent number: 7392250
    Abstract: Exemplary embodiments of the present invention relate to enhanced faceted search support for OLAP queries over unstructured text as well as structured dimensions by the dynamic and automatic discovery of dimensions that are determined to be most “interesting” to a user based upon the data. Within the exemplary embodiments “interestingness” is defined as how surprising a summary along some dimensions is from a user's expectation. Further, multi-attribute facets are determined and a user is optionally permitted to specify the distribution of values that she expects, and/or the distance metric by which actual and expected distributions are to be compared.
    Type: Grant
    Filed: October 22, 2007
    Date of Patent: June 24, 2008
    Assignee: International Business Machines Corporation
    Inventors: Debabrata Dash, Guy M. Lohman, Nimrod Megiddo, Jun Rao
  • Patent number: 7376639
    Abstract: A method for consistent selectivity estimation based on the principle of maximum entropy (ME) is provided. The method efficiently exploits all available information and avoids the bias problem. In the absence of detailed knowledge, the ME approach reduces to standard uniformity and independence assumptions. The disclosed method, based on the principle of ME, is used to improve the optimizer's cardinality estimates by orders of magnitude, resulting in better plan quality and significantly reduced query execution times.
    Type: Grant
    Filed: July 28, 2005
    Date of Patent: May 20, 2008
    Assignee: International Business Machines Corporation
    Inventors: Marcel Kutsch, Volker Gerhard Markl, Nimrod Megiddo, Tam Minh Dai Tran
  • Publication number: 20080109811
    Abstract: Methods are provided for maximizing the throughput of a computer system in the presence of one or more power constraints. Throughput is maximized by repeatedly or continuously optimizing task scheduling and assignment for each of a plurality of components of a computer system. The components include a plurality of central processing units (CPUs) each operating at a corresponding operating frequency. The components also include a plurality of disk drives. The corresponding operating frequencies of one or more CPUs of the plurality of CPUs are adjusted to maximize computer system throughput under one or more power constraints. Optimizing task scheduling and assignment, as well as adjusting the corresponding operating frequencies of one or more CPUs, are performed by solving a mathematical optimization problem using a first methodology over a first time interval and a second methodology over a second time interval longer than the first time interval.
    Type: Application
    Filed: November 8, 2006
    Publication date: May 8, 2008
    Applicant: International Business Machines Corporation
    Inventors: Robert Krauthgamer, Nimrod Megiddo
  • Publication number: 20080016097
    Abstract: A method of selectivity estimation is disclosed in which preprocessing steps improve the feasibility and efficiency of the estimation. The preprocessing steps are: partitioning (to make iterative scaling estimation terminate in a reasonable time for even large sets of predicates); forced partitioning (to enable partitioning in case there are no “natural” partitions, by finding the subsets of predicates to create partitions that least impact the overall solution); inconsistency resolution (in order to ensure that there always is a correct and feasible solution); and implied zero elimination (to ensure convergence of the iterative scaling computation under all circumstances). All of these preprocessing steps make a maximum entropy method of selectivity estimation produce a correct cardinality model, for any kind of query with conjuncts of predicates. In addition, the preprocessing steps can also be used in conjunction with prior art methods for building a cardinality model.
    Type: Application
    Filed: July 13, 2006
    Publication date: January 17, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: PETER JAY HAAS, MARCEL KUTSCH, VOLKER GERHARD MARKL, NIMROD MEGIDDO
  • Patent number: 7209913
    Abstract: A system (100) for searching and retrieving documents includes a database (106), a memory device (108), a user interface device (102) and a controller (104). The database (106) stores documents. The memory device (108) stores software, tokens and an index. The software performs methods according to a background routine (118) and a foreground routine (116). Each token (e.g., speed) has related expressions (e.g., miles per hour, mph, kilometers per hour, and kph) assigned thereto that define the token. The index has documents, having an occurrence of one of the related expressions for one of the tokens, assigned to the one of the tokens. The user interface device (102) accepts and sends search queries having a token and receives information related to the documents, having an occurrence of the related expressions for the token, responsive to a user interface process (120). The controller (104) is electrically coupled to the memory device (108), the user interface device (102) and the database (106).
    Type: Grant
    Filed: December 28, 2001
    Date of Patent: April 24, 2007
    Assignee: International Business Machines Corporation
    Inventors: Nimrod Megiddo, Andrew S. Tomkins, Shivakumar Vaithyanathan
  • Publication number: 20070078808
    Abstract: A novel method is employed for collecting optimizer statistics for optimizing database queries by gathering feedback from the query execution engine about the observed cardinality of predicates and constructing and maintaining multidimensional histograms. This makes use of the correlation between data columns without employing an inefficient data scan. The maximum entropy principle is used to approximate the true data distribution by a histogram distribution that is as “simple” as possible while being consistent with the observed predicate cardinalities. Changes in the underlying data are readily adapted to, automatically detecting and eliminating inconsistent feedback information in an efficient manner. The size of the histogram is controlled by retaining only the most “important” feedback.
    Type: Application
    Filed: September 30, 2005
    Publication date: April 5, 2007
    Inventors: Peter Haas, Volker Markl, Nimrod Megiddo, Utkarsh Srivastava
  • Publication number: 20070027837
    Abstract: A method for consistent selectivity estimation based on the principle of maximum entropy (ME) is provided. The method efficiently exploits all available information and avoids the bias problem. In the absence of detailed knowledge, the ME approach reduces to standard uniformity and independence assumptions. The disclosed method, based on the principle of ME, is used to improve the optimizer's cardinality estimates by orders of magnitude, resulting in better plan quality and significantly reduced query execution times.
    Type: Application
    Filed: July 28, 2005
    Publication date: February 1, 2007
    Inventors: Marcel Kutsch, Volker Markl, Nimrod Megiddo, Tam Minh Tran
  • Patent number: 7167953
    Abstract: An adaptive replacement cache policy dynamically maintains two lists of pages, a recency list and a frequency list, in addition to a cache directory. The policy keeps these two lists to roughly the same size, the cache size c. Together, the two lists remember twice the number of pages that would fit in the cache. At any time, the policy selects a variable number of the most recent pages to exclude from the two lists. The policy adaptively decides in response to an evolving workload how many top pages from each list to maintain in the cache at any given time. It achieves such online, on-the-fly adaptation by using a learning rule that allows the policy to track a workload quickly and effectively.
    Type: Grant
    Filed: June 13, 2005
    Date of Patent: January 23, 2007
    Assignee: International Business Machines Corporation
    Inventors: Nimrod Megiddo, Dharmendra Shantilal Modha
  • Patent number: 7117130
    Abstract: Stochastic control problems of linear systems in high dimensions are solved by modeling a structured Markov Decision Process (MDP). A state space for the MDP is a polyhedron in a Euclidean space and one or more actions that are feasible in a state of the state space are linearly constrained with respect to the state. One or more approximations are built from above and from below to a value function for the state using representations that facilitate the computation of approximately optimal actions at any given state by linear programming.
    Type: Grant
    Filed: June 28, 2000
    Date of Patent: October 3, 2006
    Assignee: International Business Machines Corporation
    Inventor: Nimrod Megiddo
  • Patent number: 7013344
    Abstract: A distributed processing system, program product and method of executing a computer program distributed across a plurality of computers. First, interested participants register and provide a commitment for available excess computer capacity. Participants may enter a number of available hours and machine characteristics. A normalized capacity may be derived from the machine characteristics and a normalized excess capacity may be derived from the number of hours committed for the participant. New registrants may be assigned benchmark tasks to indicate likely performance. Parties may purchase capacity for executing large computer programs and searches. The computer program is partitioned into multiple independent tasks of approximately equal size and the tasks are distributed to participants according to available excess capacity. A determination is made whether each distributed task will execute within a selected range of other distributed tasks and, if not, tasks may be reassigned.
    Type: Grant
    Filed: January 9, 2002
    Date of Patent: March 14, 2006
    Assignee: International Business Machines Corporation
    Inventor: Nimrod Megiddo
  • Publication number: 20060041837
    Abstract: A system, program storage device, and method of buffering an electronic document received from a host computer, wherein the method comprises determining whether an original source code of the electronic document includes executable coding which when executed by a client computer, causes the client computer to perform undesired operations, and producing an alternate source code of the electronic document, which eliminates the coding, wherein the undesired operations are characterized as undesirable based on predetermined settings established by the client computer. The electronic document comprises any of a web page, electronic mail message, an electronic mail attachment, a note in a hypertext format, a text document, a text file, and an application-specific electronic document. Each of the original source code and the alternate source code comprises a hypertext transfer protocol (HTTP) source code.
    Type: Application
    Filed: June 7, 2004
    Publication date: February 23, 2006
    Inventors: Arnon Amir, Nimrod Megiddo
  • Patent number: 6996676
    Abstract: An adaptive replacement cache policy dynamically maintains two lists of pages, a recency list and a frequency list, in addition to a cache directory. The policy keeps these two lists to roughly the same size, the cache size c. Together, the two lists remember twice the number of pages that would fit in the cache. At any time, the policy selects a variable number of the most recent pages to exclude from the two lists. The policy adaptively decides in response to an evolving workload how many top pages from each list to maintain in the cache at any given time. It achieves such online, on-the-fly adaptation by using a learning rule that allows the policy to track a workload quickly and effectively. This allows the policy to balance between recency and frequency in an online and self-tuning fashion, in response to evolving and possibly changing access patterns. The policy is also scan-resistant.
    Type: Grant
    Filed: November 14, 2002
    Date of Patent: February 7, 2006
    Assignee: International Business Machines Corporation
    Inventors: Nimrod Megiddo, Dharmendra Shantilal Modha
  • Patent number: 6996268
    Abstract: A system, method and search engine for searching images for data contained therein. Training images are provided and image attributes are extracted from the training images. Attributes extracted from training images include image features characteristic of a particular numerically generated image type, such as horizontal lines, vertical lines, percentage white area, circular arcs and text. Then, the training images are classified according to extracted attributes and a particular classifier is selected for each group of training images. Classifiers can include classification trees, discriminant functions, regression trees, support vector machines, neural nets and hidden Markov models. Available images are collected from remotely connected computers, e.g., over the Internet. Collected images are indexed and provided for interrogation by users. As a user enters queries, indexed images are identified and returned to the user. The user may provide additional data as supplemental data to the extracted image data.
    Type: Grant
    Filed: December 28, 2001
    Date of Patent: February 7, 2006
    Assignee: International Business Machines Corporation
    Inventors: Nimrod Megiddo, Shivakumar Vaithyanathan
  • Patent number: 6985868
    Abstract: A system, method and program product for commerce management, especially for managing contingency agreements or contracts. An agreement is entered into the system, logging conditions for the agreement and identifying potential responses to satisfy each condition. A location may also be identified for each identified potential response, e.g. a HTML link to an internet web site. Milestones are set to determine when to check whether conditions have been satisfied. As each milestone is encountered information is retrieved from the locations or provided manually. The retrieved information is checked to determine whether the agreement is determinate, i.e., all of the conditions have been satisfied or, the agreement has failed because one condition will not be satisfied. If more conditions remain unsatisfied and are identified with subsequent milestones, the most recent milestone is recorded. The contracting parties are notified regarding status of the agreement and of passing any milestone.
    Type: Grant
    Filed: March 22, 2000
    Date of Patent: January 10, 2006
    Assignee: International Business Machines Corporation
    Inventor: Nimrod Megiddo
  • Publication number: 20050289448
    Abstract: A system, method and search engine for searching images for data contained therein. Training images are provided and image attributes are extracted from the training images. Attributes extracted from training images include image features characteristic of a particular numerically generated image type, such as horizontal lines, vertical lines, percentage white area, circular arcs and text. Then, the training images are classified according to extracted attributes and a particular classifier is selected for each group of training images. Classifiers can include classification trees, discriminant functions, regression trees, support vector machines, neural nets and hidden Markov models. Available images are collected from remotely connected computers, e.g., over the Internet. Collected images are indexed and provided for interrogation by users. As a user enters queries, indexed images are identified and returned to the user. The user may provide additional data as supplemental data to the extracted image data.
    Type: Application
    Filed: July 18, 2005
    Publication date: December 29, 2005
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: NIMROD MEGIDDO, SHIVAKUMAR VAITHYANATHAN
  • Publication number: 20050235114
    Abstract: An adaptive replacement cache policy dynamically maintains two lists of pages, a recency list and a frequency list, in addition to a cache directory. The policy keeps these two lists to roughly the same size, the cache size c. Together, the two lists remember twice the number of pages that would fit in the cache. At any time, the policy selects a variable number of the most recent pages to exclude from the two lists. The policy adaptively decides in response to an evolving workload how many top pages from each list to maintain in the cache at any given time. It achieves such online, on-the-fly adaptation by using a learning rule that allows the policy to track a workload quickly and effectively.
    Type: Application
    Filed: June 13, 2005
    Publication date: October 20, 2005
    Inventors: Nimrod Megiddo, Dharmendra Modha
  • Patent number: 6957224
    Abstract: A system, method and computer program product for providing links to remotely located information in a network of remotely connected computers. The system may or may not include a server providing an interface between shorthand codes and corresponding original files. If the server is included, a uniform resource locator (URL) is registered with a server. A shorthand link is associate with the registered URL. The associated shorthand link and URL are logged in a registry database. When a request is received for a shorthand link, the registry database is searched for an associated URL. If the shorthand link is found to be associated with an URL, the URL is fetched, otherwise an error message is returned. If the server is not included, all URLs located at a root page may be listed and associated with shorthand keys or links. Associated files and keys are indexed in an index file. The shorthand codes or keys are combined with the root page to form shorthand URLs.
    Type: Grant
    Filed: September 11, 2000
    Date of Patent: October 18, 2005
    Assignee: International Business Machines Corporation
    Inventors: Nimrod Megiddo, Kevin S. McCurley