Patents by Inventor Eugene J. Shekita

Eugene J. Shekita has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20090228528
    Abstract: A system, method, and computer program product for updating a partitioned index of a dataset. A document is indexed by separating it into indexable sections, such that different ones of the indexable sections may be contained in different partitions of the partitioned index. The partitioned index is updated using an updated version of the document by updating only those sections of the index corresponding to sections of the document that have been updated in the updated version.
    Type: Application
    Filed: March 6, 2008
    Publication date: September 10, 2009
    Applicant: International Business Machines Corporation
    Inventors: Vuk Ercegovac, Vanja Josifovski, Ning Li, Mauricio Mediano, Eugene J. Shekita
  • Patent number: 7496568
    Abstract: A method for querying multifaceted information. An inverted index is constructed to include unique indexed tokens associated with posting lists of one or more documents. An indexed token is either a facet token included in a document as an annotation or a path prefix of the facet token. The annotation indicates a path within a tree structure representing a facet that includes the document. The tree structure includes nodes representing categories of documents. Constructing the inverted index includes generating a full path token and an associated full path token posting list. A query is received that includes constraints on documents. The constraints are associated with indexed tokens and corresponding posting lists. An execution of the query includes identifying the corresponding posting lists by utilizing the constraints and the inverted index and intersecting the posting lists to obtain a query result.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: February 24, 2009
    Assignee: International Business Machines Corporation
    Inventors: Andrei Z. Broder, Nadav Eiron, Felipe Marcus Fontoura, Ronny Lempel, Ning Li, John Ai McPherson, Jr., Andreas Neumann, Shila Ofek-Koifman, Runping Qi, Eugene J. Shekita
  • Publication number: 20080222117
    Abstract: A method and system for querying multifaceted information. An inverted index is constructed to include unique indexed tokens associated with posting lists of one or more documents. An indexed token is either a facet token included in a document as an annotation or a path prefix of the facet token. The annotation indicates a path within a tree structure representing a facet that includes the document. The tree structure includes nodes representing categories of documents. A query is received that includes constraints on documents. The constraints are associated with indexed tokens and corresponding posting lists. An execution of the query includes identifying the corresponding posting lists by utilizing the constraints and the inverted index and intersecting the posting lists to obtain a query result.
    Type: Application
    Filed: May 21, 2008
    Publication date: September 11, 2008
    Inventors: Andrei Z. Broder, Nadav Eiron, Felipe Marcus Fontoura, Ronny Lempel, Ning Li, John Ai McPherson, Andreas Neumann, Shila Ofek-Koifman, Runping Qi, Eugene J. Shekita
  • Patent number: 7424467
    Abstract: Disclosed is a technique for indexing data. A token is received. It is determined whether a data field associated with the token is a fixed width. When the data field is a fixed width, the token is designated as one for which fixed width sort is to be performed. When the data field is a variable length, the token is designated as one for which a variable width sort is to be performed.
    Type: Grant
    Filed: January 26, 2004
    Date of Patent: September 9, 2008
    Assignee: International Business Machines Corporation
    Inventors: Marcus F. Fontoura, Andreas Neumann, Sridhar Rajagopalan, Eugene J. Shekita, Jason Yeong Zien
  • Publication number: 20080133473
    Abstract: A method for querying multifaceted information. An inverted index is constructed to include unique indexed tokens associated with posting lists of one or more documents. An indexed token is either a facet token included in a document as an annotation or a path prefix of the facet token. The annotation indicates a path within a tree structure representing a facet that includes the document. The tree structure includes nodes representing categories of documents. Constructing the inverted index includes generating a full path token and an associated full path token posting list. A query is received that includes constraints on documents. The constraints are associated with indexed tokens and corresponding posting lists. An execution of the query includes identifying the corresponding posting lists by utilizing the constraints and the inverted index and intersecting the posting lists to obtain a query result.
    Type: Application
    Filed: November 30, 2006
    Publication date: June 5, 2008
    Inventors: Andrei Z. Broder, Nadav Eiron, Felipe Marcus Fontoura, Ronny Lempel, Ning Li, John Ai McPherson, Andreas Neumann, Shila Ofek-Koifman, Runping Qi, Eugene J. Shekita
  • Publication number: 20080010302
    Abstract: A holistic twig join method with optimal cursor movement is disclosed. The method in one aspect minimizes the number of cursor moves by looking more globally at the query's state to determine which cursor to move next and making virtual moves where a physical move is not needed. The method in another aspect reduces the number of cursor moves by skipping over nodes that do not need to be output.
    Type: Application
    Filed: June 27, 2006
    Publication date: January 10, 2008
    Applicant: International Business Machines Corporation
    Inventors: Marcus F. Fontoura, Vanja Josifovski, Eugene J. Shekita, Beverly Yang
  • Patent number: 7293005
    Abstract: Disclosed is a technique for building an index in which global analysis computations and index creation are pipelined, wherein the global analysis computations share intermediate results.
    Type: Grant
    Filed: January 26, 2004
    Date of Patent: November 6, 2007
    Assignee: International Business Machines Corporation
    Inventors: Marcus F. Fontoura, Reiner Kraft, Tony K. Leung, John Ai McPherson, Jr., Andreas Neumann, Runping Qi, Sridhar Rajagopalan, Eugene J. Shekita, Jason Yeong Zien
  • Publication number: 20040243632
    Abstract: Disclosed is an evaluation technique for text search with black-box scoring functions, where it is unnecessary for the evaluation engine to maintain details of the scoring function. Included is a description of a system for dealing with blackbox searching, proofs of correctness, as well experimental evidence showing that the performance of the technique is comparable in efficiency to those techniques used in custom-built engines.
    Type: Application
    Filed: December 19, 2003
    Publication date: December 2, 2004
    Applicant: International Business Machines Corporation
    Inventors: Kevin Scott Beyer, Robert W. Lyle, Sridhar Rajagopalan, Eugene J. Shekita
  • Patent number: 5619692
    Abstract: A procedure for detecting a reordering requirement in a directed record stream during query execution in a relational database processing system. The query compiler component of a relational database processing system includes procedures for building query execution plans (QEPs) for evaluation preparatory to selecting an optimal plan for execution. These plans are constructed from the bottom up using an internal graphical representation for the user query that has a number of relation nodes interconnected by directed record streams (data flows). A relational operation within each node imposes an "order requirement" on the outflow stream represented by an order requirement vector O.sub.R. The records within each directed record stream have an "order property" represented by an order property vector O.sub.P. Order detection occurs when these two vectors are compared to determine whether the order property satisfies the order requirement.
    Type: Grant
    Filed: February 17, 1995
    Date of Patent: April 8, 1997
    Assignee: International Business Machines Corporation
    Inventors: Timothy R. Malkemus, Eugene J. Shekita, David E. Simmen
  • Patent number: 5548754
    Abstract: A method and apparatus for optimizing SQL queries in a relational database management system uses early-out join transformations. An early-out join comprises a many-to-one existential join, wherein the join scans an inner table for a match for each row of the outer table and terminates the scan for each row of the outer table when a single match is found in the inner table. To transform a many-to-many join to an early-out join, the query must include a requirement for distinctiveness, either explicitly or implicitly, in one or more result columns for the join operation. Distinctiveness can be specified using the DISTINCT keyword in the SELECT clause or can be implied from the predicates present in the query. The early-out join transformation also requires that no columns of the inner table be referenced after the join, or if an inner table column is referenced after the join, that each referenced column be "bound".
    Type: Grant
    Filed: February 7, 1995
    Date of Patent: August 20, 1996
    Assignee: International Business Machines Corporation
    Inventors: Mir H. Pirahesh, Ting Y. Leung, Guy M. Lohman, Eugene J. Shekita, David E. Simmen
  • Patent number: 5548758
    Abstract: A method and apparatus for optimizing SQL queries in a relational database management system uses early-out join transformations. An early-out join comprises a many-to-one existential join, wherein the join scans an inner table for a match for each row of the outer table and terminates the scan for each row of the outer table when a single match is found in the inner table. To transform a many-to-many join to an early-out join, the query must include a requirement for distinctiveness, either explicitly or implicitly, in one or more result columns for the join operation. Distinctiveness can be specified using the DISTINCT keyword in the SELECT clause or can be implied from the predicates present in the query. The early-out join transformation also requires that no columns of the inner table be referenced after the join, or if an inner table column is referenced after the join, that each referenced column be "bound".
    Type: Grant
    Filed: June 5, 1995
    Date of Patent: August 20, 1996
    Assignee: International Business Machines Corporation
    Inventors: Mir H. Pirahesh, Ting Y. Leung, Guy M. Lohman, Eugene J. Shekita, David E. Simmen
  • Patent number: 5542089
    Abstract: A data base management system estimates the number of occurrences of values of query search keys in a data set by defining at least two independent hashing functions that map the values of the data set to buckets of respective hashing tables and maintaining a bucket count as each value from the data set is mapped to the hashing tables. A bucket is defined to be a "popular" bucket if the bucket count of the value exceeds a predetermined threshold. If all of the buckets to which a value is mapped are designated popular buckets, that value is designated an "active" value. Once a value is designated active, statistical data related to the value is collected. Estimates of the most frequently occurring values in the data set are generated from the collected statistical data. In this way, a data base management system can more effectively produce a search plan that provides an efficient response to user queries.
    Type: Grant
    Filed: July 26, 1994
    Date of Patent: July 30, 1996
    Assignee: International Business Machines Corporation
    Inventors: Bruce G. Lindsay, Eugene J. Shekita