Patents by Inventor Eugene J. Shekita

Eugene J. Shekita has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

EFFICIENT PARTITIONED JOINS IN A DATABASE WITH COLUMN-MAJOR LAYOUT

Publication number: 20140006380

Abstract: Embodiments of the present invention provide a database processing system for efficient partitioning of a database table with column-major layout for executing one or more join operations. One embodiment comprises a method for partitioning a database table with column-major layout, partitioning only the join-columns by limiting the partitions by size and number, executing one or more join operations for joining the partitioned columns, and optionally de-partitioning the join result to the original order by sequentially writing and randomly reading table values using P cursors.

Type: Application

Filed: August 24, 2012

Publication date: January 2, 2014

Applicant: International Business Machines Corporation

Inventors: Stefan ARNDT, Gopi K. Attaluri, Ronald J. Barber, Guy M. Lohman, Lin Qiao, Vijayshankar Raman, Eugene J. Shekita, Richard S. Sidle
EFFICIENT PARTITIONED JOINS IN A DATABASE WITH COLUMN-MAJOR LAYOUT

Publication number: 20140006379

Abstract: Embodiments of the present invention provide a database processing system for efficient partitioning of a database table with column-major layout for executing one or more join operations. One embodiment comprises a method for partitioning a database table with column-major layout, partitioning only the join-columns by limiting the partitions by size and number, executing one or more join operations for joining the partitioned columns, and optionally de-partitioning the join result to the original order by sequentially writing and randomly reading table values using P cursors.

Type: Application

Filed: June 29, 2012

Publication date: January 2, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Stefan Arndt, Gopi K. Attaluri, Ronald J. Barber, Guy M. Lohman, Lin Qiao, Vijayshankar Raman, Eugene J. Shekita, Richard S. Sidle
INTRA-BLOCK PARTITIONING FOR DATABASE MANAGEMENT

Publication number: 20130325901

Abstract: A method for storing database information, including: storing a table having data values in a column major order, wherein the data values are stored in a list of blocks, assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table, wherein data values that correspond to each other across a plurality of columns of the table have equivalent TSNs; assigning each data value to a partition based on a representation of the data value; and assigning a tuple map value to each data value, wherein the tuple map value identifies the partition in which each data value is located.

Type: Application

Filed: August 30, 2012

Publication date: December 5, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ronald J. Barber, Min-Soo Kim, Sam S. Lightstone, Guy M. Lohman, Lin Qiao, Vijayshankar Raman, Eugene J. Shekita, Richard S. Sidle
INTRA-BLOCK PARTITIONING FOR DATABASE MANAGEMENT

Publication number: 20130325900

Abstract: A method for storing database information includes storing a table having data values in a column major order. The data values are stored in a list of blocks. The method also includes assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table. The data values that correspond to each other across a plurality of columns of the table have equivalent TSNs. The method also includes assigning each data value to a partition based on a representation of the data value. The method also includes assigning a tuple map value to each data value. The tuple map value identifies the partition in which each data value is located.

Type: Application

Filed: May 31, 2012

Publication date: December 5, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ronald J. Barber, Min-Soo Kim, Sam S. Lightstone, Guy M. Lohman, Lin Qiao, Vijayshankar Raman, Eugene J. Shekita, Richard S. Sidle
SYSTEMS AND METHODS FOR HIGHLY PARALLEL PROCESSING OF PARAMETERIZED SIMULATIONS

Publication number: 20120323551

Abstract: Systems and associated methods for highly parallel processing of parameterized simulations are described. Embodiments permit processing of stochastic data-intensive simulations in a highly parallel fashion in order to distribute the intensive workload. Embodiments utilize methods of seeding records in a database with a source of pseudo-random numbers, such as a compressed seed for a pseudo-random number generator, such that seeded records may be processed independently in a highly parallel fashion. Thus, embodiments provide systems and associated methods facilitating quicker data-intensive simulation by enabling highly parallel asynchronous simulations.

Type: Application

Filed: August 27, 2012

Publication date: December 20, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kevin S. Beyer, Vuk Ercegovac, Peter Haas, Eugene J. Shekita, Fei Xu
DISTRIBUTED REVERSE SEMANTIC INDEX

Publication number: 20120323919

Abstract: Embodiments of the invention relate to building a distributed reverse semantic index. In one general embodiment a plurality of documents are received with each document having at least one defined rule and or semantic. The documents are distributed among a plurality of nodes of a system. The documents are processed in a generally parallel fashion. Processing the documents includes processing text data of each of the document and breaking each document into fields to index the text data to create index data by deferring how to categorize the text data based upon the defined rule and or semantics. The indexed data is combined back together to create an indexer-agnostic semantic index including a plurality of the semantic index shards and to semantically classify the documents based on the index shards into groups based on document type to create the distributed reverse semantic index.

Type: Application

Filed: August 27, 2012

Publication date: December 20, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Eugene J. Shekita, Asim V. Singh, Yuanyuan Tian, Kevin B. Wang
SCALABLE ROW-STORE WITH CONSENSUS-BASED REPLICATION

Publication number: 20120271795

Abstract: A method for updating a scalable row-store, including: receiving an update to a key within a range of keys in a database table, wherein the database table is distributed across nodes in a cluster of computing devices; and replicating the update over a group of the nodes using a consensus-based replication algorithm, wherein the replication algorithm includes completing the update in response to receiving acknowledgement messages from a majority of the nodes in the group indicating that the majority has received notification of the update.

Type: Application

Filed: April 21, 2011

Publication date: October 25, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jun Rao, Eugene J. Shekita, Sandeep Tata
Method, system, and program for handling redirects in a search engine

Patent number: 8296304

Abstract: Disclosed is a method, system, and program for handling redirects in documents. At least one equivalence class that includes documents that are connected through a redirect. Cycles for each equivalence class are detected, wherein documents in a cycle are marked so that they are not indexed. Incomplete chains for each equivalence class are detected, wherein documents in an incomplete chain are marked so that they are not indexed. A representative for each equivalence class is selected.

Type: Grant

Filed: January 26, 2004

Date of Patent: October 23, 2012

Assignee: International Business Machines Corporation

Inventors: Marcus F. Fontoura, Andreas Neumann, Runping Qi, Eugene J. Shekita
Vector throttling to control resource use in computer systems

Publication number: 20120254089

Abstract: Embodiments of the invention relate to building a distributed reverse semantic index. In one general embodiment a plurality of documents are received with each document having at least one defined rule and or semantic. The documents are distributed among a plurality of nodes of a system. The documents are processed in a generally parallel fashion. Processing the documents includes processing text data of each of the document and breaking each document into fields to index the text data to create index data by deferring how to categorize the text data based upon the defined rule and or semantics. The indexed data is combined back together to create an indexer-agnostic semantic index including a plurality of the semantic index shards and to semantically classify the documents based on the index shards into groups based on document type to create the distributed reverse semantic index.

Type: Application

Filed: March 31, 2011

Publication date: October 4, 2012

Applicant: International Business Machines Corporation

Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Eugene J. Shekita, Asim V. Singh, Yuanyuan Tian, Kevin B. Wang
Indexing and searching JSON objects

Patent number: 8260784

Abstract: Disclosed is a method of encoding JavaScript Object Notation (JSON) documents in an inverted index, wherein a tree representation of a JSON document is first generated, and, next, the JSON document is shredded into a list of <value, path, type, jdewey> tuples for each atom node, n, in the tree, where value is a label associated with n, path is a concatenation of node labels associated with ancestors of n, type is a description of a type of value, and jdewey of n is a partial Dewey code of its closest ancestor array node, if one exists, or empty, otherwise. Lastly, an inverted index is built using <path, type, value> as index term, and jdewey as payload. A method is also described to search the inverted index.

Type: Grant

Filed: February 13, 2009

Date of Patent: September 4, 2012

Assignee: International Business Machines Corporation

Inventors: Kevin Scott Beyer, Jun Rao, Eugene J Shekita
INDEX PARTITION MAINTENANCE OVER MONOTONICALLY ADDRESSED DOCUMENT SEQUENCES

Publication number: 20120059823

Abstract: Provided are techniques for partitioning a physical index into one or more physical partitions; assigning each of the one or more physical partitions to a node in a cluster of nodes; for each received document, assigning an assigned-doc-ID comprising an integer document identifier; and, in response to assigning the assigned-doc-ID to a document, determining a cut-off of assignment of new documents to a current virtual-index-epoch comprising a first set of physical partitions and placing the new documents into a new virtual-index-epoch comprising a second set of physical partitions by inserting each new document to a specific one of the physical partitions in the second set using one or more functions that direct the placement based on one of the assigned-doc-id, a field value derived from a set of fields obtained from the document, and a combination of the assigned-doc-id and the field value.

Type: Application

Filed: September 3, 2010

Publication date: March 8, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ronald J. Barber, Harish Deshmukh, Ning Li, Bruce G. Lindsay, Sridhar Rajagopalan, Roger C. Raphael, Eugene J. Shekita, Paul S. Taylor
SYSTEMS AND METHODS FOR HIGHLY PARALLEL PROCESSING OF PARAMETERIZED SIMULATIONS

Publication number: 20110320184

Abstract: Systems and associated methods for highly parallel processing of parameterized simulations are described. Embodiments permit processing of stochastic data-intensive simulations in a highly parallel fashion in order to distribute the intensive workload. Embodiments utilize methods of seeding records in a database with a source of pseudo-random numbers, such as a compressed seed for a pseudo-random number generator, such that seeded records may be processed independently in a highly parallel fashion. Thus, embodiments provide systems and associated methods facilitating quicker data-intensive simulation by enabling highly parallel asynchronous simulations.

Type: Application

Filed: June 29, 2010

Publication date: December 29, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kevin S. Beyer, Vuk Ercegovac, Peter Haas, Eugene J. Shekita, Fei Xu
Efficient multifaceted search in information retrieval systems

Patent number: 8032532

Abstract: A method and system for querying multifaceted information. An inverted index is constructed to include unique indexed tokens associated with posting lists of one or more documents. An indexed token is either a facet token included in a document as an annotation or a path prefix of the facet token. The annotation indicates a path within a tree structure representing a facet that includes the document. The tree structure includes nodes representing categories of documents. A query is received that includes constraints on documents. The constraints are associated with indexed tokens and corresponding posting lists. An execution of the query includes identifying the corresponding posting lists by utilizing the constraints and the inverted index and intersecting the posting lists to obtain a query result.

Type: Grant

Filed: May 21, 2008

Date of Patent: October 4, 2011

Assignee: International Business Machines Corporation

Inventors: Andrei Z. Broder, Nadav Eiron, Felipe Marcus Fontoura, Ronny Lempel, Ning Li, John Ai McPherson, Jr., Andreas Neumann, Shila Ofek-Koifman, Runping Qi, Eugene J. Shekita
Adaptive evaluation of text search queries with blackbox scoring functions

Patent number: 7991771

Abstract: Disclosed is an evaluation technique for text search with black-box scoring functions, where it is unnecessary for the evaluation engine to maintain details of the scoring function. Included is a description of a system for dealing with blackbox searching, proofs of correctness, as well experimental evidence showing that the performance of the technique is comparable in efficiency to those techniques used in custom-built engines.

Type: Grant

Filed: November 21, 2006

Date of Patent: August 2, 2011

Assignee: International Business Machines Corporation

Inventors: Kevin Scott Beyer, Robert W. Lyle, Sridhar Rajagopalan, Eugene J. Shekita
JOIN ALGORITHMS OVER FULL TEXT INDEXES

Publication number: 20110184933

Abstract: According to one embodiment of the present invention, a method for processing join predicates in full-text indexes is provided. The method includes evaluating local predicates of an outer full text index to generate a first posting list of documents. For each document in the first posting list, the value of a join attribute is determined and an inner full text index is probed to obtain a second posting list of documents containing one of the join attributes determined for each document. Local predicates of an inner full text index are evaluated to generate a third posting list of documents, and the second posting list is merged with the third posting list to generate a merge list of documents. Documents in the first posting list may be paired up with documents in the merge list.

Type: Application

Filed: January 28, 2010

Publication date: July 28, 2011

Applicant: International Business Machines Corporation

Inventors: Latha Sankar Colby, Quanzhong Li, Fatma Ozcan, Mir Hamid Pirahesh, Eugene J. Shekita, Zografoula Vagena
SYNCHRONIZING AN AUXILIARY DATA SYSTEM WITH A PRIMARY DATA SYSTEM

Publication number: 20110113010

Abstract: Systems, methods and articles of manufacture are disclosed for synchronizing a primary data system with an auxiliary data system that processes data for the primary data system. In one embodiment, how current the primary data system and the auxiliary data system are may be determined. Requests sent from the primary data system that were not processed by the auxiliary data system may be determined. The requests may be resent to the auxiliary data system for processing.

Type: Application

Filed: November 11, 2009

Publication date: May 12, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ronald J. Barber, Harish Deshmukh, Ning Li, Bruce G. Lindsay, Sridhar Rajagopalan, Roger C. Raphael, Eugene J. Shekita
Pipelined architecture for global analysis and index building

Patent number: 7783626

Abstract: Provided is a technique for building an index. A new indexi+1 is built and an anchor text tablei+1 and a duplicates tablei+1 are output using a storei, a delta store, and previously generated global analysis computationsi, wherein the previously generated global analysis computationsi include an anchor text tablei, a rank tablei, and a duplicates tablei. New global analysis computationsi+1 are generated using the anchor text tablei+1, the duplicates tablei+1, and the previously generated global analysis computationsi.

Type: Grant

Filed: August 17, 2007

Date of Patent: August 24, 2010

Assignee: International Business Machines Corporation

Inventors: Marcus Felipe Fontoura, Reiner Kraft, Tony Kai-Chi Leung, John A. McPherson, Jr., Andreas Neumann, Runping Qi, Sridhar Rajagopalan, Eugene J. Shekita, Jason Yeong Zien
INDEXING AND SEARCHING JSON OBJECTS

Publication number: 20100211572

Abstract: Disclosed is a method of encoding JavaScript Object Notation (JSON) documents in an inverted index, wherein a tree representation of a JSON document is first generated, and, next, the JSON document is shredded into a list of <value, path, type, jdewey> tuples for each atom node, n, in the tree, where value is a label associated with n, path is a concatenation of node labels associated with ancestors of n, type is a description of a type of value, and jdewey of n is a partial Dewey code of its closest ancestor array node, if one exists, or empty, otherwise. Lastly, an inverted index is built using <path, type, value> as index term, and jdewey as payload. A method is also described to search the inverted index.

Type: Application

Filed: February 13, 2009

Publication date: August 19, 2010

Applicant: International Business Machines Corporation

Inventors: Kevin Scott Beyer, Jun Rao, Eugene J. Shekita
Architecture for an indexer

Patent number: 7743060

Abstract: Disclosed is a technique for indexing data. For each token in a set of documents, a sort key is generated that includes a document identifier that indicates whether a section of a document associated with the sort key is an anchor text section or a context section, wherein the anchor text section and the context text section have a same document identifier; it is determined whether a data field associated with the token is a fixed width; when the data field is a fixed width, the token is designated as one for which fixed width sort is to be performed; and, when the data field is a variable length, the token is designated as one for which a variable width sort is to be performed. The fixed width sort and the variable width sort are performed. For each document, the sort keys are used to bring together the anchor text section and the context section of that document.

Type: Grant

Filed: August 6, 2007

Date of Patent: June 22, 2010

Assignee: International Business Machines Corporation

Inventors: Marcus Felipe Fontoura, Andreas Neumann, Sridhar Rajagopalan, Eugene J. Shekita, Jason Yeong Zien
Virtual cursors for XML joins

Patent number: 7685138

Abstract: A system, method, and computer program product to improve XML query processing efficiency with virtual cursors. Structural joins are a fundamental operation in XML query processing, and substantial work exists on index-based algorithms for executing them. Two well-known index features—path indices and ancestor information—are combined in a novel way to replace at least some of the physical index cursors in a structural join with virtual cursors. The position of a virtual cursor is derived from the path and ancestor information of a physical cursor. Virtual cursors can be easily incorporated into existing structural join algorithms. By eliminating index I/O and the processing cost of handling physical inverted lists, virtual cursors can improve the performance of holistic path queries by an order of magnitude or more.

Type: Grant

Filed: November 8, 2005

Date of Patent: March 23, 2010

Assignee: International Business Machines Corporation

Inventors: Kevin S. Beyer, Marcus Felipe Fontoura, Sridhar Rajagopalan, Eugene J. Shekita, Beverly Yang

prev 1 2 3 next