Patents by Inventor Vuk Ercegovac

Vuk Ercegovac has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150363466
    Abstract: In one embodiment, a computer-implemented method includes selecting one or more sub-expressions of a query during compile time. One or more pilot runs are performed by one or more computer processors. The one or more pilot runs include a pilot run associated with each of one or more of the selected sub-expressions, and each pilot run includes at least partial execution of the associated selected sub-expression. The pilot runs are performed during execution time. Statistics are collected on the one or more pilot runs during performance of the one or more pilot runs. The query is optimized based at least in part on the statistics collected during the one or more pilot runs, where the optimization includes basing cardinality and cost estimates on the statistics collected during the pilot runs.
    Type: Application
    Filed: June 11, 2014
    Publication date: December 17, 2015
    Inventors: Andrey Balmin, Vuk Ercegovac, Jesse E. Jackson, Konstantinos Karanasos, Marcel Kutsch, Fatma Ozcan, Chunyang Xia
  • Patent number: 9146983
    Abstract: A method for creating a semantically aggregated index in an indexer-agnostic index building system includes: extracting documents from a data source, each document including a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Grant
    Filed: August 24, 2012
    Date of Patent: September 29, 2015
    Assignee: International Business Machines Corporation
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Patent number: 9104749
    Abstract: A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Grant
    Filed: January 12, 2011
    Date of Patent: August 11, 2015
    Assignee: International Business Machines Corporation
    Inventors: Alfredo Alba, Chad E DeLuca, Vuk Ercegovac, Thomas D Griffin, Jun Rao, Asim V Singh, Kevin B Wang
  • Publication number: 20150186493
    Abstract: Stratified sampling of a plurality of records is performed. A plurality of records are partitioned into a plurality of splits, wherein each split includes at least a portion of the plurality of records. The split of the plurality of splits is provided to a mapper. The mapper assigns at least a portion the records of the at least one split to a group based on a strata of the assigned records, and filters the records of the group based on a comparison of the weights of the records to a local threshold of the mapper. The mapper updates the local threshold of the mapper by communicating with a coordinator. The mapper shuffles the group to a reducer, where the reducer filters the records of the group based on the weights of the records. The reducer provides a stratified sampling of the plurality of records based on the group.
    Type: Application
    Filed: December 27, 2013
    Publication date: July 2, 2015
    Applicant: International Business Machines Corporation
    Inventors: Andrey Balmin, Vuk Ercegovac, Peter J. Haas, Liping Peng, John Sismanis
  • Patent number: 8954967
    Abstract: Described herein are methods, systems, apparatuses and products for adaptive parallel data processing. An aspect provides providing a map phase in which at least one map function is applied in parallel on different partitions of input data at different mappers in a parallel data processing system; providing a communication channel between mappers using a distributed meta-data store, wherein said map phase comprises mapper data processing adapted responsive to communication with said distributed meta-data store; and providing data accessible by at least one reduce phase node in which at least one reduce function is applied. Other embodiments are disclosed.
    Type: Grant
    Filed: May 31, 2011
    Date of Patent: February 10, 2015
    Assignee: International Business Machines Corporation
    Inventors: Andrey Balmin, Kevin Scott Beyer, Vuk Ercegovac, Rares Vernica
  • Publication number: 20140281746
    Abstract: Embodiments relate to a method and computer program product for error handling. The method includes performing at least one query operation. The processing of query operation also includes generating error information data based at least an error encountered during performance of the query operation and generating a data result relating to any portion of the query operation successfully completed. The data result is processed together with the error information data based on encountering any errors. The data result and error information are provided together in a package but separated by an indicator to distinguish between them.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Inventors: Vuk Ercegovac, Carl-Christian Kanne
  • Publication number: 20140281748
    Abstract: An aspect of error handling includes a parsing block for pre-processing a document indexing application, a filtering block for discarding irrelevant documents, a transformation block to clean up and annotate input data by identifying a document language, and a processor configured for grouping inputs to collect documents for a same entity in a single spot. The processor processes a query operation. An aspect of error handling also includes a data package including a data result component that includes data generated based on successful completion of at least a portion of the query operation. The data package also includes an error information data component based on one or more errors encountered during processing of the query operation. An indicator separates the error information data from the data result. The system also includes a memory associated with a distributed file system for storing a final write output relating to the query operation.
    Type: Application
    Filed: September 13, 2013
    Publication date: September 18, 2014
    Applicant: International Business Machines Corporation
    Inventors: Vuk Ercegovac, Carl-Christian Kanne
  • Publication number: 20120323920
    Abstract: A method for creating a semantically aggregated index in an indexer-agnostic index building system includes: extracting documents from a data source, each document including a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Application
    Filed: August 24, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Publication number: 20120323551
    Abstract: Systems and associated methods for highly parallel processing of parameterized simulations are described. Embodiments permit processing of stochastic data-intensive simulations in a highly parallel fashion in order to distribute the intensive workload. Embodiments utilize methods of seeding records in a database with a source of pseudo-random numbers, such as a compressed seed for a pseudo-random number generator, such that seeded records may be processed independently in a highly parallel fashion. Thus, embodiments provide systems and associated methods facilitating quicker data-intensive simulation by enabling highly parallel asynchronous simulations.
    Type: Application
    Filed: August 27, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kevin S. Beyer, Vuk Ercegovac, Peter Haas, Eugene J. Shekita, Fei Xu
  • Publication number: 20120323919
    Abstract: Embodiments of the invention relate to building a distributed reverse semantic index. In one general embodiment a plurality of documents are received with each document having at least one defined rule and or semantic. The documents are distributed among a plurality of nodes of a system. The documents are processed in a generally parallel fashion. Processing the documents includes processing text data of each of the document and breaking each document into fields to index the text data to create index data by deferring how to categorize the text data based upon the defined rule and or semantics. The indexed data is combined back together to create an indexer-agnostic semantic index including a plurality of the semantic index shards and to semantically classify the documents based on the index shards into groups based on document type to create the distributed reverse semantic index.
    Type: Application
    Filed: August 27, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Eugene J. Shekita, Asim V. Singh, Yuanyuan Tian, Kevin B. Wang
  • Publication number: 20120311581
    Abstract: Described herein are methods, systems, apparatuses and products for adaptive parallel data processing. An aspect provides providing a map phase in which at least one map function is applied in parallel on different partitions of input data at different mappers in a parallel data processing system; providing a communication channel between mappers using a distributed meta-data store, wherein said map phase comprises mapper data processing adapted responsive to communication with said distributed meta-data store; and providing data accessible by at least one reduce phase node in which at least one reduce function is applied. Other embodiments are disclosed.
    Type: Application
    Filed: May 31, 2011
    Publication date: December 6, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Andrey Balmin, Kevin Scott Beyer, Vuk Ercegovac, Rares Vernica
  • Publication number: 20120254089
    Abstract: Embodiments of the invention relate to building a distributed reverse semantic index. In one general embodiment a plurality of documents are received with each document having at least one defined rule and or semantic. The documents are distributed among a plurality of nodes of a system. The documents are processed in a generally parallel fashion. Processing the documents includes processing text data of each of the document and breaking each document into fields to index the text data to create index data by deferring how to categorize the text data based upon the defined rule and or semantics. The indexed data is combined back together to create an indexer-agnostic semantic index including a plurality of the semantic index shards and to semantically classify the documents based on the index shards into groups based on document type to create the distributed reverse semantic index.
    Type: Application
    Filed: March 31, 2011
    Publication date: October 4, 2012
    Applicant: International Business Machines Corporation
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Eugene J. Shekita, Asim V. Singh, Yuanyuan Tian, Kevin B. Wang
  • Publication number: 20120179684
    Abstract: A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Application
    Filed: January 12, 2011
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Publication number: 20110320184
    Abstract: Systems and associated methods for highly parallel processing of parameterized simulations are described. Embodiments permit processing of stochastic data-intensive simulations in a highly parallel fashion in order to distribute the intensive workload. Embodiments utilize methods of seeding records in a database with a source of pseudo-random numbers, such as a compressed seed for a pseudo-random number generator, such that seeded records may be processed independently in a highly parallel fashion. Thus, embodiments provide systems and associated methods facilitating quicker data-intensive simulation by enabling highly parallel asynchronous simulations.
    Type: Application
    Filed: June 29, 2010
    Publication date: December 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kevin S. Beyer, Vuk Ercegovac, Peter Haas, Eugene J. Shekita, Fei Xu
  • Publication number: 20090228528
    Abstract: A system, method, and computer program product for updating a partitioned index of a dataset. A document is indexed by separating it into indexable sections, such that different ones of the indexable sections may be contained in different partitions of the partitioned index. The partitioned index is updated using an updated version of the document by updating only those sections of the index corresponding to sections of the document that have been updated in the updated version.
    Type: Application
    Filed: March 6, 2008
    Publication date: September 10, 2009
    Applicant: International Business Machines Corporation
    Inventors: Vuk Ercegovac, Vanja Josifovski, Ning Li, Mauricio Mediano, Eugene J. Shekita
  • Patent number: 6681222
    Abstract: A unified database/text retrieval system converts exact database type queries into text inclusion type queries suitable for text retrieval systems through the use of pseudo keywords. Boolean combination of the text inclusion type query elements may be readily manipulated for optimization and applied to a unified index for rapid search results. Absolute relevance values and relevance multiplier values may be added to the query elements to provide a relevance-based sorting not only of text but also of exact match type search results. Relevance values may be deduced automatically from a variety of sources.
    Type: Grant
    Filed: July 16, 2001
    Date of Patent: January 20, 2004
    Assignee: Quip Incorporated
    Inventors: Navin Kabra, Raghu Ramakrishnan, Uri Shaft, Vuk Ercegovac
  • Publication number: 20030014396
    Abstract: A unified database/text retrieval system converts exact database type queries into text inclusion type queries suitable for text retrieval systems through the use of pseudo keywords. Boolean combination of the text inclusion type query elements may be readily manipulated for optimization and applied to a unified index for rapid search results. Absolute relevance values and relevance multiplier values may be added to the query elements to provide a relevance-based sorting not only of text but also of exact match type search results. Relevance values may be deduced automatically from a variety of sources.
    Type: Application
    Filed: July 16, 2001
    Publication date: January 16, 2003
    Inventors: Navin Kabra, Raghu Ramakrishnan, Uri Shaft, Vuk Ercegovac