Patents by Inventor Alexandre Evfimievski

Alexandre Evfimievski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Extracting structure and semantics from tabular data

Patent number: 11650970

Abstract: Methods, systems, and computer program products for extracting structure and semantics from tabular data are provided herein. A computer-implemented method includes processing tabular data comprising data cells and header cells, wherein the processing includes: identifying one or more regions within the tabular data, wherein each of the regions comprises one or more of the data cells; matching some of the regions to one or more of the header cells, wherein the matched header cells are semantically related to the data cells inside the matched region; and generating, based on the matching, an output describing semantic relationships between the data cells and the header cells. The method also includes creating, for each data cell, a tuple comprising semantic information contained within one or more of the header cells that pertains to the data cell.

Type: Grant

Filed: March 9, 2018

Date of Patent: May 16, 2023

Assignee: International Business Machines Corporation

Inventors: Xilun Chen, Laura Chiticariu, Alexandre Evfimievski, Marina Danilevsky Hailpern, Prithviraj Sen
USING META-LEARNING TO OPTIMIZE AUTOMATIC SELECTION OF MACHINE LEARNING PIPELINES

Publication number: 20220051049

Abstract: A computer automatically selects a machine learning model pipeline using a meta-learning machine learning model. The computer receives ground truth data and pipeline preference metadata. The computer determines a group of pipelines appropriate for the ground truth data, and each of the pipelines includes an algorithm. The pipelines may include data preprocessing routines. The computer generates hyperparameter sets for the pipelines. The computer applies preprocessing routines to ground truth data to generate a group of preprocessed sets of said ground truth data and ranks hyperparameter set performance for each pipeline to establish a preferred set of hyperparameters for each of pipeline. The computer selects favored data features and applies each of the pipelines, with associated sets of preferred hyperparameters, to score the favored data features of the preprocessed ground truth data. The computer ranks pipeline performance and selects a candidate pipeline according to the ranking.

Type: Application

Filed: August 11, 2020

Publication date: February 17, 2022

Inventors: Dakuo Wang, Chuang Gan, Gregory Bramble, Lisa Amini, Horst Cornelius Samulowitz, Kiran A. Kate, Bei Chen, Martin Wistuba, Alexandre Evfimievski, Ioannis Katsis, Yunyao Li, Adelmo Cristiano Innocenza Malossi, Andrea Bartezzaghi, Ban Kawas, Sairam Gurajada, Lucian Popa, Tejaswini Pedapati, Alexander Gray
Table recognition in portable document format documents

Patent number: 11200413

Abstract: Methods, systems, and computer program products for table recognition in PDF documents are provided herein. A computer-implemented method includes discretizing one or more contiguous areas of a PDF document; identifying one or more white-space separator lines within the one or more discretized contiguous areas of the PDF document; detecting one or more candidate table regions within the one or more discretized contiguous areas of the PDF document by clustering the one or more white-space separator lines into one or more grids; and outputting at least one of the candidate table regions as a finalized table in accordance with scores assigned to each of the one or more candidate table regions based on (i) border information and (ii) cell structure information.

Type: Grant

Filed: July 31, 2018

Date of Patent: December 14, 2021

Assignee: International Business Machines Corporation

Inventors: Douglas Ronald Burdick, Wei Cheng, Alexandre Evfimievski, Marina Danilevsky Hailpern, Rajasekar Krishnamurthy, Shajith Ikbal Mohamed, Prithviraj Sen, Shivakumar Vaithyanathan
Table Recognition in Portable Document Format Documents

Publication number: 20200042785

Abstract: Methods, systems, and computer program products for table recognition in PDF documents are provided herein. A computer-implemented method includes discretizing one or more contiguous areas of a PDF document; identifying one or more white-space separator lines within the one or more discretized contiguous areas of the PDF document; detecting one or more candidate table regions within the one or more discretized contiguous areas of the PDF document by clustering the one or more white-space separator lines into one or more grids; and outputting at least one of the candidate table regions as a finalized table in accordance with scores assigned to each of the one or more candidate table regions based on (i) border information and (ii) cell structure information.

Type: Application

Filed: July 31, 2018

Publication date: February 6, 2020

Inventors: Douglas Ronald Burdick, Wei Cheng, Alexandre Evfimievski, Marina Danilevsky Hailpern, Rajasekar Krishnamurthy, Shajith Ikbal Mohamed, Prithviraj Sen, Shivakumar Vaithyanathan
Extracting Structure and Semantics from Tabular Data

Publication number: 20190278853

Abstract: Methods, systems, and computer program products for extracting structure and semantics from tabular data are provided herein. A computer-implemented method includes processing tabular data comprising data cells and header cells, wherein the processing includes: identifying one or more regions within the tabular data, wherein each of the regions comprises one or more of the data cells; matching some of the regions to one or more of the header cells, wherein the matched header cells are semantically related to the data cells inside the matched region; and generating, based on the matching, an output describing semantic relationships between the data cells and the header cells. The method also includes creating, for each data cell, a tuple comprising semantic information contained within one or more of the header cells that pertains to the data cell.

Type: Application

Filed: March 9, 2018

Publication date: September 12, 2019

Inventors: Xilun Chen, Laura Chiticariu, Alexandre Evfimievski, Marina Danilevsky Hailpern, Prithviraj Sen
Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

Patent number: 10223762

Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.

Type: Grant

Filed: March 16, 2018

Date of Patent: March 5, 2019

Assignee: International Business Machines Corporation

Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
PIPELINED APPROACH TO FUSED KERNELS FOR OPTIMIZATION OF MACHINE LEARNING WORKLOADS ON GRAPHICAL PROCESSING UNITS

Publication number: 20180211357

Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing is performed for the identified computation including maintaining partial output vector results in shared memory of the GPU. Hierarchical aggregation for vectors is performed including performing intra-block aggregation for multiple thread blocks of a partial output vector results on GPU global memory.

Type: Application

Filed: March 16, 2018

Publication date: July 26, 2018

Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units

Patent number: 9972063

Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. An optimized fused GPU kernel is employed to exploit temporal locality for inherent data-flow dependencies in the identified computation. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing for the identified computation is performed. GPU kernel launch parameters are estimated following an analytical model that maximizes thread occupancy and minimizes atomic writes to GPU global memory.

Type: Grant

Filed: July 30, 2015

Date of Patent: May 15, 2018

Assignee: International Business Machines Corporation

Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
PIPELINED APPROACH TO FUSED KERNELS FOR OPTIMIZATION OF MACHINE LEARNING WORKLOADS ON GRAPHICAL PROCESSING UNITS

Publication number: 20170032487

Abstract: A method for optimization of machine learning (ML) workloads on a graphics processor unit (GPU). The method includes identifying a computation having a generic pattern commonly observed in ML processes. An optimized fused GPU kernel is employed to exploit temporal locality for inherent data-flow dependencies in the identified computation. Hierarchical aggregation spanning a memory hierarchy of the GPU for processing for the identified computation is performed. GPU kernel launch parameters are estimated following an analytical model that maximizes thread occupancy and minimizes atomic writes to GPU global memory.

Type: Application

Filed: July 30, 2015

Publication date: February 2, 2017

Inventors: Arash Ashari, Matthias Boehm, Keith W. Campbell, Alexandre Evfimievski, John D. Keenleyside, Berthold Reinwald, Shirish Tatikonda
INFORMATION INTEGRATION ACROSS AUTONOMOUS ENTERPRISES

Publication number: 20080065910

Abstract: A system, method, and computer program product for processing a query spanning separate databases while revealing only minimal information beyond a query answer, by executing only specific information-limiting protocols according to query type.

Type: Application

Filed: October 25, 2007

Publication date: March 13, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: RAKESH AGRAWAL, ALEXANDRE EVFIMIEVSKI, RAMAKRISHNAN SRIKANT
Mining association rules over privacy preserving data

Publication number: 20050021488

Abstract: The following discloses a method of mining association rules from the databases while maintaining privacy of individual transactions within the databases through randomization. The invention randomly drops true items from transactions within a database and randomly inserts false items into the transactions. The invention mines the database for association rules after the dropping and inserting processes, and estimates the support of association rules in the original dataset based on their support in the randomized dataset. The dropping of the true items and the inserting of the false items is carried out to an extent such that the chance of finding a false itemset is sufficiently high relative to the chance of finding a true itemset in the database.

Type: Application

Filed: July 21, 2003

Publication date: January 27, 2005

Inventors: Rakesh Agrawal, Alexandre Evfimievski, Ramakrishnan Srikant