Patents by Inventor Namit Kabra

Namit Kabra has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

AUTOMATED DATA DUPLICATE IDENTIFICATION

Publication number: 20160162507

Abstract: In an approach to identifying duplicates in data, one or more computer processors receive a request from a user to identify duplicates in a data set. The one or more computer processors retrieve the data set utilizing data discovery. The one or more computer processors perform data profiling on the data set. The one or more computer processors determine one or more domain types of the data set, based, at least in part, on the performed data profiling. The one or more computer processors perform data standardization on the data set, based, at least in part, on the one or more determined domain types. Responsive to performing data standardization, the one or more computer processors perform probabilistic matching on the data set. The one or more computer processors to identify two or more duplicates in the data set, based, at least in part, on the probabilistic matching.

Type: Application

Filed: December 5, 2014

Publication date: June 9, 2016

Inventors: Ritesh K. Gupta, Namit Kabra, Manish Kumar, Srinivas K. Mittapalli
DATA DE-DUPLICATION

Publication number: 20160092479

Abstract: A method, executed by a computer, for de-duplicating data includes receiving a dataset, pivoting the dataset along a set of columns that have a common domain to provide a pivoted dataset, de-duplicating the pivoted dataset to provide a de-duplicated dataset, and using the de-duplicated dataset. De-duplicating the pivoted dataset may include computing similarity scores for records that have different primary keys and merging records that have a similarity score that exceeds a selected threshold value. The method may include determining the set of columns having a common domain by referencing a business catalog and/or conducting a data classification operation on some or all of the columns of the dataset. The method may also include pivoting the dataset along another set of columns that have a different common domain. A computer system and computer program product corresponding to the method are also disclosed herein.

Type: Application

Filed: May 20, 2015

Publication date: March 31, 2016

Inventors: Namit Kabra, Yannick Saillet
DATA DE-DUPLICATION

Publication number: 20160092494

Abstract: A method, executed by a computer, for de-duplicating data includes receiving a dataset, pivoting the dataset along a set of columns that have a common domain to provide a pivoted dataset, de-duplicating the pivoted dataset to provide a de-duplicated dataset, and using the de-duplicated dataset. De-duplicating the pivoted dataset may include computing similarity scores for records that have different primary keys and merging records that have a similarity score that exceeds a selected threshold value. The method may include determining the set of columns having a common domain by referencing a business catalog and/or conducting a data classification operation on some or all of the columns of the dataset. The method may also include pivoting the dataset along another set of columns that have a different common domain. A computer system and computer program product corresponding to the method are also disclosed herein.

Type: Application

Filed: September 30, 2014

Publication date: March 31, 2016

Inventors: Namit Kabra, Yannick Saillet

prev 1 2 3 4 5

AUTOMATED DATA DUPLICATE IDENTIFICATION

DATA DE-DUPLICATION

DATA DE-DUPLICATION