Patents by Inventor Devbrat Sharma

Devbrat Sharma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Tuning a trained data record matching model using customer data and representation learning

Patent number: 12608652

Abstract: A method, system, and computer program product are configured to create a tuned data record matching model by adjusting values of one or more parameters in a data record matching model based on a second training data set labeled at a data record level, wherein the data record matching model is initially trained using a first training data set labeled at an attribute level.

Type: Grant

Filed: March 10, 2023

Date of Patent: April 21, 2026

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Abhishek Seth, Devbrat Sharma, Mahendra Singh Kanyal, Soma Shekar Naganna
Dynamic threshold-based records linking

Patent number: 12547663

Abstract: Records linking is provided. Two records are selected from a plurality of records corresponding to a customer for pair-wise record comparison. It is determined whether the two records are included in different entities. A local auto-link-threshold value of the different entities is identified in response to determining that the two records are included in different entities. An attribute comparison is performed between the two records. A comparison score is generated based on the attribute comparison between the two records. It is determined whether the comparison score is greater than the local auto-link-threshold value of the different entities. The two records are linked in response to determining that the comparison score is greater than the local auto-link-threshold value of the different entities.

Type: Grant

Filed: June 24, 2022

Date of Patent: February 10, 2026

Assignee: International Business Machines Corporation

Inventors: Abhishek Seth, Soma Shekar Naganna, Devbrat Sharma, Mahendra Singh Kanyal
Records processing based on record attribute embeddings

Patent number: 12380135

Abstract: One or more trained embedding generation artificial intelligence models are executed to generate a plurality of record attribute embeddings. The plurality of record attribute embeddings represents a plurality of attributes of data of a plurality of records. Grouping of the plurality of record attribute embeddings is performed. The grouping of a record attribute embedding includes grouping attribute values of the record attribute embedding into one or more groups of attribute values. The performing grouping provides a plurality of groups of attribute values for the plurality of record attribute embeddings. Selected records are compared to provide a set of matched records. The comparing, based on a group of attribute values, includes comparing records that include one or more attribute values grouped in the group of attribute values providing a subset of matched records of the set of matched records. The set of matched records is stored in an accessible computer location.

Type: Grant

Filed: June 19, 2023

Date of Patent: August 5, 2025

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Devbrat Sharma, Soma Shekar Naganna, Abhishek Seth, Neeraj Ramkrishna Singh, Muhammed Abdul Majeed Ameen
Disintegrating an entity of records into smaller entities

Patent number: 12271356

Abstract: Described are techniques for disintegrating an entity into smaller entities. A graph (“first graph”) for the entity of records to be disintegrated is constructed, where each vertex of the first graph represents a record in the entity of records to be disintegrated. The edges in the first graph connecting records in the entity of records represent matching links between the records, where each edge is associated with a weight corresponding to a similarity score. Furthermore, two or more additional graphs representing two or more sub-entities of the entity of records to be disintegrated are constructed. Such graphs are constructed based on selecting edges with a maximum weight out of the edges connected between each pair of records in the first graph or based on the number of connections each record has with other records in the first graph exceeding a threshold value.

Type: Grant

Filed: July 17, 2023

Date of Patent: April 8, 2025

Assignee: International Business Machines Corporation

Inventors: Abhishek Seth, Soma Shekar Naganna, Mahendra Singh Kanyal, Devbrat Sharma
DISINTEGRATING AN ENTITY OF RECORDS INTO SMALLER ENTITIES

Publication number: 20250028691

Abstract: Described are techniques for disintegrating an entity into smaller entities. A graph (“first graph”) for the entity of records to be disintegrated is constructed, where each vertex of the first graph represents a record in the entity of records to be disintegrated. The edges in the first graph connecting records in the entity of records represent matching links between the records, where each edge is associated with a weight corresponding to a similarity score. Furthermore, two or more additional graphs representing two or more sub-entities of the entity of records to be disintegrated are constructed. Such graphs are constructed based on selecting edges with a maximum weight out of the edges connected between each pair of records in the first graph or based on the number of connections each record has with other records in the first graph exceeding a threshold value.

Type: Application

Filed: July 17, 2023

Publication date: January 23, 2025

Inventors: Abhishek Seth, Soma Shekar Naganna, Mahendra Singh Kanyal, Devbrat Sharma
RECORDS PROCESSING BASED ON RECORD ATTRIBUTE EMBEDDINGS

Publication number: 20240419690

Abstract: One or more trained embedding generation artificial intelligence models are executed to generate a plurality of record attribute embeddings. The plurality of record attribute embeddings represents a plurality of attributes of data of a plurality of records. Grouping of the plurality of record attribute embeddings is performed. The grouping of a record attribute embedding includes grouping attribute values of the record attribute embedding into one or more groups of attribute values. The performing grouping provides a plurality of groups of attribute values for the plurality of record attribute embeddings. Selected records are compared to provide a set of matched records. The comparing, based on a group of attribute values, includes comparing records that include one or more attribute values grouped in the group of attribute values providing a subset of matched records of the set of matched records. The set of matched records is stored in an accessible computer location.

Type: Application

Filed: June 19, 2023

Publication date: December 19, 2024

Inventors: Devbrat SHARMA, Soma Shekar NAGANNA, Abhishek SETH, Neeraj Ramkrishna SINGH, Muhammed Abdul Majeed AMEEN
TUNING A TRAINED DATA RECORD MATCHING MODEL USING CUSTOMER DATA AND REPRESENTATION LEARNING

Publication number: 20240303533

Abstract: A method, system, and computer program product are configured to create a tuned data record matching model by adjusting values of one or more parameters in a data record matching model based on a second training data set labeled at a data record level, wherein the data record matching model is initially trained using a first training data set labeled at an attribute level.

Type: Application

Filed: March 10, 2023

Publication date: September 12, 2024

Inventors: Abhishek SETH, Devbrat SHARMA, Mahendra Singh KANYAL, Soma Shekar NAGANNA
Entity explanation in data management

Patent number: 12045291

Abstract: Records can be matched by a graph neural network model performing entity resolution on the records, and representing each record as a respective node in a graph. Record matching explanations can be generated, each record matching explanation indicating a first set of attributes, and a first set of corresponding values, used for the matching at least two of the records. Nodes can be clustered into a plurality of clusters by aggregating the record matching explanations and, based on the record matching explanations, determining which of the records have high importance values, in the first set of values, that match. At least one cluster explanation can be generated, the cluster explanation indicating a second set of attributes, and a second set of values corresponding to the second set of attributes, used for the clustering the nodes. The record matching explanation and the cluster explanation can be output.

Type: Grant

Filed: November 3, 2022

Date of Patent: July 23, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Muhammed Abdul Majeed Ameen, Balaji Ganesan, Avirup Saha, Abhishek Seth, Devbrat Sharma, Arvind Agarwal, Soma Shekar Naganna, Sameep Mehta
ENTITY EXPLANATION IN DATA MANAGEMENT

Publication number: 20240152557

Abstract: Records can be matched by a graph neural network model performing entity resolution on the records, and representing each record as a respective node in a graph. Record matching explanations can be generated, each record matching explanation indicating a first set of attributes, and a first set of corresponding values, used for the matching at least two of the records. Nodes can be clustered into a plurality of clusters by aggregating the record matching explanations and, based on the record matching explanations, determining which of the records have high importance values, in the first set of values, that match. At least one cluster explanation can be generated, the cluster explanation indicating a second set of attributes, and a second set of values corresponding to the second set of attributes, used for the clustering the nodes. The record matching explanation and the cluster explanation can be output.

Type: Application

Filed: November 3, 2022

Publication date: May 9, 2024

Inventors: Muhammed Abdul Majeed Ameen, Balaji Ganesan, Avirup Saha, Abhishek Seth, Devbrat Sharma, Arvind Agarwal, Soma Shekar Naganna, Sameep Mehta
Dynamic Threshold-Based Records Linking

Publication number: 20230418877

Abstract: Records linking is provided. Two records are selected from a plurality of records corresponding to a customer for pair-wise record comparison. It is determined whether the two records are included in different entities. A local auto-link-threshold value of the different entities is identified in response to determining that the two records are included in different entities. An attribute comparison is performed between the two records. A comparison score is generated based on the attribute comparison between the two records. It is determined whether the comparison score is greater than the local auto-link-threshold value of the different entities. The two records are linked in response to determining that the comparison score is greater than the local auto-link-threshold value of the different entities.

Type: Application

Filed: June 24, 2022

Publication date: December 28, 2023

Inventors: Abhishek Seth, Soma Shekar Naganna, Devbrat Sharma, Mahendra Singh Kanyal
Searching in multilevel clustered vector-based data

Patent number: 11449704

Abstract: A multilevel clustered data set for multidimensional vectors is created by defining a plurality of clusters based on each of the signed dimensions of the vectors, each dimension functioning as an axis. Vectors are assigned to each cluster by measuring cosine similarity between a vector and each axis. Sub-clusters are defined as ranges of cosine similarity values within a cluster, and each vector is assigned into the appropriate range based on their cosine similarity value with the axis of the cluster. Searching for a matching vector to a new vector is efficiently achieved in near-constant time by measuring cosine similarity for the new vector with each axis to identify the closest cluster, reusing the cosine similarity of the new vector and axis to determine which sub-cluster corresponds to the appropriate range of values, and then comparing each vector within the sub-cluster until a match is found or ruled out.

Type: Grant

Filed: January 16, 2020

Date of Patent: September 20, 2022

Assignee: International Business Machines Corporation

Inventors: Abhishek Seth, Devbrat Sharma, Mahendra Singh Kanyal, Muhammed Abdul Majeed Ameen, Soma Shekar Naganna
MULTILEVEL CLUSTERING OF VECTOR-BASED DATA

Publication number: 20210224583

Abstract: A multilevel clustered data set for multidimensional vectors is created by defining a plurality of clusters based on each of the signed dimensions of the vectors, each dimension functioning as an axis. Vectors are assigned to each cluster by measuring cosine similarity between a vector and each axis. Sub-clusters are defined as ranges of cosine similarity values within a cluster, and each vector is assigned into the appropriate range based on their cosine similarity value with the axis of the cluster. Searching for a matching vector to a new vector is efficiently achieved in near-constant time by measuring cosine similarity for the new vector with each axis to identify the closest cluster, reusing the cosine similarity of the new vector and axis to determine which sub-cluster corresponds to the appropriate range of values, and then comparing each vector within the sub-cluster until a match is found or ruled out.

Type: Application

Filed: January 16, 2020

Publication date: July 22, 2021

Inventors: Abhishek Seth, Devbrat Sharma, Mahendra Singh Kanyal, Muhammed Abdul Majeed Ameen, Soma Shekar Naganna