Patents by Inventor Tanveer A Faruquie

Tanveer A Faruquie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Computer-based systems configured for entity resolution and indexing of entity activity

Patent number: 11966859

Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.

Type: Grant

Filed: April 28, 2023

Date of Patent: April 23, 2024

Assignee: Capital One Services, LLC

Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
COMPUTER-BASED SYSTEMS CONFIGURED FOR ENTITY RESOLUTION AND INDEXING OF ENTITY ACTIVITY

Publication number: 20230267348

Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.

Type: Application

Filed: April 28, 2023

Publication date: August 24, 2023

Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
Computer-based systems configured for entity resolution and indexing of entity activity

Patent number: 11640545

Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.

Type: Grant

Filed: November 15, 2021

Date of Patent: May 2, 2023

Assignee: Capital One Services, LLC

Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
COMPUTER-BASED SYSTEMS CONFIGURED FOR ENTITY RESOLUTION AND INDEXING OF ENTITY ACTIVITY

Publication number: 20220076149

Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.

Type: Application

Filed: November 15, 2021

Publication date: March 10, 2022

Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
Computer-based systems configured for entity resolution and indexing of entity activity

Patent number: 11176468

Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.

Type: Grant

Filed: May 29, 2020

Date of Patent: November 16, 2021

Assignee: Capital One Services, LLC

Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
COMPUTER-BASED SYSTEMS CONFIGURED FOR ENTITY RESOLUTION AND INDEXING OF ENTITY ACTIVITY

Publication number: 20210142191

Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.

Type: Application

Filed: May 29, 2020

Publication date: May 13, 2021

Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
Computer-based systems configured for entity resolution and indexing of entity activity

Patent number: 10713577

Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.

Type: Grant

Filed: November 8, 2019

Date of Patent: July 14, 2020

Assignee: Capital One Services, LLC

Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
Multi-level colocation and processing of spatial data on MapReduce

Patent number: 10339107

Abstract: Methods, systems, and computer program products for multi-level colocation and analytical processing of spatial data on MapReduce are provided herein. A method includes correlating multiple items of spatial data and multiple items of attribute data within a file system to generate multiple blocks of correlated data; colocating each of the multiple blocks of correlated data on a given node within the file system based on a data block placement policy; and clustering multiple replicas generated for each of the multiple data blocks at multiple levels of spatial granularity within the file system.

Type: Grant

Filed: June 8, 2015

Date of Patent: July 2, 2019

Assignee: International Business Machines Corporation

Inventors: Tanveer A. Faruquie, Himanshu Gupta, Sriram Lakshminarasimhan, Sameep Mehta, Stuart A. Siegel
Automatically mining patterns for rule based data standardization systems

Patent number: 10163063

Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

Type: Grant

Filed: March 7, 2012

Date of Patent: December 25, 2018

Assignee: International Business Machines Corporation

Inventors: Snigdha Chaturvedi, Tanveer A Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
In-querying data cleansing with semantic standardization

Patent number: 10120916

Abstract: The present invention relates to data cleansing, and in particular performing the semantic standardization process within a database before the transform portion of the extract-transform-load (ETL) process. Provided are a method, system and computer program product for standardizing data within a database engine, configuring the standardization function to determine at least one standardized value for at least one data value by applying the standardization table in a context of at least one data value, receiving a database query identifying the standardization function, at least one database value and the context of the data, and invoking the standardization function.

Type: Grant

Filed: June 11, 2012

Date of Patent: November 6, 2018

Assignee: International Business Machines Corporation

Inventors: Tanveer A. Faruquie, Mukesh K. Mohania, L. Venkata Subramaniam, Charles D. Wolfson
Automatically mining patterns for rule based data standardization systems

Patent number: 10095780

Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

Type: Grant

Filed: February 7, 2017

Date of Patent: October 9, 2018

Assignee: International Business Machines Corporation

Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
AUTOMATICALLY MINING PATTERNS FOR RULE BASED DATA STANDARDIZATION SYSTEMS

Publication number: 20170147688

Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

Type: Application

Filed: February 7, 2017

Publication date: May 25, 2017

Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
Multi-Level Colocation and Processing of Spatial Data on Mapreduce

Publication number: 20160357775

Abstract: Methods, systems, and computer program products for multi-level colocation and analytical processing of spatial data on MapReduce are provided herein. A method includes correlating multiple items of spatial data and multiple items of attribute data within a file system to generate multiple blocks of correlated data; colocating each of the multiple blocks of correlated data on a given node within the file system based on a data block placement policy; and clustering multiple replicas generated for each of the multiple data blocks at multiple levels of spatial granularity within the file system.

Type: Application

Filed: June 8, 2015

Publication date: December 8, 2016

Inventors: Tanveer A. Faruquie, Himanshu Gupta, Sriram Lakshminarasimhan, Sameep Mehta, Stuart A. Siegel
Determining related data points from multi-modal inputs

Patent number: 9396433

Abstract: Techniques, systems, and articles of manufacture for determining related data points from multi-modal inputs. A method includes collecting multiple items of multi-modal data comprising at least one dimension from multiple data sources, wherein said at least one dimension comprises a geographic dimension, a temporal dimension and/or an event-related dimension, determining a window of relevance for each of the multiple items of multi-modal data with respect to the at least one dimension, and identifying two or more of the multiple items of multi-modal data as related, by determining an overlap of the window of relevance corresponding to each of the two or more items of multi-modal data with respect to the at least one dimension.

Type: Grant

Filed: May 29, 2013

Date of Patent: July 19, 2016

Assignee: International Business Machines Corporation

Inventors: L. Venkata Subramaniam, Sameep Mehta, Raghuram Krishnapuram, Tanveer A. Faruquie
Processing spatial joins using a mapreduce framework

Patent number: 9311380

Abstract: Techniques, systems, and articles of manufacture for processing spatial joins using a MapReduce framework. A method includes partitioning a spatial data domain based on a distribution of spatial data objects across multiple nodes of a cluster of machines, defining at least one operation to be performed on the partitioned spatial data domain based on one or more predicates of a query, and executing the at least one defined operation on the partitioned spatial data domain to determine a response to the query.

Type: Grant

Filed: March 29, 2013

Date of Patent: April 12, 2016

Assignee: International Business Machines Corporation

Inventors: Bhupesh S. Chawda, Himanshu Gupta, Tanveer A Faruquie, L. Venkata Subramaniam
Resources management in distributed computing environment

Patent number: 9213574

Abstract: A method, system and a computer program product for determining resources allocation in a distributed computing environment. An embodiment may include identifying resources in a distributed computing environment, computing provisioning parameters, computing configuration parameters and quantifying service parameters in response to a set of service level agreements (SLA). The embodiment may further include iteratively computing a completion time required for completion of the assigned task and a cost. Embodiments may further include computing an optimal resources configuration and computing at least one of an optimal completion time and an optimal cost corresponding to the optimal resources configuration. Embodiments may further include dynamically modifying the optimal resources configuration in response to at least one change in at least one of provisioning parameters, computing parameters and quantifying service parameters.

Type: Grant

Filed: January 30, 2010

Date of Patent: December 15, 2015

Assignee: International Business Machines Corporation

Inventors: Tanveer A Faruquie, Hima P Karanam, Mukesh K Mohania, L Venkata Subramaniam, Girish Venkatachaliah
Cleansing a database system to improve data quality

Patent number: 9104709

Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.

Type: Grant

Filed: March 16, 2012

Date of Patent: August 11, 2015

Assignee: International Business Machines Corporation

Inventors: Snigdha Chaturvedi, Tanveer A Faruquie, Hima P Karanam, Mukesh K Mohania, L Venkata Subramaniam
Automatically mining patterns for rule based data standardization systems

Patent number: 8996524

Abstract: Methods, computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

Type: Grant

Filed: March 8, 2012

Date of Patent: March 31, 2015

Assignee: International Business Machines Corporation

Inventors: Snigdha Chaturvedi, Tanveer A Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
Determining Related Data Points from Multi-Modal Inputs

Publication number: 20140358843

Abstract: Techniques, systems, and articles of manufacture for determining related data points from multi-modal inputs. A method includes collecting multiple items of multi-modal data comprising at least one dimension from multiple data sources, wherein said at least one dimension comprises a geographic dimension, a temporal dimension and/or an event-related dimension, determining a window of relevance for each of the multiple items of multi-modal data with respect to the at least one dimension, and identifying two or more of the multiple items of multi-modal data as related, by determining an overlap of the window of relevance corresponding to each of the two or more items of multi-modal data with respect to the at least one dimension.

Type: Application

Filed: May 29, 2013

Publication date: December 4, 2014

Inventors: L. Venkata Subramaniam, Sameep Mehta, Raghuram Krishnapuram, Tanveer A. Faruquie
Rule set management

Patent number: 8700542

Abstract: Systems, methods, and computer products for optimally managing large rule sets are disclosed. Rule dependencies of rules within a set of rules may be determined as a function of rules execution frequency data generated from applying the rules over a data set. The rules within the set of rules may be clustered into rules clusters based on the determined rule dependencies, in which the rules clusters comprise disjoint subsets of the rules within the set of rules. Cluster frequency data for the rules clusters may be used to arrive at an optimal ordering. Each rule within the set of rules may be assigned a unique identification that may capture an execution order of the rules within the set of rules.

Type: Grant

Filed: December 15, 2010

Date of Patent: April 15, 2014

Assignee: International Business Machines Corporation

Inventors: Mohan N. Dani, Tanveer A. Faruquie, Hima P. Karanam, L. Venkata Subramaniam, Girish Venkatachaliah

1 2 3 next