Patents by Inventor Tanveer A Faruquie
Tanveer A Faruquie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11966859Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.Type: GrantFiled: April 28, 2023Date of Patent: April 23, 2024Assignee: Capital One Services, LLCInventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
-
Publication number: 20230267348Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.Type: ApplicationFiled: April 28, 2023Publication date: August 24, 2023Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
-
Patent number: 11640545Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.Type: GrantFiled: November 15, 2021Date of Patent: May 2, 2023Assignee: Capital One Services, LLCInventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
-
Publication number: 20220076149Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.Type: ApplicationFiled: November 15, 2021Publication date: March 10, 2022Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
-
Patent number: 11176468Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.Type: GrantFiled: May 29, 2020Date of Patent: November 16, 2021Assignee: Capital One Services, LLCInventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
-
Publication number: 20210142191Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.Type: ApplicationFiled: May 29, 2020Publication date: May 13, 2021Inventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
-
Patent number: 10713577Abstract: In order to facilitate the entity resolution and entity activity tracking and indexing, systems and methods include receiving first source records from a first database and second source records from a record database. A candidate set of second source records is determined by a heuristic search in the set of second source records. A candidate pair feature vector associated with each candidate pair of first and second source records is generated. An entity matching machine learning model predicts matching first source records for each candidate second source record based on the respective candidate pair feature vector. An aggregate quantity associated with the matching first source records is aggregated from a quantity associated with each first source record, and a quantity index for each candidate second source record is determined based the aggregate quantities. Each quantity index is displayed to a user.Type: GrantFiled: November 8, 2019Date of Patent: July 14, 2020Assignee: Capital One Services, LLCInventors: Tanveer Faruquie, Aman Jain, Jihan Wei, Amir Reza Rahmani, Christopher Johnson
-
Patent number: 10339107Abstract: Methods, systems, and computer program products for multi-level colocation and analytical processing of spatial data on MapReduce are provided herein. A method includes correlating multiple items of spatial data and multiple items of attribute data within a file system to generate multiple blocks of correlated data; colocating each of the multiple blocks of correlated data on a given node within the file system based on a data block placement policy; and clustering multiple replicas generated for each of the multiple data blocks at multiple levels of spatial granularity within the file system.Type: GrantFiled: June 8, 2015Date of Patent: July 2, 2019Assignee: International Business Machines CorporationInventors: Tanveer A. Faruquie, Himanshu Gupta, Sriram Lakshminarasimhan, Sameep Mehta, Stuart A. Siegel
-
Patent number: 10163063Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.Type: GrantFiled: March 7, 2012Date of Patent: December 25, 2018Assignee: International Business Machines CorporationInventors: Snigdha Chaturvedi, Tanveer A Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
-
Patent number: 10120916Abstract: The present invention relates to data cleansing, and in particular performing the semantic standardization process within a database before the transform portion of the extract-transform-load (ETL) process. Provided are a method, system and computer program product for standardizing data within a database engine, configuring the standardization function to determine at least one standardized value for at least one data value by applying the standardization table in a context of at least one data value, receiving a database query identifying the standardization function, at least one database value and the context of the data, and invoking the standardization function.Type: GrantFiled: June 11, 2012Date of Patent: November 6, 2018Assignee: International Business Machines CorporationInventors: Tanveer A. Faruquie, Mukesh K. Mohania, L. Venkata Subramaniam, Charles D. Wolfson
-
Patent number: 10095780Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.Type: GrantFiled: February 7, 2017Date of Patent: October 9, 2018Assignee: International Business Machines CorporationInventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
-
Publication number: 20170147688Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.Type: ApplicationFiled: February 7, 2017Publication date: May 25, 2017Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
-
Publication number: 20160357775Abstract: Methods, systems, and computer program products for multi-level colocation and analytical processing of spatial data on MapReduce are provided herein. A method includes correlating multiple items of spatial data and multiple items of attribute data within a file system to generate multiple blocks of correlated data; colocating each of the multiple blocks of correlated data on a given node within the file system based on a data block placement policy; and clustering multiple replicas generated for each of the multiple data blocks at multiple levels of spatial granularity within the file system.Type: ApplicationFiled: June 8, 2015Publication date: December 8, 2016Inventors: Tanveer A. Faruquie, Himanshu Gupta, Sriram Lakshminarasimhan, Sameep Mehta, Stuart A. Siegel
-
Patent number: 9396433Abstract: Techniques, systems, and articles of manufacture for determining related data points from multi-modal inputs. A method includes collecting multiple items of multi-modal data comprising at least one dimension from multiple data sources, wherein said at least one dimension comprises a geographic dimension, a temporal dimension and/or an event-related dimension, determining a window of relevance for each of the multiple items of multi-modal data with respect to the at least one dimension, and identifying two or more of the multiple items of multi-modal data as related, by determining an overlap of the window of relevance corresponding to each of the two or more items of multi-modal data with respect to the at least one dimension.Type: GrantFiled: May 29, 2013Date of Patent: July 19, 2016Assignee: International Business Machines CorporationInventors: L. Venkata Subramaniam, Sameep Mehta, Raghuram Krishnapuram, Tanveer A. Faruquie
-
Patent number: 9311380Abstract: Techniques, systems, and articles of manufacture for processing spatial joins using a MapReduce framework. A method includes partitioning a spatial data domain based on a distribution of spatial data objects across multiple nodes of a cluster of machines, defining at least one operation to be performed on the partitioned spatial data domain based on one or more predicates of a query, and executing the at least one defined operation on the partitioned spatial data domain to determine a response to the query.Type: GrantFiled: March 29, 2013Date of Patent: April 12, 2016Assignee: International Business Machines CorporationInventors: Bhupesh S. Chawda, Himanshu Gupta, Tanveer A Faruquie, L. Venkata Subramaniam
-
Patent number: 9213574Abstract: A method, system and a computer program product for determining resources allocation in a distributed computing environment. An embodiment may include identifying resources in a distributed computing environment, computing provisioning parameters, computing configuration parameters and quantifying service parameters in response to a set of service level agreements (SLA). The embodiment may further include iteratively computing a completion time required for completion of the assigned task and a cost. Embodiments may further include computing an optimal resources configuration and computing at least one of an optimal completion time and an optimal cost corresponding to the optimal resources configuration. Embodiments may further include dynamically modifying the optimal resources configuration in response to at least one change in at least one of provisioning parameters, computing parameters and quantifying service parameters.Type: GrantFiled: January 30, 2010Date of Patent: December 15, 2015Assignee: International Business Machines CorporationInventors: Tanveer A Faruquie, Hima P Karanam, Mukesh K Mohania, L Venkata Subramaniam, Girish Venkatachaliah
-
Patent number: 9104709Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.Type: GrantFiled: March 16, 2012Date of Patent: August 11, 2015Assignee: International Business Machines CorporationInventors: Snigdha Chaturvedi, Tanveer A Faruquie, Hima P Karanam, Mukesh K Mohania, L Venkata Subramaniam
-
Patent number: 8996524Abstract: Methods, computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.Type: GrantFiled: March 8, 2012Date of Patent: March 31, 2015Assignee: International Business Machines CorporationInventors: Snigdha Chaturvedi, Tanveer A Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
-
Publication number: 20140358843Abstract: Techniques, systems, and articles of manufacture for determining related data points from multi-modal inputs. A method includes collecting multiple items of multi-modal data comprising at least one dimension from multiple data sources, wherein said at least one dimension comprises a geographic dimension, a temporal dimension and/or an event-related dimension, determining a window of relevance for each of the multiple items of multi-modal data with respect to the at least one dimension, and identifying two or more of the multiple items of multi-modal data as related, by determining an overlap of the window of relevance corresponding to each of the two or more items of multi-modal data with respect to the at least one dimension.Type: ApplicationFiled: May 29, 2013Publication date: December 4, 2014Inventors: L. Venkata Subramaniam, Sameep Mehta, Raghuram Krishnapuram, Tanveer A. Faruquie
-
Patent number: 8700542Abstract: Systems, methods, and computer products for optimally managing large rule sets are disclosed. Rule dependencies of rules within a set of rules may be determined as a function of rules execution frequency data generated from applying the rules over a data set. The rules within the set of rules may be clustered into rules clusters based on the determined rule dependencies, in which the rules clusters comprise disjoint subsets of the rules within the set of rules. Cluster frequency data for the rules clusters may be used to arrive at an optimal ordering. Each rule within the set of rules may be assigned a unique identification that may capture an execution order of the rules within the set of rules.Type: GrantFiled: December 15, 2010Date of Patent: April 15, 2014Assignee: International Business Machines CorporationInventors: Mohan N. Dani, Tanveer A. Faruquie, Hima P. Karanam, L. Venkata Subramaniam, Girish Venkatachaliah