Patents by Inventor Tanveer A Faruquie

Tanveer A Faruquie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8682898
    Abstract: A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process.
    Type: Grant
    Filed: April 30, 2010
    Date of Patent: March 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Sachindra Joshi, Tanveer Faruquie, Hima Prasad Karanam, Marvin Mendelssohn, Mukesh Kumar Mohania, Angel Marie Smith, L Venkata Subramaniam, Girish Venkatachaliah
  • Publication number: 20130332408
    Abstract: The present invention relates to data cleansing, and in particular performing the semantic standardization process within a database before the transform portion of the extract-transform-load (ETL) process. Provided are a method, system and computer program product for standardizing data within a database engine, configuring the standardization function to determine at least one standardized value for at least one data value by applying the standardization table in a context of at least one data value, receiving a database query identifying the standardization function, at least one database value and the context of the data, and invoking the standardization function.
    Type: Application
    Filed: July 31, 2013
    Publication date: December 12, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tanveer A. Faruquie, Mukesh K. Mohania, L. V. Subramaniam, Charles D. Wolfson
  • Publication number: 20130332407
    Abstract: The present invention relates to data cleansing, and in particular performing the semantic standardization process within a database before the transform portion of the extract-transform-load (ETL) process. Provided are a method, system and computer program product for standardizing data within a database engine, configuring the standardization function to determine at least one standardized value for at least one data value by applying the standardization table in a context of at least one data value, receiving a database query identifying the standardization function, at least one database value and the context of the data, and invoking the standardization function.
    Type: Application
    Filed: June 11, 2012
    Publication date: December 12, 2013
    Applicant: International Business Machines Corporation
    Inventors: Tanveer A. Faruquie, Mukesh K. Mohania, L. V. Subramaniam, Charles D. Wolfson
  • Patent number: 8560506
    Abstract: A method of blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.
    Type: Grant
    Filed: April 16, 2012
    Date of Patent: October 15, 2013
    Assignee: International Business Machines Corporation
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
  • Patent number: 8560505
    Abstract: Blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.
    Type: Grant
    Filed: December 7, 2011
    Date of Patent: October 15, 2013
    Assignee: International Business Machines Corporation
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20130238610
    Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.
    Type: Application
    Filed: March 7, 2012
    Publication date: September 12, 2013
    Applicant: International Business Machines Corporation
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20130238611
    Abstract: Methods, computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.
    Type: Application
    Filed: March 8, 2012
    Publication date: September 12, 2013
    Applicant: International Business Machines Corporation
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20130151487
    Abstract: Blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.
    Type: Application
    Filed: December 7, 2011
    Publication date: June 13, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SNIGDHA CHATURVEDI, TANVEER A. FARUQUIE, HIMA P. KARANAM, MARVIN MENDELSSOHN, MUKESH K. MOHANIA, L. VENKATA SUBRAMANIAM
  • Publication number: 20130151490
    Abstract: A method of blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.
    Type: Application
    Filed: April 16, 2012
    Publication date: June 13, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SNIGDHA CHATURVEDI, TANVEER A. FARUQUIE, HIMA P. KARANAM, MARVIN MENDELSSOHN, MUKESH K. MOHANIA, L. VENKATA SUBRAMANIAM
  • Publication number: 20120179658
    Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.
    Type: Application
    Filed: March 16, 2012
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20120158619
    Abstract: Systems, methods, and computer products for optimally managing large rule sets are disclosed. Rule dependencies of rules within a set of rules may be determined as a function of rules execution frequency data generated from applying the rules over a data set. The rules within the set of rules may be clustered into rules clusters based on the determined rule dependencies, in which the rules clusters comprise disjoint subsets of the rules within the set of rules. Cluster frequency data for the rules clusters may be used to arrive at an optimal ordering.
    Type: Application
    Filed: December 15, 2010
    Publication date: June 21, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: MOHAN N. DANI, TANVEER A. FARUQUIE, HIMA P. KARANAM, L.V. SUBRAMANIAM, GIRISH VENKATACHALIAH
  • Publication number: 20120150825
    Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.
    Type: Application
    Filed: December 13, 2010
    Publication date: June 14, 2012
    Applicant: International Business Machines Corporation
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20110320442
    Abstract: Systems and associated methods providing a document corpus navigation interface including domain independent facets are described. Embodiments provide a list of domain independent facets, extract facet values from the document corpus, learn the facet values from the corpus, map each document to one of the values of each of the facets, and automatically determine a weight of the relationship.
    Type: Application
    Filed: June 25, 2010
    Publication date: December 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tanveer A. Faruquie, Mukesh K. Mohania, Ullas B. Nambiar
  • Publication number: 20110270808
    Abstract: A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process.
    Type: Application
    Filed: April 30, 2010
    Publication date: November 3, 2011
    Applicant: International Business Machines Corporation
    Inventors: Tanveer A. Faruquie, Sachindra Joshi, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, Angel Smith, L. V. Subramaniam, Girish Venkatachaliah
  • Patent number: 8010595
    Abstract: A system (30) and method are provided for single-pass execution of dynamic pages across multiple request-response cycles. The system (30) comprises a client (32) and server (34) in communication with one another. A container (35) resides on the server and handles requests made for the result of a dynamic page (36). The container controls the processing of the dynamic page. If the dynamic page requires additional information to continue processing, an intermediate request (44) is transmitted to the client, which responds with an intermediate response (46) containing the additional information. A notifier servlet (38) receives the intermediate response and passes the information to the dynamic page so that execution can resume without interruption.
    Type: Grant
    Filed: November 29, 2005
    Date of Patent: August 30, 2011
    Assignee: International Business Machines Corporation
    Inventors: Tanveer A Faruquie, Sandeep Jindal, Abhishek Verma
  • Publication number: 20110191781
    Abstract: A method, system and a computer program product for determining resources allocation in a distributed computing environment. An embodiment may include identifying resources in a distributed computing environment, computing provisioning parameters, computing configuration parameters and quantifying service parameters in response to a set of service level agreements (SLA). The embodiment may further include iteratively computing a completion time required for completion of the assigned task and a cost. Embodiments may further include computing an optimal resources configuration and computing at least one of an optimal completion time and an optimal cost corresponding to the optimal resources configuration. Embodiments may further include dynamically modifying the optimal resources configuration in response to at least one change in at least one of provisioning parameters, computing parameters and quantifying service parameters.
    Type: Application
    Filed: January 30, 2010
    Publication date: August 4, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hima P. Karanam, Tanveer A. Faruquie, L. Venkata Subramaniam, Mukesh K. Mohania, Girish Venkatachaliah
  • Patent number: 7978853
    Abstract: Techniques for protecting information in an audio file are provided. The techniques include obtaining an audio file, detecting information beating one or more segments in a speech signal, wherein the information comprises information sought for protection, encrypting the information sought for protection by scrambling the one or more segments using a scrambling filter, and selectively decrypting an amount of the encrypted information, wherein the amount of the encrypted information to be decrypted depends on user access privilege, and wherein selectively decrypting the amount of the encrypted information protects said amount of the encrypted information. Techniques are also provided for protecting information in an audio file.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: July 12, 2011
    Assignee: International Business Machines Corporation
    Inventors: Raghuram Krishnapuram, Nandita J. Mahajan, L. Venkat Subramaniam, Vivek Tyagi, Tanveer A. Faruquie
  • Patent number: 7974411
    Abstract: Techniques for protecting information in an audio file are provided. The techniques include obtaining an audio file, detecting information bearing one or more segments in a speech signal, wherein the information comprises information sought for protection, encrypting the information sought for protection by scrambling the one or more segments using a scrambling filter, and selectively decrypting an amount of the encrypted information, wherein the amount of the encrypted information to be decrypted depends on user access privilege, and wherein selectively decrypting the amount of the encrypted information protects said amount of the encrypted information. Techniques are also provided for protecting information in an audio file.
    Type: Grant
    Filed: January 31, 2008
    Date of Patent: July 5, 2011
    Assignee: International Business Machines Corporation
    Inventors: Raghuram Krishnapuram, Nandita J. Mahajan, L. Venkat Subramaniam, Vivek Tyagi, Tanveer A. Faruquie
  • Patent number: 7904399
    Abstract: A method for determining a decision point in real-time for a data stream from a conversation includes receiving streaming conversational data; and determining when to classify the streaming conversational data, using a measure of certainty, by performing certainty calculations at a plurality of time instances during the conversation and by selecting a decision point in response to the certainty calculations, the decision point not being based on a fixed window of conversational data but being based on accumulated conversational data available at different ones of the plurality of time instances. Systems and computer program products are also provided.
    Type: Grant
    Filed: November 15, 2007
    Date of Patent: March 8, 2011
    Assignee: International Business Machines Corporation
    Inventors: L. Venkata Subramaniam, Ganesh Ramakrishnan, Tanveer A Faruquie
  • Publication number: 20100332424
    Abstract: Techniques for identifying one or more inconsistencies between an unstructured document and a back-end fact-base are provided. The techniques include automatically parsing a query document and comparing the document with a back-end fact-base comprising facts relevant to the document, identifying one or more inconsistencies between information mentioned in the document and the facts stored in the back-end fact-base, and providing a response to the query document, wherein the response additionally includes the one or more identified inconsistencies.
    Type: Application
    Filed: June 30, 2009
    Publication date: December 30, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Indrajit Bhattacharya, Tanveer A. Faruquie, Shantanu Godbole, Mukesh K. Mohania, Ullas B. Nambiar