Patents by Inventor L. Venkata Subramaniam

L. Venkata Subramaniam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20130238611
    Abstract: Methods, computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.
    Type: Application
    Filed: March 8, 2012
    Publication date: September 12, 2013
    Applicant: International Business Machines Corporation
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20130238610
    Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.
    Type: Application
    Filed: March 7, 2012
    Publication date: September 12, 2013
    Applicant: International Business Machines Corporation
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20130151487
    Abstract: Blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.
    Type: Application
    Filed: December 7, 2011
    Publication date: June 13, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SNIGDHA CHATURVEDI, TANVEER A. FARUQUIE, HIMA P. KARANAM, MARVIN MENDELSSOHN, MUKESH K. MOHANIA, L. VENKATA SUBRAMANIAM
  • Publication number: 20130151490
    Abstract: A method of blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.
    Type: Application
    Filed: April 16, 2012
    Publication date: June 13, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SNIGDHA CHATURVEDI, TANVEER A. FARUQUIE, HIMA P. KARANAM, MARVIN MENDELSSOHN, MUKESH K. MOHANIA, L. VENKATA SUBRAMANIAM
  • Patent number: 8364485
    Abstract: Sentence boundaries in noisy conversational transcription data are automatically identified. Noise and transcription symbols are removed, and a training set is formed with sentence boundaries marked based on long silences or on manual markings in the transcribed data. Frequencies of head and tail n-grams that occur at the beginning and ending of sentences are determined from the training set. N-grams that occur a significant number of times in the middle of sentences in relation to their occurrences at the beginning or ending of sentences are filtered out. A boundary is marked before every head n-gram and after every tail n-gram occurring in the conversational data and remaining after filtering. Turns are identified. A boundary is marked after each turn, unless the turn ends with an impermissible tail word or is an incomplete turn. The marked boundaries in the conversational data identify sentence boundaries.
    Type: Grant
    Filed: August 27, 2007
    Date of Patent: January 29, 2013
    Assignee: International Business Machines Corporation
    Inventors: Tetsuya Nasukawa, Diwakar Punjani, Shourya Roy, L. Venkata Subramaniam, Hironori Takeuchi
  • Publication number: 20120323866
    Abstract: Described herein are methods, systems, apparatuses and products for efficient development of a rule-based system. An aspect provides a method including accessing data records; converting said data records to an intermediate form; utilizing intermediate forms to compute similarity scores for said data records; and selecting as an example to be provided for rule making at least one record of said data records having a maximum dissimilarity score indicative of dissimilarity to already considered examples.
    Type: Application
    Filed: August 29, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL MACHINES CORPORATION
    Inventors: Snigdha Chaturvedi, Tanveer Afzal Faruquie, L. Venkata Subramaniam
  • Publication number: 20120221508
    Abstract: Described herein are methods, systems, apparatuses and products for efficient development of a rule-based system. An aspect provides a method including accessing data records; converting said data records to an intermediate form; utilizing intermediate forms to compute similarity scores for said data records; and selecting as an example to be provided for rule making at least one record of said data records having a maximum dissimilarity score indicative of dissimilarity to already considered examples.
    Type: Application
    Filed: February 28, 2011
    Publication date: August 30, 2012
    Applicant: INTERNATIONAL MACHINES CORPORATION
    Inventors: Snigdha Chaturvedi, Tanveer Afzal Faruquie, L. Venkata Subramaniam
  • Publication number: 20120179658
    Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.
    Type: Application
    Filed: March 16, 2012
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20120150825
    Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.
    Type: Application
    Filed: December 13, 2010
    Publication date: June 14, 2012
    Applicant: International Business Machines Corporation
    Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Mukesh K. Mohania, L. Venkata Subramaniam
  • Publication number: 20120047179
    Abstract: Systems and associated methods for address standardization and applications related thereto are described. Embodiments exploit a common context in a taxonomy and a given address to detect and correct deviations in the address. Embodiments establish a possible path from a root of the taxonomy to a leaf in the taxonomy that can possibly generate a given address. Given a new address, embodiments use complete addresses, and/or segments or elements thereof, to compute the representations of the elements and find a closest matching leaf in the taxonomy. Embodiments then traverse the path to a root node to detect the agreement and disagreement between the path and the address entry. Taxonomical structured is thus used to detect, segregate and standardize the expected fields.
    Type: Application
    Filed: August 19, 2010
    Publication date: February 23, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tanveer Afzal Faruquie, Sachindra Joshi, Hima Prasad Karanam, Mukesh Kumar Mohania, Sriram K. Padmanabhan, L. Venkata Subramaniam
  • Publication number: 20120039460
    Abstract: A method, a system and a computer program product for analyzing customer service quality is disclosed. A plurality of customer call service quality parameters is identified using historical data. The plurality of customer call service quality parameters is quantified and correlated. The customer service quality is analyzed using the plurality of customer call service quality parameters. A repository is generated using the historical data of a plurality of customer calls and a set of pre-defined customer call flow templates. A subset of service quality queries is identified using contextual information of the customer call from the repository of service quality queries. The subset of service quality queries is then interspersed in the customer call. The customer service quality is analyzed using responses to the subset of service quality queries.
    Type: Application
    Filed: August 13, 2010
    Publication date: February 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Raghuram Krishnapuram, L. Venkata Subramaniam
  • Patent number: 8005829
    Abstract: A keyword search system including a text input unit for inputting subtexts obtained by dividing each text into parts, while associating the subtexts with an event through a process recorded in the text; a prediction device adjuster for adjusting a corresponding event prediction device to maximize the percentage of text in which the inputted event is identical to a prediction result in a first text group selected from the subtexts; a prediction processor for generating a prediction result for each section, by inputting each text in a second text group selected from the corresponding subtexts in the adjusted event prediction device; and a search unit for calculating the prediction precision for the second text group of the event prediction device using a comparison between the inputted event and the prediction result for each subtext, and searching for keywords in sections with a certain degree of prediction precision.
    Type: Grant
    Filed: March 7, 2008
    Date of Patent: August 23, 2011
    Assignee: International Business Machines Corporation
    Inventors: Tetsuya Nasukawa, Shourya Roy, L. Venkata Subramaniam, Hironori Takeuchi
  • Publication number: 20110191781
    Abstract: A method, system and a computer program product for determining resources allocation in a distributed computing environment. An embodiment may include identifying resources in a distributed computing environment, computing provisioning parameters, computing configuration parameters and quantifying service parameters in response to a set of service level agreements (SLA). The embodiment may further include iteratively computing a completion time required for completion of the assigned task and a cost. Embodiments may further include computing an optimal resources configuration and computing at least one of an optimal completion time and an optimal cost corresponding to the optimal resources configuration. Embodiments may further include dynamically modifying the optimal resources configuration in response to at least one change in at least one of provisioning parameters, computing parameters and quantifying service parameters.
    Type: Application
    Filed: January 30, 2010
    Publication date: August 4, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Hima P. Karanam, Tanveer A. Faruquie, L. Venkata Subramaniam, Mukesh K. Mohania, Girish Venkatachaliah
  • Publication number: 20110078158
    Abstract: Techniques for enriching a taxonomy using one or more additional taxonomies are provided. The techniques include receiving two or more taxonomies, wherein the two or more taxonomies comprise a destination taxonomy and one or more additional taxonomies, determining one or more relevant portions of the two or more taxonomies by identifying one or more common terms between the two or more taxonomies, importing one or more relevant portions from the one or more additional taxonomies into the destination taxonomy, and using the one or more imported taxonomy portions to enrich the destination taxonomy.
    Type: Application
    Filed: September 29, 2009
    Publication date: March 31, 2011
    Applicant: International Business Machines Corporation
    Inventors: Sougata Mukherjea, Amit A. Nanavati, L. Venkata Subramaniam
  • Patent number: 7912714
    Abstract: A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a set of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters with
    Type: Grant
    Filed: April 1, 2008
    Date of Patent: March 22, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Krishna Kummamuru, Deepak S. Padmanaban, Shourya Roy, L. Venkata Subramaniam
  • Patent number: 7904399
    Abstract: A method for determining a decision point in real-time for a data stream from a conversation includes receiving streaming conversational data; and determining when to classify the streaming conversational data, using a measure of certainty, by performing certainty calculations at a plurality of time instances during the conversation and by selecting a decision point in response to the certainty calculations, the decision point not being based on a fixed window of conversational data but being based on accumulated conversational data available at different ones of the plurality of time instances. Systems and computer program products are also provided.
    Type: Grant
    Filed: November 15, 2007
    Date of Patent: March 8, 2011
    Assignee: International Business Machines Corporation
    Inventors: L. Venkata Subramaniam, Ganesh Ramakrishnan, Tanveer A Faruquie
  • Publication number: 20090132442
    Abstract: A method for determining a decision point in real-time for a data stream from a conversation includes receiving streaming conversational data; and determining when to classify the streaming conversational data, using a measure of certainty, by performing certainty calculations at a plurality of time instances during the conversation and by selecting a decision point in response to the certainty calculations, the decision point not being based on a fixed window of conversational data but being based on accumulated conversational data available at different ones of the plurality of time instances. Systems and computer program products are also provided.
    Type: Application
    Filed: November 15, 2007
    Publication date: May 21, 2009
    Inventors: L. Venkata Subramaniam, Ganesh Ramakrishnan, Tanveer A. Faruquie
  • Publication number: 20090112571
    Abstract: A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a set of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters with
    Type: Application
    Filed: April 1, 2008
    Publication date: April 30, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Krishna Kummamuru, Deepak S. Padmanabhan, Shourya Roy, L. Venkata Subramaniam
  • Publication number: 20090112588
    Abstract: A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a specified number of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence
    Type: Application
    Filed: October 31, 2007
    Publication date: April 30, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Krishna Kummamuru, Deepak S. Padmanabhan, Shourya Roy, L. Venkata Subramaniam
  • Publication number: 20090063150
    Abstract: Sentence boundaries in noisy conversational transcription data are automatically identified. Noise and transcription symbols are removed, and a training set is formed with sentence boundaries marked based on long silences or on manual markings in the transcribed data. Frequencies of head and tail n-grams that occur at the beginning and ending of sentences are determined from the training set. N-grams that occur a significant number of times in the middle of sentences in relation to their occurrences at the beginning or ending of sentences are filtered out. A boundary is marked before every head n-gram and after every tail n-gram occurring in the conversational data and remaining after filtering. Turns are identified. A boundary is marked after each turn, unless the turn ends with an impermissible tail word or is an incomplete turn. The marked boundaries in the conversational data identify sentence boundaries.
    Type: Application
    Filed: August 27, 2007
    Publication date: March 5, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Tetsuya Nasukawa, Diwakar Punjani, Shourya Roy, L. Venkata Subramaniam, Hironori Takeuchi