Patents by Inventor L. Venkata Subramaniam

L. Venkata Subramaniam has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Automatically Mining Patterns for Rule Based Data Standardization Systems

Publication number: 20130238611

Abstract: Methods, computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

Type: Application

Filed: March 8, 2012

Publication date: September 12, 2013

Applicant: International Business Machines Corporation

Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
Automatically Mining Patterns For Rule Based Data Standardization Systems

Publication number: 20130238610

Abstract: Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.

Type: Application

Filed: March 7, 2012

Publication date: September 12, 2013

Applicant: International Business Machines Corporation

Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, L. Venkata Subramaniam
AUTOMATIC SELECTION OF BLOCKING COLUMN FOR DE-DUPLICATION

Publication number: 20130151487

Abstract: Blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.

Type: Application

Filed: December 7, 2011

Publication date: June 13, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: SNIGDHA CHATURVEDI, TANVEER A. FARUQUIE, HIMA P. KARANAM, MARVIN MENDELSSOHN, MUKESH K. MOHANIA, L. VENKATA SUBRAMANIAM
AUTOMATIC SELECTION OF BLOCKING COLUMN FOR DE-DUPLICATION

Publication number: 20130151490

Abstract: A method of blocking column selection can include determining a first parameter for each column set of a plurality of column sets, wherein the first parameter indicates distribution of blocks in the column set, and determining a second parameter for each column set. The second parameter can indicate block size for the column set. For each column set, a measure of blockability that is dependent upon at least the first parameter and the second parameter can be calculated using a processor. The plurality of column sets can be ranked according to the measures of blockability.

Type: Application

Filed: April 16, 2012

Publication date: June 13, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: SNIGDHA CHATURVEDI, TANVEER A. FARUQUIE, HIMA P. KARANAM, MARVIN MENDELSSOHN, MUKESH K. MOHANIA, L. VENKATA SUBRAMANIAM
Method for automatically identifying sentence boundaries in noisy conversational data

Patent number: 8364485

Abstract: Sentence boundaries in noisy conversational transcription data are automatically identified. Noise and transcription symbols are removed, and a training set is formed with sentence boundaries marked based on long silences or on manual markings in the transcribed data. Frequencies of head and tail n-grams that occur at the beginning and ending of sentences are determined from the training set. N-grams that occur a significant number of times in the middle of sentences in relation to their occurrences at the beginning or ending of sentences are filtered out. A boundary is marked before every head n-gram and after every tail n-gram occurring in the conversational data and remaining after filtering. Turns are identified. A boundary is marked after each turn, unless the turn ends with an impermissible tail word or is an incomplete turn. The marked boundaries in the conversational data identify sentence boundaries.

Type: Grant

Filed: August 27, 2007

Date of Patent: January 29, 2013

Assignee: International Business Machines Corporation

Inventors: Tetsuya Nasukawa, Diwakar Punjani, Shourya Roy, L. Venkata Subramaniam, Hironori Takeuchi
EFFICIENT DEVELOPMENT OF A RULE-BASED SYSTEM USING CROWD-SOURCING

Publication number: 20120323866

Abstract: Described herein are methods, systems, apparatuses and products for efficient development of a rule-based system. An aspect provides a method including accessing data records; converting said data records to an intermediate form; utilizing intermediate forms to compute similarity scores for said data records; and selecting as an example to be provided for rule making at least one record of said data records having a maximum dissimilarity score indicative of dissimilarity to already considered examples.

Type: Application

Filed: August 29, 2012

Publication date: December 20, 2012

Applicant: INTERNATIONAL MACHINES CORPORATION

Inventors: Snigdha Chaturvedi, Tanveer Afzal Faruquie, L. Venkata Subramaniam
SYSTEMS AND METHODS FOR EFFICIENT DEVELOPMENT OF A RULE-BASED SYSTEM USING CROWD-SOURCING

Publication number: 20120221508

Abstract: Described herein are methods, systems, apparatuses and products for efficient development of a rule-based system. An aspect provides a method including accessing data records; converting said data records to an intermediate form; utilizing intermediate forms to compute similarity scores for said data records; and selecting as an example to be provided for rule making at least one record of said data records having a maximum dissimilarity score indicative of dissimilarity to already considered examples.

Type: Application

Filed: February 28, 2011

Publication date: August 30, 2012

Applicant: INTERNATIONAL MACHINES CORPORATION

Inventors: Snigdha Chaturvedi, Tanveer Afzal Faruquie, L. Venkata Subramaniam
Cleansing a Database System to Improve Data Quality

Publication number: 20120179658

Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.

Type: Application

Filed: March 16, 2012

Publication date: July 12, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Mukesh K. Mohania, L. Venkata Subramaniam
Cleansing a Database System to Improve Data Quality

Publication number: 20120150825

Abstract: According to one embodiment of the present invention, a system controls cleansing of data within a database system, and comprises a computer system including at least one processor. The system receives a data set from the database system, and one or more features of the data set are selected for determining values for one or more characteristics of the selected features. The determined values are applied to a data quality estimation model to determine data quality estimates for the data set. Problematic data within the data set are identified based on the data quality estimates, where the cleansing is adjusted to accommodate the identified problematic data. Embodiments of the present invention further include a method and computer program product for controlling cleansing of data within a database system in substantially the same manner described above.

Type: Application

Filed: December 13, 2010

Publication date: June 14, 2012

Applicant: International Business Machines Corporation

Inventors: Snigdha Chaturvedi, Tanveer A. Faruquie, Hima P. Karanam, Mukesh K. Mohania, L. Venkata Subramaniam
SYSTEMS AND METHODS FOR STANDARDIZATION AND DE-DUPLICATION OF ADDRESSES USING TAXONOMY

Publication number: 20120047179

Abstract: Systems and associated methods for address standardization and applications related thereto are described. Embodiments exploit a common context in a taxonomy and a given address to detect and correct deviations in the address. Embodiments establish a possible path from a root of the taxonomy to a leaf in the taxonomy that can possibly generate a given address. Given a new address, embodiments use complete addresses, and/or segments or elements thereof, to compute the representations of the elements and find a closest matching leaf in the taxonomy. Embodiments then traverse the path to a root node to detect the agreement and disagreement between the path and the address entry. Taxonomical structured is thus used to detect, segregate and standardize the expected fields.

Type: Application

Filed: August 19, 2010

Publication date: February 23, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tanveer Afzal Faruquie, Sachindra Joshi, Hima Prasad Karanam, Mukesh Kumar Mohania, Sriram K. Padmanabhan, L. Venkata Subramaniam
CUSTOMER SERVICE ANALYSIS

Publication number: 20120039460

Abstract: A method, a system and a computer program product for analyzing customer service quality is disclosed. A plurality of customer call service quality parameters is identified using historical data. The plurality of customer call service quality parameters is quantified and correlated. The customer service quality is analyzed using the plurality of customer call service quality parameters. A repository is generated using the historical data of a plurality of customer calls and a set of pre-defined customer call flow templates. A subset of service quality queries is identified using contextual information of the customer call from the repository of service quality queries. The subset of service quality queries is then interspersed in the customer call. The customer service quality is analyzed using responses to the subset of service quality queries.

Type: Application

Filed: August 13, 2010

Publication date: February 16, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Raghuram Krishnapuram, L. Venkata Subramaniam
Technique for searching for keywords determining event occurrence

Patent number: 8005829

Abstract: A keyword search system including a text input unit for inputting subtexts obtained by dividing each text into parts, while associating the subtexts with an event through a process recorded in the text; a prediction device adjuster for adjusting a corresponding event prediction device to maximize the percentage of text in which the inputted event is identical to a prediction result in a first text group selected from the subtexts; a prediction processor for generating a prediction result for each section, by inputting each text in a second text group selected from the corresponding subtexts in the adjusted event prediction device; and a search unit for calculating the prediction precision for the second text group of the event prediction device using a comparison between the inputted event and the prediction result for each subtext, and searching for keywords in sections with a certain degree of prediction precision.

Type: Grant

Filed: March 7, 2008

Date of Patent: August 23, 2011

Assignee: International Business Machines Corporation

Inventors: Tetsuya Nasukawa, Shourya Roy, L. Venkata Subramaniam, Hironori Takeuchi
RESOURCES MANAGEMENT IN DISTRIBUTED COMPUTING ENVIRONMENT

Publication number: 20110191781

Abstract: A method, system and a computer program product for determining resources allocation in a distributed computing environment. An embodiment may include identifying resources in a distributed computing environment, computing provisioning parameters, computing configuration parameters and quantifying service parameters in response to a set of service level agreements (SLA). The embodiment may further include iteratively computing a completion time required for completion of the assigned task and a cost. Embodiments may further include computing an optimal resources configuration and computing at least one of an optimal completion time and an optimal cost corresponding to the optimal resources configuration. Embodiments may further include dynamically modifying the optimal resources configuration in response to at least one change in at least one of provisioning parameters, computing parameters and quantifying service parameters.

Type: Application

Filed: January 30, 2010

Publication date: August 4, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Hima P. Karanam, Tanveer A. Faruquie, L. Venkata Subramaniam, Mukesh K. Mohania, Girish Venkatachaliah
Automatic Taxonomy Enrichment

Publication number: 20110078158

Abstract: Techniques for enriching a taxonomy using one or more additional taxonomies are provided. The techniques include receiving two or more taxonomies, wherein the two or more taxonomies comprise a destination taxonomy and one or more additional taxonomies, determining one or more relevant portions of the two or more taxonomies by identifying one or more common terms between the two or more taxonomies, importing one or more relevant portions from the one or more additional taxonomies into the destination taxonomy, and using the one or more imported taxonomy portions to enrich the destination taxonomy.

Type: Application

Filed: September 29, 2009

Publication date: March 31, 2011

Applicant: International Business Machines Corporation

Inventors: Sougata Mukherjea, Amit A. Nanavati, L. Venkata Subramaniam
Method for segmenting communication transcripts using unsupervised and semi-supervised techniques

Patent number: 7912714

Abstract: A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a set of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters with

Type: Grant

Filed: April 1, 2008

Date of Patent: March 22, 2011

Assignee: Nuance Communications, Inc.

Inventors: Krishna Kummamuru, Deepak S. Padmanaban, Shourya Roy, L. Venkata Subramaniam
Method and apparatus for determining decision points for streaming conversational data

Patent number: 7904399

Abstract: A method for determining a decision point in real-time for a data stream from a conversation includes receiving streaming conversational data; and determining when to classify the streaming conversational data, using a measure of certainty, by performing certainty calculations at a plurality of time instances during the conversation and by selecting a decision point in response to the certainty calculations, the decision point not being based on a fixed window of conversational data but being based on accumulated conversational data available at different ones of the plurality of time instances. Systems and computer program products are also provided.

Type: Grant

Filed: November 15, 2007

Date of Patent: March 8, 2011

Assignee: International Business Machines Corporation

Inventors: L. Venkata Subramaniam, Ganesh Ramakrishnan, Tanveer A Faruquie
Method and Apparatus for Determining Decision Points for Streaming Conversational Data

Publication number: 20090132442

Abstract: A method for determining a decision point in real-time for a data stream from a conversation includes receiving streaming conversational data; and determining when to classify the streaming conversational data, using a measure of certainty, by performing certainty calculations at a plurality of time instances during the conversation and by selecting a decision point in response to the certainty calculations, the decision point not being based on a fixed window of conversational data but being based on accumulated conversational data available at different ones of the plurality of time instances. Systems and computer program products are also provided.

Type: Application

Filed: November 15, 2007

Publication date: May 21, 2009

Inventors: L. Venkata Subramaniam, Ganesh Ramakrishnan, Tanveer A. Faruquie
METHOD FOR SEGMENTING COMMUNICATION TRANSCRIPTS USING UNSUPERVISED AND SEMI-SUPERVISED TECHNIQUES

Publication number: 20090112571

Abstract: A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a set of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence clusters with

Type: Application

Filed: April 1, 2008

Publication date: April 30, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Krishna Kummamuru, Deepak S. Padmanabhan, Shourya Roy, L. Venkata Subramaniam
METHOD FOR SEGMENTING COMMUNICATION TRANSCRIPTS USING UNSUPERVSED AND SEMI-SUPERVISED TECHNIQUES

Publication number: 20090112588

Abstract: A method is provided for forming discrete segment clusters of one or more sequential sentences from a corpus of communication transcripts of transactional communications that comprises dividing the communication transcripts of the corpus into a first set of sentences spoken by a caller and a second set of sentences spoken by a responder; generating a specified number of sentence clusters by grouping the first and second sets of sentences according to a measure of lexical similarity using an unsupervised partitional clustering method; generating a collection of sequences of sentence types by assigning a distinct sentence type to each sentence cluster and representing each sentence of each communication transcript of the corpus with the sentence type assigned to the sentence cluster into which the sentence is grouped; and generating a specified number of discrete segment clusters by successively merging sentence clusters according to a proximity-based measure between the sentence types assigned to the sentence

Type: Application

Filed: October 31, 2007

Publication date: April 30, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Krishna Kummamuru, Deepak S. Padmanabhan, Shourya Roy, L. Venkata Subramaniam
METHOD FOR AUTOMATICALLY IDENTIFYING SENTENCE BOUNDARIES IN NOISY CONVERSATIONAL DATA

Publication number: 20090063150

Abstract: Sentence boundaries in noisy conversational transcription data are automatically identified. Noise and transcription symbols are removed, and a training set is formed with sentence boundaries marked based on long silences or on manual markings in the transcribed data. Frequencies of head and tail n-grams that occur at the beginning and ending of sentences are determined from the training set. N-grams that occur a significant number of times in the middle of sentences in relation to their occurrences at the beginning or ending of sentences are filtered out. A boundary is marked before every head n-gram and after every tail n-gram occurring in the conversational data and remaining after filtering. Turns are identified. A boundary is marked after each turn, unless the turn ends with an impermissible tail word or is an incomplete turn. The marked boundaries in the conversational data identify sentence boundaries.

Type: Application

Filed: August 27, 2007

Publication date: March 5, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Tetsuya Nasukawa, Diwakar Punjani, Shourya Roy, L. Venkata Subramaniam, Hironori Takeuchi

prev 1 2 3 4 next