Patents by Inventor Saloni Potdar

Saloni Potdar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

INTENT CLASSIFICATION DISTRIBUTION CALIBRATION

Publication number: 20210049502

Abstract: A method includes determining, based on an input data sample, a set of probabilities. Each probability of the set of probabilities is associated with a respective label of a set of labels. A particular probability associated with a particular label indicates an estimated likelihood that the input data sample is associated with the particular label. The method includes modifying the set of probabilities based on a set of adjustment factors to generate a modified set of probabilities. The set of adjustment factors is based on a first relative frequency distribution and a second relative frequency distribution. The first relative frequency distribution indicates for each label of the set of labels, a frequency of occurrence of the label among training data. The second relative frequency distribution indicates for each label of the set of labels, a frequency of occurrence of the label among post-training data provided to the trained classifier.

Type: Application

Filed: August 16, 2019

Publication date: February 18, 2021

Inventors: Haoyu Wang, Ming Tan, Dakuo Wang, Chuang Gan, Saloni Potdar
OUT-OF-DOMAIN ENCODER TRAINING

Publication number: 20210034965

Abstract: A computer-implemented method includes using an embedding network to generate prototypical vectors. Each prototypical vector is based on a corresponding label associated with a first domain. The computer-implemented method also includes using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain and using the embedding network to generate an out-of-domain test vector based on at least one other data sample associated with a different domain. The computer-implemented method also includes comparing the prototypical vectors to the in-domain test vector to generate in-domain comparison values and comparing the prototypical vectors to the out-of-domain test vector to generate out-of-domain comparison values. The computer-implemented method also includes modifying, based on the in-domain comparison values and the out-of-domain comparison values, one or more parameters of the embedding network.

Type: Application

Filed: August 2, 2019

Publication date: February 4, 2021

Inventors: Ming Tan, Dakuo Wang, Mo Yu, Haoyu Wang, Yang Yu, Shiyu Chang, Saloni Potdar
ROUTING TEXT CLASSIFICATIONS WITHIN A CROSS-DOMAIN CONVERSATIONAL SERVICE

Publication number: 20200364300

Abstract: A computing device receives a natural language input from a user. The computing device routes the natural language input from an active domain node of multiple domain nodes of a multi-domain context-based hierarchy to a leaf node of the domain nodes by selecting a parent domain node in the hierarchy until an off-topic classifier labels the natural language input as in-domain and then selecting a subdomain node in the hierarchy until an in-domain classifier labels the natural language input with a classification label, each of the plurality of domain nodes comprising a respective off-topic classifier and a respective in-domain classifier trained for a respective domain node. The computing device outputs the classification label determined by the leaf node.

Type: Application

Filed: May 13, 2019

Publication date: November 19, 2020

Inventors: MING TAN, LADISLAV KUNC, YANG YU, HAOYU WANG, SALONI POTDAR
HYBRID MODEL FOR SHORT TEXT CLASSIFICATION WITH IMBALANCED DATA

Publication number: 20200327445

Abstract: A method of text classification includes generating a text embedding vector representing a text sample and applying weights of a regression layer to the text embedding vector to generate a first data model output vector. The method also includes generating a plurality of prototype embedding vectors associated with a respective classification labels and comparing the plurality of prototype embedding vectors to the text embedding vector to generate a second data model output vector.

Type: Application

Filed: April 9, 2019

Publication date: October 15, 2020

Inventors: Yang Yu, Ming Tan, Ravi Nair, Haoyu Wang, Saloni Potdar
DISPLAYING TEXT CLASSIFICATION ANOMALIES PREDICTED BY A TEXT CLASSIFICATION MODEL

Publication number: 20200327194

Abstract: A test controller submits testing phrases to a text classifier and receives, from the text classifier, classification labels each comprising one or more respective heatmap values each associated with a separate word. The test controller aligns each of the classification labels corresponding with a respective testing phrase. The test controller identifies one or more anomalies of a selection of one or more classification labels that are different from an expected classification label for the respective testing phrase. The test controller outputs a graphical representation in a user interface of the selection of one or more classification labels and one or more respective testing phrases with visual indicators based on one or more respective heatmap values.

Type: Application

Filed: June 27, 2019

Publication date: October 15, 2020

Inventors: MING TAN, SALONI POTDAR, Lakshminarayanan Krishnamurthy
DISPLAYING TEXT CLASSIFICATION ANOMALIES PREDICTED BY A TEXT CLASSIFICATION MODEL

Publication number: 20200327193

Abstract: A test controller submits testing phrases to a text classifier and receives, from the text classifier, classification labels each comprising one or more respective heatmap values each associated with a separate word. The test controller aligns each of the classification labels corresponding with a respective testing phrase. The test controller identifies one or more anomalies of a selection of one or more classification labels that are different from an expected classification label for the respective testing phrase. The test controller outputs a graphical representation in a user interface of the selection of one or more classification labels and one or more respective testing phrases with visual indicators based on one or more respective heatmap values.

Type: Application

Filed: April 10, 2019

Publication date: October 15, 2020

Inventors: MING TAN, SALONI POTDAR, Lakshminarayanan Krishnamurthy
EVALUATING TEXT CLASSIFICATION ANOMALIES PREDICTED BY A TEXT CLASSIFICATION MODEL

Publication number: 20200327381

Abstract: In response to running at least one testing phrase on a previously trained text classifier and identifying a separate predicted classification label based on a score calculated for each respective at least one testing phrase, a text classifier decomposes extracted features summed in the score into word-level scores for each word in the at least one testing phrase. The text classifier assigns a separate heatmap value to each of the word-level scores, each respective separate heatmap value reflecting a weight of each word-level score. The text classifier outputs the separate predicted classification label and each separate heatmap value reflecting the weight of each word-level score for defining a heatmap identifying the contribution of each word in the at least one testing phrase to the separate predicted classification label for facilitating client evaluation of text classification anomalies.

Type: Application

Filed: April 10, 2019

Publication date: October 15, 2020

Inventors: MING TAN, SALONI POTDAR, Lakshminarayanan Krishnamurthy
OUT-OF-DOMAIN SENTENCE DETECTION

Publication number: 20200285702

Abstract: A computer-implemented method includes obtaining a training data set including text data indicating one or more phrases or sentences. The computer-implemented method includes training a classifier using supervised machine learning based on the training data set and additional text data indicating one or more out-of-domain phrases or sentences. The computer-implemented method includes training an autoencoder using unsupervised machine learning based on the training data. The computer-implemented method further includes combining the classifier and the autoencoder to generate the out-of-domain sentence detector configured to generate an output indicating a classification of whether input text data corresponds to an out-of-domain sentence. The output is based on a combination of a first output of the classifier and a second output of the autoencoder.

Type: Application

Filed: March 6, 2019

Publication date: September 10, 2020

Inventors: Inkit Padhi, Ruijian Wang, Haoyu Wang, Saloni Potdar
CROSS-DOMAIN MULTI-TASK LEARNING FOR TEXT CLASSIFICATION

Publication number: 20200251100

Abstract: A method includes providing input text to a plurality of multi-task learning (MTL) models corresponding to a plurality of domains. Each MTL model is trained to generate an embedding vector based on the input text. The method further includes providing the input text to a domain identifier that is trained to generate a weight vector based on the input text. The weight vector indicates a classification weight for each domain of the plurality of domains. The method further includes scaling each embedding vector based on a corresponding classification weight of the weight vector to generate a plurality of scaled embedding vectors, generating a feature vector based on the plurality of scaled embedding vectors, and providing the feature vector to an intent classifier that is trained to generate, based on the feature vector, an intent classification result associated with the input text.

Type: Application

Filed: February 1, 2019

Publication date: August 6, 2020

Inventors: Ming Tan, Haoyu Wang, Ladislav Kunc, Yang Yu, Saloni Potdar
UPDATING AN ONLINE MULTI-DOMAIN SENTENCE REPRESENTATION GENERATION MODULE OF A TEXT CLASSIFICATION SYSTEM

Publication number: 20200250274

Abstract: An online version of a sentence representation generation module updated by training a first sentence representation generation module using first labeled data of a first corpus. After training the first sentence representation generation module using the first labeled data, a second corpus of second labeled data is obtained. The second corpus is distinct from the first corpus. A subset of the first labeled data is identified based on similarities between the first corpus and the second corpus. A second sentence representation generation module is trained using the second labeled data of the second corpus and the subset of the first labeled data.

Type: Application

Filed: February 5, 2019

Publication date: August 6, 2020

Inventors: Ming Tan, Ladislav Kunc, Yang Yu, Haoyu Wang, Saloni Potdar
WEIGHTING FEATURES FOR AN INTENT CLASSIFICATION SYSTEM

Publication number: 20200250270

Abstract: A computer-implemented method includes obtaining a training data set including a plurality of training examples. The method includes generating, for each training example, multiple feature vectors corresponding, respectively, to multiple feature types. The method includes applying weighting factors to feature vectors corresponding to a subset of the feature types. The weighting factors are determined based on one or more of: a number of training examples, a number of classes associated with the training data set, an average number of training examples per class, a language of the training data set, a vocabulary size of the training data set, or a commonality of the vocabulary with a public corpus. The method includes concatenating the feature vectors of a particular training example to form an input vector and providing the input vector as training data to a machine-learning intent classification model to train the model to determine intent based on text input.

Type: Application

Filed: February 1, 2019

Publication date: August 6, 2020

Inventors: Yang Yu, Ladislav Kunc, Haoyu Wang, Ming Tan, Saloni Potdar
Adversarial Training Data Augmentation Data for Text Classifiers

Publication number: 20200226212

Abstract: An intelligent computer platform to introduce adversarial training to natural language processing (NLP). An initial training set is modified with synthetic training data to create an adversarial training set. The modification includes use of natural language understanding (NLU) to parse the initial training set into components and identify component categories. One or more paraphrase terms are identified with respect to the components and component categories, and function as replacement terms. The synthetic training data is effectively a merging of the initial training set with the replacement terms. As input is presented, a classifier leverages the adversarial training set to identify the intent of the input and to output a classification label to generate accurate and reflective response data.

Type: Application

Filed: January 15, 2019

Publication date: July 16, 2020

Applicant: International Business Machines Corporation

Inventors: Ming Tan, Ruijian Wang, Inkit Padhi, Saloni Potdar
Adversarial Training Data Augmentation for Generating Related Responses

Publication number: 20200227030

Abstract: An intelligent computer platform to introduce adversarial training to natural language processing (NLP). An initial training set is modified with synthetic training data to create an adversarial training set. The modification includes use of natural language understanding (NLU) to parse the initial training set into components and identify component categories. As input is presented, a classifier evaluates the input and leverages the adversarial training set to identify the intent of the input. An identified classification model generates accurate and reflective response data based on the received input.

Type: Application

Filed: January 15, 2019

Publication date: July 16, 2020

Applicant: International Business Machines Corporation

Inventors: Ming Tan, Ruijian Wang, Inkit Padhi, Saloni Potdar
IMPLEMENTING DYNAMIC CONFIDENCE RESCALING WITH MODULARITY IN AUTOMATIC USER INTENT DETECTION SYSTEMS

Publication number: 20200089773

Abstract: A method, system and computer program product are provided for implementing dynamic confidence rescaling for modularity in automatic user intent detection systems. User intents are identified using separately trained models with corresponding training data. Natural language processing (NLP) and statistical analysis are applied on the training data to classify the training data into groups and modules. A confidence rescaling algorithm is used for combining the modules. The dynamic confidence rescaling uses statistical information computed about each module being combined to identify user intents with enhanced accuracies in comparison to baseline models without confidence rescaling.

Type: Application

Filed: September 14, 2018

Publication date: March 19, 2020

Inventors: Yang Yu, Ladislav Kunc, Saloni Potdar

prev 1 2 3