Patents by Inventor Saloni Potdar

Saloni Potdar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Intent classification using non-correlated features

Patent number: 11966699

Abstract: A system for classifying a language sample intent by receiving a language sample including a set of features, identifying language sample features, determining a tokenization score for the language sample according to the language sample features, eliminating duplicate features according to the tokenization score, determining a term frequency (tf) according to the identified features and the tokenization score, determining an inverse document frequency (idf) according to the identified features and the tokenization score, and generating a term frequency-inverse document frequency (tf-idf) matrix for the identified features.

Type: Grant

Filed: June 17, 2021

Date of Patent: April 23, 2024

Assignee: International Business Machines Corporation

Inventors: Abhishek Shah, Ladislav Kunc, Haode Qi, Lin Pan, Saloni Potdar
DETECTING OUT-OF-DOMAIN TEXT DATA IN DIALOG SYSTEMS USING ARTIFICIAL INTELLIGENCE

Publication number: 20240070401

Abstract: Methods, systems, and computer program products for detecting out-of-domain text data in dialog systems using artificial intelligence techniques are provided herein. A computer-implemented method includes updating artificial intelligence techniques related to out-of-domain text data detection, the updating based on encoding training data and generating regularized representations of at least a portion of the encoded training data by combining the at least a portion of the encoded training data and at least one intent centroid associated with the updated artificial intelligence techniques; encoding input text data; computing out-of-domain scores, in connection with the at least one dialog system, for at least a portion of the encoded input text data by processing the at least a portion of encoded input data using at least a portion of the one or more updated artificial intelligence techniques; and performing one or more automated actions based on the computed out-of-domain scores.

Type: Application

Filed: August 29, 2022

Publication date: February 29, 2024

Inventors: Cheng Qian, Haode Qi, Saloni Potdar, Ladislav Kunc
OUT OF DOMAIN SENTENCE DETECTION

Publication number: 20240037331

Abstract: A method, a structure, and a computer system for OOD sentence detection in dialogue systems. The exemplary embodiments may include receiving, for a domain corresponding to a particular topic, one or more on-topic text inputs and one or more off-topic text inputs. The exemplary embodiments may further include encoding the one or more on-topic text inputs and the one or more off-topic text inputs into a latent space, as well as decoding the one or more on-topic text inputs and the one or more off-topic text inputs from the latent space. The exemplary embodiments may additionally include minimizing a reconstruction error between the encoded one or more on-topic text inputs and the decoded one or more on-topic text inputs, and maximizing a reconstruction error between the encoded one or more off-topic text inputs and the decoded one or more off-topic text inputs.

Type: Application

Filed: July 28, 2022

Publication date: February 1, 2024

Inventors: Haode Qi, Cheng Qian, Ladislav Kunc, Saloni Potdar, Eric Donald Wayne
Conversational AI with multi-lingual human chatlogs

Patent number: 11853712

Abstract: A method, computer system, and computer program product for multi-lingual chatlog training are provided. The embodiment may include receiving, by a processor, a plurality of data related to conversational data in multiple languages. The embodiment may also include assigning an intent label to each conversational data. The embodiment may further include assigning a language label to each conversational data. The embodiment may also include paring the plurality of the data related to the conversational data according to the intent label and the language label. The embodiment may further include training a machine learning model using a multi-lingual and multi-intent conversational data pairing. The embodiment may also include training the machine learning model using a single language and multi-intent conversational data paring.

Type: Grant

Filed: June 7, 2021

Date of Patent: December 26, 2023

Assignee: International Business Machines Corporation

Inventors: Haode Qi, Lin Pan, Abhishek Shah, Ladislav Kunc, Saloni Potdar
Out-of-domain encoder training

Patent number: 11645514

Abstract: A computer-implemented method includes using an embedding network to generate prototypical vectors. Each prototypical vector is based on a corresponding label associated with a first domain. The computer-implemented method also includes using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain and using the embedding network to generate an out-of-domain test vector based on at least one other data sample associated with a different domain. The computer-implemented method also includes comparing the prototypical vectors to the in-domain test vector to generate in-domain comparison values and comparing the prototypical vectors to the out-of-domain test vector to generate out-of-domain comparison values. The computer-implemented method also includes modifying, based on the in-domain comparison values and the out-of-domain comparison values, one or more parameters of the embedding network.

Type: Grant

Filed: August 2, 2019

Date of Patent: May 9, 2023

Assignee: International Business Machines Corporation

Inventors: Ming Tan, Dakuo Wang, Mo Yu, Haoyu Wang, Yang Yu, Shiyu Chang, Saloni Potdar
Domain specific model compression

Patent number: 11620435

Abstract: Domain specific model compression by providing a weighting parameter for a candidate operation of a neural network, applying the weighting parameter to an output vector of the candidate operation, performing a regularization of the weighting parameter output vector combination, compressing the neural network model according to the results of the regularization, and providing the neural network model after compression.

Type: Grant

Filed: October 10, 2019

Date of Patent: April 4, 2023

Assignee: International Business Machines Corporation

Inventors: Haoyu Wang, Yang Yu, Ming Tan, Saloni Potdar
Evaluating text classification anomalies predicted by a text classification model

Patent number: 11537821

Abstract: In response to running at least one testing phrase on a previously trained text classifier and identifying a separate predicted classification label based on a score calculated for each respective at least one testing phrase, a text classifier decomposes extracted features summed in the score into word-level scores for each word in the at least one testing phrase. The text classifier assigns a separate heatmap value to each of the word-level scores, each respective separate heatmap value reflecting a weight of each word-level score. The text classifier outputs the separate predicted classification label and each separate heatmap value reflecting the weight of each word-level score for defining a heatmap identifying the contribution of each word in the at least one testing phrase to the separate predicted classification label for facilitating client evaluation of text classification anomalies.

Type: Grant

Filed: April 10, 2019

Date of Patent: December 27, 2022

Assignee: International Business Machines Corporation

Inventors: Ming Tan, Saloni Potdar, Lakshminarayanan Krishnamurthy
INTENT CLASSIFICATION USING NON-CORRELATED FEATURES

Publication number: 20220405472

Abstract: A system for classifying a language sample intent by receiving a language sample including a set of features, identifying language sample features, determining a tokenization score for the language sample according to the language sample features, eliminating duplicate features according to the tokenization score, determining a term frequency (tf) according to the identified features and the tokenization score, determining an inverse document frequency (idf) according to the identified features and the tokenization score, and generating a term frequency-inverse document frequency (tf-idf) matrix for the identified features.

Type: Application

Filed: June 17, 2021

Publication date: December 22, 2022

Inventors: Abhishek Shah, Ladislav Kunc, Haode Qi, LIN PAN, Saloni Potdar
CONVERSATIONAL AI WITH MULTI-LINGUAL HUMAN CHATLOGS

Publication number: 20220391600

Abstract: A method, computer system, and computer program product for multi-lingual chatlog training are provided. The embodiment may include receiving, by a processor, a plurality of data related to conversational data in multiple languages. The embodiment may also include assigning an intent label to each conversational data. The embodiment may further include assigning a language label to each conversational data. The embodiment may also include paring the plurality of the data related to the conversational data according to the intent label and the language label. The embodiment may further include training a machine learning model using a multi-lingual and multi-intent conversational data pairing. The embodiment may also include training the machine learning model using a single language and multi-intent conversational data paring.

Type: Application

Filed: June 7, 2021

Publication date: December 8, 2022

Inventors: Haode Qi, LIN PAN, Abhishek Shah, Ladislav Kunc, Saloni Potdar
GENERATING QUESTION ANSWER PAIRS

Publication number: 20220358851

Abstract: In an approach to generating question answer pairs, one or more computer processors receive a corpus of text. One or more computer processors extract one or more key concepts from the corpus of text. Based on the one or more key concepts, one or more computer processors generate one or more questions associated with the key concepts, where the one or more key concepts are answers to the one or more generated questions. One or more computer processors display the one or more generated questions and the answers to the one or more generated questions.

Type: Application

Filed: May 6, 2021

Publication date: November 10, 2022

Inventors: Dakuo Wang, Mo Yu, Chuang Gan, Saloni Potdar
Contextual question answering using human chat logs

Patent number: 11443117

Abstract: A system includes a memory having instructions therein and at least one processor configured to execute the instructions to: receive a natural language question; determine, from a chat log comprising a plurality of chat session logs, a set of chat session logs most relevant to the natural language question; determine a respective plurality of non-overlapping text spans most relevant to the natural language question within each of a respective plurality of conceptual pseudo-documents; determine a conceptual pseudo-document most relevant to the natural language question; extract a question-answer pair most relevant to the natural language question from the most relevant pseudo-document; and convey the most relevant question-answer pair to a user. Each one of the conceptual pseudo-documents corresponds to a respective one of the most relevant chat session logs.

Type: Grant

Filed: November 19, 2019

Date of Patent: September 13, 2022

Assignee: International Business Machines Corporation

Inventors: Yang Yu, Ming Tan, Shasha Lin, Saloni Potdar
Intent classification distribution calibration

Patent number: 11436528

Abstract: A method includes determining, based on an input data sample, a set of probabilities. Each probability of the set of probabilities is associated with a respective label of a set of labels. A particular probability associated with a particular label indicates an estimated likelihood that the input data sample is associated with the particular label. The method includes modifying the set of probabilities based on a set of adjustment factors to generate a modified set of probabilities. The set of adjustment factors is based on a first relative frequency distribution and a second relative frequency distribution. The first relative frequency distribution indicates for each label of the set of labels, a frequency of occurrence of the label among training data. The second relative frequency distribution indicates for each label of the set of labels, a frequency of occurrence of the label among post-training data provided to the trained classifier.

Type: Grant

Filed: August 16, 2019

Date of Patent: September 6, 2022

Assignee: International Business Machines Corporation

Inventors: Haoyu Wang, Ming Tan, Dakuo Wang, Chuang Gan, Saloni Potdar
Mechanisms for continuous improvement of automated machine learning

Patent number: 11423333

Abstract: Mechanisms are provided for optimizing an automated machine learning (AutoML) operation to configure parameters of a machine learning model. AutoML logic is configured based on an initial default value and initial range for sampling of a parameter of the machine learning (ML) model and an initial AutoML process is executed on the ML model based on a plurality of datasets comprising a plurality of domains of data elements, utilizing the initially configured AutoML logic. For each domain, a cross-dataset default value and cross-dataset value range are derived from results of the execution of the initial AutoML process. For each domain, an entry is stored in a data structure, the entry storing the derived cross-dataset default value and cross-dataset value range for the domain. The AutoML logic performs a subsequent AutoML process on a new dataset based on one or more entries of the data structure.

Type: Grant

Filed: March 25, 2020

Date of Patent: August 23, 2022

Assignee: International Business Machines Corporation

Inventors: Haode Qi, Ming Tan, Ladislav Kunc, Saloni Potdar
Weak supervised abnormal entity detection

Patent number: 11423227

Abstract: A mechanism is provided to implement an abnormal entity detection mechanism that facilitates detecting abnormal entities in real-time response systems through weak supervision. For each first intent from an entity labeled workspace that matches a second intent in labeled chat logs, when the entity score associated with each first entity or second entity is above a predefined significance level the first entity or the second entity is recorded. For each first intent from the entity labeled workspace that matches the second intent in the labeled chat logs: responsive to the first entity being recorded and the second entity failing to be recorded, that first entity is removed from the training data as being mistakenly included; or, responsive to the second entity being recorded and the first entity failing to be recorded, that second entity is added as a potential business case to the training data.

Type: Grant

Filed: February 13, 2020

Date of Patent: August 23, 2022

Assignee: International Business Machines Corporation

Inventors: Haode Qi, Ming Tan, Yang Yu, Navneet N. Rao, Ladislav Kunc, Saloni Potdar
Suggestion of new entity types with discriminative term importance analysis

Patent number: 11379666

Abstract: A mechanism is provided to implement suggestion of new entity types with discriminative importance analysis. The mechanism obtains a list of predefined intents from a chatbot designer. The mechanism receives an input sentence having a target intent within the list of predefined intents. The mechanism performs intent-specific importance analysis on the input sentence to generate an importance score for each token in the input sentence. The mechanism ranks the tokens in the input sentence by importance score and outputs a token with a highest importance score as a candidate entity type.

Type: Grant

Filed: April 8, 2020

Date of Patent: July 5, 2022

Assignee: International Business Machines Corporation

Inventors: Haode Qi, Ming Tan, Yang Yu, Navneet N. Rao, Saloni Potdar, Haoyu Wang
Hybrid model for short text classification with imbalanced data

Patent number: 11328221

Abstract: A method of text classification includes generating a text embedding vector representing a text sample and applying weights of a regression layer to the text embedding vector to generate a first data model output vector. The method also includes generating a plurality of prototype embedding vectors associated with a respective classification labels and comparing the plurality of prototype embedding vectors to the text embedding vector to generate a second data model output vector. The method further includes assigning a particular classification label to the text sample based on the first data model output vector, the second data model output vector, and one or more weighting values.

Type: Grant

Filed: April 9, 2019

Date of Patent: May 10, 2022

Assignee: International Business Machines Corporation

Inventors: Yang Yu, Ming Tan, Ravi Nair, Haoyu Wang, Saloni Potdar
Intent boundary segmentation for multi-intent utterances

Patent number: 11308944

Abstract: A mechanism is provided for implementing an intent segmentation mechanism that segments intent boundaries for multi-intent utterances in a conversational agent. For each term of a set of terms in the utterance from a real-time chat session, a set of adversarial utterances is generated for the utterance. An influence of changing each term is determined so as to identify a term importance value. Utilizing the term importance value, one or more of a change in ranking of the intent of the utterance or a change in confidence with regard to the intent of the utterance is identified. An entropy-based segmentation of the utterance into a plurality of candidate partitions is performed. An associated intent and entropy value are then assigned. Based on a segment with minimum entropy, a call associated with the real-time chat session is directed to an operation associated with an intent of the segment with minimum entropy.

Type: Grant

Filed: March 12, 2020

Date of Patent: April 19, 2022

Assignee: International Business Machines Corporation

Inventors: Ming Tan, Haoyu Wang, Saloni Potdar, Yang Yu, Navneet N. Rao, Haode Qi
Artificial intelligence based context dependent spellchecking

Patent number: 11301626

Abstract: Provided is a method, system, and computer program product for context-dependent spellchecking. The method comprises receiving context data to be used in spell checking. The method further comprises receiving a user input. The method further comprises identifying an out-of-vocabulary (OOV) word in the user input. An initial suggestion pool of candidate words is identified based, at least in part, on the context data. The method then comprises using a noisy channel approach to evaluate a probability that one or more of the candidate words of the initial suggestion pool is an intended word and should be used as a candidate for replacement of the OOV word. The method further comprises selecting one or more candidate words for replacement of the OOV word. The method further comprises outputting the one or more candidates.

Type: Grant

Filed: November 11, 2019

Date of Patent: April 12, 2022

Assignee: International Business Machines Corporation

Inventors: Panos Karagiannis, Ladislav Kune, Saloni Potdar, Haoyu Wang, Navneet N. Rao
Unintended bias detection in conversational agent platforms with machine learning model

Patent number: 11270080

Abstract: A mechanism is provided for implementing a bias detection mechanism that mitigates unintended bias in a conversational agent by leveraging conversational agent definitions, a conversational agent chat logs, and user satisfaction statistics. One or more protected attributes are identified within an utterance from the conversational agent chat logs. Using the identified protected attributes, a replacement utterance with a replacement term is generated for at least one of the identified protected attributes in the utterance. A score is generated for the utterance and the replacement utterance using utterance level relative term importance for protected attributes and regular terms in the utterance and the replacement utterance. Utilizing the scoring, a determination is made as to whether unintended bias exists within the utterance. Responsive to unintended bias being detected, an action is implemented that causes a change to a machine learning model used by the conversational agent.

Type: Grant

Filed: January 15, 2020

Date of Patent: March 8, 2022

Assignee: International Business Machines Corporation

Inventors: Navneet N. Rao, Ming Tan, Haode Qi, Yang Yu, Panos Karagiannis, Saloni Potdar
Routing text classifications within a cross-domain conversational service

Patent number: 11270077

Abstract: A computing device receives a natural language input from a user. The computing device routes the natural language input from an active domain node of multiple domain nodes of a multi-domain context-based hierarchy to a leaf node of the domain nodes by selecting a parent domain node in the hierarchy until an off-topic classifier labels the natural language input as in-domain and then selecting a subdomain node in the hierarchy until an in-domain classifier labels the natural language input with a classification label, each of the plurality of domain nodes comprising a respective off-topic classifier and a respective in-domain classifier trained for a respective domain node. The computing device outputs the classification label determined by the leaf node.

Type: Grant

Filed: May 13, 2019

Date of Patent: March 8, 2022

Assignee: International Business Machines Corporation

Inventors: Ming Tan, Ladislav Kunc, Yang Yu, Haoyu Wang, Saloni Potdar

1 2 3 next