Patents by Inventor Vishal Vishnoi

Vishal Vishnoi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ENTITY LEVEL DATA AUGMENTATION IN CHATBOTS FOR ROBUST NAMED ENTITY RECOGNITION

Publication number: 20240013780

Abstract: Techniques for data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes generating a list of values to cover for an entity, selecting utterances from a set of data that have context for the entity, converting the utterances into templates, where each template of the templates comprises a slot that maps to the list of values for the entity, selecting a template from the templates, selecting a value from the list of values based on the mapping between the slot within the selected template and the list of values for the entity; and creating an artificial utterance based on the selected template and the selected value, where the creating the artificial utterance comprises inserting the selected value into the slot of the selected template that maps to the list of values for the entity.

Type: Application

Filed: September 21, 2023

Publication date: January 11, 2024

Applicant: Oracle International Corporation

Inventors: Srinivasa Phani Kumar Gadde, Yuanxu Wu, Aashna Devang Kanuga, Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson
Context tag integration with named entity recognition models

Patent number: 11868727

Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.

Type: Grant

Filed: January 19, 2022

Date of Patent: January 9, 2024

Assignee: Oracle International Corporation

Inventors: Duy Vu, Tuyen Quang Pham, Cong Duy Vu Hoang, Srinivasa Phani Kumar Gadde, Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi
TECHNIQUES FOR TWO-STAGE ENTITY-AWARE DATA AUGMENTATION

Publication number: 20230419040

Abstract: Novel techniques are described for data augmentation using a two-stage entity-aware augmentation to improve model robustness to entity value changes for intent prediction.

Type: Application

Filed: February 1, 2023

Publication date: December 28, 2023

Applicant: Oracle International Corporation

Inventors: Ahmed Ataallah Ataallah Abobakr, Shivashankar Subramanian, Ying Xu, Vladislav Blinov, Umanga Bista, Tuyen Quang Pham, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Vanshika Sridharan, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
TECHNIQUES FOR POSITIVE ENTITY AWARE AUGMENTATION USING TWO-STAGE AUGMENTATION

Publication number: 20230419052

Abstract: Novel techniques are described for positive entity-aware augmentation using a two-stage augmentation to improve the stability of the model to entity value changes for intent prediction. In one particular aspect, a method is provided that includes accessing a first set of training data for an intent prediction model, the first set of training data comprising utterances and intent labels; applying one or more positive data augmentation techniques to the first set of training data, depending on the tuning requirements for hyper-parameters, to result in a second set of training data, where the positive data augmentation techniques comprise Entity-Aware (“EA”) technique and a two-stage augmentation technique; combining the first set of training data and the second set of training data to generate expanded training data; and training the intent prediction model using the expanded training data.

Type: Application

Filed: February 1, 2023

Publication date: December 28, 2023

Applicant: Oracle International Corporation

Inventors: Ahmed Ataallah Ataallah Abobakr, Shivashankar Subramanian, Ying Xu, Vladislav Blinov, Umanga Bista, Tuyen Quang Pham, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Vanshika Sridharan, Xin XU, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
TECHNIQUES FOR NEGATIVE ENTITY AWARE AUGMENTATION

Publication number: 20230419127

Abstract: Novel techniques are described for negative entity-aware augmentation using a two-stage augmentation to improve the stability of the model to entity value changes for intent prediction. In some embodiments, a method comprises accessing a first set of training data for an intent prediction model, the first set of training data comprising utterances and intent labels; applying one or more negative entity-aware data augmentation techniques to the first set of training data, depending on the tuning requirements for hyper-parameters, to result in a second set of training data, where the one or more negative entity-aware data augmentation techniques comprise Keyword Augmentation Technique (“KAT”) plus entity without context technique and KAT plus entity in random context as OOD technique; combining the first set of training data and the second set of training data to generate expanded training data; and training the intent prediction model using the expanded training data.

Type: Application

Filed: February 1, 2023

Publication date: December 28, 2023

Applicant: Oracle International Corporation

Inventors: Ahmed Ataallah Ataallah Abobakr, Shivashankar Subramanian, Ying Xu, Vladislav Blinov, Umanga Bista, Tuyen Quang Pham, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Vanshika Sridharan, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
TRAINING DATA GENERATION TO FACILITATE FINE-TUNING EMBEDDING MODELS

Publication number: 20230376700

Abstract: Techniques are provided for generating training data to facilitate fine-tuning embedding models. Training data including anchor utterances is obtained. Positive utterances and negative utterances are generated from the anchor utterances. Tuples including the anchor utterances, the positive utterances, and the negative utterances are formed. Embeddings for the tuples are generated and a pre-trained embedding model is fine-tuned based on the embeddings. The fine-tuned model can be deployed to a system.

Type: Application

Filed: May 9, 2023

Publication date: November 23, 2023

Applicant: Oracle International Corporation

Inventors: Umanga Bista, Vladislav Blinov, Mark Edward Johnson, Ahmed Ataallah Ataallah Abobakr, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Elias Luqman Jalaluddin, Xin Xu, Shivashankar Subramanian
TECHNIQUES FOR OUT-OF-DOMAIN (OOD) DETECTION

Publication number: 20230376696

Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances.

Type: Application

Filed: August 2, 2023

Publication date: November 23, 2023

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Crystal C. Pan, Vladislav Blinov, Cong Duy Vu Hoang, Elias Luqman Jalaluddin, Duy Vu, Balakota Srinivas Vinnakota
Entity level data augmentation in chatbots for robust named entity recognition

Patent number: 11804219

Abstract: Techniques for data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes generating a list of values to cover for an entity, selecting utterances from a set of data that have context for the entity, converting the utterances into templates, where each template of the templates comprises a slot that maps to the list of values for the entity, selecting a template from the templates, selecting a value from the list of values based on the mapping between the slot within the selected template and the list of values for the entity; and creating an artificial utterance based on the selected template and the selected value, where the creating the artificial utterance comprises inserting the selected value into the slot of the selected template that maps to the list of values for the entity.

Type: Grant

Filed: June 11, 2021

Date of Patent: October 31, 2023

Assignee: Oracle International Corporation

Inventors: Srinivasa Phani Kumar Gadde, Yuanxu Wu, Aashna Devang Kanuga, Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson
Techniques for out-of-domain (OOD) detection

Patent number: 11763092

Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances.

Type: Grant

Filed: March 30, 2021

Date of Patent: September 19, 2023

Assignee: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Crystal C. Pan, Vladislav Blinov, Cong Duy Vu Hoang, Elias Luqman Jalaluddin, Duy Vu, Balakota Srinivas Vinnakota
ROUTING FOR CHATBOTS

Publication number: 20230252975

Abstract: Techniques are described for invoking and switching between chatbots of a chatbot system. In some embodiments, the chatbot system is capable of routing an utterance received while a user is already interacting with a first chatbot in the chatbot system. For instance, the chatbot system may identify a second chatbot based on determining that (i) such an utterance is an invalid input to the first chatbot or (ii) that the first chatbot is attempting to route the utterance to a destination associated with the first chatbot. Identifying the second chatbot can involve computing, using a predictive model, separate confidence scores for the first chatbot and the second chatbot, and then determining that a confidence score for the second chatbot satisfies one or more confidence score thresholds. The utterance is then routed to the second chatbot based on the identifying of the second chatbot.

Type: Application

Filed: April 19, 2023

Publication date: August 10, 2023

Applicant: Oracle International Corporation

Inventors: Vishal Vishnoi, Xin Xu, Srinivasa Phani Kumar Gadde, Fen Wang, Muruganantham Chinnananchi, Manish Parekh, Stephen Andrew McRitchie, Jae Min John, Crystal C. Pan, Gautam Singaraju, Saba Amsalu Teserra
TRANSFORMING NATURAL LANGUAGE TO STRUCTURED QUERY LANGUAGE BASED ON SCALABLE SEARCH AND CONTENT-BASED SCHEMA LINKING

Publication number: 20230186025

Abstract: Techniques for preprocessing data assets to be used in a natural language to logical form model based on scalable search and content-based schema linking. In one particular aspect, a method includes accessing an utterance, classifying named entities within the utterance into predefined classes, searching value lists within the database schema using tokens from the utterance to identify and output value matches including: (i) any value within the value lists that matches a token from the utterance and (ii) any attribute associated with a matching value, generating a data structure by organizing and storing: (i) each of the named entities and an assigned class for each of the named entities, (ii) each of the value matches and the token matching each of the value matches, and (iii) the utterance, in a predefined format for the data structure, and outputting the data structure.

Type: Application

Filed: December 13, 2022

Publication date: June 15, 2023

Applicant: Oracle International Corporation

Inventors: Jae Min John, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota, Shivashankar Subramanian, Cong Duy Vu Hoang, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Nitika Mathur, Aashna Devang Kanuga, Philip Arthur, Gioacchino Tangari, Steve Wai-Chun Siu
DATA MANUFACTURING FRAMEWORKS FOR SYNTHESIZING SYNTHETIC TRAINING DATA TO FACILITATE TRAINING A NATURAL LANGUAGE TO LOGICAL FORM MODEL

Publication number: 20230186161

Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.

Type: Application

Filed: December 13, 2022

Publication date: June 15, 2023

Applicant: Oracle International Corporation

Inventors: Philip Arthur, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota, Cong Duy Vu Hoang, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Aashna Devang Kanuga
DATA MANUFACTURING FRAMEWORKS FOR SYNTHESIZING SYNTHETIC TRAINING DATA TO FACILITATE TRAINING A NATURAL LANGUAGE TO LOGICAL FORM MODEL

Publication number: 20230186026

Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.

Type: Application

Filed: December 13, 2022

Publication date: June 15, 2023

Applicant: Oracle International Corporation

Inventors: Philip Arthur, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota, Cong Duy Vu Hoang, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Aashna Devang Kanuga
DATA MANUFACTURING FRAMEWORKS FOR SYNTHESIZING SYNTHETIC TRAINING DATA TO FACILITATE TRAINING A NATURAL LANGUAGE TO LOGICAL FORM MODEL

Publication number: 20230185834

Abstract: Techniques are disclosed herein for synthesizing synthetic training data to facilitate training a natural language to logical form model. In one aspect, training data can be synthesized from original under a framework based on templates and a synchronous context-free grammar. In one aspect, training data can be synthesized under a framework based on a probabilistic context-free grammar and a translator. In one aspect, training data can be synthesized under a framework based on tree-to-string translation. In one aspect, the synthetic training data can be combined with original training data in order to train a machine learning model to translate an utterance to a logical form.

Type: Application

Filed: December 13, 2022

Publication date: June 15, 2023

Applicant: Oracle International Corporation

Inventors: Philip Arthur, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota, Cong Duy Vu Hoang, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Aashna Devang Kanuga
TRANSFORMING NATURAL LANGUAGE TO STRUCTURED QUERY LANGUAGE BASED ON MULTI-TASK LEARNING AND JOINT TRAINING

Publication number: 20230185799

Abstract: Techniques are disclosed for training a model, using multi-task learning, to transform natural language to a logical form. In one particular aspect, a method includes accessing a first set of utterances that have non-follow-up utterances and a second set of utterances that have initial utterances and associated one or more follow-up utterances and training a model for translating an utterance to a logical form. The training is a joint training process that includes calculating a first loss for a first semantic parsing task based on one or more non-follow-up utterances from the first set of utterances, calculating a second loss for a second semantic parsing task based on one or more initial utterances and associated one or more follow-up utterances from the second set of utterances, combining the first and second losses to obtain a final loss, and updating model parameters of the model based on the final loss.

Type: Application

Filed: December 13, 2022

Publication date: June 15, 2023

Applicant: Oracle International Corporation

Inventors: Cong Duy Vu Hoang, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Balakota Srinivas Vinnakota
NOISE DATA AUGMENTATION FOR NATURAL LANGUAGE PROCESSING

Publication number: 20230169955

Abstract: Techniques for noise data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with noise text to generate an augmented training set of utterances; and training the intent classifier using the augmented training set of utterances. The augmenting includes: obtaining the noise text from a list of words, a text corpus, a publication, a dictionary, or any combination thereof irrelevant of original text within the utterances of the training set of utterances, and incorporating the noise text within the utterances relative to the original text in the utterances of the training set of utterances at a predefined augmentation ratio to generate augmented utterances.

Type: Application

Filed: November 23, 2022

Publication date: June 1, 2023

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Yu-Heng Hong, Balakota Srinivas Vinnakota
Routing for chatbots

Patent number: 11657797

Abstract: Techniques are described for invoking and switching between chatbots of a chatbot system. In some embodiments, the chatbot system is capable of routing an utterance received while a user is already interacting with a first chatbot in the chatbot system. For instance, the chatbot system may identify a second chatbot based on determining that (i) such an utterance is an invalid input to the first chatbot or (ii) that the first chatbot is attempting to route the utterance to a destination associated with the first chatbot. Identifying the second chatbot can involve computing, using a predictive model, separate confidence scores for the first chatbot and the second chatbot, and then determining that a confidence score for the second chatbot satisfies one or more confidence score thresholds. The utterance is then routed to the second chatbot based on the identifying of the second chatbot.

Type: Grant

Filed: April 23, 2020

Date of Patent: May 23, 2023

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Vishal Vishnoi, Xin Xu, Srinivasa Phani Kumar Gadde, Fen Wang, Muruganantham Chinnananchi, Manish Parekh, Stephen Andrew McRitchie, Jae Min John, Crystal C. Pan, Gautam Singaraju, Saba Amsalu Teserra
Stop word data augmentation for natural language processing

Patent number: 11651768

Abstract: Techniques for stop word data augmentation for training chatbot systems in natural language processing. In one particular aspect, a computer-implemented method includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with stop words to generate an augmented training set of out-of-domain utterances for an unresolved intent category corresponding to an unresolved intent; and training the intent classifier using the training set of utterances and the augmented training set of out-of-domain utterances. The augmenting includes: selecting one or more utterances from the training set of utterances, and for each selected utterance, preserving existing stop words within the utterance and replacing at least one non-stop word within the utterance with a stop word or stop word phrase selected from a list of stop words to generate an out-of-domain utterance.

Type: Grant

Filed: September 9, 2020

Date of Patent: May 16, 2023

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Vishal Vishnoi, Mark Edward Johnson, Elias Luqman Jalaluddin, Balakota Srinivas Vinnakota, Thanh Long Duong, Gautam Singaraju
WIDE AND DEEP NETWORK FOR LANGUAGE DETECTION USING HASH EMBEDDINGS

Publication number: 20230141853

Abstract: Techniques disclosed herein relate generally to language detection. In one particular aspect, a method is provided that includes obtaining a sequence of n-grams of a textual unit; using an embedding layer to obtain an ordered plurality of embedding vectors for the sequence of n-grams; using a deep network to obtain an encoded vector that is based on the ordered plurality of embedding vectors; and using a classifier to obtain a language prediction for the textual unit that is based on the encoded vector. The deep network includes an attention mechanism, and using the embedding layer to obtain the ordered plurality of embedding vectors comprises, for each n-gram in the sequence of n-grams: obtaining hash values for the n-gram; based on the hash values, selecting component vectors from among the plurality of component vectors; and obtaining an embedding vector for the n-gram that is based on the component vectors.

Type: Application

Filed: November 4, 2022

Publication date: May 11, 2023

Applicant: Oracle International Corporation

Inventors: Thanh Tien Vu, Poorya Zaremoodi, Duy Vu, Mark Edward Johnson, Thanh Long Duong, Xu Zhong, Vladislav Blinov, Cong Duy Vu Hoang, Yu-Heng Hong, Vinamr Goel, Philip Victor Ogren, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
PROHIBITING INCONSISTENT NAMED ENTITY RECOGNITION TAG SEQUENCES

Publication number: 20230136965

Abstract: In some aspects, a computer obtains a trained conditional random field (CRF) model comprising a set of model parameters learned from training data and stored in a transition matrix. Tag sequences, inconsistent with the tag sequence logic, are identified for the tags within the transition matrix. setting, within the transition matrix, a cost associated with transitioning between the pair of tags to be equal to a predefined hyperparameter value that penalizes the transitioning between the inconsistent pair of tags. The CRF model receives a string of text comprising one or more named entities. The CRF model inputs the string of text into the CRF model having the cost associated with the transitioning between the pair of tags set equal to the predefined hyperparameter value. The CRF model classifies the words within the string of text into different classes which might include the one or more named entities.

Type: Application

Filed: October 31, 2022

Publication date: May 4, 2023

Applicant: Oracle International Corporation

Inventors: Thanh Tien Vu, Tuyen Quang Pham, Mark Edward Johnson, Thanh Long Duong, Aashna Devang Kanuga, Srinivasa Phani Kumar Gadde, Vishal Vishnoi

prev 1 2 3 4 next