Patents by Inventor Thanh Long Duong

Thanh Long Duong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

USING BACKPROPAGATION TO TRAIN A DIALOG SYSTEM

Publication number: 20230043528

Abstract: Techniques described herein use backpropagation to train one or more machine learning (ML) models of a dialog system. For instance, a method includes accessing seed data that includes training tuples, where each training tuple comprising a respective logical form. The method includes converting the logical form of a training tuple to a converted logical form, by applying to the logical form a text-to-speech (TTS) subsystem, an automatic speech recognition (ASR) subsystem, and a semantic parser of a dialog system. The method includes determining a training signal by using an objective function to compare the converted logical form to the logical form. The method further includes training the TTS subsystem, the ASR subsystem, and the semantic parser via backpropagation based on the training signal. As a result of the training by backpropagation, the machine learning models are tuned work effectively together within a pipeline of the dialog system.

Type: Application

Filed: October 26, 2022

Publication date: February 9, 2023

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson
Task-oriented dialog suitable for a standalone device

Patent number: 11574636

Abstract: Described herein are dialog systems, and techniques for providing such dialog systems, that are suitable for use on standalone computing devices. In some embodiments, a dialog system includes a dialog manager, which takes as input an input logical form, which may be a representation of user input. The dialog, manager may include a dialog state tracker, an execution subsystem, a dialog policy subsystem, and a context stack. The dialog state tracker may generate an intermediate logical form from the input logical form combined with a context from the context stack. The context stack may maintain a history of a current dialog, and thus, the intermediate logical form may include contextual information potentially missing from the input logical form. The execution subsystem may execute the intermediate logical form to produce an execution result, and the dialog policy subsystem may generate an output logical form based on the execution result.

Type: Grant

Filed: August 28, 2020

Date of Patent: February 7, 2023

Assignee: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson, Vu Cong Duy Hoang, Tuyen Quang Pham, Yu-Heng Hong, Vladislavs Dovgalecs, Guy Bashkansky, Jason Eric Black, Andrew David Bleeker, Serge Le Huitouze
Noise data augmentation for natural language processing

Patent number: 11538457

Abstract: Techniques for noise data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with noise text to generate an augmented training set of utterances; and training the intent classifier using the augmented training set of utterances. The augmenting includes: obtaining the noise text from a list of words, a text corpus, a publication, a dictionary, or any combination thereof irrelevant of original text within the utterances of the training set of utterances, and incorporating the noise text within the utterances relative to the original text in the utterances of the training set of utterances at a predefined augmentation ratio to generate augmented utterances.

Type: Grant

Filed: September 9, 2020

Date of Patent: December 27, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Yu-Heng Hong, Balakota Srinivas Vinnakota
Using backpropagation to train a dialog system

Patent number: 11508359

Abstract: Techniques described herein use backpropagation to train one or more machine learning (ML) models of a dialog system. For instance, a method includes accessing seed data that includes training tuples, where each training tuple comprising a respective logical form. The method includes converting the logical form of a training tuple to a converted logical form, by applying to the logical form a text-to-speech (TTS) subsystem, an automatic speech recognition (ASR) subsystem, and a semantic parser of a dialog system. The method includes determining a training signal by using an objective function to compare the converted logical form to the logical form. The method further includes training the TTS subsystem, the ASR subsystem, and the semantic parser via backpropagation based on the training signal. As a result of the training by backpropagation, the machine learning models are tuned work effectively together within a pipeline of the dialog system.

Type: Grant

Filed: August 25, 2020

Date of Patent: November 22, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Thanh Long Duong, Mark Edward Johnson
Semantic parser including a coarse semantic parser and a fine semantic parser

Patent number: 11501065

Abstract: Techniques for improving a semantic parser of a dialog system, by breaking the semantic parser into a coarse semantic parser and a fine semantic parser, are described. A method described herein includes accessing an utterance received in a dialog system. The utterance is a text-based natural language expression. The method further includes applying a coarse semantic parser to the utterance to determine an intermediate logical form for the utterance. The intermediate logical form indicates one or more intents in the utterance. The method further includes applying a fine semantic parser to the intermediate logical form to determine a logical form for the utterance. The logical form is a syntactic expression of the utterance according to an established grammar, and the logical form includes one or more parameters of the one or more intents. The logical form can be used to conduct a dialog with a user of the dialog system.

Type: Grant

Filed: August 13, 2020

Date of Patent: November 15, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Thanh Long Duong, Mark Edward Johnson
Implementing a correction model to reduce propagation of automatic speech recognition errors

Patent number: 11462208

Abstract: Some techniques described herein determine a correction model for a dialog system, such that the correction model corrects output from an automatic speech recognition (ASR) subsystem in the dialog system. A method described herein includes accessing training data. A first tuple of the training data includes an utterance, where the utterance is a textual representation of speech. The method further includes using an ASR subsystem of a dialog system to convert the utterance to an output utterance. The method further includes storing the output utterance in corrective training data that is based on the training data. The method further includes training a correction model based on the corrective training data, such that the correction model is configured to correct output from the ASR subsystem during operation of the dialog system.

Type: Grant

Filed: August 13, 2020

Date of Patent: October 4, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Thanh Long Duong, Mark Edward Johnson
MULTI-FEATURE BALANCING FOR NATURAL LANGUAGE PROCESSORS

Publication number: 20220229991

Abstract: Techniques are disclosed for systems including techniques for multi-feature balancing for natural langue processors. In an embodiment, a method includes receiving a natural language query to be processed by a machine learning model, the machine learning model utilizing a dataset of natural language phrases for processing natural language queries, determining, based on the machine learning model and the natural language query, a feature dropout value, generating, and based on the natural language query, one or more contextual features and one or more expressional features that may be input to the machine learning model, modifying at least one or the one or more contextual features and the one or more expressional features based on the feature dropout value to generate a set of input features for the machine learning model, and processing the set of input features to cause generating an output dataset for corresponding to the natural language query.

Type: Application

Filed: January 20, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Vishal Vishnoi, Mark Edward Johnson, Elias Luqman Jalaluddin, Tuyen Quang Pham, Cong Duy Vu Hoang, Poorya Zaremoodi, Srinivasa Phani Kumar Gadde, Aashna Devang Kanuga, Zikai Li, Yuanxu Wu
MULTI-FACTOR MODELLING FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220230000

Abstract: Techniques are disclosed for systems including techniques for multi-factor modelling for training and utilizing chatbot systems for natural language processing. In an embodiment, a method includes receiving a set of utterance data corresponding to a natural language-based query, determining one or more intents for the chatbot corresponds to a possible context for the natural language-based query and associated with a skill for the chatbot, generating one or more intent classification datasets, each intent classification dataset associated with a probability that the natural language query corresponds to an intent of the one or more intents, generating one or more transformed datasets each corresponding to a skill of one or more skills, determining a first skill of the one or more skills based on the one or more transformed datasets and processing, based on the determined first skill, the set of utterance data to resolve the natural language-based query.

Type: Application

Filed: January 18, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Ying Xu
CONTEXT TAG INTEGRATION WITH NAMED ENTITY RECOGNITION MODELS

Publication number: 20220229993

Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.

Type: Application

Filed: January 19, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Duy Vu, Tuyen Quang Pham, Cong Duy Vu Hoang, Srinivasa Phani Kumar Gadde, Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi
CONTENT TARGETING USING CONTENT CONTEXT AND USER PROPENSITY

Publication number: 20220207284

Abstract: Disclosed herein are techniques for machine-learning systems and methods for generating content objects using AI models. A method described herein includes predicting a propensity metric using a machine-learning propensity model describing a propensity of a user to interact with a tag. The method includes generating, using a content-tagging machine-learning model, a set of features characterizing the content object. The method includes determining, for each user in a set of users, a score that predicts a propensity of the user interacting with a particular content object. The method includes selecting a subset of users of the set of users based on the scores determined for the set of users. The method also includes facilitating output of the particular content object to each of the subset of users.

Type: Application

Filed: December 31, 2020

Publication date: June 30, 2022

Applicant: Oracle International Corporation

Inventors: Venkata Chandrashekar Duvvuri, Srinivasa Golla, Thanh Long Duong
METHOD AND SYSTEM FOR OVER-PREDICTION IN NEURAL NETWORKS

Publication number: 20220172021

Abstract: Disclosed herein are techniques for addressing an overconfidence problem associated with machine learning models in chatbot systems. For each layer of a plurality of layers of a machine learning model, a distribution of confidence scores is generated for a plurality of predictions with respect to an input utterance. A prediction is determined for each layer of the machine learning model based on the distribution of confidence scores generated for the layer. Based on the predictions, an overall prediction of the machine learning model is determined. A subset of the plurality of layers are iteratively processed to identify a layer whose assigned prediction satisfies a criterion. A confidence score associated with the assigned prediction of the layer of the machine learning model is assigned as an overall confidence score to be associated with the overall prediction of the machine learning model.

Type: Application

Filed: November 16, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Cong Duy Vu Hoang, Thanh Tien Vu, Poorya Zaremoodi, Ying Xu, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
KEYWORD DATA AUGMENTATION TOOL FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171930

Abstract: Techniques for keyword data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: identifying keywords within utterances of the training set of utterances, generating a set of OOD examples with the identified keywords, filtering out OOD examples from the set of OOD examples that have a context substantially similar to context of the utterances of the training set of utterances, and incorporating the set of OOD examples without the filtered OOD examples into the training set of utterances to generate an augmented training set of utterances. Thereafter, the machine-learning model is trained using the augmented training set of utterances.

Type: Application

Filed: October 28, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Poorya Zaremoodi, Gautam Singaraju, Ying Xu, Vladislav Blinov
DISTANCE-BASED LOGIT VALUE FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171947

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

Type: Application

Filed: November 30, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
ENHANCED LOGITS FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171946

Abstract: Techniques for using enhanced logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system and inputting the utterance into a machine-learning model including a series of network layers. A final network layer of the series of network layers can include a logit function. The machine-learning model can map a first probability for a resolvable class to a first logit value using the logit function. The machine-learning model can map a second probability for a unresolvable class to an enhanced logit value. The method can also include the chatbot system classifying the utterance as the resolvable class or the unresolvable class based on the first logit value and the enhanced logit value.

Type: Application

Filed: November 29, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
OUT-OF-DOMAIN DATA AUGMENTATION FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171938

Abstract: Techniques for out-of-domain data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, and augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: generating a data set of OOD examples, filtering out OOD examples from the data set of OOD examples, determining a difficulty value for each OOD example remaining within the filtered data set of the OOD examples, and generating augmented batches of utterances comprising utterances from the training set of utterances and utterances from the filtered data set of the OOD based on the difficulty value for each OOD. Thereafter, the machine-learning model is trained using the augmented batches of utterances in accordance with a curriculum training protocol.

Type: Application

Filed: October 28, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Poorya Zaremoodi, Gautam Singaraju, Ying Xu, Vladislav Blinov, Yu-Heng Hong
AUTOMATIC OUT OF SCOPE TRANSITION FOR CHATBOT

Publication number: 20220100961

Abstract: Techniques for automatically switching between chatbot skills in the same domain. In one particular aspect, a method is provided that includes receiving an utterance from a user within a chatbot session, where a current skill context is a first skill and a current group context is a first group, inputting the utterance into a candidate skills model for the first group, obtaining, using the candidate skills model, a ranking of skills within the first group, determining, based on the ranking of skills, a second skill is a highest ranked skill, changing the current skill context of the chatbot session to the second skill, inputting the utterance into a candidate flows model for the second skill, obtaining, using the candidate flows model, a ranking of intents within the second skill that match the utterance, and determining, based on the ranking of intents, an intent that is a highest ranked intent.

Type: Application

Filed: September 30, 2021

Publication date: March 31, 2022

Applicant: Oracle International Corporation

Inventors: Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Srinivasa Phani Kumar Gadde, Crystal C. Pan, Mark Edward Johnson, Thanh Long Duong, Balakota Srinivas Vinnakota, Manish Parekh
TECHNIQUES FOR OUT-OF-DOMAIN (OOD) DETECTION

Publication number: 20210303798

Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances.

Type: Application

Filed: March 30, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Crystal C. Pan, Vladislav Blinov, Cong Duy Vu Hoang, Elias Luqman Jalaluddin, Duy Vu, Balakota Srinivas Vinnakota
METHOD AND SYSTEM FOR CONSTRAINT BASED HYPERPARAMETER TUNING

Publication number: 20210304003

Abstract: Techniques are disclosed for tuning hyperparameters of a model. Datasets are obtained for training the model and metrics are selected for evaluating performance of the model. Each metric is assigned a weight specifying an importance to the performance of the model. A function is created that measures performance based on the weighted metrics. Hyperparameters are tuned to optimize the model performance. Tuning the hyperparameters includes: (i) training the model that is configured based on a current values for the hyperparameters; (ii) evaluating a performance of the model using the function; (iii) determining whether the model is optimized for the metrics; (iv) in response to the model not being optimized, searching for a new values for the hyperparameters, reconfiguring the model with the new values, and repeating steps (i)-(iii) using the reconfigured model; and (v) in response to the model being optimized for the metrics, providing a trained model.

Type: Application

Filed: March 29, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Mark Edward Johnson, Thanh Long Duong, Vishal Vishnoi, Balakota Srinivas Vinnakota, Tuyen Quang Pham, Cong Duy Vu Hoang
METHOD AND SYSTEM FOR TARGET BASED HYPER-PARAMETER TUNING

Publication number: 20210304074

Abstract: Techniques are disclosed for tuning hyperparameters of a machine-learning model. A plurality of metrics are selected for which hyperparameters of the machine-learning model are to be tuned. Each metric is associated with a plurality of specification parameters including a target score, a penalty factor, and a bonus factor. The plurality of specification parameters are configured for each metric in accordance with a first criterion. The machine-learning model is evaluated using one or more validation datasets to obtain a metric score. A weighted loss function is formulated based on a difference between the metric score and the target score of each metric, the penalty factor or the bonus factor. The hyperparameters associated with the machine-learning model are tuned in order to optimize the weighted loss function. In response to the weighted loss function being optimized, the machine-learning model is provided as a validated machine-learning model.

Type: Application

Filed: March 29, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Poorya Zaremoodi, Ying Xu, Thanh Tien Vu, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson, Xin Xu, Cong Duy Vu Hoang
BATCHING TECHNIQUES FOR HANDLING UNBALANCED TRAINING DATA FOR A CHATBOT

Publication number: 20210304075

Abstract: The present disclosure relates to chatbot systems, and more particularly, to batching techniques for handling unbalanced training data when training a model such that bias is removed from the trained machine learning model when performing inference. In an embodiment, a plurality of raw utterances is obtained. A bias eliminating distribution is determined and a subset of the plurality of raw utterances is batched according to the bias-reducing distribution. The resulting unbiased training data may be input into a prediction model for training the prediction model. The trained prediction model may be obtained and utilized to predict unbiased results from new inputs received by the trained prediction model.

Type: Application

Filed: March 30, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Balakota Srinivas Vinnakota, Yu-Heng Hong, Elias Luqman Jalaluddin

prev 1 2 3 4 5 next