Patents by Inventor Thanh Long Duong

Thanh Long Duong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240232187
    Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.
    Type: Application
    Filed: May 22, 2023
    Publication date: July 11, 2024
    Applicant: Oracle International Corporation
    Inventors: Chang Xu, Poorya Zaremoodi, Cong Duy Vu Hoang, Nitika Mathur, Philip Arthur, Steve Wai-Chun Siu, Aashna Devang Kanuga, Gioacchino Tangari, Mark Edward Johnson, Thanh Long Duong, Vishal Vishnoi, Stephen Andrew McRitchie, Christopher Mark Broadbent
  • Publication number: 20240232541
    Abstract: Techniques for using enhanced logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system and inputting the utterance into a machine-learning model including a series of network layers. A final network layer of the series of network layers can include a logit function. The machine-learning model can map a first probability for a resolvable class to a first logit value using the logit function. The machine-learning model can map a second probability for a unresolvable class to an enhanced logit value. The method can also include the chatbot system classifying the utterance as the resolvable class or the unresolvable class based on the first logit value and the enhanced logit value.
    Type: Application
    Filed: March 20, 2024
    Publication date: July 11, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
  • Patent number: 12026468
    Abstract: Techniques for out-of-domain data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, and augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: generating a data set of OOD examples, filtering out OOD examples from the data set of OOD examples, determining a difficulty value for each OOD example remaining within the filtered data set of the OOD examples, and generating augmented batches of utterances comprising utterances from the training set of utterances and utterances from the filtered data set of the OOD based on the difficulty value for each OOD. Thereafter, the machine-learning model is trained using the augmented batches of utterances in accordance with a curriculum training protocol.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: July 2, 2024
    Assignee: Oracle International Corporation
    Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Poorya Zaremoodi, Gautam Singaraju, Ying Xu, Vladislav Blinov, Yu-Heng Hong
  • Patent number: 12019994
    Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.
    Type: Grant
    Filed: November 30, 2021
    Date of Patent: June 25, 2024
    Assignee: Oracle International Corporation
    Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
  • Patent number: 12014146
    Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances.
    Type: Grant
    Filed: August 2, 2023
    Date of Patent: June 18, 2024
    Assignee: Oracle International Corporation
    Inventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Crystal C. Pan, Vladislav Blinov, Cong Duy Vu Hoang, Elias Luqman Jalaluddin, Duy Vu, Balakota Srinivas Vinnakota
  • Patent number: 12001775
    Abstract: A data corpus is partitioned into text strings for header classification. A group characteristic is computed for a text string, and whether the group characteristic satisfies a group characteristic criterion is determined. The text string may be disqualified from header classification if the group characteristic criterion is not satisfied, or one or more font characteristics may be determined for the text string if the group characteristic criterion is satisfied. A font characteristic that meets one or more prevalence criteria may be identified and evaluated to determine whether the font characteristic meets at least one font characteristic criterion. The text string may be disqualified from header classification if the font characteristic criterion is not satisfied, or if the font characteristic meets the font characteristic criterion, the text string is classified as a header, and tagged content is generated by applying a header tag to the text string.
    Type: Grant
    Filed: June 13, 2023
    Date of Patent: June 4, 2024
    Assignee: Oracle International Corporation
    Inventors: Sagar Gollamudi, Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
  • Publication number: 20240169155
    Abstract: Techniques for automatically switching between chatbot skills in the same domain. In one particular aspect, a method is provided that includes receiving an utterance from a user within a chatbot session, where a current skill context is a first skill and a current group context is a first group, inputting the utterance into a candidate skills model for the first group, obtaining, using the candidate skills model, a ranking of skills within the first group, determining, based on the ranking of skills, a second skill is a highest ranked skill, changing the current skill context of the chatbot session to the second skill, inputting the utterance into a candidate flows model for the second skill, obtaining, using the candidate flows model, a ranking of intents within the second skill that match the utterance, and determining, based on the ranking of intents, an intent that is a highest ranked intent.
    Type: Application
    Filed: January 26, 2024
    Publication date: May 23, 2024
    Applicant: Oracle International Corporation
    Inventors: Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Srinivasa Phani Kumar Gadde, Crystal C. Pan, Mark Edward Johnson, Thanh Long Duong, Balakota Srinivas Vinnakota, Manish Parekh
  • Publication number: 20240169161
    Abstract: Obtaining collections of sentences in different languages that are usable for training models in various applications of artificial intelligence is provided. A method is provided that obtains, from text corpus, webpages in a plurality of languages, each of the webpages corresponding to an URL; obtains annotations for each of the webpages based on its URL, to obtain annotated data entries corresponding to the webpages, each of the annotated data entries including a classification label corresponding to a sub-topic of one of a plurality of topics, where each of the plurality of topics includes a corresponding plurality of sub-topics; filters the annotated data entries to obtain topic-specific content in a target language based on the classification labels, the topic-specific content corresponding to one or more sub-topics; performs post-processing on the topic-specific content to obtain result data; and outputs the result data for the topic.
    Type: Application
    Filed: August 21, 2023
    Publication date: May 23, 2024
    Applicant: Oracle International Corporation
    Inventors: Paria Jamshid Lou, Gioacchino Tangari, Jason Black, Bhagya Gayathri Hettige, Xu Zhong, Poorya Zaremoodi, Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240144923
    Abstract: Disclosed herein are techniques for using a generative adversarial network (GAN) to train a semantic parser of a dialog system. A method described herein involves accessing seed data that includes seed tuples. Each seed tuple includes a respective seed utterance and a respective seed logical form corresponding to the respective seed utterance. The method further includes training a semantic parser and a discriminator in a GAN. The semantic parser learns to map utterances to logical forms based on output from the discriminator, and the discriminator learns to recognize authentic logical forms based on output from the semantic parser. The semantic parser may then be integrated into a dialog system.
    Type: Application
    Filed: January 11, 2024
    Publication date: May 2, 2024
    Applicant: Oracle International Corporation
    Inventors: Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240143934
    Abstract: A method includes accessing document including sentences, document being associated with configuration flag indicating whether ABSA, SLSA, or both are to be performed; inputting the document into language model that generates chunks of token embeddings for the document; and, based on the configuration flag, performing at least one from among the ABSA and the SLSA by inputting the chunks of token embeddings into a multi-task model. When performing the SLSA, a part of token embeddings in each of the chunks is masked, and the masked token embeddings do not belong to a particular sentence on which the SLSA is performed.
    Type: Application
    Filed: October 12, 2023
    Publication date: May 2, 2024
    Applicant: Oracle International Corporation
    Inventors: Poorya Zaremoodi, Duy Vu, Nagaraj N. Bhat, Srijon Sarkar, Varsha Kuppur Rajendra, Thanh Long Duong, Mark Edward Johnson, Pramir Sarkar, Shahid Reza
  • Patent number: 11972755
    Abstract: Techniques for noise data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with noise text to generate an augmented training set of utterances; and training the intent classifier using the augmented training set of utterances. The augmenting includes: obtaining the noise text from a list of words, a text corpus, a publication, a dictionary, or any combination thereof irrelevant of original text within the utterances of the training set of utterances, and incorporating the noise text within the utterances relative to the original text in the utterances of the training set of utterances at a predefined augmentation ratio to generate augmented utterances.
    Type: Grant
    Filed: November 23, 2022
    Date of Patent: April 30, 2024
    Assignee: Oracle International Corporation
    Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Yu-Heng Hong, Balakota Srinivas Vinnakota
  • Patent number: 11972220
    Abstract: Techniques for using enhanced logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system and inputting the utterance into a machine-learning model including a series of network layers. A final network layer of the series of network layers can include a logit function. The machine-learning model can map a first probability for a resolvable class to a first logit value using the logit function. The machine-learning model can map a second probability for a unresolvable class to an enhanced logit value. The method can also include the chatbot system classifying the utterance as the resolvable class or the unresolvable class based on the first logit value and the enhanced logit value.
    Type: Grant
    Filed: November 29, 2021
    Date of Patent: April 30, 2024
    Assignee: Oracle International Corporation
    Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240135116
    Abstract: A computer-implemented method includes: accessing a plurality of datasets, where each dataset of the plurality of datasets includes training examples; selecting datasets that include the training examples in a source language and a target language; and sampling, based on a sampling weight that is determined for each of the selected datasets, the training examples from the selected datasets to generate the training batches; training an ML model for performing at least a first task using the training examples of the training batches, by interleavingly inputting the training batches to the ML model; and outputting the trained ML model configured to perform the at least the first task on input utterances provided in at least one among the source language and the target language. The sampling weight is determined for each of the selected datasets based on one or more attributes common to the training examples of the selected dataset.
    Type: Application
    Filed: October 12, 2023
    Publication date: April 25, 2024
    Applicant: Oracle International Corporation
    Inventors: Duy Vu, Poorya Zaremoodi, Nagaraj N. Bhat, Srijon Sarkar, Varsha Kuppur Rajendra, Thanh Long Duong, Mark Edward Johnson, Pramir Sarkar, Shahid Reza
  • Publication number: 20240134850
    Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.
    Type: Application
    Filed: May 21, 2023
    Publication date: April 25, 2024
    Applicant: Oracle International Corporation
    Inventors: Chang Xu, Poorya Zaremoodi, Cong Duy Vu Hoang, Nitika Mathur, Philip Arthur, Steve Wai-Chun Siu, Aashna Devang Kanuga, Gioacchino Tangari, Mark Edward Johnson, Thanh Long Duong, Vishal Vishnoi, Stephen Andrew McRitchie, Christopher Mark Broadbent
  • Publication number: 20240126999
    Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.
    Type: Application
    Filed: December 19, 2023
    Publication date: April 18, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240126795
    Abstract: Techniques are disclosed herein for integrating document question answering in an artificial intelligence-based platform, such as a chatbot system. The techniques include receiving a query from a user, rewriting the query to include one or more specific descriptors, computing an embedding vector for the rewritten query, retrieving one or more textual passages from a document store utilizing the embedding vector for the rewritten query, determining one or more answers to the rewritten query within the one or more textual passages, and returning the one or more answers.
    Type: Application
    Filed: October 13, 2023
    Publication date: April 18, 2024
    Applicant: Oracle International Corporation
    Inventors: Xu Zhong, Thanh Long Duong, Mark Edward Johnson, Charles Woodrow Dickstein, King-Hwa Lee, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Christopher Kennewick, Balakota Srinivas Vinnakota, Raefer Christopher Gabriel
  • Publication number: 20240126800
    Abstract: Techniques for maintaining list-type text formatting when converting content from a source content format to a destination content format are disclosed. A system generates text content by applying text formatting tags to segments of characters obtained from a source electronic document. The system parses a static-display type source electronic document to obtain character data of the characters in the source document. The system analyzes the parsed data to identify text arranged in a list-type text format in the source document. The system generates text content in a destination content format different from the source format by applying tags to segments of the text content designating the segments items in a list.
    Type: Application
    Filed: May 31, 2023
    Publication date: April 18, 2024
    Applicant: Oracle International Corporation
    Inventors: Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
  • Publication number: 20240095584
    Abstract: Techniques are disclosed herein for objective function optimization in target based hyperparameter tuning. In one aspect, a computer-implemented method is provided that includes initializing a machine learning algorithm with a set of hyperparameter values and obtaining a hyperparameter objective function that comprises a domain score for each domain that is calculated based on a number of instances within an evaluation dataset that are correctly or incorrectly predicted by the machine learning algorithm during a given trial. For each trial of a hyperparameter tuning process: training the machine learning algorithm to generate a machine learning model, running the machine learning model in different domains using the set of hyperparameter values, evaluating the machine learning model for each domain, and once the machine learning model has reached convergence, outputting at least one machine learning model.
    Type: Application
    Filed: May 15, 2023
    Publication date: March 21, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Vladislav Blinov, Ahmed Ataallah Ataallah Abobakr, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Poorya Zaremoodi, Umanga Bista
  • Publication number: 20240095454
    Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.
    Type: Application
    Filed: November 28, 2023
    Publication date: March 21, 2024
    Applicant: Oracle International Corporation
    Inventors: Duy Vu, Tuyen Quang Pham, Cong Duy Vu Hoang, Srinivasa Phani Kumar Gadde, Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi
  • Publication number: 20240086767
    Abstract: Techniques are disclosed herein for continuous hyperparameter tuning with automatic domain weight adjustment based on periodic performance checkpoints. In one aspect, a method is provided that includes initializing a machine learning algorithm with a set of hyperparameter values and obtaining a hyperparameter objective function that is defined at least in part on a plurality of domains of a search space that is associated with the machine learning algorithm. For each trial of a hyperparameter tuning process: running the machine learning algorithm in different domains using the set of hyperparameter values, periodically checking a performance of the machine learning algorithm in the different domains based on the hyperparameter objective function; and continuing hyperparameter tuning with a new set of hyperparameter values after automatically adjusting the domain weights according to a regression status of the different domains.
    Type: Application
    Filed: April 3, 2023
    Publication date: March 14, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Vladislav Blinov, Ahmed Ataallah Ataallah Abobakr, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Umanga Bista