Patents by Inventor Thanh Long Duong

Thanh Long Duong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240144923
    Abstract: Disclosed herein are techniques for using a generative adversarial network (GAN) to train a semantic parser of a dialog system. A method described herein involves accessing seed data that includes seed tuples. Each seed tuple includes a respective seed utterance and a respective seed logical form corresponding to the respective seed utterance. The method further includes training a semantic parser and a discriminator in a GAN. The semantic parser learns to map utterances to logical forms based on output from the discriminator, and the discriminator learns to recognize authentic logical forms based on output from the semantic parser. The semantic parser may then be integrated into a dialog system.
    Type: Application
    Filed: January 11, 2024
    Publication date: May 2, 2024
    Applicant: Oracle International Corporation
    Inventors: Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240143934
    Abstract: A method includes accessing document including sentences, document being associated with configuration flag indicating whether ABSA, SLSA, or both are to be performed; inputting the document into language model that generates chunks of token embeddings for the document; and, based on the configuration flag, performing at least one from among the ABSA and the SLSA by inputting the chunks of token embeddings into a multi-task model. When performing the SLSA, a part of token embeddings in each of the chunks is masked, and the masked token embeddings do not belong to a particular sentence on which the SLSA is performed.
    Type: Application
    Filed: October 12, 2023
    Publication date: May 2, 2024
    Applicant: Oracle International Corporation
    Inventors: Poorya Zaremoodi, Duy Vu, Nagaraj N. Bhat, Srijon Sarkar, Varsha Kuppur Rajendra, Thanh Long Duong, Mark Edward Johnson, Pramir Sarkar, Shahid Reza
  • Patent number: 11972220
    Abstract: Techniques for using enhanced logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system and inputting the utterance into a machine-learning model including a series of network layers. A final network layer of the series of network layers can include a logit function. The machine-learning model can map a first probability for a resolvable class to a first logit value using the logit function. The machine-learning model can map a second probability for a unresolvable class to an enhanced logit value. The method can also include the chatbot system classifying the utterance as the resolvable class or the unresolvable class based on the first logit value and the enhanced logit value.
    Type: Grant
    Filed: November 29, 2021
    Date of Patent: April 30, 2024
    Assignee: Oracle International Corporation
    Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
  • Patent number: 11972755
    Abstract: Techniques for noise data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with noise text to generate an augmented training set of utterances; and training the intent classifier using the augmented training set of utterances. The augmenting includes: obtaining the noise text from a list of words, a text corpus, a publication, a dictionary, or any combination thereof irrelevant of original text within the utterances of the training set of utterances, and incorporating the noise text within the utterances relative to the original text in the utterances of the training set of utterances at a predefined augmentation ratio to generate augmented utterances.
    Type: Grant
    Filed: November 23, 2022
    Date of Patent: April 30, 2024
    Assignee: Oracle International Corporation
    Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Yu-Heng Hong, Balakota Srinivas Vinnakota
  • Publication number: 20240135116
    Abstract: A computer-implemented method includes: accessing a plurality of datasets, where each dataset of the plurality of datasets includes training examples; selecting datasets that include the training examples in a source language and a target language; and sampling, based on a sampling weight that is determined for each of the selected datasets, the training examples from the selected datasets to generate the training batches; training an ML model for performing at least a first task using the training examples of the training batches, by interleavingly inputting the training batches to the ML model; and outputting the trained ML model configured to perform the at least the first task on input utterances provided in at least one among the source language and the target language. The sampling weight is determined for each of the selected datasets based on one or more attributes common to the training examples of the selected dataset.
    Type: Application
    Filed: October 12, 2023
    Publication date: April 25, 2024
    Applicant: Oracle International Corporation
    Inventors: Duy Vu, Poorya Zaremoodi, Nagaraj N. Bhat, Srijon Sarkar, Varsha Kuppur Rajendra, Thanh Long Duong, Mark Edward Johnson, Pramir Sarkar, Shahid Reza
  • Publication number: 20240134850
    Abstract: The present disclosure is related to techniques for converting a natural language utterance to a logical form query and deriving a natural language interpretation of the logical form query. The techniques include accessing a Meaning Resource Language (MRL) query and converting the MRL query into a MRL structure including logical form statements. The converting includes extracting operations and associated attributes from the MRL query and generating the logical form statements from the operations and associated attributes. The techniques further include translating each of the logical form statements into a natural language expression based on a grammar data structure that includes a set of rules for translating logical form statements into corresponding natural language expressions, combining the natural language expressions into a single natural language expression, and providing the single natural language expression as an interpretation of the natural language utterance.
    Type: Application
    Filed: May 21, 2023
    Publication date: April 25, 2024
    Applicant: Oracle International Corporation
    Inventors: Chang Xu, Poorya Zaremoodi, Cong Duy Vu Hoang, Nitika Mathur, Philip Arthur, Steve Wai-Chun Siu, Aashna Devang Kanuga, Gioacchino Tangari, Mark Edward Johnson, Thanh Long Duong, Vishal Vishnoi, Stephen Andrew McRitchie, Christopher Mark Broadbent
  • Publication number: 20240126795
    Abstract: Techniques are disclosed herein for integrating document question answering in an artificial intelligence-based platform, such as a chatbot system. The techniques include receiving a query from a user, rewriting the query to include one or more specific descriptors, computing an embedding vector for the rewritten query, retrieving one or more textual passages from a document store utilizing the embedding vector for the rewritten query, determining one or more answers to the rewritten query within the one or more textual passages, and returning the one or more answers.
    Type: Application
    Filed: October 13, 2023
    Publication date: April 18, 2024
    Applicant: Oracle International Corporation
    Inventors: Xu Zhong, Thanh Long Duong, Mark Edward Johnson, Charles Woodrow Dickstein, King-Hwa Lee, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Christopher Kennewick, Balakota Srinivas Vinnakota, Raefer Christopher Gabriel
  • Publication number: 20240126800
    Abstract: Techniques for maintaining list-type text formatting when converting content from a source content format to a destination content format are disclosed. A system generates text content by applying text formatting tags to segments of characters obtained from a source electronic document. The system parses a static-display type source electronic document to obtain character data of the characters in the source document. The system analyzes the parsed data to identify text arranged in a list-type text format in the source document. The system generates text content in a destination content format different from the source format by applying tags to segments of the text content designating the segments items in a list.
    Type: Application
    Filed: May 31, 2023
    Publication date: April 18, 2024
    Applicant: Oracle International Corporation
    Inventors: Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
  • Publication number: 20240126999
    Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.
    Type: Application
    Filed: December 19, 2023
    Publication date: April 18, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240095584
    Abstract: Techniques are disclosed herein for objective function optimization in target based hyperparameter tuning. In one aspect, a computer-implemented method is provided that includes initializing a machine learning algorithm with a set of hyperparameter values and obtaining a hyperparameter objective function that comprises a domain score for each domain that is calculated based on a number of instances within an evaluation dataset that are correctly or incorrectly predicted by the machine learning algorithm during a given trial. For each trial of a hyperparameter tuning process: training the machine learning algorithm to generate a machine learning model, running the machine learning model in different domains using the set of hyperparameter values, evaluating the machine learning model for each domain, and once the machine learning model has reached convergence, outputting at least one machine learning model.
    Type: Application
    Filed: May 15, 2023
    Publication date: March 21, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Vladislav Blinov, Ahmed Ataallah Ataallah Abobakr, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Poorya Zaremoodi, Umanga Bista
  • Publication number: 20240095454
    Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.
    Type: Application
    Filed: November 28, 2023
    Publication date: March 21, 2024
    Applicant: Oracle International Corporation
    Inventors: Duy Vu, Tuyen Quang Pham, Cong Duy Vu Hoang, Srinivasa Phani Kumar Gadde, Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi
  • Publication number: 20240086767
    Abstract: Techniques are disclosed herein for continuous hyperparameter tuning with automatic domain weight adjustment based on periodic performance checkpoints. In one aspect, a method is provided that includes initializing a machine learning algorithm with a set of hyperparameter values and obtaining a hyperparameter objective function that is defined at least in part on a plurality of domains of a search space that is associated with the machine learning algorithm. For each trial of a hyperparameter tuning process: running the machine learning algorithm in different domains using the set of hyperparameter values, periodically checking a performance of the machine learning algorithm in the different domains based on the hyperparameter objective function; and continuing hyperparameter tuning with a new set of hyperparameter values after automatically adjusting the domain weights according to a regression status of the different domains.
    Type: Application
    Filed: April 3, 2023
    Publication date: March 14, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Vladislav Blinov, Ahmed Ataallah Ataallah Abobakr, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Umanga Bista
  • Patent number: 11922123
    Abstract: Techniques for automatically switching between chatbot skills in the same domain. In one particular aspect, a method is provided that includes receiving an utterance from a user within a chatbot session, where a current skill context is a first skill and a current group context is a first group, inputting the utterance into a candidate skills model for the first group, obtaining, using the candidate skills model, a ranking of skills within the first group, determining, based on the ranking of skills, a second skill is a highest ranked skill, changing the current skill context of the chatbot session to the second skill, inputting the utterance into a candidate flows model for the second skill, obtaining, using the candidate flows model, a ranking of intents within the second skill that match the utterance, and determining, based on the ranking of intents, an intent that is a highest ranked intent.
    Type: Grant
    Filed: September 30, 2021
    Date of Patent: March 5, 2024
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Srinivasa Phani Kumar Gadde, Crystal C. Pan, Mark Edward Johnson, Thanh Long Duong, Balakota Srinivas Vinnakota, Manish Parekh
  • Patent number: 11914943
    Abstract: Techniques for generating text content arranged in a consistent read order from a source document including text corresponding to different read orders are disclosed. A system parses a binary file representing an electronic document to identify characters and metadata associated with the characters. The system pre-sorts a character order of characters in each line of the electronic document to generate an ordered list of characters arranged according to the right-to-left reading order. The system performs a layout-mirroring operation to change a position of characters within the modified document relative to a right edge of the document and a left edge of the document. Subsequent to performing layout-mirroring, the system identifies native left-to-right reading-order text in-line with the native right-to-left reading-order text.
    Type: Grant
    Filed: February 15, 2023
    Date of Patent: February 27, 2024
    Assignee: Oracle International Corporation
    Inventors: Xu Zhong, Vishank Bhatia, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
  • Publication number: 20240061833
    Abstract: Techniques are disclosed for augmenting training data for training a machine learning model to generate database queries. Training data comprising a first training example comprising a first natural language utterance, a logical form for the first natural language utterance, and associated first metadata is obtained. From the first training example, a template utterance is generated. A second natural language utterance is generated by filling slots in the template utterance based on a database schema and database values. Updated metadata is produced based on the first metadata and the second natural language utterance. A second training example is generated, comprising the second natural language utterance, the logical form for the first natural language utterance, and the updated metadata. The training data is augmented by adding the second training example. A machine learning model is trained to generate a database query comprising the database operation using the augmented training data set.
    Type: Application
    Filed: July 5, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Nitika Mathur, Philip Arthur, Cong Duy Vu Hoang, Aashna Devang Kanuga, Steve Wai-Chun Siu, Syed Najam Abbas Zaidi, Poorya Zaremoodi, Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240061992
    Abstract: Techniques for generating formatting tags for textual content obtained from a source electronic document are disclosed. A system parses a digital file to obtain information about characters in an electronic document. The system applies tags to text generated based on the textual content of the electronic document by creating segments of textually-consecutive characters and applying corresponding text formatting style tags to the segments. The system further identifies segments of text overlapping bounding boxes in the electronic document. The system generates textual content including a segment of text and a corresponding hyperlink associated with the segment of text. The system further generates textual content by selectively applying line breaks from the source electronic document in the textual content.
    Type: Application
    Filed: January 6, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Vishank Bhatia, Xu Zhong, Thanh Long Duong, Mark Johnson, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, King-Hwa Lee, Christopher Kennewick
  • Publication number: 20240061832
    Abstract: Techniques are disclosed herein for converting a natural language utterance to an intermediate database query representation. An input string is generated by concatenating a natural language utterance with a database schema representation for a database. Based on the input string, a first encoder generates one or more embeddings of the natural language utterance and the database schema representation. A second encoder encodes relations between elements in the database schema representation and words in the natural language utterance based on the one or more embeddings. A grammar-based decoder generates an intermediate database query representation based on the encoded relations and the one or more embeddings. Based on the intermediate database query representation and an interface specification, a database query is generated in a database query language.
    Type: Application
    Filed: June 14, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Cong Duy Vu Hoang, Stephen Andrew McRitchie, Mark Edward Johnson, Shivashankar Subramanian, Aashna Devang Kanuga, Nitika Mathur, Gioacchino Tangari, Steve Wai-Chun Siu, Poorya Zaremoodi, Vasisht Raghavendra, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Christopher Mark Broadbent, Philip Arthur, Syed Najam Abbas Zaidi
  • Publication number: 20240062011
    Abstract: Techniques are disclosed herein for using named entity recognition to resolve entity expression while transforming natural language to a meaning representation language. In one aspect, a method includes accessing natural language text, predicting, by a first machine learning model, a class label for a token in the natural language text, predicting, by a second machine-learning model, operators for a meaning representation language and a value or value span for each attribute of the operators, in response to determining that the value or value span for a particular attribute matches the class label, converting a portion of the natural language text for the value or value span into a resolved format, and outputting syntax for the meaning representation language. The syntax comprises the operators with the portion of the natural language text for the value or value span in the resolved format.
    Type: Application
    Filed: July 13, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Aashna Devang Kanuga, Cong Duy Vu Hoang, Mark Edward Johnson, Vasisht Raghavendra, Yuanxu Wu, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Shubham Pawankumar Shah, Vanshika Sridharan, Zikai Li, Diego Andres Cornejo Barra, Stephen Andrew McRitchie, Christopher Mark Broadbent, Vishal Vishnoi, Srinivasa Phani Kumar Gadde, Poorya Zaremoodi, Thanh Long Duong, Bhagya Gayathri Hettige, Tuyen Quang Pham, Arash Shamaei, Thanh Tien Vu, Yakupitiyage Don Thanuja Samodhve Dharmasiri
  • Publication number: 20240061834
    Abstract: Systems and methods identify whether an input utterance is suitable for providing to a machine learning model configured to generate a query for a database. Techniques include generating an input string by concatenating a natural language utterance with a database schema representation for a database; providing the input string to a first machine learning model; based on the input string, generating, by the first machine learning model, a score indicating whether the natural language utterance is translatable to a database query for the database and should be routed to a second machine learning model, the second machine learning model configured to generate a query for the database based on the natural language utterance; comparing the score to a threshold value; and responsive to determining that the score exceeds the threshold value, providing the natural language utterance or the input string to the second machine learning model.
    Type: Application
    Filed: August 21, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Poorya Zaremoodi, Philip Arthur, Nitika Mathur, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20240062021
    Abstract: Techniques are disclosed herein for calibrating confidence scores of a machine learning model trained to translate natural language to a meaning representation language. The techniques include obtaining one or more raw beam scores generated from one or more beam levels of a decoder of a machine learning model trained to translate natural language to a logical form, where each of the one or more raw beam scores is a conditional probability of a sub-tree determined by a heuristic search algorithm of the decoder at one of the one or more beam levels, classifying, by a calibration model, a logical form output by the machine learning model as correct or incorrect based on the one or more raw beam scores, and providing the logical form with a confidence score that is determined based on the classifying of the logical form.
    Type: Application
    Filed: February 9, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Mark Edward Johnson, Poorya Zaremoodi, Nitika Mathur, Aashna Devang Kanuga, Thanh Long Duong