Patents by Inventor Mark Edward Johnson

Mark Edward Johnson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240095584
    Abstract: Techniques are disclosed herein for objective function optimization in target based hyperparameter tuning. In one aspect, a computer-implemented method is provided that includes initializing a machine learning algorithm with a set of hyperparameter values and obtaining a hyperparameter objective function that comprises a domain score for each domain that is calculated based on a number of instances within an evaluation dataset that are correctly or incorrectly predicted by the machine learning algorithm during a given trial. For each trial of a hyperparameter tuning process: training the machine learning algorithm to generate a machine learning model, running the machine learning model in different domains using the set of hyperparameter values, evaluating the machine learning model for each domain, and once the machine learning model has reached convergence, outputting at least one machine learning model.
    Type: Application
    Filed: May 15, 2023
    Publication date: March 21, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Vladislav Blinov, Ahmed Ataallah Ataallah Abobakr, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Poorya Zaremoodi, Umanga Bista
  • Publication number: 20240095454
    Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.
    Type: Application
    Filed: November 28, 2023
    Publication date: March 21, 2024
    Applicant: Oracle International Corporation
    Inventors: Duy Vu, Tuyen Quang Pham, Cong Duy Vu Hoang, Srinivasa Phani Kumar Gadde, Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi
  • Publication number: 20240086767
    Abstract: Techniques are disclosed herein for continuous hyperparameter tuning with automatic domain weight adjustment based on periodic performance checkpoints. In one aspect, a method is provided that includes initializing a machine learning algorithm with a set of hyperparameter values and obtaining a hyperparameter objective function that is defined at least in part on a plurality of domains of a search space that is associated with the machine learning algorithm. For each trial of a hyperparameter tuning process: running the machine learning algorithm in different domains using the set of hyperparameter values, periodically checking a performance of the machine learning algorithm in the different domains based on the hyperparameter objective function; and continuing hyperparameter tuning with a new set of hyperparameter values after automatically adjusting the domain weights according to a regression status of the different domains.
    Type: Application
    Filed: April 3, 2023
    Publication date: March 14, 2024
    Applicant: Oracle International Corporation
    Inventors: Ying Xu, Vladislav Blinov, Ahmed Ataallah Ataallah Abobakr, Mark Edward Johnson, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Umanga Bista
  • Patent number: 11922123
    Abstract: Techniques for automatically switching between chatbot skills in the same domain. In one particular aspect, a method is provided that includes receiving an utterance from a user within a chatbot session, where a current skill context is a first skill and a current group context is a first group, inputting the utterance into a candidate skills model for the first group, obtaining, using the candidate skills model, a ranking of skills within the first group, determining, based on the ranking of skills, a second skill is a highest ranked skill, changing the current skill context of the chatbot session to the second skill, inputting the utterance into a candidate flows model for the second skill, obtaining, using the candidate flows model, a ranking of intents within the second skill that match the utterance, and determining, based on the ranking of intents, an intent that is a highest ranked intent.
    Type: Grant
    Filed: September 30, 2021
    Date of Patent: March 5, 2024
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventors: Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Srinivasa Phani Kumar Gadde, Crystal C. Pan, Mark Edward Johnson, Thanh Long Duong, Balakota Srinivas Vinnakota, Manish Parekh
  • Patent number: 11914962
    Abstract: The present disclosure relates generally to determining intent based upon speech input using a dialog system. More particularly, techniques are described using matching-based machine learning techniques to identify an intent corresponding to speech input in a dialog system. These procedures do not require training when intents are added or removed from the set of possible intents.
    Type: Grant
    Filed: July 29, 2020
    Date of Patent: February 27, 2024
    Assignee: Oracle International Corporation
    Inventor: Mark Edward Johnson
  • Publication number: 20240061833
    Abstract: Techniques are disclosed for augmenting training data for training a machine learning model to generate database queries. Training data comprising a first training example comprising a first natural language utterance, a logical form for the first natural language utterance, and associated first metadata is obtained. From the first training example, a template utterance is generated. A second natural language utterance is generated by filling slots in the template utterance based on a database schema and database values. Updated metadata is produced based on the first metadata and the second natural language utterance. A second training example is generated, comprising the second natural language utterance, the logical form for the first natural language utterance, and the updated metadata. The training data is augmented by adding the second training example. A machine learning model is trained to generate a database query comprising the database operation using the augmented training data set.
    Type: Application
    Filed: July 5, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Nitika Mathur, Philip Arthur, Cong Duy Vu Hoang, Aashna Devang Kanuga, Steve Wai-Chun Siu, Syed Najam Abbas Zaidi, Poorya Zaremoodi, Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240061832
    Abstract: Techniques are disclosed herein for converting a natural language utterance to an intermediate database query representation. An input string is generated by concatenating a natural language utterance with a database schema representation for a database. Based on the input string, a first encoder generates one or more embeddings of the natural language utterance and the database schema representation. A second encoder encodes relations between elements in the database schema representation and words in the natural language utterance based on the one or more embeddings. A grammar-based decoder generates an intermediate database query representation based on the encoded relations and the one or more embeddings. Based on the intermediate database query representation and an interface specification, a database query is generated in a database query language.
    Type: Application
    Filed: June 14, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Cong Duy Vu Hoang, Stephen Andrew McRitchie, Mark Edward Johnson, Shivashankar Subramanian, Aashna Devang Kanuga, Nitika Mathur, Gioacchino Tangari, Steve Wai-Chun Siu, Poorya Zaremoodi, Vasisht Raghavendra, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Christopher Mark Broadbent, Philip Arthur, Syed Najam Abbas Zaidi
  • Publication number: 20240061835
    Abstract: Systems and methods fine-tune a pretrained machine learning model. For a model having multiple layers, an initial set of configurations is identified, each configuration establishing layers to be frozen and layers to be fine-tuned. A configuration that is optimized with respect to one or more parameters is selected, establishing a set of fine-tuning layers and a set of frozen layers. An input for the model is provided to a remote system. An output of the set of frozen layers of the model, given the provided input, is received back and locally stored. The set of fine-tuning layers of the model is loaded from the remote system. The model is fine-tuned by retrieving the locally stored output of the set of frozen layers, and updating weights associated with the set of fine-tuning layers of the machine learning model.
    Type: Application
    Filed: August 21, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Shivashankar Subramanian, Gioacchino Tangari, Thanh Tien Vu, Cong Duy Vu Hoang, Poorya Zaremoodi, Dalu Guo, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20240062011
    Abstract: Techniques are disclosed herein for using named entity recognition to resolve entity expression while transforming natural language to a meaning representation language. In one aspect, a method includes accessing natural language text, predicting, by a first machine learning model, a class label for a token in the natural language text, predicting, by a second machine-learning model, operators for a meaning representation language and a value or value span for each attribute of the operators, in response to determining that the value or value span for a particular attribute matches the class label, converting a portion of the natural language text for the value or value span into a resolved format, and outputting syntax for the meaning representation language. The syntax comprises the operators with the portion of the natural language text for the value or value span in the resolved format.
    Type: Application
    Filed: July 13, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Aashna Devang Kanuga, Cong Duy Vu Hoang, Mark Edward Johnson, Vasisht Raghavendra, Yuanxu Wu, Steve Wai-Chun Siu, Nitika Mathur, Gioacchino Tangari, Shubham Pawankumar Shah, Vanshika Sridharan, Zikai Li, Diego Andres Cornejo Barra, Stephen Andrew McRitchie, Christopher Mark Broadbent, Vishal Vishnoi, Srinivasa Phani Kumar Gadde, Poorya Zaremoodi, Thanh Long Duong, Bhagya Gayathri Hettige, Tuyen Quang Pham, Arash Shamaei, Thanh Tien Vu, Yakupitiyage Don Thanuja Samodhve Dharmasiri
  • Publication number: 20240062044
    Abstract: Techniques are disclosed herein for addressing catastrophic forgetting and over-generalization while training a model to transform natural language to a logical form such as a meaning representation language. The techniques include accessing training data comprising natural language examples, augmenting the training data to generate expanded training data, training a machine learning model on the expanded training data, and providing the trained machine learning model. The augmenting includes (i) generating contrastive examples by revising natural language of examples identified to have caused regression during training of a machine learning model with the training data, (ii) generating alternative examples by modifying operators of examples identified within the training data that belong to a concept that exhibits bias, or (iii) a combination of (i) and (ii).
    Type: Application
    Filed: August 18, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Shivashankar Subramanian, Dalu Guo, Gioacchino Tangari, Nitika Mathur, Cong Duy Vu Hoang, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20240061834
    Abstract: Systems and methods identify whether an input utterance is suitable for providing to a machine learning model configured to generate a query for a database. Techniques include generating an input string by concatenating a natural language utterance with a database schema representation for a database; providing the input string to a first machine learning model; based on the input string, generating, by the first machine learning model, a score indicating whether the natural language utterance is translatable to a database query for the database and should be routed to a second machine learning model, the second machine learning model configured to generate a query for the database based on the natural language utterance; comparing the score to a threshold value; and responsive to determining that the score exceeds the threshold value, providing the natural language utterance or the input string to the second machine learning model.
    Type: Application
    Filed: August 21, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Poorya Zaremoodi, Philip Arthur, Nitika Mathur, Mark Edward Johnson, Thanh Long Duong
  • Publication number: 20240062021
    Abstract: Techniques are disclosed herein for calibrating confidence scores of a machine learning model trained to translate natural language to a meaning representation language. The techniques include obtaining one or more raw beam scores generated from one or more beam levels of a decoder of a machine learning model trained to translate natural language to a logical form, where each of the one or more raw beam scores is a conditional probability of a sub-tree determined by a heuristic search algorithm of the decoder at one of the one or more beam levels, classifying, by a calibration model, a logical form output by the machine learning model as correct or incorrect based on the one or more raw beam scores, and providing the logical form with a confidence score that is determined based on the classifying of the logical form.
    Type: Application
    Filed: February 9, 2023
    Publication date: February 22, 2024
    Applicant: Oracle International Corporation
    Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Mark Edward Johnson, Poorya Zaremoodi, Nitika Mathur, Aashna Devang Kanuga, Thanh Long Duong
  • Patent number: 11908460
    Abstract: Disclosed herein are techniques for using a generative adversarial network (GAN) to train a semantic parser of a dialog system. A method described herein involves accessing seed data that includes seed tuples. Each seed tuple includes a respective seed utterance and a respective seed logical form corresponding to the respective seed utterance. The method further includes training a semantic parser and a discriminator in a GAN. The semantic parser learns to map utterances to logical forms based on output from the discriminator, and the discriminator learns to recognize authentic logical forms based on output from the semantic parser. The semantic parser may then be integrated into a dialog system.
    Type: Grant
    Filed: August 13, 2020
    Date of Patent: February 20, 2024
    Assignee: Oracle International Corporation
    Inventors: Thanh Long Duong, Mark Edward Johnson
  • Publication number: 20240028963
    Abstract: An augmentation and feature caching subsystem is described for training AI/ML models. In one particular aspect, a method is provided that includes receiving data comprising training examples, one or more augmentation configuration hyperparameters and one or more feature extraction configuration hyperparameters; generating a first key based on one of the training examples and the one or more augmentation configuration hyperparameters; searching a first key-value storage based on the first key; obtaining one or more augmentations based on the search of the first key-value storage; applying the obtained one or more augmentations to the training examples to result in augmented training examples; generating a second key based on one of the augmented training examples and the one or more feature extraction configuration hyperparameters; searching a second key-value storage based on the second key; obtaining one or more features based on the search of the second key-value storage.
    Type: Application
    Filed: July 11, 2023
    Publication date: January 25, 2024
    Applicant: Oracle International Corporation
    Inventors: Vladislav Blinov, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Xin Xu, Elias Luqman Jalaluddin, Ying Xu, Ahmed Ataallah Ataallah Abobakr, Umanga Bista, Thanh Tien Vu
  • Publication number: 20240013780
    Abstract: Techniques for data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes generating a list of values to cover for an entity, selecting utterances from a set of data that have context for the entity, converting the utterances into templates, where each template of the templates comprises a slot that maps to the list of values for the entity, selecting a template from the templates, selecting a value from the list of values based on the mapping between the slot within the selected template and the list of values for the entity; and creating an artificial utterance based on the selected template and the selected value, where the creating the artificial utterance comprises inserting the selected value into the slot of the selected template that maps to the list of values for the entity.
    Type: Application
    Filed: September 21, 2023
    Publication date: January 11, 2024
    Applicant: Oracle International Corporation
    Inventors: Srinivasa Phani Kumar Gadde, Yuanxu Wu, Aashna Devang Kanuga, Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson
  • Patent number: 11868727
    Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.
    Type: Grant
    Filed: January 19, 2022
    Date of Patent: January 9, 2024
    Assignee: Oracle International Corporation
    Inventors: Duy Vu, Tuyen Quang Pham, Cong Duy Vu Hoang, Srinivasa Phani Kumar Gadde, Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi
  • Publication number: 20230419040
    Abstract: Novel techniques are described for data augmentation using a two-stage entity-aware augmentation to improve model robustness to entity value changes for intent prediction.
    Type: Application
    Filed: February 1, 2023
    Publication date: December 28, 2023
    Applicant: Oracle International Corporation
    Inventors: Ahmed Ataallah Ataallah Abobakr, Shivashankar Subramanian, Ying Xu, Vladislav Blinov, Umanga Bista, Tuyen Quang Pham, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Vanshika Sridharan, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
  • Publication number: 20230419127
    Abstract: Novel techniques are described for negative entity-aware augmentation using a two-stage augmentation to improve the stability of the model to entity value changes for intent prediction. In some embodiments, a method comprises accessing a first set of training data for an intent prediction model, the first set of training data comprising utterances and intent labels; applying one or more negative entity-aware data augmentation techniques to the first set of training data, depending on the tuning requirements for hyper-parameters, to result in a second set of training data, where the one or more negative entity-aware data augmentation techniques comprise Keyword Augmentation Technique (“KAT”) plus entity without context technique and KAT plus entity in random context as OOD technique; combining the first set of training data and the second set of training data to generate expanded training data; and training the intent prediction model using the expanded training data.
    Type: Application
    Filed: February 1, 2023
    Publication date: December 28, 2023
    Applicant: Oracle International Corporation
    Inventors: Ahmed Ataallah Ataallah Abobakr, Shivashankar Subramanian, Ying Xu, Vladislav Blinov, Umanga Bista, Tuyen Quang Pham, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Vanshika Sridharan, Xin Xu, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
  • Publication number: 20230419052
    Abstract: Novel techniques are described for positive entity-aware augmentation using a two-stage augmentation to improve the stability of the model to entity value changes for intent prediction. In one particular aspect, a method is provided that includes accessing a first set of training data for an intent prediction model, the first set of training data comprising utterances and intent labels; applying one or more positive data augmentation techniques to the first set of training data, depending on the tuning requirements for hyper-parameters, to result in a second set of training data, where the positive data augmentation techniques comprise Entity-Aware (“EA”) technique and a two-stage augmentation technique; combining the first set of training data and the second set of training data to generate expanded training data; and training the intent prediction model using the expanded training data.
    Type: Application
    Filed: February 1, 2023
    Publication date: December 28, 2023
    Applicant: Oracle International Corporation
    Inventors: Ahmed Ataallah Ataallah Abobakr, Shivashankar Subramanian, Ying Xu, Vladislav Blinov, Umanga Bista, Tuyen Quang Pham, Thanh Long Duong, Mark Edward Johnson, Elias Luqman Jalaluddin, Vanshika Sridharan, Xin XU, Srinivasa Phani Kumar Gadde, Vishal Vishnoi
  • Publication number: 20230376700
    Abstract: Techniques are provided for generating training data to facilitate fine-tuning embedding models. Training data including anchor utterances is obtained. Positive utterances and negative utterances are generated from the anchor utterances. Tuples including the anchor utterances, the positive utterances, and the negative utterances are formed. Embeddings for the tuples are generated and a pre-trained embedding model is fine-tuned based on the embeddings. The fine-tuned model can be deployed to a system.
    Type: Application
    Filed: May 9, 2023
    Publication date: November 23, 2023
    Applicant: Oracle International Corporation
    Inventors: Umanga Bista, Vladislav Blinov, Mark Edward Johnson, Ahmed Ataallah Ataallah Abobakr, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Elias Luqman Jalaluddin, Xin Xu, Shivashankar Subramanian