Patents by Inventor Mark Edward Johnson

Mark Edward Johnson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Streamlining dialog processing using integrated shared resources

Patent number: 11403462

Abstract: Techniques for reducing memory and processing resources used by a dialog system by sharing resources between pipelined processes of the dialog system. An integrated shared dictionary is constructed for concurrent use by automated speech recognition (ASR) and natural language understanding (NLU) subsystems of the dialog system. The integrated shared dictionary comprises multiple entries, with each entry comprising first information that is used by the ASR subsystem, second information used by the NLU subsystem, and information correlating the first information and the second information. The ASR subsystem uses the integrated shared dictionary to identify a dictionary entry containing a set of words corresponding to speech input. The dictionary entry information is communicated to the NLU subsystem, which uses the entry to generate a meaning representation for the speech input.

Type: Grant

Filed: July 13, 2020

Date of Patent: August 2, 2022

Assignee: Oracle International Corporation

Inventor: Mark Edward Johnson
MULTI-FACTOR MODELLING FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220230000

Abstract: Techniques are disclosed for systems including techniques for multi-factor modelling for training and utilizing chatbot systems for natural language processing. In an embodiment, a method includes receiving a set of utterance data corresponding to a natural language-based query, determining one or more intents for the chatbot corresponds to a possible context for the natural language-based query and associated with a skill for the chatbot, generating one or more intent classification datasets, each intent classification dataset associated with a probability that the natural language query corresponds to an intent of the one or more intents, generating one or more transformed datasets each corresponding to a skill of one or more skills, determining a first skill of the one or more skills based on the one or more transformed datasets and processing, based on the determined first skill, the set of utterance data to resolve the natural language-based query.

Type: Application

Filed: January 18, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Ying Xu
MULTI-FEATURE BALANCING FOR NATURAL LANGUAGE PROCESSORS

Publication number: 20220229991

Abstract: Techniques are disclosed for systems including techniques for multi-feature balancing for natural langue processors. In an embodiment, a method includes receiving a natural language query to be processed by a machine learning model, the machine learning model utilizing a dataset of natural language phrases for processing natural language queries, determining, based on the machine learning model and the natural language query, a feature dropout value, generating, and based on the natural language query, one or more contextual features and one or more expressional features that may be input to the machine learning model, modifying at least one or the one or more contextual features and the one or more expressional features based on the feature dropout value to generate a set of input features for the machine learning model, and processing the set of input features to cause generating an output dataset for corresponding to the natural language query.

Type: Application

Filed: January 20, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Vishal Vishnoi, Mark Edward Johnson, Elias Luqman Jalaluddin, Tuyen Quang Pham, Cong Duy Vu Hoang, Poorya Zaremoodi, Srinivasa Phani Kumar Gadde, Aashna Devang Kanuga, Zikai Li, Yuanxu Wu
CONTEXT TAG INTEGRATION WITH NAMED ENTITY RECOGNITION MODELS

Publication number: 20220229993

Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.

Type: Application

Filed: January 19, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Duy Vu, Tuyen Quang Pham, Cong Duy Vu Hoang, Srinivasa Phani Kumar Gadde, Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi
KEYWORD DATA AUGMENTATION TOOL FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171930

Abstract: Techniques for keyword data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: identifying keywords within utterances of the training set of utterances, generating a set of OOD examples with the identified keywords, filtering out OOD examples from the set of OOD examples that have a context substantially similar to context of the utterances of the training set of utterances, and incorporating the set of OOD examples without the filtered OOD examples into the training set of utterances to generate an augmented training set of utterances. Thereafter, the machine-learning model is trained using the augmented training set of utterances.

Type: Application

Filed: October 28, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Poorya Zaremoodi, Gautam Singaraju, Ying Xu, Vladislav Blinov
METHOD AND SYSTEM FOR OVER-PREDICTION IN NEURAL NETWORKS

Publication number: 20220172021

Abstract: Disclosed herein are techniques for addressing an overconfidence problem associated with machine learning models in chatbot systems. For each layer of a plurality of layers of a machine learning model, a distribution of confidence scores is generated for a plurality of predictions with respect to an input utterance. A prediction is determined for each layer of the machine learning model based on the distribution of confidence scores generated for the layer. Based on the predictions, an overall prediction of the machine learning model is determined. A subset of the plurality of layers are iteratively processed to identify a layer whose assigned prediction satisfies a criterion. A confidence score associated with the assigned prediction of the layer of the machine learning model is assigned as an overall confidence score to be associated with the overall prediction of the machine learning model.

Type: Application

Filed: November 16, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Cong Duy Vu Hoang, Thanh Tien Vu, Poorya Zaremoodi, Ying Xu, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
OUT-OF-DOMAIN DATA AUGMENTATION FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171938

Abstract: Techniques for out-of-domain data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, and augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: generating a data set of OOD examples, filtering out OOD examples from the data set of OOD examples, determining a difficulty value for each OOD example remaining within the filtered data set of the OOD examples, and generating augmented batches of utterances comprising utterances from the training set of utterances and utterances from the filtered data set of the OOD based on the difficulty value for each OOD. Thereafter, the machine-learning model is trained using the augmented batches of utterances in accordance with a curriculum training protocol.

Type: Application

Filed: October 28, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Poorya Zaremoodi, Gautam Singaraju, Ying Xu, Vladislav Blinov, Yu-Heng Hong
ENHANCED LOGITS FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171946

Abstract: Techniques for using enhanced logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system and inputting the utterance into a machine-learning model including a series of network layers. A final network layer of the series of network layers can include a logit function. The machine-learning model can map a first probability for a resolvable class to a first logit value using the logit function. The machine-learning model can map a second probability for a unresolvable class to an enhanced logit value. The method can also include the chatbot system classifying the utterance as the resolvable class or the unresolvable class based on the first logit value and the enhanced logit value.

Type: Application

Filed: November 29, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
DISTANCE-BASED LOGIT VALUE FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171947

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

Type: Application

Filed: November 30, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
AUTOMATIC OUT OF SCOPE TRANSITION FOR CHATBOT

Publication number: 20220100961

Abstract: Techniques for automatically switching between chatbot skills in the same domain. In one particular aspect, a method is provided that includes receiving an utterance from a user within a chatbot session, where a current skill context is a first skill and a current group context is a first group, inputting the utterance into a candidate skills model for the first group, obtaining, using the candidate skills model, a ranking of skills within the first group, determining, based on the ranking of skills, a second skill is a highest ranked skill, changing the current skill context of the chatbot session to the second skill, inputting the utterance into a candidate flows model for the second skill, obtaining, using the candidate flows model, a ranking of intents within the second skill that match the utterance, and determining, based on the ranking of intents, an intent that is a highest ranked intent.

Type: Application

Filed: September 30, 2021

Publication date: March 31, 2022

Applicant: Oracle International Corporation

Inventors: Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Srinivasa Phani Kumar Gadde, Crystal C. Pan, Mark Edward Johnson, Thanh Long Duong, Balakota Srinivas Vinnakota, Manish Parekh
ENTITY LEVEL DATA AUGMENTATION IN CHATBOTS FOR ROBUST NAMED ENTITY RECOGNITION

Publication number: 20210390951

Abstract: Techniques for data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes generating a list of values to cover for an entity, selecting utterances from a set of data that have context for the entity, converting the utterances into templates, where each template of the templates comprises a slot that maps to the list of values for the entity, selecting a template from the templates, selecting a value from the list of values based on the mapping between the slot within the selected template and the list of values for the entity; and creating an artificial utterance based on the selected template and the selected value, where the creating the artificial utterance comprises inserting the selected value into the slot of the selected template that maps to the list of values for the entity.

Type: Application

Filed: June 11, 2021

Publication date: December 16, 2021

Applicant: Oracle International Corporation

Inventors: Srinivasa Phani Kumar Gadde, Yuanxu Wu, Aashna Devang Kanuga, Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson
METHOD AND SYSTEM FOR CONSTRAINT BASED HYPERPARAMETER TUNING

Publication number: 20210304003

Abstract: Techniques are disclosed for tuning hyperparameters of a model. Datasets are obtained for training the model and metrics are selected for evaluating performance of the model. Each metric is assigned a weight specifying an importance to the performance of the model. A function is created that measures performance based on the weighted metrics. Hyperparameters are tuned to optimize the model performance. Tuning the hyperparameters includes: (i) training the model that is configured based on a current values for the hyperparameters; (ii) evaluating a performance of the model using the function; (iii) determining whether the model is optimized for the metrics; (iv) in response to the model not being optimized, searching for a new values for the hyperparameters, reconfiguring the model with the new values, and repeating steps (i)-(iii) using the reconfigured model; and (v) in response to the model being optimized for the metrics, providing a trained model.

Type: Application

Filed: March 29, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Mark Edward Johnson, Thanh Long Duong, Vishal Vishnoi, Balakota Srinivas Vinnakota, Tuyen Quang Pham, Cong Duy Vu Hoang
METHOD AND SYSTEM FOR TARGET BASED HYPER-PARAMETER TUNING

Publication number: 20210304074

Abstract: Techniques are disclosed for tuning hyperparameters of a machine-learning model. A plurality of metrics are selected for which hyperparameters of the machine-learning model are to be tuned. Each metric is associated with a plurality of specification parameters including a target score, a penalty factor, and a bonus factor. The plurality of specification parameters are configured for each metric in accordance with a first criterion. The machine-learning model is evaluated using one or more validation datasets to obtain a metric score. A weighted loss function is formulated based on a difference between the metric score and the target score of each metric, the penalty factor or the bonus factor. The hyperparameters associated with the machine-learning model are tuned in order to optimize the weighted loss function. In response to the weighted loss function being optimized, the machine-learning model is provided as a validated machine-learning model.

Type: Application

Filed: March 29, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Poorya Zaremoodi, Ying Xu, Thanh Tien Vu, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson, Xin Xu, Cong Duy Vu Hoang
TECHNIQUES FOR OUT-OF-DOMAIN (OOD) DETECTION

Publication number: 20210303798

Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances.

Type: Application

Filed: March 30, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Crystal C. Pan, Vladislav Blinov, Cong Duy Vu Hoang, Elias Luqman Jalaluddin, Duy Vu, Balakota Srinivas Vinnakota
BATCHING TECHNIQUES FOR HANDLING UNBALANCED TRAINING DATA FOR A CHATBOT

Publication number: 20210304075

Abstract: The present disclosure relates to chatbot systems, and more particularly, to batching techniques for handling unbalanced training data when training a model such that bias is removed from the trained machine learning model when performing inference. In an embodiment, a plurality of raw utterances is obtained. A bias eliminating distribution is determined and a subset of the plurality of raw utterances is batched according to the bias-reducing distribution. The resulting unbiased training data may be input into a prediction model for training the prediction model. The trained prediction model may be obtained and utilized to predict unbiased results from new inputs received by the trained prediction model.

Type: Application

Filed: March 30, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Balakota Srinivas Vinnakota, Yu-Heng Hong, Elias Luqman Jalaluddin
NOISE DATA AUGMENTATION FOR NATURAL LANGUAGE PROCESSING

Publication number: 20210304733

Abstract: Techniques for noise data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with noise text to generate an augmented training set of utterances; and training the intent classifier using the augmented training set of utterances. The augmenting includes: obtaining the noise text from a list of words, a text corpus, a publication, a dictionary, or any combination thereof irrelevant of original text within the utterances of the training set of utterances, and incorporating the noise text within the utterances relative to the original text in the utterances of the training set of utterances at a predefined augmentation ratio to generate augmented utterances.

Type: Application

Filed: September 9, 2020

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Yu-Heng Hong, Balakota Srinivas Vinnakota
STREAMLINING DIALOG PROCESSING USING INTEGRATED SHARED RESOURCES

Publication number: 20210081609

Abstract: Techniques for reducing memory and processing resources used by a dialog system by sharing resources between pipelined processes of the dialog system. An integrated shared dictionary is constructed for concurrent use by automated speech recognition (ASR) and natural language understanding (NLU) subsystems of the dialog system. The integrated shared dictionary comprises multiple entries, with each entry comprising first information that is used by the ASR subsystem, second information used by the NLU subsystem, and information correlating the first information and the second information. The ASR subsystem uses the integrated shared dictionary to identify a dictionary entry containing a set of words corresponding to speech input. The dictionary entry information is communicated to the NLU subsystem, which uses the entry to generate a meaning representation for the speech input.

Type: Application

Filed: July 13, 2020

Publication date: March 18, 2021

Applicant: Oracle International Corporation

Inventor: Mark Edward Johnson
TECHNIQUES FOR DIALOG PROCESSING USING CONTEXTUAL DATA

Publication number: 20210082414

Abstract: Techniques are described for using data stored for a user in association with context levels to improve the efficiency and accuracy of dialog processing tasks. A dialog system stores historical dialog data in association with a plurality of configured context levels. The dialog system receives an utterance and identifies a term for disambiguation from the utterance. Based on a determined context level, the dialog system identifies relevant historical data stored to a database. The historical data may be used to perform tasks such as resolving an ambiguity based on user preferences, disambiguating named entities based on a prior dialog, and identifying previously generated answers to queries. Based on the context level, the dialog system can efficiently identify the relevant information and use the identified information to provide a response.

Type: Application

Filed: August 26, 2020

Publication date: March 18, 2021

Applicant: Oracle International Corporation

Inventor: Mark Edward Johnson
STOP WORD DATA AUGMENTATION FOR NATURAL LANGUAGE PROCESSING

Publication number: 20210082400

Abstract: Techniques for stop word data augmentation for training chatbot systems in natural language processing. In one particular aspect, a computer-implemented method includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with stop words to generate an augmented training set of out-of-domain utterances for an unresolved intent category corresponding to an unresolved intent; and training the intent classifier using the training set of utterances and the augmented training set of out-of-domain utterances. The augmenting includes: selecting one or more utterances from the training set of utterances, and for each selected utterance, preserving existing stop words within the utterance and replacing at least one non-stop word within the utterance with a stop word or stop word phrase selected from a list of stop words to generate an out-of-domain utterance.

Type: Application

Filed: September 9, 2020

Publication date: March 18, 2021

Applicant: Oracle International Corporation

Inventors: Vishal Vishnoi, Mark Edward Johnson, Elias Luqman Jalaluddin, Balakota Srinivas Vinnakota, Thanh Long Duong, Gautam Singaraju
COMPRESSING NEURAL NETWORKS FOR NATURAL LANGUAGE UNDERSTANDING

Publication number: 20210081799

Abstract: A model for a natural language understanding task is generated based on labeled data generated by a labeling model. The model for the natural language understanding task is smaller than the labeling model (i.e., with lower computational and memory requirements than the combined model), but with substantially the same performance as the labeling model. In some cases, the labeling model may be generated based on a large pre-trained model.

Type: Application

Filed: July 24, 2020

Publication date: March 18, 2021

Applicant: Oracle International Corporation

Inventor: Mark Edward Johnson

prev 1 2 3 4 5 next