Patents by Inventor Mark Edward Johnson

Mark Edward Johnson has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Techniques for dialog processing using contextual data

Patent number: 11551676

Abstract: Techniques are described for using data stored for a user in association with context levels to improve the efficiency and accuracy of dialog processing tasks. A dialog system stores historical dialog data in association with a plurality of configured context levels. The dialog system receives an utterance and identifies a term for disambiguation from the utterance. Based on a determined context level, the dialog system identifies relevant historical data stored to a database. The historical data may be used to perform tasks such as resolving an ambiguity based on user preferences, disambiguating named entities based on a prior dialog, and identifying previously generated answers to queries. Based on the context level, the dialog system can efficiently identify the relevant information and use the identified information to provide a response.

Type: Grant

Filed: August 26, 2020

Date of Patent: January 10, 2023

Assignee: Oracle International Corporation

Inventor: Mark Edward Johnson
Noise data augmentation for natural language processing

Patent number: 11538457

Abstract: Techniques for noise data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with noise text to generate an augmented training set of utterances; and training the intent classifier using the augmented training set of utterances. The augmenting includes: obtaining the noise text from a list of words, a text corpus, a publication, a dictionary, or any combination thereof irrelevant of original text within the utterances of the training set of utterances, and incorporating the noise text within the utterances relative to the original text in the utterances of the training set of utterances at a predefined augmentation ratio to generate augmented utterances.

Type: Grant

Filed: September 9, 2020

Date of Patent: December 27, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Yu-Heng Hong, Balakota Srinivas Vinnakota
Reduced training for dialog systems using a database

Patent number: 11514911

Abstract: Techniques are described for training and executing a machine learning model using data derived from a database. A dialog system uses data from the database to generate related training data for natural language understanding applications. The generated training data is then used to train a machine learning model. This enables the dialog system to leverage a large amount of available data to speed up the training process as compared to conventional labeling techniques. The dialog system uses the trained machine learning model to identify a named entity from a received spoken utterance and generate and output a speech response based upon the identified named entity.

Type: Grant

Filed: August 3, 2020

Date of Patent: November 29, 2022

Assignee: Oracle International Corporation

Inventors: Mark Edward Johnson, Michael Rye Kennewick
Using backpropagation to train a dialog system

Patent number: 11508359

Abstract: Techniques described herein use backpropagation to train one or more machine learning (ML) models of a dialog system. For instance, a method includes accessing seed data that includes training tuples, where each training tuple comprising a respective logical form. The method includes converting the logical form of a training tuple to a converted logical form, by applying to the logical form a text-to-speech (TTS) subsystem, an automatic speech recognition (ASR) subsystem, and a semantic parser of a dialog system. The method includes determining a training signal by using an objective function to compare the converted logical form to the logical form. The method further includes training the TTS subsystem, the ASR subsystem, and the semantic parser via backpropagation based on the training signal. As a result of the training by backpropagation, the machine learning models are tuned work effectively together within a pipeline of the dialog system.

Type: Grant

Filed: August 25, 2020

Date of Patent: November 22, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Thanh Long Duong, Mark Edward Johnson
Semantic parser including a coarse semantic parser and a fine semantic parser

Patent number: 11501065

Abstract: Techniques for improving a semantic parser of a dialog system, by breaking the semantic parser into a coarse semantic parser and a fine semantic parser, are described. A method described herein includes accessing an utterance received in a dialog system. The utterance is a text-based natural language expression. The method further includes applying a coarse semantic parser to the utterance to determine an intermediate logical form for the utterance. The intermediate logical form indicates one or more intents in the utterance. The method further includes applying a fine semantic parser to the intermediate logical form to determine a logical form for the utterance. The logical form is a syntactic expression of the utterance according to an established grammar, and the logical form includes one or more parameters of the one or more intents. The logical form can be used to conduct a dialog with a user of the dialog system.

Type: Grant

Filed: August 13, 2020

Date of Patent: November 15, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Thanh Long Duong, Mark Edward Johnson
Implementing a correction model to reduce propagation of automatic speech recognition errors

Patent number: 11462208

Abstract: Some techniques described herein determine a correction model for a dialog system, such that the correction model corrects output from an automatic speech recognition (ASR) subsystem in the dialog system. A method described herein includes accessing training data. A first tuple of the training data includes an utterance, where the utterance is a textual representation of speech. The method further includes using an ASR subsystem of a dialog system to convert the utterance to an output utterance. The method further includes storing the output utterance in corrective training data that is based on the training data. The method further includes training a correction model based on the corrective training data, such that the correction model is configured to correct output from the ASR subsystem during operation of the dialog system.

Type: Grant

Filed: August 13, 2020

Date of Patent: October 4, 2022

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Thanh Long Duong, Mark Edward Johnson
Streamlining dialog processing using integrated shared resources

Patent number: 11403462

Abstract: Techniques for reducing memory and processing resources used by a dialog system by sharing resources between pipelined processes of the dialog system. An integrated shared dictionary is constructed for concurrent use by automated speech recognition (ASR) and natural language understanding (NLU) subsystems of the dialog system. The integrated shared dictionary comprises multiple entries, with each entry comprising first information that is used by the ASR subsystem, second information used by the NLU subsystem, and information correlating the first information and the second information. The ASR subsystem uses the integrated shared dictionary to identify a dictionary entry containing a set of words corresponding to speech input. The dictionary entry information is communicated to the NLU subsystem, which uses the entry to generate a meaning representation for the speech input.

Type: Grant

Filed: July 13, 2020

Date of Patent: August 2, 2022

Assignee: Oracle International Corporation

Inventor: Mark Edward Johnson
MULTI-FACTOR MODELLING FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220230000

Abstract: Techniques are disclosed for systems including techniques for multi-factor modelling for training and utilizing chatbot systems for natural language processing. In an embodiment, a method includes receiving a set of utterance data corresponding to a natural language-based query, determining one or more intents for the chatbot corresponds to a possible context for the natural language-based query and associated with a skill for the chatbot, generating one or more intent classification datasets, each intent classification dataset associated with a probability that the natural language query corresponds to an intent of the one or more intents, generating one or more transformed datasets each corresponding to a skill of one or more skills, determining a first skill of the one or more skills based on the one or more transformed datasets and processing, based on the determined first skill, the set of utterance data to resolve the natural language-based query.

Type: Application

Filed: January 18, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson, Thanh Long Duong, Ying Xu
MULTI-FEATURE BALANCING FOR NATURAL LANGUAGE PROCESSORS

Publication number: 20220229991

Abstract: Techniques are disclosed for systems including techniques for multi-feature balancing for natural langue processors. In an embodiment, a method includes receiving a natural language query to be processed by a machine learning model, the machine learning model utilizing a dataset of natural language phrases for processing natural language queries, determining, based on the machine learning model and the natural language query, a feature dropout value, generating, and based on the natural language query, one or more contextual features and one or more expressional features that may be input to the machine learning model, modifying at least one or the one or more contextual features and the one or more expressional features based on the feature dropout value to generate a set of input features for the machine learning model, and processing the set of input features to cause generating an output dataset for corresponding to the natural language query.

Type: Application

Filed: January 20, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Vishal Vishnoi, Mark Edward Johnson, Elias Luqman Jalaluddin, Tuyen Quang Pham, Cong Duy Vu Hoang, Poorya Zaremoodi, Srinivasa Phani Kumar Gadde, Aashna Devang Kanuga, Zikai Li, Yuanxu Wu
CONTEXT TAG INTEGRATION WITH NAMED ENTITY RECOGNITION MODELS

Publication number: 20220229993

Abstract: Techniques are provided for using context tags in named-entity recognition (NER) models. In one particular aspect, a method is provided that includes receiving an utterance, generating embeddings for words of the utterance, generating a regular expression and gazetteer feature vector for the utterance, generating a context tag distribution feature vector for the utterance, concatenating or interpolating the embeddings with the regular expression and gazetteer feature vector and the context tag distribution feature vector to generate a set of feature vectors, generating an encoded form of the utterance based on the set of feature vectors, generating log-probabilities based on the encoded form of the utterance, and identifying one or more constraints for the utterance.

Type: Application

Filed: January 19, 2022

Publication date: July 21, 2022

Applicant: Oracle International Corporation

Inventors: Duy Vu, Tuyen Quang Pham, Cong Duy Vu Hoang, Srinivasa Phani Kumar Gadde, Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi
KEYWORD DATA AUGMENTATION TOOL FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171930

Abstract: Techniques for keyword data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: identifying keywords within utterances of the training set of utterances, generating a set of OOD examples with the identified keywords, filtering out OOD examples from the set of OOD examples that have a context substantially similar to context of the utterances of the training set of utterances, and incorporating the set of OOD examples without the filtered OOD examples into the training set of utterances to generate an augmented training set of utterances. Thereafter, the machine-learning model is trained using the augmented training set of utterances.

Type: Application

Filed: October 28, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Poorya Zaremoodi, Gautam Singaraju, Ying Xu, Vladislav Blinov
METHOD AND SYSTEM FOR OVER-PREDICTION IN NEURAL NETWORKS

Publication number: 20220172021

Abstract: Disclosed herein are techniques for addressing an overconfidence problem associated with machine learning models in chatbot systems. For each layer of a plurality of layers of a machine learning model, a distribution of confidence scores is generated for a plurality of predictions with respect to an input utterance. A prediction is determined for each layer of the machine learning model based on the distribution of confidence scores generated for the layer. Based on the predictions, an overall prediction of the machine learning model is determined. A subset of the plurality of layers are iteratively processed to identify a layer whose assigned prediction satisfies a criterion. A confidence score associated with the assigned prediction of the layer of the machine learning model is assigned as an overall confidence score to be associated with the overall prediction of the machine learning model.

Type: Application

Filed: November 16, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Cong Duy Vu Hoang, Thanh Tien Vu, Poorya Zaremoodi, Ying Xu, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
OUT-OF-DOMAIN DATA AUGMENTATION FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171938

Abstract: Techniques for out-of-domain data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, and augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: generating a data set of OOD examples, filtering out OOD examples from the data set of OOD examples, determining a difficulty value for each OOD example remaining within the filtered data set of the OOD examples, and generating augmented batches of utterances comprising utterances from the training set of utterances and utterances from the filtered data set of the OOD based on the difficulty value for each OOD. Thereafter, the machine-learning model is trained using the augmented batches of utterances in accordance with a curriculum training protocol.

Type: Application

Filed: October 28, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Poorya Zaremoodi, Gautam Singaraju, Ying Xu, Vladislav Blinov, Yu-Heng Hong
ENHANCED LOGITS FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171946

Abstract: Techniques for using enhanced logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system and inputting the utterance into a machine-learning model including a series of network layers. A final network layer of the series of network layers can include a logit function. The machine-learning model can map a first probability for a resolvable class to a first logit value using the logit function. The machine-learning model can map a second probability for a unresolvable class to an enhanced logit value. The method can also include the chatbot system classifying the utterance as the resolvable class or the unresolvable class based on the first logit value and the enhanced logit value.

Type: Application

Filed: November 29, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
DISTANCE-BASED LOGIT VALUE FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220171947

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

Type: Application

Filed: November 30, 2021

Publication date: June 2, 2022

Applicant: Oracle International Corporation

Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
AUTOMATIC OUT OF SCOPE TRANSITION FOR CHATBOT

Publication number: 20220100961

Abstract: Techniques for automatically switching between chatbot skills in the same domain. In one particular aspect, a method is provided that includes receiving an utterance from a user within a chatbot session, where a current skill context is a first skill and a current group context is a first group, inputting the utterance into a candidate skills model for the first group, obtaining, using the candidate skills model, a ranking of skills within the first group, determining, based on the ranking of skills, a second skill is a highest ranked skill, changing the current skill context of the chatbot session to the second skill, inputting the utterance into a candidate flows model for the second skill, obtaining, using the candidate flows model, a ranking of intents within the second skill that match the utterance, and determining, based on the ranking of intents, an intent that is a highest ranked intent.

Type: Application

Filed: September 30, 2021

Publication date: March 31, 2022

Applicant: Oracle International Corporation

Inventors: Vishal Vishnoi, Xin Xu, Elias Luqman Jalaluddin, Srinivasa Phani Kumar Gadde, Crystal C. Pan, Mark Edward Johnson, Thanh Long Duong, Balakota Srinivas Vinnakota, Manish Parekh
ENTITY LEVEL DATA AUGMENTATION IN CHATBOTS FOR ROBUST NAMED ENTITY RECOGNITION

Publication number: 20210390951

Abstract: Techniques for data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes generating a list of values to cover for an entity, selecting utterances from a set of data that have context for the entity, converting the utterances into templates, where each template of the templates comprises a slot that maps to the list of values for the entity, selecting a template from the templates, selecting a value from the list of values based on the mapping between the slot within the selected template and the list of values for the entity; and creating an artificial utterance based on the selected template and the selected value, where the creating the artificial utterance comprises inserting the selected value into the slot of the selected template that maps to the list of values for the entity.

Type: Application

Filed: June 11, 2021

Publication date: December 16, 2021

Applicant: Oracle International Corporation

Inventors: Srinivasa Phani Kumar Gadde, Yuanxu Wu, Aashna Devang Kanuga, Elias Luqman Jalaluddin, Vishal Vishnoi, Mark Edward Johnson
METHOD AND SYSTEM FOR CONSTRAINT BASED HYPERPARAMETER TUNING

Publication number: 20210304003

Abstract: Techniques are disclosed for tuning hyperparameters of a model. Datasets are obtained for training the model and metrics are selected for evaluating performance of the model. Each metric is assigned a weight specifying an importance to the performance of the model. A function is created that measures performance based on the weighted metrics. Hyperparameters are tuned to optimize the model performance. Tuning the hyperparameters includes: (i) training the model that is configured based on a current values for the hyperparameters; (ii) evaluating a performance of the model using the function; (iii) determining whether the model is optimized for the metrics; (iv) in response to the model not being optimized, searching for a new values for the hyperparameters, reconfiguring the model with the new values, and repeating steps (i)-(iii) using the reconfigured model; and (v) in response to the model being optimized for the metrics, providing a trained model.

Type: Application

Filed: March 29, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Mark Edward Johnson, Thanh Long Duong, Vishal Vishnoi, Balakota Srinivas Vinnakota, Tuyen Quang Pham, Cong Duy Vu Hoang
METHOD AND SYSTEM FOR TARGET BASED HYPER-PARAMETER TUNING

Publication number: 20210304074

Abstract: Techniques are disclosed for tuning hyperparameters of a machine-learning model. A plurality of metrics are selected for which hyperparameters of the machine-learning model are to be tuned. Each metric is associated with a plurality of specification parameters including a target score, a penalty factor, and a bonus factor. The plurality of specification parameters are configured for each metric in accordance with a first criterion. The machine-learning model is evaluated using one or more validation datasets to obtain a metric score. A weighted loss function is formulated based on a difference between the metric score and the target score of each metric, the penalty factor or the bonus factor. The hyperparameters associated with the machine-learning model are tuned in order to optimize the weighted loss function. In response to the weighted loss function being optimized, the machine-learning model is provided as a validated machine-learning model.

Type: Application

Filed: March 29, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Poorya Zaremoodi, Ying Xu, Thanh Tien Vu, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson, Xin Xu, Cong Duy Vu Hoang
TECHNIQUES FOR OUT-OF-DOMAIN (OOD) DETECTION

Publication number: 20210303798

Abstract: The present disclosure relates to techniques for identifying out-of-domain utterances.

Type: Application

Filed: March 30, 2021

Publication date: September 30, 2021

Applicant: Oracle International Corporation

Inventors: Thanh Long Duong, Mark Edward Johnson, Vishal Vishnoi, Crystal C. Pan, Vladislav Blinov, Cong Duy Vu Hoang, Elias Luqman Jalaluddin, Duy Vu, Balakota Srinivas Vinnakota

prev 1 2 3 4 5 6 next