Patents by Inventor Jonathan K. Kummerfeld

Jonathan K. Kummerfeld has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for slot relation extraction for machine learning task-oriented dialogue systems

Patent number: 11734519

Abstract: A system and method for implementing slot-relation extraction for a task-oriented dialogue system that includes implementing dialogue intent classification machine learning models that predict a category of dialogue of a single utterance based on an input of utterance data relating to the single utterance, wherein the category of dialogue informs a selection of slot-filling machine learning models; implementing the slot-filling machine learning models that predict slot classification labels for each of a plurality of slots within the utterance based on the input of the utterance data; implementing a slot relation extraction machine learning model that predicts semantic relationship classifications between two or more distinct slots of tokens of the utterance; and generating a response to the single utterance or performing actions in response to the single utterance based on the semantic relationship classifications between the distinct pairings of the two or more distinct slots of the single utterance.

Type: Grant

Filed: February 10, 2021

Date of Patent: August 22, 2023

Assignee: Clinc, Inc.

Inventors: Andrew Lee, Zhenguo Chen, Jonathan K. Kummerfeld
SYSTEMS AND METHODS FOR SLOT RELATION EXTRACTION FOR MACHINE LEARNING TASK-ORIENTED DIALOGUE SYSTEMS

Publication number: 20210192146

Abstract: A system and method for implementing slot-relation extraction for a task-oriented dialogue system that includes implementing dialogue intent classification machine learning models that predict a category of dialogue of a single utterance based on an input of utterance data relating to the single utterance, wherein the category of dialogue informs a selection of slot-filling machine learning models; implementing the slot-filling machine learning models that predict slot classification labels for each of a plurality of slots within the utterance based on the input of the utterance data; implementing a slot relation extraction machine learning model that predicts semantic relationship classifications between two or more distinct slots of tokens of the utterance; and generating a response to the single utterance or performing actions in response to the single utterance based on the semantic relationship classifications between the distinct pairings of the two or more distinct slots of the single utterance.

Type: Application

Filed: February 10, 2021

Publication date: June 24, 2021

Inventors: Andrew Lee, Zhenguo Chen, Jonathan K. Kummerfeld
Systems and methods for mixed setting training for slot filling machine learning tasks in a machine learning task-oriented dialogue system

Patent number: 11043208

Abstract: Systems and methods for intelligently training a subject machine learning model includes identifying new observations comprising a plurality of distinct samples unseen by a target model during a prior training; creating an incremental training corpus based on randomly sampling a collection of training data samples that includes a plurality of new observations and a plurality of historical training data samples used in the prior training of the target model; implementing a first training mode that includes an incremental training of the target model using samples from the incremental training corpus as model training input; computing performance metrics of the target model based on the incremental training; evaluating the performance metrics of the target model against training mode thresholds; and selectively choosing based on the evaluation one of maintaining the first training mode and automatically switching to a second training mode that includes a full retraining of the target model.

Type: Grant

Filed: February 19, 2021

Date of Patent: June 22, 2021

Assignee: Clinc, Inc.

Inventors: Daniel C. Michelin, Jonathan K. Kummerfeld, Kevin Leach, Stefan Larson, Joseph J. Peper, Yunqi Zhang
SYSTEMS AND METHODS FOR AUTOMATICALLY DETECTING AND REPAIRING SLOT ERRORS IN MACHINE LEARNING TRAINING DATA FOR A MACHINE LEARNING-BASED DIALOGUE SYSTEM

Publication number: 20210166138

Abstract: Systems and methods for automatically detecting annotation discrepancies in annotated training data samples and repairing the annotated training data samples for a machine learning-based automated dialogue system include evaluating a corpus of a plurality of distinct training data samples; identifying one or more of a slot span defect and a slot label defect of a target annotated slot span of a target training data sample of the corpus based on the evaluation; and automatically correcting one or more annotations of the target annotated slot span based on the identified one or more of the slot span defect and the slot label defect.

Type: Application

Filed: January 15, 2021

Publication date: June 3, 2021

Inventors: Stefan Larson, Anish Mahendran, Parker Hill, Jonathan K. Kummerfeld, Michael A. Laurenzano, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR SLOT RELATION EXTRACTION FOR MACHINE LEARNING TASK-ORIENTED DIALOGUE SYSTEMS

Publication number: 20210117629

Abstract: A system and method for implementing slot-relation extraction for a task-oriented dialogue system that includes implementing dialogue intent classification machine learning models that predict a category of dialogue of a single utterance based on an input of utterance data relating to the single utterance, wherein the category of dialogue informs a selection of slot-filling machine learning models; implementing the slot-filling machine learning models that predict slot classification labels for each of a plurality of slots within the utterance based on the input of the utterance data; implementing a slot relation extraction machine learning model that predicts semantic relationship classifications between two or more distinct slots of tokens of the utterance; and generating a response to the single utterance or performing actions in response to the single utterance based on the semantic relationship classifications between the distinct pairings of the two or more distinct slots of the single utterance.

Type: Application

Filed: September 8, 2020

Publication date: April 22, 2021

Inventors: Andrew Lee, Zhenguo Chen, Jonathan K. Kummerfeld
Systems and methods for slot relation extraction for machine learning task-oriented dialogue systems

Patent number: 10970493

Abstract: A system and method for implementing slot-relation extraction for a task-oriented dialogue system that includes implementing dialogue intent classification machine learning models that predict a category of dialogue of a single utterance based on an input of utterance data relating to the single utterance, wherein the category of dialogue informs a selection of slot-filling machine learning models; implementing the slot-filling machine learning models that predict slot classification labels for each of a plurality of slots within the utterance based on the input of the utterance data; implementing a slot relation extraction machine learning model that predicts semantic relationship classifications between two or more distinct slots of tokens of the utterance; and generating a response to the single utterance or performing actions in response to the single utterance based on the semantic relationship classifications between the distinct pairings of the two or more distinct slots of the single utterance.

Type: Grant

Filed: September 8, 2020

Date of Patent: April 6, 2021

Assignee: Clinc, Inc.

Inventors: Andrew Lee, Zhenguo Chen, Jonathan K. Kummerfeld
Systems and methods for automatically detecting and repairing slot errors in machine learning training data for a machine learning-based dialogue system

Patent number: 10929761

Abstract: Systems and methods for automatically detecting annotation discrepancies in annotated training data samples and repairing the annotated training data samples for a machine learning-based automated dialogue system include evaluating a corpus of a plurality of distinct training data samples; identifying one or more of a slot span defect and a slot label defect of a target annotated slot span of a target training data sample of the corpus based on the evaluation; and automatically correcting one or more annotations of the target annotated slot span based on the identified one or more of the slot span defect and the slot label defect.

Type: Grant

Filed: June 8, 2020

Date of Patent: February 23, 2021

Assignee: Clinic, Inc.

Inventors: Stefan Larson, Anish Mahendran, Parker Hill, Jonathan K. Kummerfeld, Michael A. Laurenzano, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR CONSTRUCTING AN ARTIFICIALLY DIVERSE CORPUS OF TRAINING DATA SAMPLES FOR TRAINING A CONTEXTUALLY-BIASED MODEL FOR A MACHINE LEARNING-BASED DIALOGUE SYSTEM

Publication number: 20210004539

Abstract: Systems and methods for constructing an artificially diverse corpus of training data includes evaluating a corpus of utterance-based training data samples, identifying a slot replacement candidate; deriving distinct skeleton utterances that include the slot replacement candidate, wherein deriving the distinct skeleton utterances includes replacing slots of each of the plurality of distinct utterance training samples with one of a special token and proper slot classification labels; selecting a subset of the distinct skeleton utterances; converting each of the distinct skeleton utterances of the subset back to distinct utterance training samples while still maintaining the special token at a position of the slot replacement candidate; altering a percentage of the distinct utterance training samples with a distinct randomly-generated slot token value at the position of the slot replacement candidate; and constructing the artificially diverse corpus of training samples based on a collection of the percentage of

Type: Application

Filed: September 1, 2020

Publication date: January 7, 2021

Inventors: Andrew Lee, Stefan Larson, Christopher Clarke, Kevin Leach, Jonathan K. Kummerfeld, Parker Hill, Johann Hauswald, Michael A. Laurenzano, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR AUTOMATICALLY DETECTING AND REPAIRING SLOT ERRORS IN MACHINE LEARNING TRAINING DATA FOR A MACHINE LEARNING-BASED DIALOGUE SYSTEM

Publication number: 20200401914

Abstract: Systems and methods for automatically detecting annotation discrepancies in annotated training data samples and repairing the annotated training data samples for a machine learning-based automated dialogue system include evaluating a corpus of a plurality of distinct training data samples; identifying one or more of a slot span defect and a slot label defect of a target annotated slot span of a target training data sample of the corpus based on the evaluation; and automatically correcting one or more annotations of the target annotated slot span based on the identified one or more of the slot span defect and the slot label defect.

Type: Application

Filed: June 8, 2020

Publication date: December 24, 2020

Inventors: Stefan Larson, Anish Mahendran, Parker Hill, Jonathan K. Kummerfeld, Michael A. Laurenzano, Lingjia Tang, Jason Mars
Systems and methods for machine learning-based multi-intent segmentation and classification

Patent number: 10824818

Abstract: Systems and methods for synthesizing training data for multi-intent utterance segmentation include identifying a first corpus of utterances comprising a plurality of distinct single-intent in-domain utterances; identifying a second corpus of utterances comprising a plurality of distinct single-intent out-of-domain utterances; identifying a third corpus comprising a plurality of distinct conjunction terms; forming a multi-intent training corpus comprising synthetic multi-intent utterances, wherein forming each distinct multi-intent utterance includes: selecting a first distinct in-domain utterance from the first corpus of utterances; probabilistically selecting one of a first out-of-domain utterance from the second corpus and a second in-domain utterance from the first corpus; probabilistically selecting or not selecting a distinct conjunction term from the third corpus; and forming a synthetic multi-intent utterance including appending the first in-domain utterance with one of the first out-of-domain utteranc

Type: Grant

Filed: April 21, 2020

Date of Patent: November 3, 2020

Assignee: Clinc, Inc.

Inventors: Joseph Peper, Parker Hill, Kevin Leach, Sean Stapleton, Jonathan K. Kummerfeld, Johann Hauswald, Michael Laurenzano, Lingjia Tang, Jason Mars
Systems and methods for constructing an artificially diverse corpus of training data samples for training a contextually-biased model for a machine learning-based dialogue system

Patent number: 10796104

Abstract: Systems and methods for constructing an artificially diverse corpus of training data includes evaluating a corpus of utterance-based training data samples, identifying a slot replacement candidate; deriving distinct skeleton utterances that include the slot replacement candidate, wherein deriving the distinct skeleton utterances includes replacing slots of each of the plurality of distinct utterance training samples with one of a special token and proper slot classification labels; selecting a subset of the distinct skeleton utterances; converting each of the distinct skeleton utterances of the subset back to distinct utterance training samples while still maintaining the special token at a position of the slot replacement candidate; altering a percentage of the distinct utterance training samples with a distinct randomly-generated slot token value at the position of the slot replacement candidate; and constructing the artificially diverse corpus of training samples based on a collection of the percentage of

Type: Grant

Filed: June 22, 2020

Date of Patent: October 6, 2020

Assignee: Clinc, Inc.

Inventors: Andrew Lee, Stefan Larson, Christopher Clarke, Kevin Leach, Jonathan K. Kummerfeld, Parker Hill, Johann Hauswald, Michael A. Laurenzano, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR INTELLIGENTLY CURATING MACHINE LEARNING TRAINING DATA AND IMPROVING MACHINE LEARNING MODEL PERFORMANCE

Publication number: 20200272855

Abstract: Systems and methods of intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system includes constructing a corpora of machine learning test corpus that comprise a plurality of historical queries and commands sampled from production logs of a deployed dialogue system; configuring training data sourcing parameters to source a corpora of raw machine learning training data from remote sources of machine learning training data; calculating efficacy metrics of the corpora of raw machine learning training data, wherein calculating the efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; using the corpora of raw machine learning training data to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold.

Type: Application

Filed: April 30, 2020

Publication date: August 27, 2020

Inventors: Yiping Kang, Yunqi Zhang, Jonathan K. Kummerfeld, Parker Hill, Johann Hauswald, Michael A. Laurenzano, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR AUTOMATICALLY CONFIGURING TRAINING DATA FOR TRAINING MACHINE LEARNING MODELS OF A MACHINE LEARNING-BASED DIALOGUE SYSTEM

Publication number: 20200258007

Abstract: A system and method for improving a machine learning-based dialogue system includes: sourcing a corpus of raw machine learning training data from sources of training data based on a plurality of seed training samples, wherein the corpus of raw machine learning training data comprises a plurality of distinct instances of training data; generating a vector representation for each distinct instance of training data; identifying statistical characteristics of the corpus of raw machine learning training data based on a mapping of the vector representation for each distinct instance of training data; identifying anomalous instances of the plurality of distinct instances of training data of the corpus of raw machine learning training data based on the identified statistical characteristics of the corpus; and curating the corpus of raw machine learning training data based on each of the instances of training data identified as anomalous instances.

Type: Application

Filed: April 30, 2020

Publication date: August 13, 2020

Inventors: Stefan Larson, Anish Mahendran, Andrew Lee, Jonathan K. Kummerfeld, Parker Hill, Michael A. Laurenzano, Johann Hauswald, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR MACHINE LEARNING-BASED MULTI-INTENT SEGMENTATION AND CLASSIFICATION

Publication number: 20200257857

Abstract: Systems and methods for synthesizing training data for multi-intent utterance segmentation include identifying a first corpus of utterances comprising a plurality of distinct single-intent in-domain utterances; identifying a second corpus of utterances comprising a plurality of distinct single-intent out-of-domain utterances; identifying a third corpus comprising a plurality of distinct conjunction terms; forming a multi-intent training corpus comprising synthetic multi-intent utterances, wherein forming each distinct multi-intent utterance includes: selecting a first distinct in-domain utterance from the first corpus of utterances; probabilistically selecting one of a first out-of-domain utterance from the second corpus and a second in-domain utterance from the first corpus; probabilistically selecting or not selecting a distinct conjunction term from the third corpus; and forming a synthetic multi-intent utterance including appending the first in-domain utterance with one of the first out-of-domain utteranc

Type: Application

Filed: April 21, 2020

Publication date: August 13, 2020

Inventors: Joseph Peper, Parker Hill, Kevin Leach, Sean Stapleton, Jonathan K. Kummerfeld, Johann Hauswald, Michael Laurenzano, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR MACHINE LEARNING BASED MULTI INTENT SEGMENTATION AND CLASSIFICATION

Publication number: 20200257856

Abstract: Systems and methods for synthesizing training data for multi-intent utterance segmentation include identifying a first corpus of utterances comprising a plurality of distinct single-intent in-domain utterances; identifying a second corpus of utterances comprising a plurality of distinct single-intent out-of-domain utterances; identifying a third corpus comprising a plurality of distinct conjunction terms; forming a multi-intent training corpus comprising synthetic multi-intent utterances, wherein forming each distinct multi-intent utterance includes: selecting a first distinct in-domain utterance from the first corpus of utterances; probabilistically selecting one of a first out-of-domain utterance from the second corpus and a second in-domain utterance from the first corpus; probabilistically selecting or not selecting a distinct conjunction term from the third corpus; and forming a synthetic multi-intent utterance including appending the first in-domain utterance with one of the first out-of-domain utteranc

Type: Application

Filed: February 6, 2020

Publication date: August 13, 2020

Inventors: Joseph Peper, Parker Hill, Kevin Leach, Sean Stapleton, Jonathan K. Kummerfeld, Johann Hauswald, Michael A. Laurenzano, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR AUTOMATICALLY CONFIGURING TRAINING DATA FOR TRAINING MACHINE LEARNING MODELS OF A MACHINE LEARNING-BASED DIALOGUE SYSTEM INCLUDING SEEDING TRAINING SAMPLES OR CURATING A CORPUS OF TRAINING DATA BASED ON INSTANCES OF TRAINING DATA IDENTIFIEDAS ANOMALOUS

Publication number: 20200193331

Abstract: A system and method for improving a machine learning-based dialogue system includes: sourcing a corpus of raw machine learning training data from sources of training data based on a plurality of seed training samples, wherein the corpus of raw machine learning training data comprises a plurality of distinct instances of training data; generating a vector representation for each distinct instance of training data; identifying statistical characteristics of the corpus of raw machine learning training data based on a mapping of the vector representation for each distinct instance of training data; identifying anomalous instances of the plurality of distinct instances of training data of the corpus of raw machine learning training data based on the identified statistical characteristics of the corpus; and curating the corpus of raw machine learning training data based on each of the instances of training data identified as anomalous instances.

Type: Application

Filed: November 20, 2019

Publication date: June 18, 2020

Inventors: Stefan Larson, Anish Mahendran, Andrew Lee, Jonathan K. Kummerfeld, Parker Hill, Michael A. Laurenzano, Johann Hauswald, Lingjia Tang, Jason Mars
Systems and methods for automatically configuring training data for training machine learning models of a machine learning-based dialogue system including seeding training samples or curating a corpus of training data based on instances of training data identified as anomalous

Patent number: 10679150

Abstract: A system and method for improving a machine learning-based dialogue system includes: sourcing a corpus of raw machine learning training data from sources of training data based on a plurality of seed training samples, wherein the corpus of raw machine learning training data comprises a plurality of distinct instances of training data; generating a vector representation for each distinct instance of training data; identifying statistical characteristics of the corpus of raw machine learning training data based on a mapping of the vector representation for each distinct instance of training data; identifying anomalous instances of the plurality of distinct instances of training data of the corpus of raw machine learning training data based on the identified statistical characteristics of the corpus; and curating the corpus of raw machine learning training data based on each of the instances of training data identified as anomalous instances.

Type: Grant

Filed: November 20, 2019

Date of Patent: June 9, 2020

Assignee: Clinc, Inc.

Inventors: Stefan Larson, Anish Mahendran, Andrew Lee, Jonathan K. Kummerfeld, Parker Hill, Michael A. Laurenzano, Johann Hauswald, Lingjia Tang, Jason Mars
Systems and methods for intelligently curating machine learning training data and improving machine learning model performance

Patent number: 10679100

Abstract: Systems and methods of intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system includes constructing a corpora of machine learning test corpus that comprise a plurality of historical queries and commands sampled from production logs of a deployed dialogue system; configuring training data sourcing parameters to source a corpora of raw machine learning training data from remote sources of machine learning training data; calculating efficacy metrics of the corpora of raw machine learning training data, wherein calculating the efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; using the corpora of raw machine learning training data to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold.

Type: Grant

Filed: April 10, 2019

Date of Patent: June 9, 2020

Assignee: Clinc, Inc.

Inventors: Yiping Kang, Yunqi Zhang, Jonathan K. Kummerfeld, Parker Hill, Johann Hauswald, Michael A. Laurenzano, Lingjia Tang, Jason Mars
SYSTEMS AND METHODS FOR INTELLIGENTLY CURATING MACHINE LEARNING TRAINING DATA AND IMPROVING MACHINE LEARNING MODEL PERFORMANCE

Publication number: 20190294925

Abstract: Systems and methods of intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system includes constructing a corpora of machine learning test corpus that comprise a plurality of historical queries and commands sampled from production logs of a deployed dialogue system; configuring training data sourcing parameters to source a corpora of raw machine learning training data from remote sources of machine learning training data; calculating efficacy metrics of the corpora of raw machine learning training data, wherein calculating the efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; using the corpora of raw machine learning training data to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold.

Type: Application

Filed: April 10, 2019

Publication date: September 26, 2019

Inventors: Yiping Kang, Yunqi Zhang, Jonathan K. Kummerfeld, Parker Hill, Johann Hauswald, Michael A. Laurenzano, Lingjia Tang, Jason Mars
Systems and methods for intelligently curating machine learning training data and improving machine learning model performance

Patent number: 10303978

Abstract: Systems and methods of intelligent formation and acquisition of machine learning training data for implementing an artificially intelligent dialogue system includes constructing a corpora of machine learning test corpus that comprise a plurality of historical queries and commands sampled from production logs of a deployed dialogue system; configuring training data sourcing parameters to source a corpora of raw machine learning training data from remote sources of machine learning training data; calculating efficacy metrics of the corpora of raw machine learning training data, wherein calculating the efficacy metrics includes calculating one or more of a coverage metric value and a diversity metric value of the corpora of raw machine learning training data; using the corpora of raw machine learning training data to train the at least one machine learning classifier if the calculated coverage metric value of the corpora of machine learning training data satisfies a minimum coverage metric threshold.

Type: Grant

Filed: September 27, 2018

Date of Patent: May 28, 2019

Assignee: Clinc, Inc.

Inventors: Yiping Kang, Yunqi Zhang, Jonathan K. Kummerfeld, Parker Hill, Johann Hauswald, Michael A. Laurenzano, Lingjia Tang, Jason Mars