Patents by Inventor Michael Langford

Michael Langford has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Generating synthetic time series datasets having change points

Patent number: 12380122

Abstract: Methods and systems are described herein for facilitating generation of synthetic datasets having a change point. The system may receive a command to generate a synthetic time series dataset. The system may generate data points for components of the synthetic dataset, the components including a seasonality function, a trend function, and a noise function. The system may modify the trend function to a different trend function by modifying a level or a slope of the trend function. The system may generate a change point by replacing a subset of consecutive data points generated using the trend function with consecutive data points generated using the different trend function. The system may then generate the synthetic time series dataset having a change point by combining the seasonality data points, the trend data points, and the noise data points into corresponding time slots of the synthetic time series dataset.

Type: Grant

Filed: November 22, 2023

Date of Patent: August 5, 2025

Assignee: Capital One Services, LLC

Inventors: Justin Essert, Zhengqing Liu, Vannia Gonzalez Macias, Pratik Gandhi, Michael Langford
GENERATING SYNTHETIC TIME SERIES DATASETS HAVING CHANGE POINTS

Publication number: 20250165485

Abstract: Methods and systems are described herein for facilitating generation of synthetic datasets having a change point. The system may receive a command to generate a synthetic time series dataset. The system may generate data points for components of the synthetic dataset, the components including a seasonality function, a trend function, and a noise function. The system may modify the trend function to a different trend function by modifying a level or a slope of the trend function. The system may generate a change point by replacing a subset of consecutive data points generated using the trend function with consecutive data points generated using the different trend function. The system may then generate the synthetic time series dataset having a change point by combining the seasonality data points, the trend data points, and the noise data points into corresponding time slots of the synthetic time series dataset.

Type: Application

Filed: November 22, 2023

Publication date: May 22, 2025

Applicant: Capital One Services, LLC

Inventors: Justin ESSERT, Zhengqing LIU, Vannia GONZALEZ MACIAS, Pratik GANDHI, Michael LANGFORD
SYSTEMS AND METHODS FOR MINIMIZING DIMENSIONALITY OF A HIGH-DIMENSIONALITY DATASET DURING FEATURE ENGINEERING

Publication number: 20250156730

Abstract: Systems and methods for minimizing dimensionality of a high-dimensionality dataset during feature engineering. The system achieves this by using a tabular neural network to extract non-linear transformations of features without dramatically increasing the dimensionality of the original dataset. The system receives an original dataset for classification and a defined number of final features (e.g., dimensionality) that result from the synthetic feature creation and the neural network embedding process. Once an architecture of a model is determined, a model is fit on a synthetic feature set (e.g., a second dataset comprising synthetic features) with a given classification as a target.

Type: Application

Filed: November 15, 2023

Publication date: May 15, 2025

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD
SYSTEMS AND METHODS FOR GENERATING FLEXIBLE ENSEMBLES OF WEAK LEARNERS USING BROKEN SERIES TRAINING

Publication number: 20250156770

Abstract: Systems and methods for novel uses and/or improvements to artificial intelligence applications, particularly in the context of practical applications featuring less complex model architectures. As one example, systems and methods described herein may achieve the technical benefits of a more complex model architecture through an ensemble of less complex models while reducing the overall training burden (e.g., in terms of computing resources, training time, and/or technical feasibility).

Type: Application

Filed: November 15, 2023

Publication date: May 15, 2025

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD
SYSTEMS AND METHODS FOR MINIMIZING DEVELOPMENT TIME IN ARTIFICIAL INTELLIGENCE MODELS BASED ON DATASET FITTINGS

Publication number: 20250139456

Abstract: Methods and systems are described herein for minimizing development time in artificial intelligence models by automating model selection based on dataset fittings of time-series data prior to hyperparameter optimization. The system may select a statistical profile type to identify in a first dataset. The system may retrieve a statistical model corresponding to the statistical profile type. The system may select, based on a first statistical profile, a first untrained model from a first plurality of untrained models for training, wherein the first plurality of untrained models comprises respective algorithms for time-series forecasting and wherein each of the first plurality of untrained models comprises default hyperparameter tuning. The system may, based on selecting the first untrained model, tune a first hyperparameter of the first untrained model using the first dataset.

Type: Application

Filed: October 31, 2023

Publication date: May 1, 2025

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD
SYSTEMS AND METHODS FOR MINIMIZING DEVELOPMENT TIME IN ARTIFICIAL INTELLIGENCE MODELS USING DYNAMIC DATASET FITTINGS

Publication number: 20250139455

Abstract: Systems and methods for minimizing development time in artificial intelligence models by automating model selection based on dataset fittings of time-series data prior to hyperparameter optimization. The systems and methods use a scoring policy based on a plurality of labeled datasets that score one or more results contained within the aggregate statistical profile. The system may dynamically identify particular criteria in statistical data that indicates an effectiveness of a given model on a given dataset. These criteria (e.g., the scoring policy) may then be updated over time as new datasets, statistical analyses, and/or aggregated statistical profiles are developed within affecting the underlying models and/or datasets.

Type: Application

Filed: October 31, 2023

Publication date: May 1, 2025

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD
SYSTEMS AND METHODS FOR MINIMIZING DEVELOPMENT TIME IN ARTIFICIAL INTELLIGENCE MODELS USING LOWER DIMENSIONAL EMBEDDINGS

Publication number: 20250139440

Abstract: Methods and systems are described herein for minimizing development time in artificial intelligence models by automating model selection based on dataset fittings of time-series data prior to hyperparameter optimization. For example, the system may apply a profiling model using a time-series embedding of the dataset combined with the aggregate statistical profile. In either case, the profiling model may be trained on the scoring policy and/or a time-series embedding of the dataset combined with the aggregate statistical profile to determine a likelihood of the effectiveness of a given model on the given dataset and/or likely hyperparameters for the given model.

Type: Application

Filed: October 31, 2023

Publication date: May 1, 2025

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD
SYSTEMS AND METHODS FOR MINIMIZING DEVELOPMENT TIME IN ARTIFICIAL INTELLIGENCE MODELS BY AUTOMATING HYPERPARAMETER SELECTION

Publication number: 20250139503

Abstract: Methods and systems are described herein for minimizing development time in artificial intelligence models by automating model selection based on dataset fittings of time-series data prior to hyperparameter optimization. The systems and methods described herein aim to reduce the redundancies and improve the efficiencies of model selection, model training, and/or hyperparameter selection. The systems and methods achieve this by using information about the attributes of the time-series dataset that may be used to determine a model that may be most effective at fitting a given dataset. If a model is selected prior to hyperparameter optimization, the time and resources spent training, fitting, and/or tuning models that are not selected can be avoided.

Type: Application

Filed: October 31, 2023

Publication date: May 1, 2025

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD
SYSTEMS AND METHODS FOR MINIMIZING DEVELOPMENT TIME IN ARTIFICIAL INTELLIGENCE MODELS

Publication number: 20250139502

Abstract: Methods and systems are described herein for minimizing development time in artificial intelligence models by automating model selection based on dataset fittings of time-series data prior to hyperparameter optimization.

Type: Application

Filed: October 31, 2023

Publication date: May 1, 2025

Applicant: Capital One Services, LLC

Inventors: Michael LANGFORD, Abhisek JANA, Rajesh Kanna DURAIRAJ
Systems and Methods for Providing a Nearest Neighbors Classification Pipeline with Automated Dimensionality Reduction

Publication number: 20250077503

Abstract: Disclosed embodiments may include a system for providing a nearest neighbors classification pipeline with automated dimensionality reduction. The system may receive a dataset. The system may determine whether the dataset has a first dimensionality that exceeds a predetermined threshold. When dataset has a dimensionality that exceeds a predetermined threshold, the system may prompt a user to input an explained variance threshold ratio. The system may receive the explained variance threshold ratio. The system may iteratively perform a binary search on the dataset to determine a reduced dimensionality having a total explained variance ratio closest to but not less than the explained variance threshold ratio. The system may reduce the dataset to the reduced dimensionality to generate a reduced dataset. The system may train a machine learning model using the reduced dataset.

Type: Application

Filed: March 7, 2024

Publication date: March 6, 2025

Inventor: Michael LANGFORD
SYSTEMS AND METHODS FOR BAGGING ENSEMBLE CLASSIFIERS FOR IMBALANCED BIG DATA

Publication number: 20240185116

Abstract: Disclosed embodiments may include a method for bagging ensemble classifiers for imbalanced big data. The system may receive user input comprising a number of machine learning base models to generate. The system may generate the machine learning base models based on the user input. Iteratively for each machine learning base model of the machine learning base models until all machine learning base models are trained, the system may: determine a chunk for a machine learning base model of the machine learning base models, wherein the chunk comprises all minority cases from training data and a plurality of majority cases from the training data and train the machine learning base model with the chunk.

Type: Application

Filed: December 1, 2022

Publication date: June 6, 2024

Inventor: Michael Langford
CORRELATION-BASED DIMENSIONAL REDUCTION OF SYNTHESIZED FEATURES

Publication number: 20240104421

Abstract: A method includes obtaining a first dataset comprising a first set of features and generating a second set of features based on the first set of features set by providing the first dataset to feature primitive stacks that respectively corresponds to features of the second set of features. The method further includes determining a reduced feature set based on the second set of features and a count of correlation values between features of the second set of features, wherein the correlation values satisfy a correlation threshold. The method further includes storing the reduced feature set in a database in association with the first set of features based on a determination that a second dataset comprising the reduced feature set satisfies a set of criteria.

Type: Application

Filed: September 26, 2022

Publication date: March 28, 2024

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD
CHAINED FEATURE SYNTHESIS AND DIMENSIONAL REDUCTION

Publication number: 20240104436

Abstract: A method includes obtaining a first dataset including a first feature set, generating a first set of feature values by providing the first dataset to a set of feature primitive stacks, and determining a reduced set of feature values based on the first set of feature values by dimensionally reducing features of the first set of feature values. The method further includes generating an intermediate set of feature values by providing a value of the first dataset and a value of the reduced set of feature values to at least one feature primitive of the set of feature primitive stacks. The method further includes updating the reduced set of feature values by dimensionally reducing features of the intermediate set of feature values and storing a second dataset including features of the intermediate set of feature values in association with the first feature set.

Type: Application

Filed: September 26, 2022

Publication date: March 28, 2024

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD
SYSTEMS AND METHODS FOR SUCCESSIVE FEATURE IMPUTATION USING MACHINE LEARNING

Publication number: 20240095551

Abstract: Systems and methods for successively imputing missing feature values using machine learning to sequentially fill in missing feature values in partially-filled datasets, and by using the information in populated records of the dataset. The systems and methods disclosed herein may be useful in many machine learning contexts and application where datasets are missing values.

Type: Application

Filed: September 15, 2022

Publication date: March 21, 2024

Inventor: Michael Langford
Systems and methods for providing a nearest neighbors classification pipeline with automated dimensionality reduction

Patent number: 11934384

Abstract: Disclosed embodiments may include a system for providing a nearest neighbors classification pipeline with automated dimensionality reduction. The system may receive a dataset. The system may determine whether the dataset has a first dimensionality that exceeds a predetermined threshold. When dataset has a dimensionality that exceeds a predetermined threshold, the system may prompt a user to input an explained variance threshold ratio. The system may receive the explained variance threshold ratio. The system may iteratively perform a binary search on the dataset to determine a reduced dimensionality having a total explained variance ratio closest to but not less than the explained variance threshold ratio. The system may reduce the dataset to the reduced dimensionality to generate a reduced dataset. The system may train a machine learning model using the reduced dataset.

Type: Grant

Filed: December 1, 2022

Date of Patent: March 19, 2024

Assignee: CAPITAL ONE SERVICES, LLC

Inventor: Michael Langford
TREE-BASED SYSTEMS AND METHODS FOR SELECTING AND REDUCING GRAPH NEURAL NETWORK NODE EMBEDDING DIMENSIONALITY

Publication number: 20240078415

Abstract: A method may be provided for selecting embedding dimension, which can include receiving a trained machine learning (ML) model and a graph neural network (GNN) and extracting, from the received ML model, a count of a number of neurons in a penultimate layer and node embeddings for each input graph node in GNN neurons in the penultimate layer. An importance threshold input for filtering the node embeddings can be received, and a tree-based model may be used to return feature importance values. The extracted node embeddings may be input into the tree-based model and an importance metric of each of the node embedding dimensions may be determined from the penultimate layer neurons. The penultimate layer neuron count of the ML model may be restricted to correspond to a number of the highest importance node embedding dimensions and the ML model may be trained using the restricted penultimate layer.

Type: Application

Filed: September 7, 2022

Publication date: March 7, 2024

Inventor: Michael Langford
EXTREMELY RANDOMIZED BOOTSTRAP AGGREGATION SYSTEMS AND METHODS

Publication number: 20240070528

Abstract: Systems and methods are provided for evaluating and selecting an ensemble of machine language models using extremely randomized bootstrap aggregation (e.g., bagging) with replacement. The method may include the use of a plurality of base models to produce a combined (or aggregated) output. Original data may be randomly sampled with replacement to create N subsets of bootstrapped data for which each of the N-selected base models may produce a prediction based on their subset of data. The individual predictions may be combined and evaluated, and an ensemble having the highest performance may be selected and trained for production. Certain implementations of the disclosed technology can eliminate the need for apriori knowledge about which model (or models) will provide accurate predictions.

Type: Application

Filed: August 31, 2022

Publication date: February 29, 2024

Inventor: Michael Langford
Sequential Synthesis and Selection for Feature Engineering

Publication number: 20240013089

Abstract: Systems and methods, as described herein, relate to sequential synthesis and selection for feature engineering. A dataset may be associated with a label defining a machine-learning target attribute and a received operation that can be applied to at least one of the existing features of the dataset. One or more potential features may be generated by applying the operation to one or more existing features. For each of the one or more potential features, a feature importance algorithm may be applied to the respective feature along with the one or more existing features, generating a respective feature importance value. Respective feature importance values may be generated for each of the one or more existing features based on applying the feature importance algorithm and used to sort the potential features. A level of correlation to each of the one or more existing features may be determined to make sure it is under a threshold level to avoid new features heavily correlated to existing ones.

Type: Application

Filed: July 7, 2022

Publication date: January 11, 2024

Inventor: Michael Langford
PROGRAMMATIC SELECTOR FOR CHOOSING A WELL-SUITED STACKED MACHINE LEARNING ENSEMBLE PIPELINE AND HYPERPARAMETER VALUES

Publication number: 20230419189

Abstract: The exemplary embodiments may provide a stacked machine learning model ensemble pipeline architecture selector that selects a well-suited stacked machine learning model ensemble pipeline architecture for a specified configuration input and a target data set. The stacked machine learning model ensemble pipeline architecture selector may generate and score possible stacked machine learning model ensemble pipeline architectures to locate one that is well-suited for the target data set and the conforms with the configuration input. The stacked machine learning model ensemble pipeline architecture selector may use genetic programming to generate successive generations of possible stacked ensemble pipeline architectures and to score those architectures to determine how well-suited they are. In this manner, the stacked machine learning model ensemble pipeline architecture selector may converge on an architecture that is well-suited, for example, that meet one or more scores, evaluation metrics, and/or the like.

Type: Application

Filed: June 24, 2022

Publication date: December 28, 2023

Applicant: Capital One Services, LLC

Inventors: Michael LANGFORD, Jakub KRZEPTOWSKI-MUCHA, Krishna BALAM
TECHNIQUES FOR RANKED HYPERPARAMETER OPTIMIZATION

Publication number: 20230196125

Abstract: Various embodiments are generally directed to techniques for optimizing hyperparameters, such as optimizing different combinations of hyperparameters, for instance. Some embodiments are particularly directed using a genetic or Bayesian algorithm to identify and optimize different combinations of hyperparameters for a machine learning (ML) model. Many embodiments construct a search using a genetic algorithm that prioritizes the most important hyperparameters in influencing model performance.

Type: Application

Filed: December 16, 2021

Publication date: June 22, 2023

Applicant: Capital One Services, LLC

Inventor: Michael LANGFORD

1 2 next