Patents by Inventor Alan Donald Czeszynski

Alan Donald Czeszynski has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for training set obfuscation utilizing an inverted threat model in a zero-trust computing environment

Patent number: 12229274

Abstract: An algorithm is trained on a dataset to facilitate dynamic data exfiltration protection in a zero-trust environment. An inversion threat model using the original training dataset (a ‘gold standard’ inversion model) may also be generated. This inversion model can be characterized to determine its performance/accuracy of properly identifying a given input as being within the original training dataset or not (a data exfiltration event). It is possible to reduce this risk of data exfiltration to a desired level, without unduly impacting the algorithm's performance using the inversion model for the generation of noise that is targeted (as opposed to Gaussian noise). Noise added to the original training dataset causes the inversion model to perform poorer (meaning data steward data is more secure) but has a corresponding impact on the algorithm accuracy and performance.

Type: Grant

Filed: February 16, 2023

Date of Patent: February 18, 2025

Assignee: BeeKeeperAI, Inc.

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR MODULATING OUTPUTS OF LARGE LANGUAGE MODELS RESPONSIVE TO CONFIDENTIAL INFORMATION

Publication number: 20250053687

Abstract: Systems and methods for the generation and usage of an identifier determiner model is provided. The identifier determiner model is generated in a sequestered computing node by receiving an untrained foundational model and a data set. The data set is bifurcated into a raw set and a de-identified set. The untrained foundational model is then tuned using the de-identified set to generate a sanitized model and the raw set to generate a raw model. Queries are presented to the raw model and the sanitized model to generate outputs. The identifier determiner machine learning model is generated by using the outputs to classify information as either sensitive or non-sensitive. The system may then receive a new foundational model. The identifier determiner machine learning model may be applied to outputs of this new foundational model to filter out sensitive information, either through redaction, or preventing them from being asked.

Type: Application

Filed: August 8, 2024

Publication date: February 13, 2025

Inventors: Michael Scott Blum, Mary Elizabeth Chalk, Robert Deward Rogers, Alan Donald Czeszynski, Sudish Mogli
SYSTEMS AND METHODS FOR MEASURING DATA EXFILTRATION VULNERABILITY AND DYNAMIC DIFFERENTIAL PRIVACY IN A ZERO-TRUST COMPUTING ENVIRONMENT

Publication number: 20240273232

Abstract: An algorithm is trained on a known dataset to facilitate dynamic data exfiltration protection in a zero-trust environment. The classifications generated by the trained algorithm on a very large set of inputs may then be used to train an inversion threat model by a bad actor attempting to exfiltrate data from the data steward. Since our system is taking place within the enclave/secure computing node, the system is able to very accurately build an inversion threat model since the original training dataset is known (a ‘gold standard’ inversion model). This inversion model can be characterized to determine its performance/accuracy of properly identifying a given input as being within the original training dataset or not (a data exfiltration event). This very accurate inversion model will be superior at data exfiltration as compared to any inversion attack model generated by a bad actor using only the algorithm classification outputs.

Type: Application

Filed: February 15, 2023

Publication date: August 15, 2024

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR MODEL OVERFITTING REDUCTION UTILIZING DATA SUBSET TRAINING IN A ZERO-TRUST COMPUTING ENVIRONMENT

Publication number: 20240273213

Abstract: Data from different data stewards is aggregated, and an algorithm is trained on this aggregated dataset. The larger dataset may be subdivided into subsets to generate a series of weak trained models. The aggregation of these weak models into a strong model causes a reduction in model overfitting as compared against training the algorithm on the single aggregated training set. The subdivided subsets may be selected in a manner as to minimize the likelihood of data exfiltration by measuring the likelihood of exfiltration using an inversion model (as detailed above). It is possible the aggregated dataset may be subdivided many times to generate a very large set of weak models, which should further reduce overfitting.

Type: Application

Filed: February 16, 2023

Publication date: August 15, 2024

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR TRAINING SET OBFUSCATION UTILIZING AN INVERTED THREAT MODEL IN A ZERO-TRUST COMPUTING ENVIRONMENT

Publication number: 20240273233

Abstract: An algorithm is trained on a dataset to facilitate dynamic data exfiltration protection in a zero-trust environment. An inversion threat model using the original training dataset (a ‘gold standard’ inversion model) may also be generated. This inversion model can be characterized to determine its performance/accuracy of properly identifying a given input as being within the original training dataset or not (a data exfiltration event). It is possible to reduce this risk of data exfiltration to a desired level, without unduly impacting the algorithm's performance using the inversion model for the generation of noise that is targeted (as opposed to Gaussian noise). Noise added to the original training dataset causes the inversion model to perform poorer (meaning data steward data is more secure) but has a corresponding impact on the algorithm accuracy and performance.

Type: Application

Filed: February 16, 2023

Publication date: August 15, 2024

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR ALGORITHM VALIDATION IN A ZERO-TRUST ENVIRONMENT

Publication number: 20240211588

Abstract: Systems and methods for validating algorithms across different parties' systems by generating synthetic data for operation on algorithms is provided. The synthetic data may include real data that has been de-identified, data that had been altered by pseudo-random deviations (range and distribution bound), or via generation by a ML algorithm that has been trained on real datasets. The synthetic data is shared between the various parties and run on their individual substantiations of the algorithm. The resulting output should be identical, thereby validating the algorithm. If there are differences in the outputs, then it can be determined that the algorithm is behaving in an unexpected manner. Annotation validation can also be performed. This may include salting annotations with known elements or ML trend identification and collecting the results from the annotators. By redundant comparison between annotators activity, the accuracy and consistency of annotations can be ascertained.

Type: Application

Filed: December 28, 2022

Publication date: June 27, 2024

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR DATA EXFILTRATION PREVENTION IN A ZERO-TRUST ENVIRONMENT

Publication number: 20240143794

Abstract: Systems and methods for data exfiltration prevention is provided. In some embodiments, exfiltration detection includes receiving an algorithm and a data set within a secure computing node. The algorithm is trained on the data set to generate a set of weights. A determination is made if the algorithm originated from a trusted source. When it is a trusted source, an unintentional data exfiltration analysis is performed. Conversely, when the source is not known to be trusted an intentional data exfiltration analysis is performed. Unintentional data exfiltration analysis is considerably more computationally intensive, and as such, making this determination can save significantly on computational resources. If an exfiltration event is identified, the system prevents exporting of the set of weights; otherwise the set of weights can be provided back to the algorithm developer.

Type: Application

Filed: October 23, 2023

Publication date: May 2, 2024

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR ACTIVE ALGORITHM TRAINING IN A ZERO-TRUST ENVIRONMENT

Publication number: 20240037272

Abstract: Systems and methods for providing algorithm performance feedback to an algorithm developer is provided In some embodiments, an algorithm and a data set are receiving within a secure computing node. The data set is processed using the algorithm to generate an algorithm output. A raw performance model is generated by regression modeling the algorithm output. The raw performance model is then smoothed to generate a final performance model, which is then encrypted and routed to an algorithm developer for further analysis. The performance model models at least one of the algorithm's accuracy, F1 score accuracy, precision, recall, dice score, ROC (receiver operator characteristic) curve/area, log loss, Jaccard index, error, R2 or by some combination thereof. The regression modeling includes linear least squares, logistic regression, deep learning or some combination thereof.

Type: Application

Filed: July 15, 2023

Publication date: February 1, 2024

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR ALGORITHM PERFORMANCE MODELING IN A ZERO-TRUST ENVIRONMENT

Publication number: 20240037299

Abstract: Systems and methods for providing algorithm performance feedback to an algorithm developer is provided In some embodiments, an algorithm and a data set are receiving within a secure computing node. The data set is processed using the algorithm to generate an algorithm output. A raw performance model is generated by regression modeling the algorithm output. The raw performance model is then smoothed to generate a final performance model, which is then encrypted and routed to an algorithm developer for further analysis. The performance model models at least one of the algorithm's accuracy, F1 score accuracy, precision, recall, dice score, ROC (receiver operator characteristic) curve/area, log loss, Jaccard index, error, R2 or by some combination thereof. The regression modeling includes linear least squares, logistic regression, deep learning or some combination thereof.

Type: Application

Filed: July 14, 2023

Publication date: February 1, 2024

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR FEDERATED FEEDBACK AND SECURE MULTI-MODEL TRAINING WITHIN A ZERO-TRUST ENVIRONMENT

Publication number: 20240020417

Abstract: Systems and methods for federated localized feedback and performance tracking of an algorithm is provided. An encrypted algorithm and data are provided to a sequestered computing node. The algorithm is decrypted and processes the protected information to generate inferences from dataframes which are provide to an inference interaction server, which performs feedback processing on the inference/dataframe pairs. Further, a computerized method of secure model generation in a sequestered computing node is provided, using automated multi-model training, leaderboard generation and then optimization. The top model is then selected and security processing on the selected model may be performed. Also, systems and methods are provided for the mapping of data input features to a data profile to prevent data exfiltration. Data consumed is broken out by features, and the features are mapped to either sensitive or non-sensitive classifications.

Type: Application

Filed: April 26, 2023

Publication date: January 18, 2024

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYNTHETIC AND TRADITIONAL DATA STEWARDS FOR SELECTING, OPTIMIZING, VERIFYING AND RECOMMENDING ONE OR MORE DATASETS

Publication number: 20230274026

Abstract: Confirmation of data selection in a zero-trust environment is provided. In some embodiments, a synthetic data steward and/or a traditional data steward can receive the dataset(s). Additionally, a script is received from the algorithm developer. The dataset(s) and script(s) reside within a secure computing node and are therefore inaccessible by any party. The script(s) are executed, resulting in at least one confirmation about the data within the dataset(s). The script(s) complete any of confirming a format for data in the at least one dataset, the expected class values for data within the at least one dataset, an overall characterization and completeness of the at least one dataset, and/or an expected class membership for different data attributes within the at least one dataset.

Type: Application

Filed: February 17, 2023

Publication date: August 31, 2023

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski
SYSTEMS AND METHODS FOR DATA VALIDATION AND TRANSFORMATION OF DATA IN A ZERO-TRUST ENVIRONMENT

Publication number: 20230205917

Abstract: Systems and methods for the validation and transform of data for processing by an algorithm is provided. In some embodiments, input data is cleaned, and then the domain of the data is determined. The domain of the data refers to the data type. A validation of the data occurs. The validation is for the ranges and distribution that the data should have, according to the domain, versus the actual data ranges and distribution. Data that fails the validation undergo a transform step and then are re-validated. This process is iterative until the data set passes validation. Transforms are first selected based upon the data domain that was determined prior. Transforms that fit a range requirement, or a distribution type may be selected. In alternate embodiments, machine learning (ML) may be employed to train models, exclusive to a given domain, to identify needed transforms.

Type: Application

Filed: December 20, 2022

Publication date: June 29, 2023

Inventors: Mary Elizabeth Chalk, Robert Derward Rogers, Alan Donald Czeszynski