Patents by Inventor ANDREW BROWNSWORD

ANDREW BROWNSWORD has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SEMI-SUPERVISED FRAMEWORK FOR PURPOSE-ORIENTED ANOMALY DETECTION

Publication number: 20230362180

Abstract: Techniques for implementing a semi-supervised framework for purpose-oriented anomaly detection are provided. In one technique, a data item in inputted into an unsupervised anomaly detection model, which generates first output. Based on the first output, it is determined whether the data item represents an anomaly. In response to determining that the data item represents an anomaly, the data item is inputted into a supervised classification model, which generates second output that indicates whether the data item is unknown. In response to determining that the data item is unknown, a training instance is generated based on the data item. The supervised classification model is updated based on the training instance.

Type: Application

Filed: May 9, 2022

Publication date: November 9, 2023

Inventors: Milos Vasic, Saeid Allahdadian, Matteo Casserini, Felix Schmidt, Andrew Brownsword
Multi-stage feature extraction for effective ML-based anomaly detection on structured log data

Patent number: 11704386

Abstract: Herein are feature extraction mechanisms that receive parsed log messages as inputs and transform them into numerical feature vectors for machine learning models (MLMs). In an embodiment, a computer extracts fields from a log message. Each field specifies a name, a text value, and a type. For each field, a field transformer for the field is dynamically selected based the field's name and/or the field's type. The field transformer converts the field's text value into a value of the field's type. A feature encoder for the value of the field's type is dynamically selected based on the field's type and/or a range of the field's values that occur in a training corpus of an MLM. From the feature encoder, an encoding of the value of the field's typed is stored into a feature vector. Based on the MLM and the feature vector, the log message is detected as anomalous.

Type: Grant

Filed: March 12, 2021

Date of Patent: July 18, 2023

Assignee: Oracle International Corporation

Inventors: Amin Suzani, Saeid Allahdadian, Milos Vasic, Matteo Casserini, Hamed Ahmadi, Felix Schmidt, Andrew Brownsword, Nipun Agarwal
BALANCING FEATURE DISTRIBUTIONS USING AN IMPORTANCE FACTOR

Publication number: 20230024884

Abstract: Herein are machine learning techniques that adjust reconstruction loss of a reconstructive model such as an autoencoder based on importances of values of features. In an embodiment and before, during, or after training, the reconstructive model that more or less accurately reconstructs its input, a computer measures, for each distinct value of each feature, a respective importance that is not based on the reconstructive model. For example, importance may be based solely on a training corpus. For each feature during or after training, a respective original loss from the reconstructive model measures a difference between a value of the feature in an input and a reconstructed value of the feature generated by the reconstructive model. For each feature, the respective importance of the input value of the feature is applied to the respective original loss to generate a respective weighted loss. The weighted losses of the features of the input are collectively detected as anomalous or non-anomalous.

Type: Application

Filed: July 20, 2021

Publication date: January 26, 2023

Inventors: MATTEO CASSERINI, SAEID ALLAHDADIAN, FELIX SCHMIDT, ANDREW BROWNSWORD
SPARSE ENSEMBLING OF UNSUPERVISED MODELS

Publication number: 20220318684

Abstract: Techniques are provided for sparse ensembling of unsupervised machine learning models. In an embodiment, the proposed architecture is composed of multiple unsupervised machine learning models that each produce a score as output and a gating network that analyzes the inputs and outputs of the unsupervised machine learning models to select an optimal ensemble of unsupervised machine learning models. The gating network is trained to choose a minimal number of the multiple unsupervised machine learning models whose scores are combined to create a final score that matches or closely resembles a final score that is computed using all the scores of the multiple unsupervised machine learning models.

Type: Application

Filed: April 2, 2021

Publication date: October 6, 2022

Inventors: SAEID ALLAHDADIAN, AMIN SUZANI, MILOS VASIC, MATTEO CASSERINI, ANDREW BROWNSWORD, FELIX SCHMIDT, NIPUN AGARWAL
Malicious activity detection by cross-trace analysis and deep learning

Patent number: 11451565

Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.

Type: Grant

Filed: September 5, 2018

Date of Patent: September 20, 2022

Assignee: Oracle International Corporation

Inventors: Guang-Tong Zhou, Hossein Hajimirsadeghi, Andrew Brownsword, Stuart Wray, Craig Schelp, Rod Reddekopp, Felix Schmidt
MULTI-STAGE FEATURE EXTRACTION FOR EFFECTIVE ML-BASED ANOMALY DETECTION ON STRUCTURED LOG DATA

Publication number: 20220292304

Abstract: Herein are feature extraction mechanisms that receive parsed log messages as inputs and transform them into numerical feature vectors for machine learning models (MLMs). In an embodiment, a computer extracts fields from a log message. Each field specifies a name, a text value, and a type. For each field, a field transformer for the field is dynamically selected based the field's name and/or the field's type. The field transformer converts the field's text value into a value of the field's type. A feature encoder for the value of the field's type is dynamically selected based on the field's type and/or a range of the field's values that occur in a training corpus of an MLM. From the feature encoder, an encoding of the value of the field's typed is stored into a feature vector. Based on the MLM and the feature vector, the log message is detected as anomalous or not.

Type: Application

Filed: March 12, 2021

Publication date: September 15, 2022

Inventors: AMIN SUZANI, SAEID ALLAHDADIAN, MILOS VASIC, MATTEO CASSERINI, HAMED AHMADI, FELIX SCHMIDT, ANDREW BROWNSWORD, NIPUN AGARWAL
Out of band server utilization estimation and server workload characterization for datacenter resource optimization and forecasting

Patent number: 11423327

Abstract: Techniques are described herein for estimating CPU, memory, and I/O utilization for a workload via out-of-band sensor readings using a machine learning model. The framework involves receiving sensor data associated with executing benchmark applications, obtaining ground truth utilization values for the benchmarks, preprocessing the training data to select a set of enhanced sequences, and using the enhanced sequences to train a random forest model to estimate CPU, memory, and I/O utilization given sensor monitoring data. Prior to the training phase, a machine learning model is trained using a set of predefined hyper-parameters. The trained models are used to generate estimations for CPU, memory, and I/O utilizations values. The utilization values are used with workload context information to assess the deployment and generate one or more recommendations for machine types that will best serve the workload in terms of system utilization.

Type: Grant

Filed: October 10, 2018

Date of Patent: August 23, 2022

Assignee: Oracle International Corporation

Inventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Andrew Brownsword, Nipun Agarwal
Parsing of unstructured log data into structured data and creation of schema

Patent number: 11372868

Abstract: Herein are techniques for training a parser by categorizing and generalizing messages and abstracting message templates for parsing after training. In an embodiment, a computer generates a message signature based on a message sequence of tokens that were extracted from a training message. The message signature is matched to a cluster signature that represents messages of one of many clusters that have distinct signatures. The training message is added to the cluster. Based on a data type of the cluster signature, a value is extracted from a second message, such as a live message after training. Fuzzy signatures may be probabilistically matched to select a best matching cluster for a message. The value range of a token may be broadened or narrowed by adding or removing candidate data types, by adding or removing literals to a data type, and/or by promoting a narrow data type to a broader data type.

Type: Grant

Filed: January 14, 2019

Date of Patent: June 28, 2022

Assignee: Oracle International Corporation

Inventors: Rod Reddekopp, Andrew Brownsword, Manel Fernandez Gomez, Juan Fernandez Peinador
COPING WITH FEATURE ERROR SUPPRESSION: A MECHANISM TO HANDLE THE CONCEPT DRIFT

Publication number: 20220188410

Abstract: Approaches herein relate to reconstructive models such as an autoencoder for anomaly detection. Herein are machine learning techniques that detect and suppress any feature that causes model decay by concept drift. In an embodiment in a production environment, a computer initializes an unsuppressed subset of features with a plurality of features that an already-trained reconstructive model can process. A respective reconstruction error of each feature of the unsuppressed subset of features is calculated. The computer detects that a respective moving average based on the reconstruction error of a particular feature of the unsuppressed subset of features exceeds a respective feature suppression threshold of the particular feature, which causes removal of the particular feature from the unsuppressed subset of features.

Type: Application

Filed: December 15, 2020

Publication date: June 16, 2022

Inventors: SAEID ALLAHDADIAN, ANDREW BROWNSWORD, MILOS VASIC, MATTEO CASSERINI, AMIN SUZANI, HAMED AHMADI, FELIX SCHMIDT, NIPUN AGARWAL
AUTOMATICALLY CHANGE ANOMALY DETECTION THRESHOLD BASED ON PROBABILISTIC DISTRIBUTION OF ANOMALY SCORES

Publication number: 20220188694

Abstract: Approaches herein relate to model decay of an anomaly detector due to concept drift. Herein are machine learning techniques for dynamically self-tuning an anomaly score threshold. In an embodiment in a production environment, a computer receives an item in a stream of items. A machine learning (ML) model hosted by the computer infers by calculation an anomaly score for the item. Whether the item is anomalous or not is decided based on the anomaly score and an adaptive anomaly threshold that dynamically fluctuates. A moving standard deviation of anomaly scores is adjusted based on a moving average of anomaly scores. The moving average of anomaly scores is then adjusted based on the anomaly score. The adaptive anomaly threshold is then adjusted based on the moving average of anomaly scores and the moving standard deviation of anomaly scores.

Type: Application

Filed: December 15, 2020

Publication date: June 16, 2022

Inventors: Amin Suzani, Matteo Casserini, Milos Vasic, Saeid Allahdadian, Andrew Brownsword, Hamed Ahmadi, Felix Schmidt, Nipun Agarwal
STATISTICAL CONFIDENCE METRIC FOR RECONSTRUCTIVE ANOMALY DETECTION MODELS

Publication number: 20220156578

Abstract: Approaches herein relate to reconstructive models such as an autoencoder for anomaly detection. Herein are machine learning techniques that measure inference confidence based on reconstruction error trends. In an embodiment, a computer hosts a reconstructive model that encodes and decodes features. Based on that decoding, the following are automatically calculated: a respective reconstruction error of each feature, a respective moving average of reconstruction errors of each feature, an average of the moving averages of the reconstruction errors of all features, a standard deviation of the moving averages of the reconstruction errors of all features, and a confidence of decoding the features that is based on a ratio of the average of the moving averages of the reconstruction errors to the standard deviation of the moving averages of the reconstruction errors. The computer detects and indicates that a threshold exceeds the confidence of decoding, which may cause important automatic reactions herein.

Type: Application

Filed: November 16, 2020

Publication date: May 19, 2022

Inventors: SAEID ALLAHDADIAN, MATTEO CASSERINI, ANDREW BROWNSWORD, AMIN SUZANI, MILOS VASIC, FELIX SCHMIDT, NIPUN AGARWAL
ANOMALY DETECTION ON SEQUENTIAL LOG DATA USING A RESIDUAL NEURAL NETWORK

Publication number: 20220108181

Abstract: A multilayer perceptron herein contains an already-trained combined sequence of residual blocks that contains a semantic sequence of residual blocks and a contextual sequence of residual blocks. The semantic sequence of residual blocks contains a semantic sequence of layers of an autoencoder. The contextual sequence of residual blocks contains a contextual sequence of layers of a recurrent neural network. Each residual block of the combined sequence of residual blocks is used based on a respective survival probability. By the autoencoder and based on the using each residual block of the semantic sequence, a previous entry of a log is semantically encoded. By the recurrent neural network and based on the using each residual block of the contextual sequence, a next entry of the log is predicted. In an embodiment during training, survival probabilities are hyperparameters that are learned and used to probabilistically skip residual blocks such that the multilayer perceptron has stochastic depth.

Type: Application

Filed: October 7, 2020

Publication date: April 7, 2022

Inventors: HAMED AHMADI, SAEID ALLAHDADIAN, MATTEO CASSERINI, MILOS VASIC, AMIN SUZANI, FELIX SCHMIDT, ANDREW BROWNSWORD, NIPUN AGARWAL
Context-aware feature embedding and anomaly detection of sequential log data using deep recurrent neural networks

Patent number: 11218498

Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.

Type: Grant

Filed: September 5, 2018

Date of Patent: January 4, 2022

Assignee: Oracle International Corporation

Inventors: Hossein Hajimirsadeghi, Guang-Tong Zhou, Andrew Brownsword, Nipun Agarwal, Pavan Chandrashekar, Karoon Rashedi Nia
Malicious activity detection by cross-trace analysis and deep learning

Patent number: 11082438

Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.

Type: Grant

Filed: September 5, 2018

Date of Patent: August 3, 2021

Assignee: Oracle International Corporation

Inventors: Juan Fernandez Peinador, Manel Fernandez Gomez, Guang-Tong Zhou, Hossein Hajimirsadeghi, Andrew Brownsword, Onur Kocberber, Felix Schmidt, Craig Schelp
MODULAR FEATURE EXTRACTION FROM PARSED LOG DATA

Publication number: 20200364585

Abstract: Herein are techniques for efficient and modular transcoding of message fields into features for inclusion within a feature vector. In an embodiment, a computer receives message signatures. Each signature has fields. Each field has a name and type. A feature map is generated that associates a field name and field type with transcoder(s). A message is received from a parser as field tuples. Each tuple has a type, name, and value of a field. Each tuple is processed as follows. The field name and field type of the tuple is used as a lookup key into the feature map to retrieve respective transcoder(s) that each generate a respective encoded feature from the field value of the tuple. An encoded feature from at least one relevant transcoder is written into a respective distinct location within a feature vector to encode the message. An inference is made based on the feature vector.

Type: Application

Filed: May 17, 2019

Publication date: November 19, 2020

Inventors: PAVAN CHANDRASHEKAR, ANDREW BROWNSWORD, MANEL FERNANDEZ GOMEZ, JUAN FERNANDEZ PEINADOR, ROD REDDEKOPP
Engine for reactive execution of massively concurrent heterogeneous accelerated scripted streaming analyses

Patent number: 10768982

Abstract: Herein are techniques for analysis of data streams. In an embodiment, a computer associates each software actor with data streams. Each software actor has its own backlog queue of data to analyze. In response to receiving some stream content and based on the received stream content, data is distributed to some software actors. In response to determining that the data satisfies completeness criteria of a particular software actor, an indication of the data is appended onto the backlog queue of the particular software actor. The particular software actor is reset to an initial state by loading an execution snapshot of a previous initial execution of an embedded virtual machine. Based on the particular software actor, execution of the execution snapshot of the previous initial execution is resumed to dequeue and process the indication of the data from the backlog queue of the particular software actor to generate a result.

Type: Grant

Filed: September 19, 2018

Date of Patent: September 8, 2020

Assignee: Oracle International Corporation

Inventors: Andrew Brownsword, Tayler Hetherington, Pavan Chandrashekar, Akhilesh Singhania, Stuart Wray, Pravin Shinde, Felix Schmidt, Craig Schelp, Onur Kocberber, Juan Fernandez Peinador, Rod Reddekopp, Manel Fernandez Gomez, Nipun Agarwal
PARSING OF UNSTRUCTURED LOG DATA INTO STRUCTURED DATA AND CREATION OF SCHEMA

Publication number: 20200226214

Abstract: Herein are techniques for training a parser by categorizing and generalizing messages and abstracting message templates for parsing after training. In an embodiment, a computer generates a message signature based on a message sequence of tokens that were extracted from a training message. The message signature is matched to a cluster signature that represents messages of one of many clusters that have distinct signatures. The training message is added to the cluster. Based on a data type of the cluster signature, a value is extracted from a second message, such as a live message after training. Fuzzy signatures may be probabilistically matched to select a best matching cluster for a message. The value range of a token may be broadened or narrowed by adding or removing candidate data types, by adding or removing literals to a data type, and/or by promoting a narrow data type to a broader data type.

Type: Application

Filed: January 14, 2019

Publication date: July 16, 2020

Inventors: ROD REDDEKOPP, ANDREW BROWNSWORD, MANEL FERNANDEZ GOMEZ, JUAN FERNANDEZ PEINADOR
OUT OF BAND SERVER UTILIZATION ESTIMATION AND SERVER WORKLOAD CHARACTERIZATION FOR DATACENTER RESOURCE OPTIMIZATION AND FORECASTING

Publication number: 20200118039

Abstract: Techniques are described herein for estimating CPU, memory, and I/O utilization for a workload via out-of-band sensor readings using a machine learning model. The framework involves receiving sensor data associated with executing benchmark applications, obtaining ground truth utilization values for the benchmarks, preprocessing the training data to select a set of enhanced sequences, and using the enhanced sequences to train a random forest model to estimate CPU, memory, and I/O utilization given sensor monitoring data. Prior to the training phase, a machine learning model is trained using a set of predefined hyper-parameters. The trained models are used to generate estimations for CPU, memory, and I/O utilizations values. The utilization values are used with workload context information to assess the deployment and generate one or more recommendations for machine types that will best serve the workload in terms of system utilization.

Type: Application

Filed: October 10, 2018

Publication date: April 16, 2020

Inventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Andrew Brownsword, Nipun Agarwal
ENGINE FOR REACTIVE EXECUTION OF MASSIVELY CONCURRENT HETEROGENEOUS ACCELERATED SCRIPTED STREAMING ANALYSES

Publication number: 20200089529

Abstract: Herein are techniques for analysis of data streams. In an embodiment, a computer associates each software actor with data streams. Each software actor has its own backlog queue of data to analyze. In response to receiving some stream content and based on the received stream content, data is distributed to some software actors. In response to determining that the data satisfies completeness criteria of a particular software actor, an indication of the data is appended onto the backlog queue of the particular software actor. The particular software actor is reset to an initial state by loading an execution snapshot of a previous initial execution of an embedded virtual machine. Based on the particular software actor, execution of the execution snapshot of the previous initial execution is resumed to dequeue and process the indication of the data from the backlog queue of the particular software actor to generate a result.

Type: Application

Filed: September 19, 2018

Publication date: March 19, 2020

Inventors: ANDREW BROWNSWORD, TAYLER HETHERINGTON, PAVAN CHANDRASHEKAR, AKHILESH SINGHANIA, STUART WRAY, PRAVIN SHINDE, FELIX SCHMIDT, CRAIG SCHELP, ONUR KOCBERBER, JUAN FERNANDEZ PEINADOR, ROD REDDEKOPP, MANEL FERNANDEZ GOMEZ, NIPUN AGARWAL
CONTEXT-AWARE FEATURE EMBEDDING AND ANOMALY DETECTION OF SEQUENTIAL LOG DATA USING DEEP RECURRENT NEURAL NETWORKS

Publication number: 20200076841

Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.

Type: Application

Filed: September 5, 2018

Publication date: March 5, 2020

Inventors: HOSSEIN HAJIMIRSADEGHI, GUANG-TONG ZHOU, ANDREW BROWNSWORD, NIPUN AGARWAL, PAVAN CHANDRASHEKAR, KAROON RASHEDI NIA

1 2 next