Patents by Inventor Arno Schneuwly

Arno Schneuwly has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ANOMALY DETECTION BASED PREPROCESSING FOR BETTER CLASSIFICATION TASKS WITH NOISY AND IMBALANCED DATASETS

Publication number: 20240143993

Abstract: A computer trains, based on many timeseries, many anomaly detectors. Each anomaly detector is configured with a respective distinct contamination factor. Each timeseries is a temporal sequence of datapoints that characterize a device. Each datapoint in the many timeseries has a respective label that indicates whether the device failed when the datapoint occurred. Each anomaly detector detects: a set of anomalous datapoints, the size of which is proportional to the contamination factor of the anomaly detector, a healthy count of anomalous datapoints in timeseries of devices not failed, and an unhealthy count of anomalous datapoints in timeseries of failed devices. For a particular anomaly detector, the computer detects that the magnitude of the difference between the respective healthy count and the respective unhealthy count is less than a predefined threshold. Based on the contamination factor of the particular anomaly detector, anomalous datapoints are oversampled.

Type: Application

Filed: October 28, 2022

Publication date: May 2, 2024

Inventors: Arno Schneuwly, Suwen Yang
PROFILE-ENRICHED EXPLANATIONS OF DATA-DRIVEN MODELS

Publication number: 20240126798

Abstract: In an embodiment, a computer stores, in memory or storage, many explanation profiles, many log entries, and definitions of many features that log entries contain. Some features may contain a logic statement such as a database query, and these are specially aggregated based on similarity. Based on the entity specified by an explanation profile, statistics are materialized for some or all features. Statistics calculation may be based on scheduled batches of log entries or a stream of live log entries. At runtime, an inference that is based on a new log entry is received. Based on an entity specified in the new log entry, a particular explanation profile is dynamically selected. Based on the new log entry and statistics of features for the selected explanation profile, a local explanation of the inference is generated. In an embodiment, an explanation text template is used to generate the local explanation.

Type: Application

Filed: May 30, 2023

Publication date: April 18, 2024

Inventors: Arno Schneuwly, Desislava Wagenknecht-Dimitrova, Felix Schmidt, Marija Nikolic, Matteo Casserini, Milos Vasic, Renata Khasanova
SCORE PROPAGATION ON GRAPHS WITH DIFFERENT SUBGRAPH MAPPING STRATEGIES

Publication number: 20240070156

Abstract: Techniques for propagating scores in subgraphs are provided. In one technique, multiple path scores are stored, each path score associated with a path (or subgraph), of multiple paths, in a graph of nodes. The path scores may be generated by a machine-learned model. For each path score, a path that is associated with that path score is identified and nodes of that path are identified. For each identified node, a node score for that node is determined or computed based on the corresponding path score and the node score is stored in association with that node. Subsequently, for each node in a subset of the graph, multiple node scores that are associated with that node are identified and aggregated to generate a propagated score for that node. In a related technique, a propagated score of a node is used to compute a score for each leaf node of the node.

Type: Application

Filed: August 23, 2022

Publication date: February 29, 2024

Inventors: Kenyu Kobayashi, Arno Schneuwly, Renata Khasanova, Matteo Casserini, Felix Schmidt
TEXTUAL EXPLANATIONS FOR ABSTRACT SYNTAX TREES WITH SCORED NODES

Publication number: 20240061997

Abstract: Herein is a machine learning (ML) explainability (MLX) approach in which a natural language explanation is generated based on analysis of a parse tree such as for a suspicious database query or web browser JavaScript. In an embodiment, a computer selects, based on a respective relevance score for each non-leaf node in a parse tree of a statement, a relevant subset of non-leaf nodes. The non-leaf nodes are grouped in the parse tree into groups that represent respective portions of the statement. Based on a relevant subset of the groups that contain at least one non-leaf node in the relevant subset of non-leaf nodes, a natural language explanation of why the statement is anomalous is generated.

Type: Application

Filed: August 19, 2022

Publication date: February 22, 2024

Inventors: Kenyu Kobayashi, Arno Schneuwly, Renata Khasanova, Matteo Casserini, Felix Schmidt
VALIDATION METRIC FOR ATTRIBUTION-BASED EXPLANATION METHODS FOR ANOMALY DETECTION MODELS

Publication number: 20240037383

Abstract: Herein are machine learning (ML) explainability (MLX) techniques for calculating and using a novel fidelity metric for assessing and comparing explainers that are based on feature attribution. In an embodiment, a computer generates many anomalous tuples from many non-anomalous tuples. Each anomalous tuple contains a perturbed value of a respective perturbed feature. For each anomalous tuple, a respective explanation is generated that identifies a respective identified feature as a cause of the anomalous tuple being anomalous. A fidelity metric is calculated by counting correct explanations for the anomalous tuples whose identified feature is the perturbed feature. Tuples may represent entries in an activity log such as structured query language (SQL) statements in a console output log of a database server. This approach herein may gauge the quality of a set of MLX explanations for why log entries or network packets are characterized as anomalous by an intrusion detector or other anomaly detector.

Type: Application

Filed: July 26, 2022

Publication date: February 1, 2024

Inventors: Kenyu Kobayashi, Arno Schneuwly, Renata Khasanova, Matteo Casserini, Felix Schmidt
BACKPROPAGATION-BASED EXPLAINABILITY METHOD FOR UNSUPERVISED ANOMALY DETECTION MODELS BASED ON AUTOENCODER ARCHITECTURES

Publication number: 20240037372

Abstract: The present invention relates to machine learning (ML) explainability (MLX). Herein are techniques for a novel relevance propagation rule in layer-wise relevance propagation (LRP) for feature attribution-based explanation (ABX) for a reconstructive autoencoder. In an embodiment, a reconstruction layer of a reconstructive neural network in a computer generates a reconstructed tuple that is based on an original tuple that contains many features. A reconstruction residual cost function calculates a reconstruction error that measures a difference between the original tuple and the reconstructed tuple. Applied to the reconstruction error is a novel reconstruction relevance propagation rule that assigns a respective reconstruction relevance to each reconstruction neuron in the reconstruction layer. Based on the reconstruction relevance of the reconstruction neurons, a respective feature relevance of each feature is determined, from which an ABX explanation may be automatically generated.

Type: Application

Filed: July 26, 2022

Publication date: February 1, 2024

Inventors: Kenyu Kobayashi, Arno Schneuwly, Renata Khasanova, Matteo Casserini, Felix Felix Schmidt
ADVERSARIAL CORRUPTION FOR ATTRIBUTION-BASED EXPLANATIONS VALIDATION

Publication number: 20230419169

Abstract: Herein are machine learning (ML) explainability (MLX) techniques that perturb a non-anomalous tuple to generate an anomalous tuple as adversarial input to any explainer that is based on feature attribution. In an embodiment, a computer generates, from a non-anomalous tuple, an anomalous tuple that contains a perturbed value of a perturbed feature. In the anomalous tuple, the perturbed value of the perturbed feature is modified to cause a change in reconstruction error for the anomalous tuple. The change in reconstruction error includes a decrease in reconstruction error of the perturbed feature and/or an increase in a sum of reconstruction error of all features that are not the perturbed feature. After modifying the perturbed value, an attribution-based explainer automatically generates an explanation that identifies an identified feature as a cause of the anomalous tuple being anomalous. Whether the identified feature of the explanation is or is not the perturbed feature is detected.

Type: Application

Filed: June 28, 2022

Publication date: December 28, 2023

Inventors: Kenyu Kobayashi, Arno Schneuwly, Renata Khasanova, Matteo Casserini, Felix Schmidt
TRACE REPRESENTATION LEARNING

Publication number: 20230376743

Abstract: The present invention avoids overfitting in deep neural network (DNN) training by using multitask learning (MTL) and self-supervised learning (SSL) techniques when training a multi-branch DNN to encode a sequence. In an embodiment, a computer first trains the DNN to perform a first task. The DNN contains: a first encoder in a first branch, a second encoder in a second branch, and an interpreter layer that combines data from the first branch and the second branch. The DNN second trains to perform a second task. After the first and second trainings, production encoding and inferencing occur. The first encoder encodes a sparse feature vector into a dense feature vector from which an inference is inferred. In an embodiment, a sequence of log messages is encoded into an encoded trace. An anomaly detector infers whether the sequence is anomalous. In an embodiment, the log messages are database commands.

Type: Application

Filed: May 19, 2022

Publication date: November 23, 2023

Inventors: Marija Nikolic, Nikola Milojkovic, Arno Schneuwly, Matteo Casserini, Milos Vasic, Renata Khasanova, Felix Schmidt
ANOMALY SCORE NORMALISATION BASED ON EXTREME VALUE THEORY

Publication number: 20230368054

Abstract: The present invention relates to threshold estimation and calibration for anomaly detection. Herein are machine learning (ML) and extreme value theory (EVT) techniques for normalizing and thresholding anomaly scores without presuming a values distribution. In an embodiment, a computer receives many unnormalized anomaly scores and, according to peak over threshold (POT), selects a highest subset of the unnormalized anomaly scores that exceed a tail threshold. Based on the highest subset of the unnormalized anomaly scores, parameters of a probability density function are trained according to EVT. After training and in a production environment, a normalized anomaly score is generated based on an unnormalized anomaly score and the trained parameters of the probability density function. Anomaly detection compares the normalized anomaly score to an optimized anomaly threshold.

Type: Application

Filed: May 16, 2022

Publication date: November 16, 2023

Inventors: Marija Nikolic, Matteo Casserini, Arno Schneuwly, Nikola Milojkovic, Milos Vasic, Renata Khasanova, Felix Schmidt
Extraction from trees at scale

Patent number: 11620118

Abstract: Herein are machine learning (ML) feature processing and analytic techniques to detect anomalies in parse trees of logic statements, database queries, logic scripts, compilation units of general-purpose programing language, extensible markup language (XML), JavaScript object notation (JSON), and document object models (DOM). In an embodiment, a computer identifies an operational trace that contains multiple parse trees. Values of explicit features are generated from a single respective parse tree of the multiple parse trees of the operational trace. Values of implicit features are generated from more than one respective parse tree of the multiple parse trees of the operational trace. The explicit and implicit features are stored into a same feature vector. With the feature vector as input, an ML model detects whether or not the operational trace is anomalous, based on the explicit features of each parse tree of the operational trace and the implicit features of multiple parse trees of the operational trace.

Type: Grant

Filed: February 12, 2021

Date of Patent: April 4, 2023

Assignee: Oracle International Corporation

Inventors: Arno Schneuwly, Nikola Milojkovic, Felix Schmidt, Nipun Agarwal
Kernel subsampling for an accelerated tree similarity computation

Patent number: 11449517

Abstract: Approaches herein relate to machine learning for detection of anomalous logic syntax. Herein is acceleration for comparison of parse trees such as suspicious database queries. In an embodiment, a computer identifies subtrees in each of many trees. A respective subset of participating subtrees is selected in each tree. A respective root node of each participating subtree should directly have a child node that is a leaf and/or should have a degree that exceeds a branching threshold such as one. For each pairing of a respective first tree with a respective second tree, based on a count of subtree matches between the participating subset of subtrees in the first tree and the participating subset of subtrees in the second tree, a respective tree similarity score is calculated. A machine learning model inferences based on the tree similarity scores of the many trees. In an embodiment, each tree similarity score is a convolution kernel.

Type: Grant

Filed: December 22, 2020

Date of Patent: September 20, 2022

Assignee: Oracle International Corporation

Inventors: Arno Schneuwly, Nikola Milojkovic, Felix Schmidt, Nipun Agarwal
EXTRACTION FROM TREES AT SCALE

Publication number: 20220261228

Abstract: Herein are machine learning (ML) feature processing and analytic techniques to detect anomalies in parse trees of logic statements, database queries, logic scripts, compilation units of general-purpose programing language, extensible markup language (XML), JavaScript object notation (JSON), and document object models (DOM). In an embodiment, a computer identifies an operational trace that contains multiple parse trees. Values of explicit features are generated from a single respective parse tree of the multiple parse trees of the operational trace. Values of implicit features are generated from more than one respective parse tree of the multiple parse trees of the operational trace. The explicit and implicit features are stored into a same feature vector. With the feature vector as input, an ML model detects whether or not the operational trace is anomalous, based on the explicit features of each parse tree of the operational trace and the implicit features of multiple parse trees of the operational trace.

Type: Application

Filed: February 12, 2021

Publication date: August 18, 2022

Inventors: Arno Schneuwly, Nikola Milojkovic, Felix Schmidt, Nipun Agarwal
GENERALIZED PRODUCTION RULES - N-GRAM FEATURE EXTRACTION FROM ABSTRACT SYNTAX TREES (AST) FOR CODE VECTORIZATION

Publication number: 20220198294

Abstract: Herein is resource-constrained feature enrichment for analysis of parse trees such as suspicious database queries. In an embodiment, a computer receives a parse tree that contains many tree nodes. Each tree node is associated with a respective production rule that was used to generate the tree node. Extracted from the parse tree are many sequences of production rules having respective sequence lengths that satisfy a length constraint that accepts at least one fixed length that is greater than two. Each extracted sequence of production rules consists of respective production rules of a sequence of tree nodes in a respective directed tree path of the parse tree having a path length that satisfies that same length constraint. Based on the extracted sequences of production rules, a machine learning model generates an inference. In a bag of rules data structure, the extracted sequences of production rules are aggregated by distinct sequence and duplicates are counted.

Type: Application

Filed: December 23, 2020

Publication date: June 23, 2022

Inventors: ARNO SCHNEUWLY, NIKOLA MILOJKOVIC, FELIX SCHMIDT, NIPUN AGARWAL
KERNEL SUBSAMPLING FOR AN ACCELERATED TREE SIMILARITY COMPUTATION

Publication number: 20220197917

Abstract: Approaches herein relate to machine learning for detection of anomalous logic syntax. Herein is acceleration for comparison of parse trees such as suspicious database queries. In an embodiment, a computer identifies subtrees in each of many trees. A respective subset of participating subtrees is selected in each tree. A respective root node of each participating subtree should directly have a child node that is a leaf and/or should have a degree that exceeds a branching threshold such as one. For each pairing of a respective first tree with a respective second tree, based on a count of subtree matches between the participating subset of subtrees in the first tree and the participating subset of subtrees in the second tree, a respective tree similarity score is calculated. A machine learning model inferences based on the tree similarity scores of the many trees. In an embodiment, each tree similarity score is a convolution kernel.

Type: Application

Filed: December 22, 2020

Publication date: June 23, 2022

Inventors: ARNO SCHNEUWLY, NIKOLA MILOJKOVIC, FELIX SCHMIDT, NIPUN AGARWAL
NETWORK CODED MULTIPATH SYSTEM AND RELATED TECHNIQUES

Publication number: 20200328858

Abstract: Techniques are disclosed for adaptive coding and scheduling of packets in wireless networks. The adaptive coding and scheduling can be achieved by utilizing a discrete water filling (DWF) scheme. In an example, a computer-implemented method to adaptively code and schedule packets in a wireless network may include determining number of paths between a sender and a receiver in a multipath (MP) network, determining erasure rates for each path of the paths between the sender and the receiver, and determining a multipath rate. The method may also include determining a coding bucket size based on the multipath rate and determining a multipath delay for the coding bucket size and the erasure rates. In another example, the adaptive coding and scheduling techniques can be applied to a multihop multipath (MM) network.

Type: Application

Filed: July 31, 2019

Publication date: October 15, 2020

Inventors: Muriel Medard, Derya Malak, Arno Schneuwly, Emre Telatar