Patents by Inventor Felix Schmidt

Felix Schmidt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11620118
    Abstract: Herein are machine learning (ML) feature processing and analytic techniques to detect anomalies in parse trees of logic statements, database queries, logic scripts, compilation units of general-purpose programing language, extensible markup language (XML), JavaScript object notation (JSON), and document object models (DOM). In an embodiment, a computer identifies an operational trace that contains multiple parse trees. Values of explicit features are generated from a single respective parse tree of the multiple parse trees of the operational trace. Values of implicit features are generated from more than one respective parse tree of the multiple parse trees of the operational trace. The explicit and implicit features are stored into a same feature vector. With the feature vector as input, an ML model detects whether or not the operational trace is anomalous, based on the explicit features of each parse tree of the operational trace and the implicit features of multiple parse trees of the operational trace.
    Type: Grant
    Filed: February 12, 2021
    Date of Patent: April 4, 2023
    Assignee: Oracle International Corporation
    Inventors: Arno Schneuwly, Nikola Milojkovic, Felix Schmidt, Nipun Agarwal
  • Publication number: 20230067910
    Abstract: The present invention relates to compounds represented by general formula (Ia), (Ib), (IIa) or (IIb). These compounds are suitable for imaging alpha-synuclein and for diagnosing diseases which are associated with the aggregation of alpha-synuclein. The compounds are also useful for treating and preventing diseases which are associated with the aggregation of alpha-synuclein.
    Type: Application
    Filed: November 19, 2020
    Publication date: March 2, 2023
    Applicants: MODAG GMBH, MAX-PLANCK-GESELLSCHAFT ZUR FÖRDERUNG DER WISSENSCHAFTEN E.V.
    Inventors: Armin GIESE, Felix SCHMIDT, Daniel WECKBECKER, Andrei LEONOV, Sergey RYAZANOV, Christian GRIESINGER, Bernd PICHLER, Kristina HERFERT, Andreas MAURER, Laura KÜBLER, Sabrina BUSS
  • Patent number: 11579951
    Abstract: Techniques are described herein for predicting disk drive failure using a machine learning model. The framework involves receiving disk drive sensor attributes as training data, preprocessing the training data to select a set of enhanced feature sequences, and using the enhanced feature sequences to train a machine learning model to predict disk drive failures from disk drive sensor monitoring data. Prior to the training phase, the RNN LSTM model is tuned using a set of predefined hyper-parameters. The preprocessing, which is performed during the training and evaluation phase as well as later during the prediction phase, involves using predefined values for a set of parameters to generate the set of enhanced sequences from raw sensor reading. The enhanced feature sequences are generated to maintain a desired healthy/failed disk ratio, and only use samples leading up to a last-valid-time sample in order to honor a pre-specified heads-up-period alert requirement.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: February 14, 2023
    Assignee: Oracle International Corporation
    Inventors: Onur Kocberber, Felix Schmidt, Arun Raghavan, Nipun Agarwal, Sam Idicula, Guang-Tong Zhou, Nitin Kunal
  • Publication number: 20230043993
    Abstract: Herein are machine learning techniques that adjust reconstruction loss of a reconstructive model, such as a principal component analysis (PCA), based on importances of features. In an embodiment having a reconstructive model that more or less accurately reconstructs its input, a computer measures, for each feature, a respective importance that is based on the reconstructive model. For example, importance may be based on grading samples that the reconstructive model correctly or incorrectly inferenced. For each feature during production inferencing, a respective original loss from the reconstructive model measures a difference between a value of the feature in an input and a reconstructed value of the feature generated by the reconstructive model. For each feature, the respective importance of the feature is applied to the respective original loss to generate a respective weighted loss, which compensates for concept drift.
    Type: Application
    Filed: August 4, 2021
    Publication date: February 9, 2023
    Inventors: SAEID ALLAHDADIAN, YUTING SUN, NAVANEETH JAMADAGNI, FELIX SCHMIDT, MARIA VLACHOPOULOU
  • Publication number: 20230024884
    Abstract: Herein are machine learning techniques that adjust reconstruction loss of a reconstructive model such as an autoencoder based on importances of values of features. In an embodiment and before, during, or after training, the reconstructive model that more or less accurately reconstructs its input, a computer measures, for each distinct value of each feature, a respective importance that is not based on the reconstructive model. For example, importance may be based solely on a training corpus. For each feature during or after training, a respective original loss from the reconstructive model measures a difference between a value of the feature in an input and a reconstructed value of the feature generated by the reconstructive model. For each feature, the respective importance of the input value of the feature is applied to the respective original loss to generate a respective weighted loss. The weighted losses of the features of the input are collectively detected as anomalous or non-anomalous.
    Type: Application
    Filed: July 20, 2021
    Publication date: January 26, 2023
    Inventors: MATTEO CASSERINI, SAEID ALLAHDADIAN, FELIX SCHMIDT, ANDREW BROWNSWORD
  • Publication number: 20220351023
    Abstract: Embodiments use a hierarchy of machine learning models to predict datacenter behavior at multiple hardware levels of a datacenter without accessing operating system generated hardware utilization information. The accuracy of higher-level models in the hierarchy of models is increased by including, as input to the higher-level models, hardware utilization predictions from lower-level models. The hierarchy of models includes: server utilization models and workload/OS prediction models that produce predictions at a server device-level of a datacenter; and also top-of-rack switch models and backbone switch models that produce predictions at higher levels of the datacenter. These models receive, as input, hardware utilization information from non-OS sources. Based on datacenter-level network utilization predictions from the hierarchy of models, the datacenter automatically configures its hardware to avoid any predicted over-utilization of hardware in the datacenter.
    Type: Application
    Filed: July 18, 2022
    Publication date: November 3, 2022
    Inventors: Pravin Shinde, Felix Schmidt, Onur Kocberber
  • Publication number: 20220318684
    Abstract: Techniques are provided for sparse ensembling of unsupervised machine learning models. In an embodiment, the proposed architecture is composed of multiple unsupervised machine learning models that each produce a score as output and a gating network that analyzes the inputs and outputs of the unsupervised machine learning models to select an optimal ensemble of unsupervised machine learning models. The gating network is trained to choose a minimal number of the multiple unsupervised machine learning models whose scores are combined to create a final score that matches or closely resembles a final score that is computed using all the scores of the multiple unsupervised machine learning models.
    Type: Application
    Filed: April 2, 2021
    Publication date: October 6, 2022
    Inventors: SAEID ALLAHDADIAN, AMIN SUZANI, MILOS VASIC, MATTEO CASSERINI, ANDREW BROWNSWORD, FELIX SCHMIDT, NIPUN AGARWAL
  • Patent number: 11451565
    Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.
    Type: Grant
    Filed: September 5, 2018
    Date of Patent: September 20, 2022
    Assignee: Oracle International Corporation
    Inventors: Guang-Tong Zhou, Hossein Hajimirsadeghi, Andrew Brownsword, Stuart Wray, Craig Schelp, Rod Reddekopp, Felix Schmidt
  • Patent number: 11449517
    Abstract: Approaches herein relate to machine learning for detection of anomalous logic syntax. Herein is acceleration for comparison of parse trees such as suspicious database queries. In an embodiment, a computer identifies subtrees in each of many trees. A respective subset of participating subtrees is selected in each tree. A respective root node of each participating subtree should directly have a child node that is a leaf and/or should have a degree that exceeds a branching threshold such as one. For each pairing of a respective first tree with a respective second tree, based on a count of subtree matches between the participating subset of subtrees in the first tree and the participating subset of subtrees in the second tree, a respective tree similarity score is calculated. A machine learning model inferences based on the tree similarity scores of the many trees. In an embodiment, each tree similarity score is a convolution kernel.
    Type: Grant
    Filed: December 22, 2020
    Date of Patent: September 20, 2022
    Assignee: Oracle International Corporation
    Inventors: Arno Schneuwly, Nikola Milojkovic, Felix Schmidt, Nipun Agarwal
  • Publication number: 20220294757
    Abstract: Techniques are described herein for using machine learning to learn vector representations of DNS requests such that the resulting embeddings represent the semantics of the DNS requests as a whole. Techniques described herein perform pre-processing of tokenized DNS request strings in which hashes, which are long and relatively random strings of characters, are detected in DNS request strings and each detected hash token is replaced with a placeholder token. A vectorizing ML model is trained using the pre-processed training dataset in which hash tokens have been replaced. Embeddings for the DNS tokens are derived from an intermediate layer of the vectorizing ML model. The encoding application creates final vector representations for each DNS request string by generating a weighted summation of the embeddings of all of the tokens in the DNS request string. Because of hash replacement, the resulting DNS request embeddings reflect semantics of the hashes as a group.
    Type: Application
    Filed: March 10, 2021
    Publication date: September 15, 2022
    Inventors: Renata Khasanova, Felix Schmidt, Stuart Wray, Craig Schelp, Nipun Agarwal, Matteo Casserini
  • Publication number: 20220292304
    Abstract: Herein are feature extraction mechanisms that receive parsed log messages as inputs and transform them into numerical feature vectors for machine learning models (MLMs). In an embodiment, a computer extracts fields from a log message. Each field specifies a name, a text value, and a type. For each field, a field transformer for the field is dynamically selected based the field's name and/or the field's type. The field transformer converts the field's text value into a value of the field's type. A feature encoder for the value of the field's type is dynamically selected based on the field's type and/or a range of the field's values that occur in a training corpus of an MLM. From the feature encoder, an encoding of the value of the field's typed is stored into a feature vector. Based on the MLM and the feature vector, the log message is detected as anomalous or not.
    Type: Application
    Filed: March 12, 2021
    Publication date: September 15, 2022
    Inventors: AMIN SUZANI, SAEID ALLAHDADIAN, MILOS VASIC, MATTEO CASSERINI, HAMED AHMADI, FELIX SCHMIDT, ANDREW BROWNSWORD, NIPUN AGARWAL
  • Patent number: 11443166
    Abstract: Embodiments use a hierarchy of machine learning models to predict datacenter behavior at multiple hardware levels of a datacenter without accessing operating system generated hardware utilization information. The accuracy of higher-level models in the hierarchy of models is increased by including, as input to the higher-level models, hardware utilization predictions from lower-level models. The hierarchy of models includes: server utilization models and workload/OS prediction models that produce predictions at a server device-level of a datacenter; and also top-of-rack switch models and backbone switch models that produce predictions at higher levels of the datacenter. These models receive, as input, hardware utilization information from non-OS sources. Based on datacenter-level network utilization predictions from the hierarchy of models, the datacenter automatically configures its hardware to avoid any predicted over-utilization of hardware in the datacenter.
    Type: Grant
    Filed: October 29, 2018
    Date of Patent: September 13, 2022
    Assignee: Oracle International Corporation
    Inventors: Pravin Shinde, Felix Schmidt, Onur Kocberber
  • Publication number: 20220284005
    Abstract: Unsorted sparse dictionary encodings are transformed into unsorted-dense or sorted-dense dictionary encodings. Sparse domain codes have large gaps between codes that are adjacent in order. Unlike spare codes, dense codes have smaller gaps between adjacent codes; consecutive codes are dense codes that have no gaps between adjacent codes. The techniques described herein are relational approaches that may be used to generate sparse composite codes and sorted codes.
    Type: Application
    Filed: May 24, 2022
    Publication date: September 8, 2022
    Inventors: Pit Fender, Felix Schmidt, Benjamin Schlegel
  • Patent number: 11423327
    Abstract: Techniques are described herein for estimating CPU, memory, and I/O utilization for a workload via out-of-band sensor readings using a machine learning model. The framework involves receiving sensor data associated with executing benchmark applications, obtaining ground truth utilization values for the benchmarks, preprocessing the training data to select a set of enhanced sequences, and using the enhanced sequences to train a random forest model to estimate CPU, memory, and I/O utilization given sensor monitoring data. Prior to the training phase, a machine learning model is trained using a set of predefined hyper-parameters. The trained models are used to generate estimations for CPU, memory, and I/O utilizations values. The utilization values are used with workload context information to assess the deployment and generate one or more recommendations for machine types that will best serve the workload in terms of system utilization.
    Type: Grant
    Filed: October 10, 2018
    Date of Patent: August 23, 2022
    Assignee: Oracle International Corporation
    Inventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Andrew Brownsword, Nipun Agarwal
  • Publication number: 20220261228
    Abstract: Herein are machine learning (ML) feature processing and analytic techniques to detect anomalies in parse trees of logic statements, database queries, logic scripts, compilation units of general-purpose programing language, extensible markup language (XML), JavaScript object notation (JSON), and document object models (DOM). In an embodiment, a computer identifies an operational trace that contains multiple parse trees. Values of explicit features are generated from a single respective parse tree of the multiple parse trees of the operational trace. Values of implicit features are generated from more than one respective parse tree of the multiple parse trees of the operational trace. The explicit and implicit features are stored into a same feature vector. With the feature vector as input, an ML model detects whether or not the operational trace is anomalous, based on the explicit features of each parse tree of the operational trace and the implicit features of multiple parse trees of the operational trace.
    Type: Application
    Filed: February 12, 2021
    Publication date: August 18, 2022
    Inventors: Arno Schneuwly, Nikola Milojkovic, Felix Schmidt, Nipun Agarwal
  • Patent number: 11379450
    Abstract: Unsorted sparse dictionary encodings are transformed into unsorted-dense or sorted-dense dictionary encodings. Sparse domain codes have large gaps between codes that are adjacent in order. Unlike spare codes, dense codes have smaller gaps between adjacent codes; consecutive codes are dense codes that have no gaps between adjacent codes. The techniques described herein are relational approaches that may be used to generate sparse composite codes and sorted codes.
    Type: Grant
    Filed: October 9, 2018
    Date of Patent: July 5, 2022
    Assignee: Oracle International Corporation
    Inventors: Pit Fender, Felix Schmidt, Benjamin Schlegel
  • Publication number: 20220198294
    Abstract: Herein is resource-constrained feature enrichment for analysis of parse trees such as suspicious database queries. In an embodiment, a computer receives a parse tree that contains many tree nodes. Each tree node is associated with a respective production rule that was used to generate the tree node. Extracted from the parse tree are many sequences of production rules having respective sequence lengths that satisfy a length constraint that accepts at least one fixed length that is greater than two. Each extracted sequence of production rules consists of respective production rules of a sequence of tree nodes in a respective directed tree path of the parse tree having a path length that satisfies that same length constraint. Based on the extracted sequences of production rules, a machine learning model generates an inference. In a bag of rules data structure, the extracted sequences of production rules are aggregated by distinct sequence and duplicates are counted.
    Type: Application
    Filed: December 23, 2020
    Publication date: June 23, 2022
    Inventors: ARNO SCHNEUWLY, NIKOLA MILOJKOVIC, FELIX SCHMIDT, NIPUN AGARWAL
  • Publication number: 20220197917
    Abstract: Approaches herein relate to machine learning for detection of anomalous logic syntax. Herein is acceleration for comparison of parse trees such as suspicious database queries. In an embodiment, a computer identifies subtrees in each of many trees. A respective subset of participating subtrees is selected in each tree. A respective root node of each participating subtree should directly have a child node that is a leaf and/or should have a degree that exceeds a branching threshold such as one. For each pairing of a respective first tree with a respective second tree, based on a count of subtree matches between the participating subset of subtrees in the first tree and the participating subset of subtrees in the second tree, a respective tree similarity score is calculated. A machine learning model inferences based on the tree similarity scores of the many trees. In an embodiment, each tree similarity score is a convolution kernel.
    Type: Application
    Filed: December 22, 2020
    Publication date: June 23, 2022
    Inventors: ARNO SCHNEUWLY, NIKOLA MILOJKOVIC, FELIX SCHMIDT, NIPUN AGARWAL
  • Publication number: 20220188694
    Abstract: Approaches herein relate to model decay of an anomaly detector due to concept drift. Herein are machine learning techniques for dynamically self-tuning an anomaly score threshold. In an embodiment in a production environment, a computer receives an item in a stream of items. A machine learning (ML) model hosted by the computer infers by calculation an anomaly score for the item. Whether the item is anomalous or not is decided based on the anomaly score and an adaptive anomaly threshold that dynamically fluctuates. A moving standard deviation of anomaly scores is adjusted based on a moving average of anomaly scores. The moving average of anomaly scores is then adjusted based on the anomaly score. The adaptive anomaly threshold is then adjusted based on the moving average of anomaly scores and the moving standard deviation of anomaly scores.
    Type: Application
    Filed: December 15, 2020
    Publication date: June 16, 2022
    Inventors: Amin Suzani, Matteo Casserini, Milos Vasic, Saeid Allahdadian, Andrew Brownsword, Hamed Ahmadi, Felix Schmidt, Nipun Agarwal
  • Publication number: 20220188410
    Abstract: Approaches herein relate to reconstructive models such as an autoencoder for anomaly detection. Herein are machine learning techniques that detect and suppress any feature that causes model decay by concept drift. In an embodiment in a production environment, a computer initializes an unsuppressed subset of features with a plurality of features that an already-trained reconstructive model can process. A respective reconstruction error of each feature of the unsuppressed subset of features is calculated. The computer detects that a respective moving average based on the reconstruction error of a particular feature of the unsuppressed subset of features exceeds a respective feature suppression threshold of the particular feature, which causes removal of the particular feature from the unsuppressed subset of features.
    Type: Application
    Filed: December 15, 2020
    Publication date: June 16, 2022
    Inventors: SAEID ALLAHDADIAN, ANDREW BROWNSWORD, MILOS VASIC, MATTEO CASSERINI, AMIN SUZANI, HAMED AHMADI, FELIX SCHMIDT, NIPUN AGARWAL