Patents by Inventor CRAIG SCHELP

CRAIG SCHELP has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230421528
    Abstract: Techniques are described herein for using machine learning to learn vector representations of DNS requests such that the resulting embeddings represent the semantics of the DNS requests as a whole. Techniques described herein perform pre-processing of tokenized DNS request strings in which hashes, which are long and relatively random strings of characters, are detected in DNS request strings and each detected hash token is replaced with a placeholder token. A vectorizing ML model is trained using the pre-processed training dataset in which hash tokens have been replaced. Embeddings for the DNS tokens are derived from an intermediate layer of the vectorizing ML model. The encoding application creates final vector representations for each DNS request string by generating a weighted summation of the embeddings of all of the tokens in the DNS request string. Because of hash replacement, the resulting DNS request embeddings reflect semantics of the hashes as a group.
    Type: Application
    Filed: August 24, 2023
    Publication date: December 28, 2023
    Inventors: Renata Khasanova, Felix Schmidt, Stuart Wray, Craig Schelp, Nipun Agarwal, Matteo Casserini
  • Patent number: 11784964
    Abstract: Techniques are described herein for using machine learning to learn vector representations of DNS requests such that the resulting embeddings represent the semantics of the DNS requests as a whole. Techniques described herein perform pre-processing of tokenized DNS request strings in which hashes, which are long and relatively random strings of characters, are detected in DNS request strings and each detected hash token is replaced with a placeholder token. A vectorizing ML model is trained using the pre-processed training dataset in which hash tokens have been replaced. Embeddings for the DNS tokens are derived from an intermediate layer of the vectorizing ML model. The encoding application creates final vector representations for each DNS request string by generating a weighted summation of the embeddings of all of the tokens in the DNS request string. Because of hash replacement, the resulting DNS request embeddings reflect semantics of the hashes as a group.
    Type: Grant
    Filed: March 10, 2021
    Date of Patent: October 10, 2023
    Assignee: Oracle International Corporation
    Inventors: Renata Khasanova, Felix Schmidt, Stuart Wray, Craig Schelp, Nipun Agarwal, Matteo Casserini
  • Patent number: 11451565
    Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.
    Type: Grant
    Filed: September 5, 2018
    Date of Patent: September 20, 2022
    Assignee: Oracle International Corporation
    Inventors: Guang-Tong Zhou, Hossein Hajimirsadeghi, Andrew Brownsword, Stuart Wray, Craig Schelp, Rod Reddekopp, Felix Schmidt
  • Publication number: 20220294757
    Abstract: Techniques are described herein for using machine learning to learn vector representations of DNS requests such that the resulting embeddings represent the semantics of the DNS requests as a whole. Techniques described herein perform pre-processing of tokenized DNS request strings in which hashes, which are long and relatively random strings of characters, are detected in DNS request strings and each detected hash token is replaced with a placeholder token. A vectorizing ML model is trained using the pre-processed training dataset in which hash tokens have been replaced. Embeddings for the DNS tokens are derived from an intermediate layer of the vectorizing ML model. The encoding application creates final vector representations for each DNS request string by generating a weighted summation of the embeddings of all of the tokens in the DNS request string. Because of hash replacement, the resulting DNS request embeddings reflect semantics of the hashes as a group.
    Type: Application
    Filed: March 10, 2021
    Publication date: September 15, 2022
    Inventors: Renata Khasanova, Felix Schmidt, Stuart Wray, Craig Schelp, Nipun Agarwal, Matteo Casserini
  • Patent number: 11423327
    Abstract: Techniques are described herein for estimating CPU, memory, and I/O utilization for a workload via out-of-band sensor readings using a machine learning model. The framework involves receiving sensor data associated with executing benchmark applications, obtaining ground truth utilization values for the benchmarks, preprocessing the training data to select a set of enhanced sequences, and using the enhanced sequences to train a random forest model to estimate CPU, memory, and I/O utilization given sensor monitoring data. Prior to the training phase, a machine learning model is trained using a set of predefined hyper-parameters. The trained models are used to generate estimations for CPU, memory, and I/O utilizations values. The utilization values are used with workload context information to assess the deployment and generate one or more recommendations for machine types that will best serve the workload in terms of system utilization.
    Type: Grant
    Filed: October 10, 2018
    Date of Patent: August 23, 2022
    Assignee: Oracle International Corporation
    Inventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Andrew Brownsword, Nipun Agarwal
  • Patent number: 11416324
    Abstract: Techniques are described herein for accurately measuring the reliability of storage systems. Rather than relying on a series of approximations, which may produce highly optimistic estimates, the techniques described herein use a failure distribution derived from a disk failure data set to derive reliability metrics such as mean time to data loss (MTTDL) and annual durability. A new framework for modeling storage system dynamics is described herein. The framework facilitates theoretical analysis of the reliability. The model described herein captures the complex structure of storage systems considering their configuration, dynamics, and operation. Given this model, a simulation-free analytical solution to the commonly used reliability metrics is derived. The model may also be used to analyze the long-term reliability behavior of storage systems.
    Type: Grant
    Filed: May 13, 2020
    Date of Patent: August 16, 2022
    Assignee: Oracle International Corporation
    Inventors: Paria Rashidinejad, Navaneeth Jamadagni, Arun Raghavan, Craig Schelp, Charles Gordon
  • Patent number: 11082438
    Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.
    Type: Grant
    Filed: September 5, 2018
    Date of Patent: August 3, 2021
    Assignee: Oracle International Corporation
    Inventors: Juan Fernandez Peinador, Manel Fernandez Gomez, Guang-Tong Zhou, Hossein Hajimirsadeghi, Andrew Brownsword, Onur Kocberber, Felix Schmidt, Craig Schelp
  • Patent number: 10917203
    Abstract: Embodiments use Bayesian techniques to efficiently estimate the bit error rates (BERs) of cables in a computer network at a customizable level of confidence. Specifically, a plurality of probability records are maintained for a given cable in a computer system, where each probability record is associated with a hypothetical BER for the cable, and reflects a probability that the cable has the associated hypothetical BER. At configurable time intervals, the probability records are updated using statistics gathered from a switch port connected to the cable. In order to estimate the BER of the cable at a given confidence level, embodiments determine which probability record is associated with a probability mass that indicates the confidence level. The estimate for the cable BER is the hypothetical BER that is associated with the indicated probability mass. Embodiments store the estimate in memory and utilize the estimate to aid in maintaining the computer system.
    Type: Grant
    Filed: May 17, 2019
    Date of Patent: February 9, 2021
    Assignee: Oracle International Corporation
    Inventors: Stuart Wray, Felix Schmidt, Craig Schelp, Pravin Shinde, Akhilesh Singhania, Nipun Agarwal
  • Patent number: 10892961
    Abstract: Herein are computerized techniques for autonomous and artificially intelligent administration of a computer cloud health monitoring system. In an embodiment, an orchestration computer automatically detects a current state of network elements of a computer network by processing: a) a network plan that defines a topology of the computer network, and b) performance statistics of the network elements. The network elements include computers that each hosts virtual execution environment(s). Each virtual execution environment hosts analysis logic that transforms raw performance data of a network element into a portion of the performance statistics. For each computer, a configuration specification for each virtual execution environment of the computer is automatically generated based on the network plan and the current state of the computer network. At least one virtual execution environment is automatically tuned and/or re-provisioned based on a generated configuration specification.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: January 12, 2021
    Assignee: Oracle International Corporation
    Inventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Pravin Shinde
  • Publication number: 20200371855
    Abstract: Techniques are described herein for accurately measuring the reliability of storage systems. Rather than relying on a series of approximations, which may produce highly optimistic estimates, the techniques described herein use a failure distribution derived from a disk failure data set to derive reliability metrics such as mean time to data loss (MTTDL) and annual durability. A new framework for modeling storage system dynamics is described herein. The framework facilitates theoretical analysis of the reliability. The model described herein captures the complex structure of storage systems considering their configuration, dynamics, and operation. Given this model, a simulation-free analytical solution to the commonly used reliability metrics is derived. The model may also be used to analyze the long-term reliability behavior of storage systems.
    Type: Application
    Filed: May 13, 2020
    Publication date: November 26, 2020
    Inventors: Paria Rashidinejad, Navaneeth Jamadagni, Arun Raghavan, Craig Schelp, Charles Gordon
  • Publication number: 20200366428
    Abstract: Embodiments use Bayesian techniques to efficiently estimate the bit error rates (BERs) of cables in a computer network at a customizable level of confidence. Specifically, a plurality of probability records are maintained for a given cable in a computer system, where each probability record is associated with a hypothetical BER for the cable, and reflects a probability that the cable has the associated hypothetical BER. At configurable time intervals, the probability records are updated using statistics gathered from a switch port connected to the cable. In order to estimate the BER of the cable at a given confidence level, embodiments determine which probability record is associated with a probability mass that indicates the confidence level. The estimate for the cable BER is the hypothetical BER that is associated with the indicated probability mass. Embodiments store the estimate in memory and utilize the estimate to aid in maintaining the computer system.
    Type: Application
    Filed: May 17, 2019
    Publication date: November 19, 2020
    Inventors: STUART WRAY, FELIX SCHMIDT, CRAIG SCHELP, PRAVIN SHINDE, AKHILESH SINGHANIA, NIPUN AGARWAL
  • Patent number: 10795690
    Abstract: Herein are computerized techniques for generation, costing/scoring, optimal selection, and reporting of intermediate configurations for a datacenter change plan. In an embodiment, a computer receives a current configuration of a datacenter and a target configuration. New configurations are generated based on the current configuration. A cost function is applied to calculate a cost of each new configuration based on measuring a logical difference between the new configuration and the target configuration. A particular new configuration is selected that has a least cost. When the particular configuration satisfies the target configuration, the datacenter is reconfigured based on the particular configuration. Otherwise, this process is (e.g. iteratively) repeated with the particular configuration instead used as the current configuration. In embodiments, new configurations are randomly, greedily, and/or manually generated.
    Type: Grant
    Filed: October 30, 2018
    Date of Patent: October 6, 2020
    Assignee: Oracle International Corporation
    Inventors: Pravin Shinde, Felix Schmidt, Craig Schelp
  • Patent number: 10768982
    Abstract: Herein are techniques for analysis of data streams. In an embodiment, a computer associates each software actor with data streams. Each software actor has its own backlog queue of data to analyze. In response to receiving some stream content and based on the received stream content, data is distributed to some software actors. In response to determining that the data satisfies completeness criteria of a particular software actor, an indication of the data is appended onto the backlog queue of the particular software actor. The particular software actor is reset to an initial state by loading an execution snapshot of a previous initial execution of an embedded virtual machine. Based on the particular software actor, execution of the execution snapshot of the previous initial execution is resumed to dequeue and process the indication of the data from the backlog queue of the particular software actor to generate a result.
    Type: Grant
    Filed: September 19, 2018
    Date of Patent: September 8, 2020
    Assignee: Oracle International Corporation
    Inventors: Andrew Brownsword, Tayler Hetherington, Pavan Chandrashekar, Akhilesh Singhania, Stuart Wray, Pravin Shinde, Felix Schmidt, Craig Schelp, Onur Kocberber, Juan Fernandez Peinador, Rod Reddekopp, Manel Fernandez Gomez, Nipun Agarwal
  • Publication number: 20200259722
    Abstract: Herein are computerized techniques for autonomous and artificially intelligent administration of a computer cloud health monitoring system. In an embodiment, an orchestration computer automatically detects a current state of network elements of a computer network by processing: a) a network plan that defines a topology of the computer network, and b) performance statistics of the network elements. The network elements include computers that each hosts virtual execution environment(s). Each virtual execution environment hosts analysis logic that transforms raw performance data of a network element into a portion of the performance statistics. For each computer, a configuration specification for each virtual execution environment of the computer is automatically generated based on the network plan and the current state of the computer network. At least one virtual execution environment is automatically tuned and/or re-provisioned based on a generated configuration specification.
    Type: Application
    Filed: February 8, 2019
    Publication date: August 13, 2020
    Inventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Pravin Shinde
  • Publication number: 20200133688
    Abstract: Herein are computerized techniques for generation, costing/scoring, optimal selection, and reporting of intermediate configurations for a datacenter change plan. In an embodiment, a computer receives a current configuration of a datacenter and a target configuration. New configurations are generated based on the current configuration. A cost function is applied to calculate a cost of each new configuration based on measuring a logical difference between the new configuration and the target configuration. A particular new configuration is selected that has a least cost. When the particular configuration satisfies the target configuration, the datacenter is reconfigured based on the particular configuration. Otherwise, this process is (e.g. iteratively) repeated with the particular configuration instead used as the current configuration. In embodiments, new configurations are randomly, greedily, and/or manually generated.
    Type: Application
    Filed: October 30, 2018
    Publication date: April 30, 2020
    Inventors: PRAVIN SHINDE, FELIX SCHMIDT, CRAIG SCHELP
  • Publication number: 20200118039
    Abstract: Techniques are described herein for estimating CPU, memory, and I/O utilization for a workload via out-of-band sensor readings using a machine learning model. The framework involves receiving sensor data associated with executing benchmark applications, obtaining ground truth utilization values for the benchmarks, preprocessing the training data to select a set of enhanced sequences, and using the enhanced sequences to train a random forest model to estimate CPU, memory, and I/O utilization given sensor monitoring data. Prior to the training phase, a machine learning model is trained using a set of predefined hyper-parameters. The trained models are used to generate estimations for CPU, memory, and I/O utilizations values. The utilization values are used with workload context information to assess the deployment and generate one or more recommendations for machine types that will best serve the workload in terms of system utilization.
    Type: Application
    Filed: October 10, 2018
    Publication date: April 16, 2020
    Inventors: Onur Kocberber, Felix Schmidt, Craig Schelp, Andrew Brownsword, Nipun Agarwal
  • Publication number: 20200089529
    Abstract: Herein are techniques for analysis of data streams. In an embodiment, a computer associates each software actor with data streams. Each software actor has its own backlog queue of data to analyze. In response to receiving some stream content and based on the received stream content, data is distributed to some software actors. In response to determining that the data satisfies completeness criteria of a particular software actor, an indication of the data is appended onto the backlog queue of the particular software actor. The particular software actor is reset to an initial state by loading an execution snapshot of a previous initial execution of an embedded virtual machine. Based on the particular software actor, execution of the execution snapshot of the previous initial execution is resumed to dequeue and process the indication of the data from the backlog queue of the particular software actor to generate a result.
    Type: Application
    Filed: September 19, 2018
    Publication date: March 19, 2020
    Inventors: ANDREW BROWNSWORD, TAYLER HETHERINGTON, PAVAN CHANDRASHEKAR, AKHILESH SINGHANIA, STUART WRAY, PRAVIN SHINDE, FELIX SCHMIDT, CRAIG SCHELP, ONUR KOCBERBER, JUAN FERNANDEZ PEINADOR, ROD REDDEKOPP, MANEL FERNANDEZ GOMEZ, NIPUN AGARWAL
  • Publication number: 20200076840
    Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.
    Type: Application
    Filed: September 5, 2018
    Publication date: March 5, 2020
    Inventors: JUAN FERNANDEZ PEINADOR, MANEL FERNANDEZ GOMEZ, GUANG-TONG ZHOU, HOSSEIN HAJIMIRSADEGHI, ANDREW BROWNSWORD, ONUR KOCBERBER, FELIX SCHMIDT, CRAIG SCHELP
  • Publication number: 20200076842
    Abstract: Techniques are provided herein for contextual embedding of features of operational logs or network traffic for anomaly detection based on sequence prediction. In an embodiment, a computer has a predictive recurrent neural network (RNN) that detects an anomalous network flow. In an embodiment, an RNN contextually transcodes sparse feature vectors that represent log messages into dense feature vectors that may be predictive or used to generate predictive vectors. In an embodiment, graph embedding improves feature embedding of log traces. In an embodiment, a computer detects and feature-encodes independent traces from related log messages. These techniques may detect malicious activity by anomaly analysis of context-aware feature embeddings of network packet flows, log messages, and/or log traces.
    Type: Application
    Filed: September 5, 2018
    Publication date: March 5, 2020
    Inventors: GUANG-TONG ZHOU, HOSSEIN HAJIMIRSADEGHI, ANDREW BROWNSWORD, STUART WRAY, CRAIG SCHELP, ROD REDDEKOPP, FELIX SCHMIDT