Patents by Inventor Marian Radu

Marian Radu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

ANOMALOUS COMMAND LINE ENTRY DETECTION

Publication number: 20240095346

Abstract: A command line anomaly detection system can generate anomaly scores associated with command line entries, such that command line entries associated with the highest anomaly scores can be identified. The command line anomaly detection system can include a transformer model trained, via unsupervised machine learning, to determine meanings of components of individual command line entries. The command line anomaly detection system can also include an anomaly detection model trained, via unsupervised machine learning, to determine anomaly scores based on the meanings of components of individual command line entries determined by the transformer model.

Type: Application

Filed: September 15, 2022

Publication date: March 21, 2024

Inventors: Stefan-Bogdan Cocea, Mihaela Petruta Gaman, Cristian Viorel Popa, Marian Radu
FILE FORMAT IDENTIFICATION SYSTEM

Publication number: 20230376526

Abstract: A file format identification system can predict file formats associated with binary data. The file format identification system can extract n-grams, such as byte 4-grams, from the binary data. A trained neural network with at least one embedding layer can generate embedding arrays that correspond to the extracted n-grams. A trained file format classifier can compare values in the embedding arrays with patterns of values associated with known file formats. The trained file format classifier can accordingly determine which of the known file formats are most likely to be associated with the binary data.

Type: Application

Filed: May 19, 2022

Publication date: November 23, 2023

Inventor: Marian Radu
ENTROPY EXCLUSION OF TRAINING DATA FOR AN EMBEDDING NETWORK

Publication number: 20230367849

Abstract: Methods and systems are provided for entropy exclusion of labeled training data by extracting windows therefrom, for training an embedding learning model to output a feature space for a feature space based learning model. Based on feature embedding by machine learning, a machine learning model is trained to embed feature vectors in a feature space which magnifies distances between features of a labeled dataset. Before training, however, sub-sequences of bytes are extracted from each sample of the labeled subset, based on a window size hyperparameter and a window distance hyperparameter. Information entropy is computed for each among a set of extracted windows, and extracted windows having highest information entropy, as well as extracted windows having lowest information entropy, are excluded therefrom. Extracted windows of the subset are stored in a data stream and accessed sequentially to derive feature vectors.

Type: Application

Filed: May 16, 2022

Publication date: November 16, 2023

Inventors: Marian Radu, Daniel Radu
APPLICATIONS OF MACHINE LEARNING MODELS TO A BINARY SEARCH ENGINE BASED ON AN INVERTED INDEX OF BYTE SEQUENCES

Publication number: 20230359601

Abstract: Techniques for searching an inverted index associating byte sequences of a fixed length and files that contain those byte sequences are described herein. Byte sequences comprising a search query are determined and searched in the inverted index. In some examples, training data for training machine learning (ML) model(s) may be created using pre-featured data from the inverted index. In various examples, training data may be used to retrain the ML model until the ML model meets a criterion. In some examples, the trained ML model may be used to perform searches on the inverted index and classify files.

Type: Application

Filed: June 30, 2023

Publication date: November 9, 2023

Inventors: Horea Razvan Coroiu, Daniel Radu, Marian Radu
DERIVING STATISTICALLY PROBABLE AND STATISTICALLY RELEVANT INDICATOR OF COMPROMISE SIGNATURE FOR MATCHING ENGINES

Publication number: 20230351016

Abstract: Methods and systems are provided for a histogram model configuring a computing system to derive an indicator of compromise signature based on a sliding window index of identified malware samples, and a matching rule constructor configuring a computing system to generate matching signatures by selecting statistically relevant n-grams of an unidentified file sample. A matching rule constructor configures the computing system to construct a matching rule including, as a signature, 32 n-grams found in the unidentified file sample which occur most frequently, and another 32 n-grams found in the unidentified file sample which occur least frequently amongst records of the threat database across 32 discrete file size ranges.

Type: Application

Filed: April 29, 2022

Publication date: November 2, 2023

Inventors: Marian Radu, Daniel Radu
BYTE N-GRAM EMBEDDING MODEL

Publication number: 20230334154

Abstract: Training and use of a byte n-gram embedding model is described herein. A neural network is trained to determine a probability of occurrence associated with a byte n-gram. The neural network includes one or more embedding model layers, at least one of which is configured to output an embedding array of values. The byte n-gram embedding model may be used to generate a hash of received data, to classify the received data with no knowledge of a data structure associated with the received data, to compare the received data to files having a known classification, and/or to generate a signature for the received data.

Type: Application

Filed: June 22, 2023

Publication date: October 19, 2023

Inventors: Radu Cazan, Daniel Radu, Marian Radu
Byte n-gram embedding model

Patent number: 11727112

Abstract: Training and use of a byte n-gram embedding model is described herein. A neural network is trained to determine a probability of occurrence associated with a byte n-gram. The neural network includes one or more embedding model layers, at least one of which is configured to output an embedding array of values. The byte n-gram embedding model may be used to generate a hash of received data, to classify the received data with no knowledge of a data structure associated with the received data, to compare the received data to files having a known classification, and/or to generate a signature for the received data.

Type: Grant

Filed: December 31, 2018

Date of Patent: August 15, 2023

Assignee: CrowdStrike, Inc.

Inventors: Radu Cazan, Daniel Radu, Marian Radu
Applications of machine learning models to a binary search engine based on an inverted index of byte sequences

Patent number: 11709811

Abstract: Techniques for searching an inverted index associating byte sequences of a fixed length and files that contain those byte sequences are described herein. Byte sequences comprising a search query are determined and searched in the inverted index. In some examples, training data for training machine learning model(s) may be created using pre-featured data from the inverted index. In various examples, training data may be used to retrain a ML model until the ML model meets a criterion. In some examples, the trained ML model may be used to perform searches on the inverted index and classify files.

Type: Grant

Filed: May 14, 2019

Date of Patent: July 25, 2023

Assignee: CrowdStrike, Inc.

Inventors: Horea Coroiu, Daniel Radu, Marian Radu
BYTE N-GRAM EMBEDDING MODEL

Publication number: 20200005082

Abstract: Training and use of a byte n-gram embedding model is described herein. A neural network is trained to determine a probability of occurrence associated with a byte n-gram. The neural network includes one or more embedding model layers, at least one of which is configured to output an embedding array of values. The byte n-gram embedding model may be used to generate a hash of received data, to classify the received data with no knowledge of a data structure associated with the received data, to compare the received data to files having a known classification, and/or to generate a signature for the received data.

Type: Application

Filed: December 31, 2018

Publication date: January 2, 2020

Inventors: Radu Cazan, Daniel Radu, Marian Radu
APPLICATIONS OF MACHINE LEARNING MODELS TO A BINARY SEARCH ENGINE BASED ON AN INVERTED INDEX OF BYTE SEQUENCES

Publication number: 20190266141

Abstract: Techniques for searching an inverted index associating byte sequences of a fixed length and files that contain those byte sequences are described herein. Byte sequences comprising a search query are determined and searched in the inverted index. In some examples, training data for training machine learning model(s) may be created using pre-featured data from the inverted index. In various examples, training data may be used to retrain a ML model until the ML model meets a criterion. In some examples, the trained ML model may be used to perform searches on the inverted index and classify files.

Type: Application

Filed: May 14, 2019

Publication date: August 29, 2019

Applicant: CrowdStrike, Inc.

Inventors: Horea Coroiu, Daniel Radu, Marian Radu
REPRESENTING AND COMPARING FILES BASED ON SEGMENTED SIMILARITY

Publication number: 20170193230

Abstract: Disclosed herein is a system and method for determining whether two files are similar or an unknown file contains malware or other malicious activity. The system takes a suspect file and generates a hash for the file. The hash represents segments of a file that may be compared with segments of other hashes. This hash is then compared with the hash of another file. The comparison measures the distance between the two hashes and if the two hashes are close enough to each other then the two files are consider similar to each other.

Type: Application

Filed: May 3, 2015

Publication date: July 6, 2017

Inventors: Roy Jevnisek, Tomer Brand, Patrick Estavillo, Marian Radu