Patents by Inventor Tomas Komarek

Tomas Komarek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230376836
    Abstract: Techniques and architecture are described for converting tree structured data such as, for example, JavaScript Object Notation (JSON) data, into multiple feature vectors to train multiple instance learning (MIL) models for providing cybersecurity in networks. In particular, a data set is provided, wherein the data set comprises a sample configured as a hierarchal tree. The sample is converted into a set of path and value pairs, e.g., flattened into a set of path and value pairs, where the path is a sequence of field names and array indices encoding a position of a value. Each path and value pair of the set of path and value pairs is converted into a respective feature vector to form a set of feature vectors. The set of feature vectors is used to train a multiple instance learning (MIL) model, wherein each feature vector has a same, fixed length.
    Type: Application
    Filed: May 20, 2022
    Publication date: November 23, 2023
    Inventors: Tomas Komarek, Stepan Dvorak, Jan Brabec
  • Patent number: 11799904
    Abstract: Inverse imbalance subspace searching techniques are used to detect potential malware among samples of network communication data. A large number of samples of network communication data, such as proxy log data and/or network flows, are received and analyzed by a malware detection system. A number of the samples are associated with known malware, while other unlabeled samples are either benign or may be associated with unknown malware. An inverse imbalance subspace search may be performed, in which the sample sets are divided into subsets based on random feature thresholds, and each subset is evaluated based on the ratio of known malware samples to unlabeled samples. Unlabeled samples within subsets having high malware sample ratios may be identified, aggregated, and processed as potential malware.
    Type: Grant
    Filed: December 10, 2020
    Date of Patent: October 24, 2023
    Assignee: Cisco Technology, Inc.
    Inventors: Tomas Komarek, Jan Brabec, Cenek Skarda
  • Patent number: 11374944
    Abstract: In one embodiment, a network security service forms, for each of a plurality of malware classes, a feature vector descriptor for the malware class. The service uses the feature vector descriptors for the malware classes and a symmetric mapping function to generate a training dataset having both positively and negatively labeled feature vectors. The service trains, using the training dataset, an instant threat detector to determine whether telemetry data for a particular traffic flow is within a threshold of similarity to a feature vector descriptor for a new malware class that was not part of the plurality of malware classes.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: June 28, 2022
    Assignee: Cisco Technology, Inc.
    Inventors: Tomas Komarek, Petr Somol
  • Publication number: 20220191244
    Abstract: Inverse imbalance subspace searching techniques are used to detect potential malware among samples of network communication data. A large number of samples of network communication data, such as proxy log data and/or network flows, are received and analyzed by a malware detection system. A number of the samples are associated with known malware, while other unlabeled samples are either benign or may be associated with unknown malware. An inverse imbalance subspace search may be performed, in which the sample sets are divided into subsets based on random feature thresholds, and each subset is evaluated based on the ratio of known malware samples to unlabeled samples. Unlabeled samples within subsets having high malware sample ratios may be identified, aggregated, and processed as potential malware.
    Type: Application
    Filed: December 10, 2020
    Publication date: June 16, 2022
    Inventors: Tomas Komarek, Jan Brabec, Cenek Skarda
  • Patent number: 11271833
    Abstract: In one embodiment, a device groups feature vectors representing network traffic flows into bags. The device forms a bag representation of a particular one of the bags by aggregating the feature vectors in the particular bag. The device extends one or more feature vectors in the particular bag with the bag representation. The extended one or more feature vectors are positive examples of a classification label for the network traffic. The device trains a network traffic classifier using training data that comprises the one or more feature vectors extended with the bag representation.
    Type: Grant
    Filed: October 23, 2017
    Date of Patent: March 8, 2022
    Assignee: Cisco Technology, Inc.
    Inventors: Tomas Komarek, Martin Vejman, Petr Somol
  • Patent number: 10867036
    Abstract: In one embodiment, a device divides groups of tuples of traffic characteristics of encrypted network traffic into different pairs of the characteristics. Each of the pairs has a corresponding two dimensional (2-D) feature subspace. The device discretizes the 2-D feature subspaces, to form a plurality of bins in each feature subspace. The device assigns the pairs of the traffic characteristics in a particular group of tuples to the bins in the discretized 2-D feature subspaces. The device forms, for each group of tuples, a vector representation of the group of tuples based on the bins in the discretized 2-D feature subspaces to which the pairs of the traffic characteristics from the group are assigned. The vector representations of the groups of tuples are of a fixed dimension. The device uses the vector representations of the groups of tuples to train a machine learning-based traffic classifier.
    Type: Grant
    Filed: October 12, 2017
    Date of Patent: December 15, 2020
    Assignee: Cisco Technology, Inc.
    Inventors: Tomas Komarek, Petr Somol
  • Publication number: 20200204569
    Abstract: In one embodiment, a network security service forms, for each of a plurality of malware classes, a feature vector descriptor for the malware class. The service uses the feature vector descriptors for the malware classes and a symmetric mapping function to generate a training dataset having both positively and negatively labeled feature vectors. The service trains, using the training dataset, an instant threat detector to determine whether telemetry data for a particular traffic flow is within a threshold of similarity to a feature vector descriptor for a new malware class that was not part of the plurality of malware classes.
    Type: Application
    Filed: December 19, 2018
    Publication date: June 25, 2020
    Inventors: Tomas Komarek, Petr Somol
  • Publication number: 20190123982
    Abstract: In one embodiment, a device groups feature vectors representing network traffic flows into bags. The device forms a bag representation of a particular one of the bags by aggregating the feature vectors in the particular bag. The device extends one or more feature vectors in the particular bag with the bag representation. The extended one or more feature vectors are positive examples of a classification label for the network traffic. The device trains a network traffic classifier using training data that comprises the one or more feature vectors extended with the bag representation.
    Type: Application
    Filed: October 23, 2017
    Publication date: April 25, 2019
    Inventors: Tomas Komarek, Martin Vejman, Petr Somol
  • Publication number: 20190114416
    Abstract: In one embodiment, a device divides groups of tuples of traffic characteristics of encrypted network traffic into different pairs of the characteristics. Each of the pairs has a corresponding two dimensional (2-D) feature subspace. The device discretizes the 2-D feature subspaces, to form a plurality of bins in each feature subspace. The device assigns the pairs of the traffic characteristics in a particular group of tuples to the bins in the discretized 2-D feature subspaces. The device forms, for each group of tuples, a vector representation of the group of tuples based on the bins in the discretized 2-D feature subspaces to which the pairs of the traffic characteristics from the group are assigned. The vector representations of the groups of tuples are of a fixed dimension. The device uses the vector representations of the groups of tuples to train a machine learning-based traffic classifier.
    Type: Application
    Filed: October 12, 2017
    Publication date: April 18, 2019
    Inventors: Tomas Komarek, Petr Somol
  • Publication number: 20170230395
    Abstract: Actual traffic logs of network traffic to and from host devices in a network are collected over time. Artificial traffic logs for each of multiple artificial network address translation (NAT) devices are generated from the actual traffic logs. The actual traffic logs and the artificial traffic logs are labeled as being indicative of non-NAT devices and NAT devices, respectively, to produce labeled traffic logs. From the labeled traffic logs for each artificial NAT device and each non-NAT device, respective, correspondingly labeled, network traffic features indicative of whether the device behaves like a NAT device or a non-NAT device are extracted. A classifier device is trained using the network traffic features extracted for each artificial NAT device and each non-NAT device to classify between an actual NAT device and an actual non-NAT device based on further actual traffic logs.
    Type: Application
    Filed: April 25, 2017
    Publication date: August 10, 2017
    Inventors: Tomás Komárek, Martin Grill, Tomás Pevny
  • Publication number: 20160315952
    Abstract: Network traffic logs of network traffic to and from host devices connected to a network that were collected over time are accessed. For each host device identified in the logs, a set of network traffic features indicative of whether the host device behaves like a Network Address Translation (NAT) device or an end host device is extracted from the logs for the host device. Each feature has values that vary over time based on the logs. A trained host device behavior classifier classifies the host device as either a NAT device or an end host device based on one or more of the feature values.
    Type: Application
    Filed: April 27, 2015
    Publication date: October 27, 2016
    Inventors: Tomás Komárek, Martin Grill, Tomás Pevný