Patents by Inventor Sameer T. Khanna
Sameer T. Khanna has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11928593Abstract: Among a great deal of other disclosure and scope, systems and methods are enclosed that enable for highly efficient labeling of data. For example, in some of many cases, a novel methodology for ranking vectors most useful to label next is disclosed. In such an example, a neural network is trained to predict this ranking methodology upon being given a set of heuristics from which to assess the given problem space. A user can continue the cycle of identifying a set of candidate vectors to label, compiling relevant heuristics from said vectors, ranking vectors via the trained neural network, selecting a subset of the ranked vectors, inquiring an oracle regarding the true labels of the vectors, and then appending the subset of newly labelled vectors to the labelled set of vectors until satisfaction.Type: GrantFiled: June 15, 2021Date of Patent: March 12, 2024Assignee: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Patent number: 11921820Abstract: Systems and methods are described for training a machine learning model using intelligently selected multiclass vectors. According to an embodiment, a set of un-labeled feature vectors are received. The set of feature vectors are grouped into clusters within a vector space having fewer dimensions than the first set of feature vectors by applying a homomorphic dimensionality reduction algorithm to the set of feature vectors and performing centroid-based clustering. An optimal set of clusters among the clusters is identified by performing a convex optimization process on the clusters. Vector labeling is minimized by selecting ground truth representative vectors including a representative vector from each cluster of the optimal set of clusters. A set of labeled feature vectors is created based on labels received from an oracle for each of the representative vectors. A machine-learning model is trained for multiclass classification based on the set of labeled feature vectors.Type: GrantFiled: September 11, 2020Date of Patent: March 5, 2024Assignee: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Patent number: 11870693Abstract: Systems and methods for efficient kernel space packet processing and IoT device classification are provided. According to an embodiment, a computer system receives a packet in kernel space, ascertains whether the packet is destined for the computer system, when the ascertaining is affirmative the packet is forwarded to user space; otherwise, it is determined whether the packet is associated with a protocol used by IoT devices. When the determination is affirmative, header information is extracted from the packet, and subsequent IoT device detection processing is facilitated by sending the header information to the user space. The same or a separate computer system may perform the IoT device detection processing based on the header information by for each identified TCP or UDP flow: creating a variable-length feature set; and inferring whether the TCP or UDP flow represents an IoT device or a non-IoT device communication by applying an ML model.Type: GrantFiled: December 31, 2020Date of Patent: January 9, 2024Assignee: Fortinet, Inc.Inventors: Sameer T. Khanna, Xiaoguang Liu, Jianwen Zhang
-
Publication number: 20230058569Abstract: Systems, devices, and methods are discussed for identifying possible improper file accesses by an endpoint device. In some cases an agent is placed on each system to be surveilled that records the absolute paths for each file accessed for each user. This information may be accumulated and sent to a central server or computer for analysis of all such file accesses on a user basis. In some cases, a file access tree is created, and in some implementations be pruned of branches and leaves if deemed to be duplicates or very similar to other branched and leaves via a Levenshtein distance threshold. The resulting tree's edges may be scaled in particular implementations based on the deviation of a user's file accesses from their sphere of permissions. A variance metric may be computed from the final tree's form to capture the user's access patterns.Type: ApplicationFiled: September 1, 2021Publication date: February 23, 2023Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220398453Abstract: Among a great deal of other disclosure and scope, systems and methods are enclosed that enable efficient assessment of the currently known manifolds within a problem space. A set of labeled vectors is identified as well as a set of unlabeled vectors. An angular based comparison is made between each unlabeled vector and each labeled vector. If the smallest angle between a given unlabeled vector and any of the labeled vectors is deemed satisfactory, such as when the angle is small and acute, the vector is deemed not crucial to obtain information regarding. However, if the smallest between a given unlabeled vector and any of the labeled vectors is deemed large, such as when the angle is orthogonal to the labeled set, then the given vector possesses vital information pivotal to learning our problem space. All such vectors are ranked, with the unlabeled vectors with the largest angles to our labeled set sent to our oracle first in order to improve our labeled set of vectors.Type: ApplicationFiled: July 16, 2021Publication date: December 15, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220398494Abstract: Among a great deal of other disclosure and scope, systems and methods are disclosed in relation to a dual network entity designed for classification in problem spaces where the target can be one of multiple possibilities with as few labeled training examples as possible. In one of many possible implementations, a network is first used to identify vectors considered to possess immense amounts of information regarding the problem space. An oracle is then tasked with labeling such vectors. The secondary network uses the new insights gleaned about the problem space to identify unlabeled vectors that our target model has correctly identified. These vectors are pseudolabeled, providing further information about the problem space to our target model. The cycle continues until the operator is satisfied with the performance of the target model.Type: ApplicationFiled: October 1, 2021Publication date: December 15, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220398491Abstract: Among a great deal of other disclosure and scope, systems and methods are enclosed that enable automated labelling of a subset of vectors in a given problem space. For example, in some of many cases, a first machine learning model pre-trained on a given problem space makes predictions regarding fresh, unseen data. In addition to this prediction, the model can output a confidence metric indicating its confidence regarding the prediction made. A subset of these vectors with the highest confidence may be selected. Relevant heuristics assessing each vector in the subset may be computed. These heuristics can be fed through a second machine learning model, which identifies if the given prediction made by the first model is correct. If so, the vector is automatically annotated with the correct predicted label, the vector is appended to the labeled set of data, and the first model is retrained with the new labeled set of data.Type: ApplicationFiled: June 15, 2021Publication date: December 15, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220398493Abstract: Among a great deal of other disclosure and scope, systems and methods are disclosed in relation to training regression machine learning models. In one of many possible implementations, a region of particular interest is identified where it is important for the target model to be very accurate within the region even at the expense of accuracy outside the region. The operator then tunes the loss function hyperparameters in order to correctly fit the region of interest and importance dropoff desired for the problem space. The loss function generated is easily differentiable and scales the importance of the training example based on its distance from the region of interest. The custom loss function is plugged into one of multiple training algorithms such as the gradient descent algorithm Adam and can be used to train our target model as before.Type: ApplicationFiled: October 1, 2021Publication date: December 15, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220398436Abstract: Among a great deal of other disclosure and scope, systems and methods are enclosed that adapt adversarial learning principles to an active learning regime. Given a problem space of note, a set of labeled vectors, a machine learning model trained on the set of labeled vectors, and a set of unlabeled vectors, we identify the unlabeled vectors our model is most unsure of. Each of our unlabeled vectors in our set of unlabeled vectors is initially classified by our model, and the prediction probabilities are taken note of. Then, each of our unlabeled vectors in our set of unlabeled vectors is perturbed by adding some random noise. The perturbed vectors are reclassified by our model, with the prediction probabilities taken note of once again.Type: ApplicationFiled: July 16, 2021Publication date: December 15, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220398449Abstract: Among a great deal of other disclosure and scope, systems and methods are enclosed that enable for highly efficient labeling of data. For example, in some of many cases, a novel methodology for ranking vectors most useful to label next is disclosed. In such an example, a neural network is trained to predict this ranking methodology upon being given a set of heuristics from which to assess the given problem space. A user can continue the cycle of identifying a set of candidate vectors to label, compiling relevant heuristics from said vectors, ranking vectors via the trained neural network, selecting a subset of the ranked vectors, inquiring an oracle regarding the true labels of the vectors, and then appending the subset of newly labelled vectors to the labelled set of vectors until satisfaction.Type: ApplicationFiled: June 15, 2021Publication date: December 15, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Patent number: 11496394Abstract: Systems and methods for efficient kernel space packet processing and IoT device classification are provided. According to one embodiment, a computer system performs IoT device detection processing. Packet header information is received for multiple packets. Based on the packet header information, multiple Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) flows between a given source device of multiple devices and a given destination device of the multiple devices are identified. For each TCP or UDP flow: a variable-length feature set is created having a size limited by a predetermined or configurable aggregate number of packets sent and received for the TCP or UDP flow; and it is inferred whether the TCP or UDP flow represents an IoT device communication or a non-IoT device communication by applying a machine-learning model to the variable length feature set.Type: GrantFiled: December 31, 2020Date of Patent: November 8, 2022Assignee: Fortinet, Inc.Inventors: Sameer T. Khanna, Xiaoguang Liu, Jianwen Zhang
-
Publication number: 20220210065Abstract: Systems and methods for efficient kernel space packet processing and IoT device classification are provided. According to an embodiment, a computer system receives a packet in kernel space, ascertains whether the packet is destined for the computer system, when the ascertaining is affirmative the packet is forwarded to user space; otherwise, it is determined whether the packet is associated with a protocol used by IoT devices. When the determination is affirmative, header information is extracted from the packet, and subsequent IoT device detection processing is facilitated by sending the header information to the user space. The same or a separate computer system may perform the IoT device detection processing based on the header information by for each identified TCP or UDP flow: creating a variable-length feature set; and inferring whether the TCP or UDP flow represents an IoT device or a non-IoT device communication by applying an ML model.Type: ApplicationFiled: December 31, 2020Publication date: June 30, 2022Applicant: Fortinet, Inc.Inventors: Sameer T. Khanna, Xiaoguang Liu, Jianwen Zhang
-
Publication number: 20220210066Abstract: Systems and methods for efficient kernel space packet processing and IoT device classification are provided. According to one embodiment, a computer system performs IoT device detection processing. Packet header information is received for multiple packets. Based on the packet header information, multiple Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) flows between a given source device of multiple devices and a given destination device of the multiple devices are identified. For each TCP or UDP flow: a variable-length feature set is created having a size limited by a predetermined or configurable aggregate number of packets sent and received for the TCP or UDP flow; and it is inferred whether the TCP or UDP flow represents an IoT device communication or a non-IoT device communication by applying a machine-learning model to the variable length feature set.Type: ApplicationFiled: December 31, 2020Publication date: June 30, 2022Applicant: Fortinet, Inc.Inventors: Sameer T. Khanna, Xiaoguang Liu, Jianwen Zhang
-
Publication number: 20220083810Abstract: Systems and methods are described for training a machine learning model using intelligently selected multiclass vectors. According to an embodiment, a processing resource of a computing system receives a first set of un-labeled feature vectors. The first set feature vectors are homomorphically translated using a T-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm to obtain a second set of feature vectors with reduced dimensionality. The second set of feature vectors are clustered to obtain an initial set of clusters using centroid-based clustering. An optimal set of clusters is identified among the initial set of clusters by performing a convex optimization process on the initial set of clusters. For each cluster of the optimal set of clusters, a representative vector from the cluster is selected for labeling.Type: ApplicationFiled: September 11, 2020Publication date: March 17, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220083815Abstract: Systems and methods are described for training a machine learning model using intelligently selected multiclass vectors. According to an embodiment, a set of un-labeled feature vectors are received. The set of feature vectors are grouped into clusters within a vector space having fewer dimensions than the first set of feature vectors by applying a homomorphic dimensionality reduction algorithm to the set of feature vectors and performing centroid-based clustering. An optimal set of clusters among the clusters is identified by performing a convex optimization process on the clusters. Vector labeling is minimized by selecting ground truth representative vectors including a representative vector from each cluster of the optimal set of clusters. A set of labeled feature vectors is created based on labels received from an oracle for each of the representative vectors. A machine-learning model is trained for multiclass classification based on the set of labeled feature vectors.Type: ApplicationFiled: September 11, 2020Publication date: March 17, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220083901Abstract: Systems and methods are described for training a machine learning model using intelligently selected multiclass vectors. According to an embodiment, an un-labeled feature vector is selected from a set of feature vectors. A model classified cluster and a confidence score are obtained by classifying an un-labeled feature vector using a machine-learning model. A determination is made regarding whether the confidence score is greater than a threshold. When the determination is affirmative: (i) for each labeled feature vector, determining a distance metric for the un-labeled feature vector with respect to the labeled feature; (ii) determining a statistically matching cluster of labeled feature vectors to which the un-labeled feature vector is closest; and (iii) when the model classified cluster and the statistically matching cluster are one and the same: (a) labeling the un-labeled feature vector; and (b) model fitting the machine learning model based on the labeling.Type: ApplicationFiled: September 11, 2020Publication date: March 17, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna
-
Publication number: 20220083900Abstract: Systems and methods are described for training a machine learning model using intelligently selected multiclass vectors. According to an embodiment, a processing resource of a computer system receives a set of feature vectors. For each feature vector of the set of feature vectors: (i) the feature vector is classified as one of multiple classes using a machine-learning model trained for multiclass classification; and (ii) a prediction skepticism metric, representing a degree of prediction skepticism relating to classification of the feature vector by the machine-learning model, is calculated for the feature vector using a heuristic function. A boundary condition vector is selected from the set of feature vectors for labeling having a highest degree of prediction skepticism.Type: ApplicationFiled: September 11, 2020Publication date: March 17, 2022Applicant: Fortinet, Inc.Inventor: Sameer T. Khanna