Patents by Inventor Asankhaya Sharma

Asankhaya Sharma has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11899800
    Abstract: A system to create a stacked classifier model combination or classifier ensemble has been designed for identification of undisclosed flaws in software components on a large-scale. This classifier ensemble is capable of at least a 54.55% improvement in precision. The system uses a K-folding cross validation algorithm to partition a sample dataset and then train and test a set of N classifiers with the dataset folds. At each test iteration, trained models of the set of classifiers generate probabilities that a sample has a flaw, resulting in a set of N probabilities or predictions for each sample in the test data. With a sample size of S, the system passes the S sets of N predictions to a logistic regressor along with “ground truth” for the sample dataset to train a logistic regression model. The trained classifiers and the logistic regression model are stored as the classifier ensemble.
    Type: Grant
    Filed: June 28, 2022
    Date of Patent: February 13, 2024
    Assignee: Veracode, Inc.
    Inventors: Asankhaya Sharma, Yaqin Zhou
  • Publication number: 20230409464
    Abstract: With invocations of a software development pipeline, organization specific remediations/fixes for a software project can be learned from scanning results of code submissions (e.g., commits or merges) across an organization for a software project(s). Fixes of detected program code flaws can be detected and/or specified across scans and associated with flaw identifiers and used for training machine learning models to identify candidate fixes for detected flaws. This ongoing learning during development propagates fixes created or chosen by experts (e.g., software engineers working on the software project) relevant to the software project. The experts can choose from suggestions mined from the learned fixes of the organization and suggestions generated from a pipeline created with the trained machine learning models. The selections are then used for further training of the machine learning models that form the pipeline.
    Type: Application
    Filed: October 29, 2020
    Publication date: December 21, 2023
    Inventors: Asankhaya Sharma, Hao Xiao, Hendy Heng Lee Chua, Darius Tsien Wei Foo
  • Publication number: 20230153459
    Abstract: To preserve privacy when leveraging organization-specific remediation knowledge for flaw remediation across organizations, program code is deidentified to remove code which potentially identifies its source/origin. Deidentification operates based on structure of flaws and fixes at the level of source code constructs based on an abstract syntax tree (AST) or other structural context representation of a fix and corresponding flaw. Potentially identifying portions of a fix indicated in its AST are determined and modified (e.g., removed or obfuscated) without impacting AST structure. Deidentified remediation knowledge originating from different organizations is used to train a fix suggestion model(s) which learns structural context of fixes and corresponding flaws and, once trained, generates predictions indicating suggested fixes to flaws based on structural contexts of the flaws.
    Type: Application
    Filed: November 10, 2020
    Publication date: May 18, 2023
    Inventors: Asankhaya Sharma, Hao Xiao, Hendy Heng Lee Chua, Darius Tsien Wei Foo
  • Publication number: 20220327220
    Abstract: A system to create a stacked classifier model combination or classifier ensemble has been designed for identification of undisclosed flaws in software components on a large-scale. This classifier ensemble is capable of at least a 54.55% improvement in precision. The system uses a K-folding cross validation algorithm to partition a sample dataset and then train and test a set of N classifiers with the dataset folds. At each test iteration, trained models of the set of classifiers generate probabilities that a sample has a flaw, resulting in a set of N probabilities or predictions for each sample in the test data. With a sample size of S, the system passes the S sets of N predictions to a logistic regressor along with “ground truth” for the sample dataset to train a logistic regression model. The trained classifiers and the logistic regression model are stored as the classifier ensemble.
    Type: Application
    Filed: June 28, 2022
    Publication date: October 13, 2022
    Inventors: Asankhaya Sharma, Yaqin Zhou
  • Patent number: 11416622
    Abstract: A system to create a stacked classifier model combination or classifier ensemble has been designed for identification of undisclosed flaws in software components on a large-scale. This classifier ensemble is capable of at least a 54.55% improvement in precision. The system uses a K-folding cross validation algorithm to partition a sample dataset and then train and test a set of N classifiers with the dataset folds. At each test iteration, trained models of the set of classifiers generate probabilities that a sample has a flaw, resulting in a set of N probabilities or predictions for each sample in the test data. With a sample size of S, the system passes the S sets of N predictions to a logistic regressor along with “ground truth” for the sample dataset to train a logistic regression model. The trained classifiers and the logistic regression model are stored as the classifier ensemble.
    Type: Grant
    Filed: August 20, 2018
    Date of Patent: August 16, 2022
    Assignee: VERACODE, INC.
    Inventors: Asankhaya Sharma, Yaqin Zhou
  • Patent number: 10803061
    Abstract: To analyze open-source code at a large scale, a security domain graph language (“SGL”) has been created that functions as a vulnerability description language and facilitates program analysis queries. The SGL facilitates building and maintaining a graph database to catalogue vulnerabilities found in open-source components. This graphical database can be accessed via a database interface directly or accessed by an agent that interacts with the database interface. To build the graph database, a database interface processes an open-source component and creates graph structures which represent relationships present in the open-source component. The database interface transforms a vulnerability description into a canonical form based on a schema for the graph database and updates the database based on a determination of whether the vulnerability is a duplicate. This ensures quality and consistency of the vulnerability dataset maintained in the graph database.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: October 13, 2020
    Assignee: Veracode, Inc.
    Inventors: Darius Tsien Wei Foo, Ming Yi Ang, Jie Shun Yeo, Asankhaya Sharma
  • Publication number: 20200057858
    Abstract: A system to create a stacked classifier model combination or classifier ensemble has been designed for identification of undisclosed flaws in software components on a large-scale. This classifier ensemble is capable of at least a 54.55% improvement in precision. The system uses a K-folding cross validation algorithm to partition a sample dataset and then train and test a set of N classifiers with the dataset folds. At each test iteration, trained models of the set of classifiers generate probabilities that a sample has a flaw, resulting in a set of N probabilities or predictions for each sample in the test data. With a sample size of S, the system passes the S sets of N predictions to a logistic regressor along with “ground truth” for the sample dataset to train a logistic regression model. The trained classifiers and the logistic regression model are stored as the classifier ensemble.
    Type: Application
    Filed: August 20, 2018
    Publication date: February 20, 2020
    Inventors: Asankhaya Sharma, Yaqin Zhou
  • Publication number: 20200042712
    Abstract: To analyze open-source code at a large scale, a security domain graph language (“SGL”) has been created that functions as a vulnerability description language and facilitates program analysis queries. The SGL facilitates building and maintaining a graph database to catalogue vulnerabilities found in open-source components. This vulnerability database generated with SGL is used for analysis of software projects which use open source components. An agent which interacts with the vulnerability database can perform a scan of a software project to identify open-source components used in the project and submit queries to the vulnerability database to identify vulnerabilities which may affect the open-source components in the project. Results of the scan are presented to a user in the form of a vulnerability report which indicates vulnerabilities that have been discovered and which open-source components the vulnerabilities affect.
    Type: Application
    Filed: July 31, 2018
    Publication date: February 6, 2020
    Inventors: Darius Tsien Wei Foo, Ming Yi Ang, Jie Shun Yeo, Asankhaya Sharma
  • Publication number: 20200042628
    Abstract: To analyze open-source code at a large scale, a security domain graph language (“SGL”) has been created that functions as a vulnerability description language and facilitates program analysis queries. The SGL facilitates building and maintaining a graph database to catalogue vulnerabilities found in open-source components. This graphical database can be accessed via a database interface directly or accessed by an agent that interacts with the database interface. To build the graph database, a database interface processes an open-source component and creates graph structures which represent relationships present in the open-source component. The database interface transforms a vulnerability description into a canonical form based on a schema for the graph database and updates the database based on a determination of whether the vulnerability is a duplicate. This ensures quality and consistency of the vulnerability dataset maintained in the graph database.
    Type: Application
    Filed: July 31, 2018
    Publication date: February 6, 2020
    Inventors: Darius Tsien Wei Foo, Ming Yi Ang, Jie Shun Yeo, Asankhaya Sharma
  • Publication number: 20160098563
    Abstract: A facility for analyzing a pair of code files is described. From each of the code files, the facility extracts a hierarchy of textual names. The facility then determines the score reflecting a level of similarity between the extracted hierarchies of textual names for attribution to the pair of code files.
    Type: Application
    Filed: October 3, 2014
    Publication date: April 7, 2016
    Inventor: Asankhaya Sharma
  • Publication number: 20110126113
    Abstract: Aspects of the subject matter described herein relate to displaying content on multiple pages. In aspects, a request for content is received from a browsing component. The content is divided into pages suitable for displaying on a display associated with the browsing component. Navigation elements may be embedded in the pages to allow a user using the browsing component to navigate between pages corresponding to the content. The actions of dividing the content into multiple pages may occur on a content server, an entity intermediate to the content server and a client hosting the browsing component, or a component of the client.
    Type: Application
    Filed: November 23, 2009
    Publication date: May 26, 2011
    Applicant: c/o Microsoft Corporation
    Inventors: Asankhaya Sharma, Prashant Kumar Dhingra