Patents by Inventor Sebastiaan Johannes van Schaik
Sebastiaan Johannes van Schaik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11449335Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing weights for source code alerts. One of the methods includes generating a respective sample of alerts for each feature of a plurality of features. One or more feature values are computed for alerts having a same respective attribute value for each feature of a plurality of features. An importance distribution that maps each feature value to a respective measure of importance for an alert having the feature value is used to compute a respective feature score for the feature using one or more feature values computed the alert. A respective weight is computed for each alert by combining the plurality of feature scores computed for the alert.Type: GrantFiled: September 23, 2019Date of Patent: September 20, 2022Inventors: Sebastiaan Johannes van Schaik, Man Yue Mo, Jean Helie
-
Publication number: 20220156388Abstract: The computer-performed automatic estimation of data leaks from private stores into public stores. The owner of the data in the private store can then be alerted to the estimation so the cause of such leaks can be remedied. The estimation is based on comparisons between similarity mapping results for data within the private store with similarity mapping results for data within the public store. As an example, the one-way similarity mapping could be a fuzzy hashing or a provenance signature.Type: ApplicationFiled: November 16, 2020Publication date: May 19, 2022Inventors: Maya KACZOROWSKI, Pavel AVGUSTINOV, Oege DE MOOR, Sebastiaan Johannes VAN SCHAIK, Justin Allen HUTCHINGS, Derek S. JEDAMSKI, Adam Philip BALDWIN
-
Patent number: 11099843Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating similarity groupings of software projects. One of the methods includes computing respective values for a plurality of analysis metrics associated with each software development project of a plurality of software development projects, wherein the analysis metrics include snapshot metrics that represent respective properties of the commit history of snapshots in the software development project, functionality metrics that represent respective properties of software elements in the software development project, or both. A similarity grouping is computed for the primary software development project based on the respective computed values for the plurality of analysis metrics for the plurality of software development projects, wherein the similarity grouping for the primary software development project comprises fewer than all of the plurality of software development projects.Type: GrantFiled: December 21, 2018Date of Patent: August 24, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventor: Sebastiaan Johannes van Schaik
-
Patent number: 10929125Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining the provenance of source code. One of the methods includes receiving a portion of a file occurring in a source code project. For each of a plurality of windows of characters in the portion of the file, a respective provenance signature is computed. An index that maps each provenance signature to occurrences of the provenance signature in one or more files of a plurality of projects is searched to identify one or more matching files that are each associated with at least one provenance signature computed for the portion of the file. Data identifying the one or more matching files is provided in response to receiving the portion of the file occurring in the source code project.Type: GrantFiled: December 21, 2018Date of Patent: February 23, 2021Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventor: Sebastiaan Johannes van Schaik
-
Patent number: 10810007Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying system-generated code. One of the methods includes generating data representing a state of source code files of the snapshot before performing a build process for the snapshot. An instrumented build process is performed for the snapshot, including intercepting each compiler call of a plurality of compiler calls by the build process for the snapshot, and designating one or more respective source code files of each compiler call as source code files compiled during the build process for the snapshot. One or more source code files that are new or were modified after the build process was initiated are classified as source code files having system-generated source code.Type: GrantFiled: December 21, 2018Date of Patent: October 20, 2020Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Arthur Baars, Sebastiaan Johannes van Schaik
-
Patent number: 10810009Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for presenting static analysis user interface presentations. One of the methods includes receiving, from a user, a request for a user interface presentation representing multiple properties of source code snapshots committed to a project versus time. A plurality of snapshots are obtained for the project, wherein each snapshot comprises a representation of source code for the project at a respective time period. Multiple snapshot metrics are computed for each snapshot, including a net violation count and a count of lines of code added or removed. A graphical user interface presentation is generated that correlates periodic lines of code metrics with overall violation metrics.Type: GrantFiled: July 16, 2018Date of Patent: October 20, 2020Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventor: Sebastiaan Johannes van Schaik
-
Publication number: 20200225943Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for presenting static analysis user interface presentations. One of the methods includes receiving, from a user, a request for a user interface presentation representing multiple properties of source code snapshots committed to a project versus time. A plurality of snapshots are obtained for the project, wherein each snapshot comprises a representation of source code for the project at a respective time period. Multiple snapshot metrics are computed for each snapshot, including a net violation count and a count of lines of code added or removed. A graphical user interface presentation is generated that correlates periodic lines of code metrics with overall violation metrics.Type: ApplicationFiled: July 16, 2018Publication date: July 16, 2020Inventor: Sebastiaan Johannes van Schaik
-
Publication number: 20200150952Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing weights for source code alerts. One of the methods includes generating a respective sample of alerts for each feature of a plurality of features. One or more feature values are computed for alerts having a same respective attribute value for each feature of a plurality of features. An importance distribution that maps each feature value to a respective measure of importance for an alert having the feature value is used to compute a respective feature score for the feature using one or more feature values computed the alert. A respective weight is computed for each alert by combining the plurality of feature scores computed for the alert.Type: ApplicationFiled: September 23, 2019Publication date: May 14, 2020Inventors: Sebastiaan Johannes van Schaik, Man Yue Mo, Jean Helie
-
Patent number: 10423409Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing weights for source code alerts. One of the methods includes generating a respective sample of alerts for each feature of a plurality of features. One or more feature values are computed for alerts having a same respective attribute value for each feature of a plurality of features. An importance distribution that maps each feature value to a respective measure of importance for an alert having the feature value is used to compute a respective feature score for the feature using one or more feature values computed the alert. A respective weight is computed for each alert by combining the plurality of feature scores computed for the alert.Type: GrantFiled: April 23, 2018Date of Patent: September 24, 2019Assignee: Semmle LimitedInventors: Sebastiaan Johannes van Schaik, Man Yue Mo, Jean Helie
-
Patent number: 10346294Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for comparing software projects having been analyzed using different criteria. One of the methods includes receiving, for each of a plurality of software projects, source code evaluation criteria that had been used to analyze source code of the respective software project. An overlapping set of source code evaluation criteria is determined. For each of the software projects, source code analysis results which resulted from the overlapping set of source code evaluation criteria are determined, and a respective value of a characteristic metric for the source code analysis results is computed. The respective values of the characteristic metric for each of the software projects are compared, and for at least one of the software projects, an assessment of the software project is output.Type: GrantFiled: April 11, 2017Date of Patent: July 9, 2019Assignee: Semmle LimitedInventor: Sebastiaan Johannes van Schaik
-
Publication number: 20190205128Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating similarity groupings of software projects. One of the methods includes computing respective values for a plurality of analysis metrics associated with each software development project of a plurality of software development projects, wherein the analysis metrics include snapshot metrics that represent respective properties of the commit history of snapshots in the software development project, functionality metrics that represent respective properties of software elements in the software development project, or both. A similarity grouping is computed for the primary software development project based on the respective computed values for the plurality of analysis metrics for the plurality of software development projects, wherein the similarity grouping for the primary software development project comprises fewer than all of the plurality of software development projects.Type: ApplicationFiled: December 21, 2018Publication date: July 4, 2019Inventor: Sebastiaan Johannes van Schaik
-
Publication number: 20190205125Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining the provenance of source code. One of the methods includes receiving a portion of a file occurring in a source code project. For each of a plurality of windows of characters in the portion of the file, a respective provenance signature is computed. An index that maps each provenance signature to occurrences of the provenance signature in one or more files of a plurality of projects is searched to identify one or more matching files that are each associated with at least one provenance signature computed for the portion of the file. Data identifying the one or more matching files is provided in response to receiving the portion of the file occurring in the source code project.Type: ApplicationFiled: December 21, 2018Publication date: July 4, 2019Inventor: Sebastiaan Johannes van Schaik
-
Publication number: 20190205122Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying system-generated code. One of the methods includes generating data representing a state of source code files of the snapshot before performing a build process for the snapshot. An instrumented build process is performed for the snapshot, including intercepting each compiler call of a plurality of compiler calls by the build process for the snapshot, and designating one or more respective source code files of each compiler call as source code files compiled during the build process for the snapshot. One or more source code files that are new or were modified after the build process was initiated are classified as source code files having system-generated source code.Type: ApplicationFiled: December 21, 2018Publication date: July 4, 2019Inventors: Arthur Baars, Sebastiaan Johannes van Schaik
-
Publication number: 20180373527Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing weights for source code alerts. One of the methods includes generating a respective sample of alerts for each feature of a plurality of features. One or more feature values are computed for alerts having a same respective attribute value for each feature of a plurality of features. An importance distribution that maps each feature value to a respective measure of importance for an alert having the feature value is used to compute a respective feature score for the feature using one or more feature values computed the alert. A respective weight is computed for each alert by combining the plurality of feature scores computed for the alert.Type: ApplicationFiled: April 23, 2018Publication date: December 27, 2018Inventors: Sebastiaan Johannes van Schaik, Man Yue Mo, Jean Helie
-
Publication number: 20180293160Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for comparing software projects having been analyzed using different criteria. One of the methods includes receiving, for each of a plurality of software projects, source code evaluation criteria that had been used to analyze source code of the respective software project. An overlapping set of source code evaluation criteria is determined. For each of the software projects, source code analysis results which resulted from the overlapping set of source code evaluation criteria are determined, and a respective value of a characteristic metric for the source code analysis results is computed. The respective values of the characteristic metric for each of the software projects are compared, and for at least one of the software projects, an assessment of the software project is output.Type: ApplicationFiled: April 11, 2017Publication date: October 11, 2018Inventor: Sebastiaan Johannes van Schaik
-
Patent number: 9753845Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for assigning levels of priority to selected source code functions. One of the methods includes for each selected function, a respective associated first set of functions reachable from the selected function by at most N steps, and a respective associated second set of functions that are each reachable from the selected function by more than N steps and less than M steps are computed. A first partition having all selected functions whose respective associated first set of functions has at least one of the subject functions is computed. A second partition having selected functions not in the first partition and whose respective associated second set of functions has at least one of the subject functions is computed. Selected functions belonging to the first partition are assigned a higher priority than selected functions belonging to the second partition.Type: GrantFiled: February 10, 2017Date of Patent: September 5, 2017Assignee: Semmle LimitedInventor: Sebastiaan Johannes van Schaik
-
Patent number: 9645817Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing a contextual ranking for a developer. One of the methods includes computing a predicted violation value for each developer in a context group. An actual violation value is computed for each developer in the context group. A score for each developer in the context group is computed, wherein the score represents a distance between the actual violation value for the developer and the predicted violation value for the developer. A contextual ranking is generated of the plurality of developers in the context group based on the score for each developer in the context group.Type: GrantFiled: September 27, 2016Date of Patent: May 9, 2017Assignee: Semmle LimitedInventor: Sebastiaan Johannes van Schaik
-
Patent number: 9639352Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating widened types for computing measures of rework normalized churn. One of the methods includes determining a plurality of commit chains for a software developer. Respective measures of rework churn occurring in each commit chain are calculated. An overall rework factor is computed for the developer using the respective measures of rework churn for each commit chain in the plurality of commit chains for the developer. A measure of rework normalized churn is computed for the developer including adjusting the initial measure of churn by the overall rework factor. Productivity of the developer is quantified relative to one or more other developers using the measure of rework normalized churn for the developer.Type: GrantFiled: October 12, 2016Date of Patent: May 2, 2017Assignee: Semmle LimitedInventors: Sebastiaan Johannes van Schaik, Stephen Philip Buckley, Yorck Huenke
-
Patent number: 9569341Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for assigning levels of priority to selected source code functions. One of the methods includes for each selected function, a respective associated first set of functions reachable from the selected function by at most N steps, and a respective associated second set of functions that are each reachable from the selected function by more than N steps and less than M steps are computed. A first partition having all selected functions whose respective associated first set of functions has at least one of the subject functions is computed. A second partition having selected functions not in the first partition and whose respective associated second set of functions has at least one of the subject functions is computed. Selected functions belonging to the first partition are assigned a higher priority than selected functions belonging to the second partition.Type: GrantFiled: May 25, 2016Date of Patent: February 14, 2017Assignee: Semmle LimitedInventor: Sebastiaan Johannes van Schaik
-
Patent number: 9411706Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generated aggregated dependencies between software elements in a code base. One of the methods includes determining that a cycle exists in the aggregated dependency graph, determining which of the links in the cycle has a lowest weight, and adding a first link in the cycle having the lowest weight to a set of candidate removable links. The links in the set of candidate removable links are classified as candidate removable links, and a user interface presentation is provided that presents the aggregated dependency graph and which visually distinguishes removable links from other links in the aggregated dependency graph.Type: GrantFiled: September 30, 2015Date of Patent: August 9, 2016Assignee: Semmle LimitedInventor: Sebastiaan Johannes van Schaik