Patents by Inventor Bruce CLEMENTS

Bruce CLEMENTS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11972232
    Abstract: A code completion tool uses a neural transformer model with attention to generate candidate sequences to complete a method body of a method signature. The neural transformer model is trained with source code programs and natural language text. The neural transformer model learns the meaning of a method name, its corresponding method parameters and types from a large corpus of unsupervised dataset of source code methods and a supervised dataset of tasks including source code constructs in combination with natural language docstrings to infer a candidate sequence of subtokens that represent a method body for a particular method signature.
    Type: Grant
    Filed: June 10, 2020
    Date of Patent: April 30, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Colin Bruce Clement, Dawn Drain, Neelakantan Sundaresan, Alexey Svyatkovskiy
  • Patent number: 11954429
    Abstract: Generally discussed herein are devices, systems, and methods for generating an automatic interactive digital notebook completion model. A method can include receiving notebook content of an interactive digital notebook, the notebook content including a markdown cell followed by a code cell. The method can include generating input/output examples by, for each input/output example by masking one of (i) content of the markdown cell or (ii) content of the code cell resulting in a masked cell, identifying the masked cell and content of another cell of the markdown cell or the code that is not masked as an input for an input/output example, and identifying the content of the masked cell as an output for the input/output example. The method can include training, based on the input/output examples, a natural language processing model that generates a prediction of the content of a second masked cell as an output.
    Type: Grant
    Filed: December 8, 2021
    Date of Patent: April 9, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Colin Bruce Clement, Shubham Chandel, Guillermo Serrato Castilla, Neelakantan Sundaresan
  • Patent number: 11947935
    Abstract: Custom source code generation models are generated by tuning a pre-trained deep learning model by freezing the model parameters and optimizing a prefix. The tuning process is distributed across a user space and a model space where the embedding and output layers are performed in the user space and the execution of the model is performed in a model space that is isolated from the user space. The tuning process updates the embeddings of the prefix across the separate execution spaces in a manner that preserves the privacy of the data used in the tuning process.
    Type: Grant
    Filed: November 24, 2021
    Date of Patent: April 2, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Colin Bruce Clement, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano, Andrei Zlotchevski
  • Publication number: 20240104001
    Abstract: A debugging tool identifies the smallest subset of an input sequence or rationales that influenced a neural language model to generate an output sequence. The debugging tool uses the rationales to understand why the model made its predictions and in particular, the particular input tokens that had the most impact on the output sequence. In the case of erroneous output, the rationales are used to alter the input sequence to avoid the error or to tailor a new training dataset to retrain the model to improve its performance.
    Type: Application
    Filed: December 15, 2022
    Publication date: March 28, 2024
    Inventors: COLIN BRUCE CLEMENT, DAVID ALBERTO NADER PALACIO, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
  • Patent number: 11900261
    Abstract: A transfer learning system is used for the development of neural transformer models pertaining to software engineering tasks. The transfer learning system trains source code domain neural transformer models with attention in various configurations on a large corpus of unsupervised training dataset of source code programs and/or source code-related natural language text. A web service provides the trained models for use in developing a model that may be fine-tuned on a supervised training dataset associated with a software engineering task thereby generating a tool to perform the software engineering task.
    Type: Grant
    Filed: November 6, 2022
    Date of Patent: February 13, 2024
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Colin Bruce Clement, Dawn Drain, Neelakantan Sundaresan, Alexey Svyatkovskiy
  • Publication number: 20240028740
    Abstract: A neural classifier model is used to detect cybersecurity vulnerabilities in the source code predicted by a deep learning code generation model having been trained on source code possibly containing security bugs. Upon the classifier model classifying a given source code snippet as likely containing a cybersecurity vulnerability, a proposed repair for the cybersecurity vulnerability is predicted from a neural decoder transformer model having been trained on non-vulnerable source code. The neural decoder transformer model is used to predict source code that repairs the cybersecurity vulnerability given the source code classified with a cybersecurity vulnerability.
    Type: Application
    Filed: September 21, 2022
    Publication date: January 25, 2024
    Inventors: AARON YUE-CHIU CHAN, COLIN BRUCE CLEMENT, YEVHEN MOHYLEVSKYY, NEELAKANTAN SUNDARESAN, ROSHANAK ZILOUCHIAN MOGHADDAM
  • Publication number: 20230359443
    Abstract: A neural transformer model with attention is trained to predict candidates to complete a line of source code with a zero-inference capability. The model is trained on an unsupervised training dataset that includes features from source code written in multiple programming languages. The features include a file-level context and a local context, where the file-level context includes a global context, a class context, a function context, and/or a method context for each class, function and/or method of the source code programs used in the training dataset. The local context includes method bodies, function bodies, and/or stand-alone code of main method routines. From these features, the model is able to learn to predict an ordered sequence of code elements that complete a line of source code in a programming language seen and not seen during training.
    Type: Application
    Filed: May 24, 2023
    Publication date: November 9, 2023
    Inventors: COLIN BRUCE CLEMENT, SHUAI LU, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, DUYU TANG
  • Patent number: 11809302
    Abstract: An automated program repair system uses a neural transformer model with attention to predict a bug-free version of a method having a source code bug identified in an associated stack trace. The neural transformer model is pre-trained with English language text and the source code of a target programming language. The pre-trained neural transformer model is trained to create synthetic bugs in bug-free methods. The bug-free methods with the synthetic bugs are executed with a test case to obtain a stack trace of the source code bug. The method with the synthetic bug, without the bug, and its stack trace are used to train the neural transformer model to predict repairs for buggy methods.
    Type: Grant
    Filed: February 16, 2023
    Date of Patent: November 7, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Colin Bruce Clement, Dawn Drain, Guillermo Serrato Castilla, Neelakantan Sundaresan
  • Publication number: 20230342287
    Abstract: A test-driven development system utilizes a neural transformer model with attention to generate method bodies for a focal method given its associated test cases, and optionally a method signature and a docstring of the focal method. The candidate method bodies are validated for syntactic correctness, tested using the given test cases, and tested with a donor class in a target system. Those candidate method bodies passing the validation and testing are then ranked based on a PLUM score that analyzes the candidate method bodies against various quality and performance metrics.
    Type: Application
    Filed: June 19, 2023
    Publication date: October 26, 2023
    Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
  • Patent number: 11797426
    Abstract: A test-driven development system utilizes a neural transformer model with attention to generate method bodies for a focal method given its associated test cases, and optionally a method signature and a docstring of the focal method. The candidate method bodies are validated for syntactic correctness, tested using the given test cases, and tested with a donor class in a target system. Those candidate method bodies passing the validation and testing are then ranked based on a PLUM score that analyzes the candidate method bodies against various quality and performance metrics.
    Type: Grant
    Filed: October 22, 2021
    Date of Patent: October 24, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING
    Inventors: Colin Bruce Clement, Shao Kun Deng, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano
  • Publication number: 20230281318
    Abstract: A constrained decoding technique incorporates token constraints into a beam search at each time step of a decoding process in order to generate viable candidate sequences that are syntactically and semantically correct. The token constraints identify source code tokens or sequences of tokens that should appear in a candidate sequence. The token constraints are generated from checking whether a token predicted at each decoding step is feasible for a partial solution based on the production rules of the grammar of the programming language, the syntactic correctness of a partial sequence, and/or static type correctness.
    Type: Application
    Filed: March 7, 2022
    Publication date: September 7, 2023
    Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, XIAOYU LIU, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
  • Publication number: 20230281317
    Abstract: A false positive vulnerability system detects whether a software vulnerability identified by a static code vulnerability analyzer is a true vulnerability or a false positive. The system utilizes deep learning models to predict whether an identified vulnerability is accurate given the source code context of the identified vulnerability. A neural encoder transformer model is trained to classify a false positive given the method body including the identified vulnerability. A neural decoder transformer model is trained to predict a candidate line-of-code to complete a prompt inserted into the context of the identified vulnerability. The candidate line-of-code that successfully completes the prompt is used as a signal to identify that the identified vulnerability is a false positive.
    Type: Application
    Filed: March 4, 2022
    Publication date: September 7, 2023
    Inventors: COLIN BRUCE CLEMENT, MATTHEW GLENN JIN, ANANT GIRISH KHARKAR, XIAOYU LIU, XIN SHI, NEELAKANTAN SUNDARESAN, ROSHANAK ZILOUCHIAN MOGHADDAM
  • Publication number: 20230251831
    Abstract: The syntax elements of a source code program used to represent the context of a focal method are selected based on a priority order. The selected syntax elements are input into a fixed-size context window that is used to train a neural transformer with attention model to learn to generate source code and used by the neural transformer model to generate source code. The context window contains prioritized sequences of tokens that extend beyond the target focus in order to provide a longer visibility back into the source code program for the model to learn predictive patterns. This gives the model a file-level context of the source code program without increasing the size of the context window.
    Type: Application
    Filed: April 17, 2023
    Publication date: August 10, 2023
    Inventors: COLIN BRUCE CLEMENT, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
  • Publication number: 20230222334
    Abstract: A deep learning model is quantized during its training to perform a target software engineering task. During training, a portion of the full-precision floating point weights is quantized into INT4 or INT 8 data types through scalar quantization or product quantization to make the model more resilient to quantization and to reduce the noise between the quantized and full-precision model outputs. In scalar quantization, each sub-block consists of a single weight that is mapped into a codeword of a codebook. In product quantization, an identity matrix and a codebook of centroids is used to map a quantized weight into its original value.
    Type: Application
    Filed: January 10, 2022
    Publication date: July 13, 2023
    Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
  • Patent number: 11693630
    Abstract: A neural transformer model with attention is trained to predict candidates to complete a line of source code with a zero-inference capability. The model is trained on an unsupervised training dataset that includes features from source code written in multiple programming languages. The features include a file-level context and a local context, where the file-level context includes a global context, a class context, a function context, and/or a method context for each class, function and/or method of the source code programs used in the training dataset. The local context includes method bodies, function bodies, and/or stand-alone code of main method routines. From these features, the model is able to learn to predict an ordered sequence of code elements that complete a line of source code in a programming language seen and not seen during training.
    Type: Grant
    Filed: November 1, 2022
    Date of Patent: July 4, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Colin Bruce Clement, Shuai Lu, Neelakantan Sundaresan, Alexey Svyatkovskiy, Duyu Tang
  • Publication number: 20230195600
    Abstract: An automated program repair system uses a neural transformer model with attention to predict a bug-free version of a method having a source code bug identified in an associated stack trace. The neural transformer model is pre-trained with English language text and the source code of a target programming language. The pre-trained neural transformer model is trained to create synthetic bugs in bug-free methods. The bug-free methods with the synthetic bugs are executed with a test case to obtain a stack trace of the source code bug. The method with the synthetic bug, without the bug, and its stack trace are used to train the neural transformer model to predict repairs for buggy methods.
    Type: Application
    Filed: February 16, 2023
    Publication date: June 22, 2023
    Inventors: COLIN BRUCE CLEMENT, DAWN DRAIN, GUILLERMO SERRATO CASTILLA, NEELAKANTAN SUNDARESAN
  • Publication number: 20230177261
    Abstract: Generally discussed herein are devices, systems, and methods for generating an automatic interactive digital notebook completion model. A method can include receiving notebook content of an interactive digital notebook, the notebook content including a markdown cell followed by a code cell. The method can include generating input/output examples by, for each input/output example by masking one of (i) content of the markdown cell or (ii) content of the code cell resulting in a masked cell, identifying the masked cell and content of another cell of the markdown cell or the code that is not masked as an input for an input/output example, and identifying the content of the masked cell as an output for the input/output example. The method can include training, based on the input/output examples, a natural language processing model that generates a prediction of the content of a second masked cell as an output.
    Type: Application
    Filed: December 8, 2021
    Publication date: June 8, 2023
    Inventors: Colin Bruce CLEMENT, Shubham Chandel, Guillermo Serrato Castilla, Neelakantan Sundaresan
  • Patent number: 11656851
    Abstract: The syntax elements of a source code program used to represent the context of a focal method are selected based on a priority order. The selected syntax elements are input into a fixed-size context window that is used to train a neural transformer with attention model to learn to generate source code and used by the neural transformer model to generate source code. The context window contains prioritized sequences of tokens that extend beyond the target focus in order to provide a longer visibility back into the source code program for the model to learn predictive patterns. This gives the model a file-level context of the source code program without increasing the size of the context window.
    Type: Grant
    Filed: October 22, 2021
    Date of Patent: May 23, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.
    Inventors: Colin Bruce Clement, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano
  • Publication number: 20230128008
    Abstract: A test-driven development system utilizes a neural transformer model with attention to generate method bodies for a focal method given its associated test cases, and optionally a method signature and a docstring of the focal method. The candidate method bodies are validated for syntactic correctness, tested using the given test cases, and tested with a donor class in a target system. Those candidate method bodies passing the validation and testing are then ranked based on a PLUM score that analyzes the candidate method bodies against various quality and performance metrics.
    Type: Application
    Filed: October 22, 2021
    Publication date: April 27, 2023
    Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
  • Publication number: 20230128200
    Abstract: The syntax elements of a source code program used to represent the context of a focal method are selected based on a priority order. The selected syntax elements are input into a fixed-size context window that is used to train a neural transformer with attention model to learn to generate source code and used by the neural transformer model to generate source code. The context window contains prioritized sequences of tokens that extend beyond the target focus in order to provide a longer visibility back into the source code program for the model to learn predictive patterns. This gives the model a file-level context of the source code program without increasing the size of the context window.
    Type: Application
    Filed: October 22, 2021
    Publication date: April 27, 2023
    Inventors: COLIN BRUCE CLEMENT, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO