Patents by Inventor ALEXEY SVYATKOVSKIY

ALEXEY SVYATKOVSKIY has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SOURCE CODE PATCH GENERATION WITH RETRIEVAL-AUGMENTED TRANSFORMER

Publication number: 20240134614

Abstract: A source code patch generation system uses the context of a buggy source code snippet of a source code program and a hint to predict a source code segment that repairs the buggy source code snippet. The hint is a source code segment that is semantically-similar to the buggy source code snippet where the similarity is based on a context of the buggy source code snippet. An autoregressive deep learning model uses the context of the buggy source code snippet and the hint to predict the most likely source code segment to repair the buggy source code snippet.

Type: Application

Filed: October 14, 2022

Publication date: April 25, 2024

Inventors: AMANDEEP SINGH BAKSHI, XIN SHI, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
Custom models for source code generation via prefix-tuning

Patent number: 11947935

Abstract: Custom source code generation models are generated by tuning a pre-trained deep learning model by freezing the model parameters and optimizing a prefix. The tuning process is distributed across a user space and a model space where the embedding and output layers are performed in the user space and the execution of the model is performed in a model space that is isolated from the user space. The tuning process updates the embeddings of the prefix across the separate execution spaces in a manner that preserves the privacy of the data used in the tuning process.

Type: Grant

Filed: November 24, 2021

Date of Patent: April 2, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Colin Bruce Clement, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano, Andrei Zlotchevski
DEBUGGING TOOL FOR CODE GENERATION NEURAL LANGUAGE MODELS

Publication number: 20240104001

Abstract: A debugging tool identifies the smallest subset of an input sequence or rationales that influenced a neural language model to generate an output sequence. The debugging tool uses the rationales to understand why the model made its predictions and in particular, the particular input tokens that had the most impact on the output sequence. In the case of erroneous output, the rationales are used to alter the input sequence to avoid the error or to tailor a new training dataset to retrain the model to improve its performance.

Type: Application

Filed: December 15, 2022

Publication date: March 28, 2024

Inventors: COLIN BRUCE CLEMENT, DAVID ALBERTO NADER PALACIO, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
Code generation through reinforcement learning using code-quality rewards

Patent number: 11941373

Abstract: A deep learning model trained to learn to predict source code is tuned for a target source code generation task through reinforcement learning using a reward score that considers the quality of the source code predicted during the tuning process. The reward score is adjusted to consider code-quality factors and source code metrics. The code-quality factors account for the predicted source code having syntactic correctness, successful compilation, successful execution, successful invocation, readability, functional correctness, and coverage. The source code metrics generate a score based on how close the predicted source code is to a ground truth code.

Type: Grant

Filed: December 17, 2021

Date of Patent: March 26, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Shao Kun Deng, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano
AUTOMATIC GENERATION OF ASSERT STATEMENTS FOR UNIT TEST CASES

Publication number: 20240070053

Abstract: An assert statement generator employs a neural transformer model with attention to generate candidate assert statements for a unit test method that tests a focal method. The neural transformer model is pre-trained with source code programs and natural language text and fine-tuned with test-assert triplets. A test-assert triplet includes a source code snippet that includes: (1) a unit test method with an assert placeholder; (2) the focal method; and (3) a corresponding assert statement. In this manner, the neural transformer model is trained to learn the semantics and statistical properties of a natural language, the syntax of a programming language, and the relationships between the code elements of the programming language and the syntax of an assert statement.

Type: Application

Filed: October 23, 2023

Publication date: February 29, 2024

Inventors: DAWN DRAIN, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
CODE GENERATION WITH REINFORCEMENT LEARNING

Publication number: 20240061655

Abstract: A code generation system uses a non-terminal expansion model and a non-terminal selector model to generate a code sketch to complete a partially-formed source code snippet. The non-terminal expansion model is a neural transformer model trained on a supervised dataset through reinforcement learning to learn to predict the production rule to expand for a given non-terminal symbol. The non-terminal selector model is trained through reinforcement learning to predict the non-terminal symbol to expand given a partial-code state. The models are used in a two-step beam search to generate the top candidate code sketches, where a candidate code sketch may contain a hole that represents an unexpanded non-terminal symbol.

Type: Application

Filed: November 3, 2023

Publication date: February 22, 2024

Inventors: MILTIADIS ALLAMANIS, DAYA GUO, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
Transfer learning system for automated software engineering tasks

Patent number: 11900261

Abstract: A transfer learning system is used for the development of neural transformer models pertaining to software engineering tasks. The transfer learning system trains source code domain neural transformer models with attention in various configurations on a large corpus of unsupervised training dataset of source code programs and/or source code-related natural language text. A web service provides the trained models for use in developing a model that may be fine-tuned on a supervised training dataset associated with a software engineering task thereby generating a tool to perform the software engineering task.

Type: Grant

Filed: November 6, 2022

Date of Patent: February 13, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Colin Bruce Clement, Dawn Drain, Neelakantan Sundaresan, Alexey Svyatkovskiy
Unit test case generation with transformers

Patent number: 11893363

Abstract: A unit test generation system employs a neural transformer model with attention to generate candidate unit test sequences given a focal method of a programming language. The neural transformer model is pre-trained with source code programs and natural language text and fine-tuned with mapped test case pairs. A mapped test case pair includes a focal method and a unit test case for the focal method. In this manner, the neural transformer model is trained to learn the semantics and statistical properties of a natural language, the syntax of a programming language and the relationships between the code elements of the programming language and the syntax of a unit test case.

Type: Grant

Filed: October 27, 2020

Date of Patent: February 6, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Dawn Drain, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano
MULTI-LINGUAL LINE-OF-CODE COMPLETION SYSTEM

Publication number: 20240028306

Abstract: A code completion tool uses a neural transformer model to generate candidate sequences to complete a line of source code. The neural transformer model is trained using a conditional language modeling objective on a large unsupervised dataset that includes source code programs written in several different programming languages. The neural transformer model is used within a beam search that predicts the most likely candidate sequences for a code snippet under development.

Type: Application

Filed: August 9, 2023

Publication date: January 25, 2024

Inventors: ALEXEY SVYATKOVSKIY, SHENGYU FU, NEELAKANTAN SUNDARESAN, SHAO KUN DENG
SYNTAX SUBTREE CODE STRENGTHENING

Publication number: 20240004623

Abstract: During software development, embodiments find various kinds of weak spots in source code and automatically suggest fixes to strengthen the code, without requiring developers to expressly select weakness finder mechanisms or fixer mechanisms by navigating a development tool's menu system. Weakness finders may analyze code using items such as hole detection, diagnostic errors, test results, changed code matches, prospective code discrepancies, generated code confidence scores, generated suggestion competition, and artificial intelligence. Weak spots and their context are submitted to weak spot fixers, which may generate fix suggestions using functionalities such as code synthesis, refactoring, autocompletion, retesting, and artificial intelligence. Fix candidate sets may be evaluated for consistency, diagnostic errors, and discrepancies. Snippets may be dynamically filled for presentation to a user.

Type: Application

Filed: July 1, 2022

Publication date: January 4, 2024

Inventors: Peter GROENEWEGEN, Jui HANAMSHET, German David OBANDO CHACON, Mark Alistair WILSON-THOMAS, Alexey SVYATKOVSKIY, David Ellis PUGH
CODE INSERTION COMPLETION

Publication number: 20230409299

Abstract: A code insertion engine predicts one or more statements of a programming language to be inserted at an insertion point in between existing source code statements of a source code program being edited. The code insertion engine extracts the surrounding context of the insertion point which includes the source code immediately preceding and the source code immediately following the insertion point. The code insertion engine uses a neural expansion model and a neural selector model to predict the one or more statements most likely to be inserted into the insertion point that are syntactically and semantically consistent with the surrounding context of the existing program.

Type: Application

Filed: June 16, 2022

Publication date: December 21, 2023

Inventors: NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
Code generation with reinforcement learning

Patent number: 11836467

Abstract: A code generation system uses a non-terminal expansion model and a non-terminal selector model to generate a code sketch to complete a partially-formed source code snippet. The non-terminal expansion model is a neural transformer model trained on a supervised dataset through reinforcement learning to learn to predict the production rule to expand for a given non-terminal symbol. The non-terminal selector model is trained through reinforcement learning to predict the non-terminal symbol to expand given a partial-code state. The models are used in a two-step beam search to generate the top candidate code sketches, where a candidate code sketch may contain a hole that represents an unexpanded non-terminal symbol.

Type: Grant

Filed: August 16, 2021

Date of Patent: December 5, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Miltiadis Allamanis, Daya Guo, Neelakantan Sundaresan, Alexey Svyatkovskiy
Automatic generation of assert statements for unit test cases

Patent number: 11829282

Abstract: An assert statement generator employs a neural transformer model with attention to generate candidate assert statements for a unit test method that tests a focal method. The neural transformer model is pre-trained with source code programs and natural language text and fine-tuned with test-assert triplets. A test-assert triplet includes a source code snippet that includes: (1) a unit test method with an assert placeholder; (2) the focal method; and (3) a corresponding assert statement. In this manner, the neural transformer model is trained to learn the semantics and statistical properties of a natural language, the syntax of a programming language, and the relationships between the code elements of the programming language and the syntax of an assert statement.

Type: Grant

Filed: October 27, 2020

Date of Patent: November 28, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Dawn Drain, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano
RETRIEVAL-AUGMENTED CODE COMPLETION

Publication number: 20230359441

Abstract: A retrieval-augmented code completion system uses the context of a partially-formed source code snippet of a source code program and a hint to predict the source code tokens needed to complete the partially-formed source code snippet. The hint is a source code segment that completes a semantically-similar source code segment of the partially-formed source code snippet. The hint is found in a retrieval source code database using a hybrid retrieval technique. A deep learning decoder model uses the context of the partially-formed source code snippet and the hint to predict the most likely candidate sequence of source code tokens to complete the partially-formed source code snippet.

Type: Application

Filed: May 9, 2022

Publication date: November 9, 2023

Inventors: NAN DUAN, SHUAI LU, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
MULTI-LINGUAL CODE GENERATION WITH ZERO-SHOT INFERENCE

Publication number: 20230359443

Abstract: A neural transformer model with attention is trained to predict candidates to complete a line of source code with a zero-inference capability. The model is trained on an unsupervised training dataset that includes features from source code written in multiple programming languages. The features include a file-level context and a local context, where the file-level context includes a global context, a class context, a function context, and/or a method context for each class, function and/or method of the source code programs used in the training dataset. The local context includes method bodies, function bodies, and/or stand-alone code of main method routines. From these features, the model is able to learn to predict an ordered sequence of code elements that complete a line of source code in a programming language seen and not seen during training.

Type: Application

Filed: May 24, 2023

Publication date: November 9, 2023

Inventors: COLIN BRUCE CLEMENT, SHUAI LU, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, DUYU TANG
Multi-lingual line-of-code completion system

Patent number: 11809842

Abstract: A code completion tool uses a neural transformer model to generate candidate sequences to complete a line of source code. The neural transformer model is trained using a conditional language modeling objective on a large unsupervised dataset that includes source code programs written in several different programming languages. The neural transformer model is used within a beam search that predicts the most likely candidate sequences for a code snippet under development.

Type: Grant

Filed: January 20, 2022

Date of Patent: November 7, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Alexey Svyatkovskiy, Shengyu Fu, Neelakantan Sundaresan, Shao Kun Deng
AUTOMATING TEST-DRIVEN DEVELOPMENT WITH TRANSFORMERS

Publication number: 20230342287

Abstract: A test-driven development system utilizes a neural transformer model with attention to generate method bodies for a focal method given its associated test cases, and optionally a method signature and a docstring of the focal method. The candidate method bodies are validated for syntactic correctness, tested using the given test cases, and tested with a donor class in a target system. Those candidate method bodies passing the validation and testing are then ranked based on a PLUM score that analyzes the candidate method bodies against various quality and performance metrics.

Type: Application

Filed: June 19, 2023

Publication date: October 26, 2023

Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
TREE-BASED MERGE CONFLICT RESOLUTION WITH MULTI-TASK NEURAL TRANSFORMER

Publication number: 20230342123

Abstract: An automated system for resolving program merges uses a multi-task neural transformer with attention. Each component of a merge conflict tuple (A, B, O) is represented as an AST and transformed into aligned AST-node sequences and aligned editing sequences. The multi-task neural transformer model predicts the tree editing steps needed to resolve the merge conflict and applies them to the AST representation of the code base. The tree editing steps include the edit actions that needed to be applied to the AST of the code base and the edit labels that are inserted or updated with the edit actions.

Type: Application

Filed: June 14, 2023

Publication date: October 26, 2023

Inventors: NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, NEGAR GHORBANI
Automating test-driven development with transformers

Patent number: 11797426

Abstract: A test-driven development system utilizes a neural transformer model with attention to generate method bodies for a focal method given its associated test cases, and optionally a method signature and a docstring of the focal method. The candidate method bodies are validated for syntactic correctness, tested using the given test cases, and tested with a donor class in a target system. Those candidate method bodies passing the validation and testing are then ranked based on a PLUM score that analyzes the candidate method bodies against various quality and performance metrics.

Type: Grant

Filed: October 22, 2021

Date of Patent: October 24, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING

Inventors: Colin Bruce Clement, Shao Kun Deng, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano
CODE ADAPTATION THROUGH DEEP LEARNING

Publication number: 20230305824

Abstract: A code adaptation mechanism automatically integrates the variable names of a pasted source code snippet into variable names defined in a pre-existing partial source code program. The variable names from the pasted source code snippet are replaced with anonymized values. A deep learning model predicts the most likely variable name from the pre-existing partial source code program to replace each anonymized value. The deep learning model is trained on numerous variable usage patterns from various source code programs to learn to predict the most likely mapping of an undefined variable name from the pasted source code snippet to a variable name in the pre-existing partial source code program thereby generating a syntactically and semantically correct program.

Type: Application

Filed: March 24, 2022

Publication date: September 28, 2023

Inventors: MILTIADIS ALLAMANIS, SHENGYU FU, XIAOYU LIU, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY

1 2 3 next