Patents by Inventor ALEXEY SVYATKOVSKIY

ALEXEY SVYATKOVSKIY has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

AUTOMATING TEST-DRIVEN DEVELOPMENT WITH TRANSFORMERS

Publication number: 20230342287

Abstract: A test-driven development system utilizes a neural transformer model with attention to generate method bodies for a focal method given its associated test cases, and optionally a method signature and a docstring of the focal method. The candidate method bodies are validated for syntactic correctness, tested using the given test cases, and tested with a donor class in a target system. Those candidate method bodies passing the validation and testing are then ranked based on a PLUM score that analyzes the candidate method bodies against various quality and performance metrics.

Type: Application

Filed: June 19, 2023

Publication date: October 26, 2023

Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
TREE-BASED MERGE CONFLICT RESOLUTION WITH MULTI-TASK NEURAL TRANSFORMER

Publication number: 20230342123

Abstract: An automated system for resolving program merges uses a multi-task neural transformer with attention. Each component of a merge conflict tuple (A, B, O) is represented as an AST and transformed into aligned AST-node sequences and aligned editing sequences. The multi-task neural transformer model predicts the tree editing steps needed to resolve the merge conflict and applies them to the AST representation of the code base. The tree editing steps include the edit actions that needed to be applied to the AST of the code base and the edit labels that are inserted or updated with the edit actions.

Type: Application

Filed: June 14, 2023

Publication date: October 26, 2023

Inventors: NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, NEGAR GHORBANI
Automating test-driven development with transformers

Patent number: 11797426

Abstract: A test-driven development system utilizes a neural transformer model with attention to generate method bodies for a focal method given its associated test cases, and optionally a method signature and a docstring of the focal method. The candidate method bodies are validated for syntactic correctness, tested using the given test cases, and tested with a donor class in a target system. Those candidate method bodies passing the validation and testing are then ranked based on a PLUM score that analyzes the candidate method bodies against various quality and performance metrics.

Type: Grant

Filed: October 22, 2021

Date of Patent: October 24, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING

Inventors: Colin Bruce Clement, Shao Kun Deng, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano
CODE ADAPTATION THROUGH DEEP LEARNING

Publication number: 20230305824

Abstract: A code adaptation mechanism automatically integrates the variable names of a pasted source code snippet into variable names defined in a pre-existing partial source code program. The variable names from the pasted source code snippet are replaced with anonymized values. A deep learning model predicts the most likely variable name from the pre-existing partial source code program to replace each anonymized value. The deep learning model is trained on numerous variable usage patterns from various source code programs to learn to predict the most likely mapping of an undefined variable name from the pasted source code snippet to a variable name in the pre-existing partial source code program thereby generating a syntactically and semantically correct program.

Type: Application

Filed: March 24, 2022

Publication date: September 28, 2023

Inventors: MILTIADIS ALLAMANIS, SHENGYU FU, XIAOYU LIU, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
CONSTRAINED DECODING FOR SOURCE CODE GENERATION

Publication number: 20230281318

Abstract: A constrained decoding technique incorporates token constraints into a beam search at each time step of a decoding process in order to generate viable candidate sequences that are syntactically and semantically correct. The token constraints identify source code tokens or sequences of tokens that should appear in a candidate sequence. The token constraints are generated from checking whether a token predicted at each decoding step is feasible for a partial solution based on the production rules of the grammar of the programming language, the syntactic correctness of a partial sequence, and/or static type correctness.

Type: Application

Filed: March 7, 2022

Publication date: September 7, 2023

Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, XIAOYU LIU, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
LONG-RANGE MODELING OF SOURCE CODE FILES BY SYNTAX HIERARCHY

Publication number: 20230251831

Abstract: The syntax elements of a source code program used to represent the context of a focal method are selected based on a priority order. The selected syntax elements are input into a fixed-size context window that is used to train a neural transformer with attention model to learn to generate source code and used by the neural transformer model to generate source code. The context window contains prioritized sequences of tokens that extend beyond the target focus in order to provide a longer visibility back into the source code program for the model to learn predictive patterns. This gives the model a file-level context of the source code program without increasing the size of the context window.

Type: Application

Filed: April 17, 2023

Publication date: August 10, 2023

Inventors: COLIN BRUCE CLEMENT, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
Tree-based merge conflict resolution with multi-task neural transformer

Patent number: 11714617

Abstract: An automated system for resolving program merges uses a multi-task neural transformer with attention. Each component of a merge conflict tuple (A, B, O) is represented as an AST and transformed into aligned AST-node sequences and aligned editing sequences. The multi-task neural transformer model predicts the tree editing steps needed to resolve the merge conflict and applies them to the AST representation of the code base. The tree editing steps include the edit actions that needed to be applied to the AST of the code base and the edit labels that are inserted or updated with the edit actions.

Type: Grant

Filed: January 26, 2022

Date of Patent: August 1, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Neelakantan Sundaresan, Alexey Svyatkovskiy, Negar Ghorbani
TREE-BASED MERGE CONFLICT RESOLUTION WITH MULTI-TASK NEURAL TRANSFORMER

Publication number: 20230236811

Abstract: An automated system for resolving program merges uses a multi-task neural transformer with attention. Each component of a merge conflict tuple (A, B, O) is represented as an AST and transformed into aligned AST-node sequences and aligned editing sequences. The multi-task neural transformer model predicts the tree editing steps needed to resolve the merge conflict and applies them to the AST representation of the code base. The tree editing steps include the edit actions that needed to be applied to the AST of the code base and the edit labels that are inserted or updated with the edit actions.

Type: Application

Filed: January 26, 2022

Publication date: July 27, 2023

Inventors: NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, NEGAR GHORBANI
MODEL QUANTIZATION FOR SOFTWARE ENGINEERING TASKS

Publication number: 20230222334

Abstract: A deep learning model is quantized during its training to perform a target software engineering task. During training, a portion of the full-precision floating point weights is quantized into INT4 or INT 8 data types through scalar quantization or product quantization to make the model more resilient to quantization and to reduce the noise between the quantized and full-precision model outputs. In scalar quantization, each sub-block consists of a single weight that is mapped into a codeword of a codebook. In product quantization, an identity matrix and a codebook of centroids is used to map a quantized weight into its original value.

Type: Application

Filed: January 10, 2022

Publication date: July 13, 2023

Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
Multi-lingual code generation with zero-shot inference

Patent number: 11693630

Abstract: A neural transformer model with attention is trained to predict candidates to complete a line of source code with a zero-inference capability. The model is trained on an unsupervised training dataset that includes features from source code written in multiple programming languages. The features include a file-level context and a local context, where the file-level context includes a global context, a class context, a function context, and/or a method context for each class, function and/or method of the source code programs used in the training dataset. The local context includes method bodies, function bodies, and/or stand-alone code of main method routines. From these features, the model is able to learn to predict an ordered sequence of code elements that complete a line of source code in a programming language seen and not seen during training.

Type: Grant

Filed: November 1, 2022

Date of Patent: July 4, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Colin Bruce Clement, Shuai Lu, Neelakantan Sundaresan, Alexey Svyatkovskiy, Duyu Tang
CODE GENERATION THROUGH REINFORCEMENT LEARNING USING CODE-QUALITY REWARDS

Publication number: 20230195428

Abstract: A deep learning model trained to learn to predict source code is tuned for a target source code generation task through reinforcement learning using a reward score that considers the quality of the source code predicted during the tuning process. The reward score is adjusted to consider code-quality factors and source code metrics. The code-quality factors account for the predicted source code having syntactic correctness, successful compilation, successful execution, successful invocation, readability, functional correctness, and coverage. The source code metrics generate a score based on how close the predicted source code is to a ground truth code.

Type: Application

Filed: December 17, 2021

Publication date: June 22, 2023

Inventors: SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
Long-range modeling of source code files by syntax hierarchy

Patent number: 11656851

Abstract: The syntax elements of a source code program used to represent the context of a focal method are selected based on a priority order. The selected syntax elements are input into a fixed-size context window that is used to train a neural transformer with attention model to learn to generate source code and used by the neural transformer model to generate source code. The context window contains prioritized sequences of tokens that extend beyond the target focus in order to provide a longer visibility back into the source code program for the model to learn predictive patterns. This gives the model a file-level context of the source code program without increasing the size of the context window.

Type: Grant

Filed: October 22, 2021

Date of Patent: May 23, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC.

Inventors: Colin Bruce Clement, Neelakantan Sundaresan, Alexey Svyatkovskiy, Michele Tufano
AUTOMATING TEST-DRIVEN DEVELOPMENT WITH TRANSFORMERS

Publication number: 20230128008

Abstract: A test-driven development system utilizes a neural transformer model with attention to generate method bodies for a focal method given its associated test cases, and optionally a method signature and a docstring of the focal method. The candidate method bodies are validated for syntactic correctness, tested using the given test cases, and tested with a donor class in a target system. Those candidate method bodies passing the validation and testing are then ranked based on a PLUM score that analyzes the candidate method bodies against various quality and performance metrics.

Type: Application

Filed: October 22, 2021

Publication date: April 27, 2023

Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
LONG-RANGE MODELING OF SOURCE CODE FILES BY SYNTAX HIERARCHY

Publication number: 20230128200

Abstract: The syntax elements of a source code program used to represent the context of a focal method are selected based on a priority order. The selected syntax elements are input into a fixed-size context window that is used to train a neural transformer with attention model to learn to generate source code and used by the neural transformer model to generate source code. The context window contains prioritized sequences of tokens that extend beyond the target focus in order to provide a longer visibility back into the source code program for the model to learn predictive patterns. This gives the model a file-level context of the source code program without increasing the size of the context window.

Type: Application

Filed: October 22, 2021

Publication date: April 27, 2023

Inventors: COLIN BRUCE CLEMENT, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
AUTOMATED PROGRAM REPAIR TOOL

Publication number: 20230114423

Abstract: An automated program repair tool utilizes a neural transformer model with attention to predict the contents of a bug repair in the context of source code having a bug of an identified bug type. The neural transformer model is trained on a large unsupervised corpus of source code using a span-masking denoising optimization objective, and fine-tuned on a large supervised dataset of triplets containing a bug-type annotation, software bug, and repair. The bug-type annotation is derived from an interprocedural static code analyzer. A bug type edit centroid is computed for each bug type and used in the inference decoding phase to generate the bug repair.

Type: Application

Filed: November 25, 2022

Publication date: April 13, 2023

Inventors: SHAO KUN DENG, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, MICHELE TUFANO
NEURAL TRANSFORMER CODE COMPLETION FOR COMMAND LINE INTERFACE

Publication number: 20230073052

Abstract: A code completion system for a CLI utilizes neural transformer models with attention to generate candidates to complete a line of CLI code. The code completion system uses a first deep learning model to predict at most k candidate command names to follow n immediately preceding lines of CLI code which are presented to a developer. Upon the developer accepting one of the candidate command names, the code completion system uses a second deep learning model to predict at most k parameter strings to complete the line of CLI code.

Type: Application

Filed: September 1, 2021

Publication date: March 9, 2023

Inventors: YEVHEN MOHYLEVSKYY, ALEXEY SVYATKOVSKIY, ROSHANAK ZILOUCHIAN MOGHADDAM
TRANSFER LEARNING SYSTEM FOR AUTOMATED SOFTWARE ENGINEERING TASKS

Publication number: 20230067364

Abstract: A transfer learning system is used for the development of neural transformer models pertaining to software engineering tasks. The transfer learning system trains source code domain neural transformer models with attention in various configurations on a large corpus of unsupervised training dataset of source code programs and/or source code-related natural language text. A web service provides the trained models for use in developing a model that may be fine-tuned on a supervised training dataset associated with a software engineering task thereby generating a tool to perform the software engineering task.

Type: Application

Filed: November 6, 2022

Publication date: March 2, 2023

Inventors: COLIN BRUCE CLEMENT, DAWN DRAIN, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY
MULTI-LINGUAL CODE GENERATION WITH ZERO-SHOT INFERENCE

Publication number: 20230048186

Abstract: A neural transformer model with attention is trained to predict candidates to complete a line of source code with a zero-inference capability. The model is trained on an unsupervised training dataset that includes features from source code written in multiple programming languages. The features include a file-level context and a local context, where the file-level context includes a global context, a class context, a function context, and/or a method context for each class, function and/or method of the source code programs used in the training dataset. The local context includes method bodies, function bodies, and/or stand-alone code of main method routines. From these features, the model is able to learn to predict an ordered sequence of code elements that complete a line of source code in a programming language seen and not seen during training.

Type: Application

Filed: November 1, 2022

Publication date: February 16, 2023

Inventors: COLIN BRUCE CLEMENT, SHUAI LU, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, DUYU TANG
AUTOMATED FINE-TUNING AND DEPLOYMENT OF PRE-TRAINED DEEP LEARNING MODELS

Publication number: 20220398462

Abstract: A cloud platform includes several web services that facilitate the automated tuning and deployment of pre-trained deep learning models configured for software engineering tasks. The automated tuning and deployment allow a developer to fine-tune a pre-existing model without having access to the parameters of the pre-existing and the fine-tuned model in a manner that does not require user management input. The cloud platform provides a set of files for each pre-trained models used to automatically build a fine-tuning infrastructure to fine-tune a model and a deployment infrastructure that deploys the fine-tuned model without requiring user input.

Type: Application

Filed: June 14, 2021

Publication date: December 15, 2022

Inventors: COLIN BRUCE CLEMENT, SHAO KUN DENG, DAWN DRAIN, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY, YIDING TIAN, MICHELE TUFANO, PAUL AN-CHIEH WANG, CHEN WU, DONGJIANG YOU
CODE GENERATION WITH REINFORCEMENT LEARNING

Publication number: 20220398071

Abstract: A code generation system uses a non-terminal expansion model and a non-terminal selector model to generate a code sketch to complete a partially-formed source code snippet. The non-terminal expansion model is a neural transformer model trained on a supervised dataset through reinforcement learning to learn to predict the production rule to expand for a given non-terminal symbol. The non-terminal selector model is trained through reinforcement learning to predict the non-terminal symbol to expand given a partial-code state. The models are used in a two-step beam search to generate the top candidate code sketches, where a candidate code sketch may contain a hole that represents an unexpanded non-terminal symbol.

Type: Application

Filed: August 16, 2021

Publication date: December 15, 2022

Inventors: MILTIADIS ALLAMANIS, DAYA GUO, NEELAKANTAN SUNDARESAN, ALEXEY SVYATKOVSKIY

prev 1 2 3 4 next