Patents by Inventor Liujia Shao

Liujia Shao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Transaction composition graph node embedding

Patent number: 12050971

Abstract: A computer-implemented process for transaction composition graph node embedding comprising traversing a data flow of transactions to convert a full graph to multiple directed acyclic subgraphs/paths in spanning trees, taking one-by-one nodes as input to a predetermined neural network, generating a set of one-hot vectors for all nodes, computing an embedding vector from a corresponding one-hot vector, computing a probability that an output node is nearby, and embedding the node to a latent feature vector.

Type: Grant

Filed: December 3, 2020

Date of Patent: July 30, 2024

Assignee: International Business Machines Corporation

Inventors: Yan Luo, Liujia Shao, Yan Xu
Code refactor renaming recommender

Patent number: 11604640

Abstract: An approach to code refactor renaming may be provided. Source code with a naming convention for functions and classes can be presented to a machine learning model. The model may identify the names for functions and classes. The identified names may be tokenized. Docstrings associated with functions and classes may be identified. Code for the identified functions and classes and associated may be input into a feature vector generation mechanism. A model may be trained mapping the generated feature vectors to tokenized identified names, via regression. The model can be utilized to analyze input code with the same naming convention to predict names for functions and classes, allowing for the recommendation of function and class names in accordance with the programming code naming convention.

Type: Grant

Filed: December 11, 2020

Date of Patent: March 14, 2023

Assignee: International Business Machines Corporation

Inventors: Liujia Shao, Yan Luo, Yan Xu
Context-based word embedding for programming artifacts

Patent number: 11422798

Abstract: Techniques for context-based word embedding for programming artifacts are described herein. An aspect includes determining a plurality of keywords based on a corpus of programming artifacts, the corpus of programming artifacts including source code corresponding to a software project. Another aspect includes determining a plurality of context/keyword pair sets based on the plurality of keywords and the corpus of programming artifacts, wherein each context/keyword pair set of the plurality of context/keyword pair sets includes a first keyword, a second keyword, and a context type corresponding to a co-occurrence of the first keyword and the second keyword in the corpus of programming artifacts. Another aspect includes constructing a word embedding matrix based on the plurality of context/keyword pair sets.

Type: Grant

Filed: February 26, 2020

Date of Patent: August 23, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Yan Luo, Liujia Shao, Yan Xu, Sibin Fan
CODE REFACTOR RENAMING RECOMMENDER

Publication number: 20220188102

Abstract: An approach to code refactor renaming may be provided. Source code with a naming convention for functions and classes can be presented to a machine learning model. The model may identify the names for functions and classes. The identified names may be tokenized. Docstrings associated with functions and classes may be identified. Code for the identified functions and classes and associated may be input into a feature vector generation mechanism. A model may be trained mapping the generated feature vectors to tokenized identified names, via regression. The model can be utilized to analyze input code with the same naming convention to predict names for functions and classes, allowing for the recommendation of function and class names in accordance with the programming code naming convention.

Type: Application

Filed: December 11, 2020

Publication date: June 16, 2022

Inventors: Liujia SHAO, Yan LUO, Yan XU
TRANSACTION COMPOSITION GRAPH NODE EMBEDDING

Publication number: 20220180240

Abstract: A computer-implemented process for transaction composition graph node embedding comprising traversing a data flow of transactions to convert a full graph to multiple directed acyclic subgraphs/paths in spanning trees, taking one-by-one nodes as input to a predetermined neural network, generating a set of one-hot vectors for all nodes, computing an embedding vector from a corresponding one-hot vector, computing a probability that an output node is nearby, and embedding the node to a latent feature vector.

Type: Application

Filed: December 3, 2020

Publication date: June 9, 2022

Inventors: Yan Luo, Liujia Shao, Yan Xu
Pretraining utilizing software dependencies

Patent number: 11262985

Abstract: In an approach to creating code snippet auto-commenting models utilizing a pre-training model leveraging dependency data, one or more computer processors create a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces associated with a software project. The one or more computer processors create a subsequent model with a model architecture identical to the created pre-training model. The one or more computer processors computationally reduce a training of the created subsequent model utilizing one or more trained parameters, activations, memory cells, and context vectors contained in the created pre-training model. The one or more computer processors create deploy the subsequent model to one to more production environments.

Type: Grant

Filed: March 10, 2020

Date of Patent: March 1, 2022

Assignee: International Business Machines Corporation

Inventors: Yan Luo, Liujia Shao, Yan Xu, Sibin Fan
Automated breakpoint creation

Patent number: 11176019

Abstract: Methods, systems, and computer program products for automated breakpoint creation using machine learning are provided. Aspects include obtaining a bug report for a software and source code for the software and analyzing the bug report to determine a bug type for the bug report, where analyzing the bug report includes using a bug type labeling model. Aspects also include analyzing the source code to identify a code snippet in the source code based on the bug type, where analyzing the source code includes using a source code detection model. Aspects further include inserting a breakpoint in the source code at the code snippet.

Type: Grant

Filed: April 1, 2020

Date of Patent: November 16, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Liujia Shao, Yan Luo, Yan Xu, Sibin Fan
AUTOMATED BREAKPOINT CREATION

Publication number: 20210311853

Abstract: Methods, systems, and computer program products for automated breakpoint creation using machine learning are provided. Aspects include obtaining a bug report for a software and source code for the software and analyzing the bug report to determine a bug type for the bug report, where analyzing the bug report includes using a bug type labeling model. Aspects also include analyzing the source code to identify a code snippet in the source code based on the bug type, where analyzing the source code includes using a source code detection model. Aspects further include inserting a breakpoint in the source code at the code snippet.

Type: Application

Filed: April 1, 2020

Publication date: October 7, 2021

Inventors: LIUJIA SHAO, YAN LUO, YAN XU, SIBIN FAN
PRETRAINING UTILIZING SOFTWARE DEPENDENCIES

Publication number: 20210286598

Abstract: In an approach to creating code snippet auto-commenting models utilizing a pre-training model leveraging dependency data, one or more computer processors create a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces associated with a software project. The one or more computer processors create a subsequent model with a model architecture identical to the created pre-training model. The one or more computer processors computationally reduce a training of the created subsequent model utilizing one or more trained parameters, activations, memory cells, and context vectors contained in the created pre-training model. The one or more computer processors create deploy the subsequent model to one to more production environments.

Type: Application

Filed: March 10, 2020

Publication date: September 16, 2021

Inventors: YAN LUO, Liujia Shao, YAN XU, SIBIN FAN
Automatic code coverage file recommendation

Patent number: 11119898

Abstract: Techniques for automatic code coverage file recommendation are described herein. An aspect includes receiving historical code coverage data. Another aspect includes clustering the historical code coverage data. Another aspect includes performing content filtering based on the clustered historical code coverage data to determine a content filtering preferred file list. Another aspect includes performing collaborative filtering based on the clustered historical code coverage data to determine a collaborative filtering preferred file list. Another aspect includes combining the content filtering preferred file list and the collaborative filtering preferred file list to determine a code coverage file recommendation list. Another aspect includes providing the code coverage file recommendation list to a user.

Type: Grant

Filed: May 7, 2020

Date of Patent: September 14, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Liujia Shao, Yan Luo, Yan Xu, Sibin Fan
CONTEXT-BASED WORD EMBEDDING FOR PROGRAMMING ARTIFACTS

Publication number: 20210263732

Abstract: Techniques for context-based word embedding for programming artifacts are described herein. An aspect includes determining a plurality of keywords based on a corpus of programming artifacts, the corpus of programming artifacts including source code corresponding to a software project. Another aspect includes determining a plurality of context/keyword pair sets based on the plurality of keywords and the corpus of programming artifacts, wherein each context/keyword pair set of the plurality of context/keyword pair sets includes a first keyword, a second keyword, and a context type corresponding to a co-occurrence of the first keyword and the second keyword in the corpus of programming artifacts. Another aspect includes constructing a word embedding matrix based on the plurality of context/keyword pair sets.

Type: Application

Filed: February 26, 2020

Publication date: August 26, 2021

Inventors: YAN LUO, Liujia Shao, YAN XU, SIBIN FAN
WEIGHTED CODE COVERAGE

Publication number: 20210149793

Abstract: Provided is a method, a system, and a computer program product for determining a cognitive code coverage weight for code snippets located in a portion of code. The method includes generating a set of samples from code snippets included in a portion of code. Each sample in the set of samples includes features derived from the code snippets. The method further includes generating corresponding labels for the set of samples and creating a training dataset by applying the labels to the set of samples. The training dataset includes the set of samples with each sample including a label from the labels generated. The method further includes training a machine learning model using the labeled dataset to output a code coverage weight for code snippets.

Type: Application

Filed: November 19, 2019

Publication date: May 20, 2021

Inventors: Liujia Shao, Yan Luo, Yan Xu, Anton Karputkin