Patents by Inventor Liujia Shao
Liujia Shao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12050971Abstract: A computer-implemented process for transaction composition graph node embedding comprising traversing a data flow of transactions to convert a full graph to multiple directed acyclic subgraphs/paths in spanning trees, taking one-by-one nodes as input to a predetermined neural network, generating a set of one-hot vectors for all nodes, computing an embedding vector from a corresponding one-hot vector, computing a probability that an output node is nearby, and embedding the node to a latent feature vector.Type: GrantFiled: December 3, 2020Date of Patent: July 30, 2024Assignee: International Business Machines CorporationInventors: Yan Luo, Liujia Shao, Yan Xu
-
Patent number: 11604640Abstract: An approach to code refactor renaming may be provided. Source code with a naming convention for functions and classes can be presented to a machine learning model. The model may identify the names for functions and classes. The identified names may be tokenized. Docstrings associated with functions and classes may be identified. Code for the identified functions and classes and associated may be input into a feature vector generation mechanism. A model may be trained mapping the generated feature vectors to tokenized identified names, via regression. The model can be utilized to analyze input code with the same naming convention to predict names for functions and classes, allowing for the recommendation of function and class names in accordance with the programming code naming convention.Type: GrantFiled: December 11, 2020Date of Patent: March 14, 2023Assignee: International Business Machines CorporationInventors: Liujia Shao, Yan Luo, Yan Xu
-
Patent number: 11422798Abstract: Techniques for context-based word embedding for programming artifacts are described herein. An aspect includes determining a plurality of keywords based on a corpus of programming artifacts, the corpus of programming artifacts including source code corresponding to a software project. Another aspect includes determining a plurality of context/keyword pair sets based on the plurality of keywords and the corpus of programming artifacts, wherein each context/keyword pair set of the plurality of context/keyword pair sets includes a first keyword, a second keyword, and a context type corresponding to a co-occurrence of the first keyword and the second keyword in the corpus of programming artifacts. Another aspect includes constructing a word embedding matrix based on the plurality of context/keyword pair sets.Type: GrantFiled: February 26, 2020Date of Patent: August 23, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Yan Luo, Liujia Shao, Yan Xu, Sibin Fan
-
Publication number: 20220188102Abstract: An approach to code refactor renaming may be provided. Source code with a naming convention for functions and classes can be presented to a machine learning model. The model may identify the names for functions and classes. The identified names may be tokenized. Docstrings associated with functions and classes may be identified. Code for the identified functions and classes and associated may be input into a feature vector generation mechanism. A model may be trained mapping the generated feature vectors to tokenized identified names, via regression. The model can be utilized to analyze input code with the same naming convention to predict names for functions and classes, allowing for the recommendation of function and class names in accordance with the programming code naming convention.Type: ApplicationFiled: December 11, 2020Publication date: June 16, 2022Inventors: Liujia SHAO, Yan LUO, Yan XU
-
Publication number: 20220180240Abstract: A computer-implemented process for transaction composition graph node embedding comprising traversing a data flow of transactions to convert a full graph to multiple directed acyclic subgraphs/paths in spanning trees, taking one-by-one nodes as input to a predetermined neural network, generating a set of one-hot vectors for all nodes, computing an embedding vector from a corresponding one-hot vector, computing a probability that an output node is nearby, and embedding the node to a latent feature vector.Type: ApplicationFiled: December 3, 2020Publication date: June 9, 2022Inventors: Yan Luo, Liujia Shao, Yan Xu
-
Patent number: 11262985Abstract: In an approach to creating code snippet auto-commenting models utilizing a pre-training model leveraging dependency data, one or more computer processors create a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces associated with a software project. The one or more computer processors create a subsequent model with a model architecture identical to the created pre-training model. The one or more computer processors computationally reduce a training of the created subsequent model utilizing one or more trained parameters, activations, memory cells, and context vectors contained in the created pre-training model. The one or more computer processors create deploy the subsequent model to one to more production environments.Type: GrantFiled: March 10, 2020Date of Patent: March 1, 2022Assignee: International Business Machines CorporationInventors: Yan Luo, Liujia Shao, Yan Xu, Sibin Fan
-
Patent number: 11176019Abstract: Methods, systems, and computer program products for automated breakpoint creation using machine learning are provided. Aspects include obtaining a bug report for a software and source code for the software and analyzing the bug report to determine a bug type for the bug report, where analyzing the bug report includes using a bug type labeling model. Aspects also include analyzing the source code to identify a code snippet in the source code based on the bug type, where analyzing the source code includes using a source code detection model. Aspects further include inserting a breakpoint in the source code at the code snippet.Type: GrantFiled: April 1, 2020Date of Patent: November 16, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Liujia Shao, Yan Luo, Yan Xu, Sibin Fan
-
Publication number: 20210311853Abstract: Methods, systems, and computer program products for automated breakpoint creation using machine learning are provided. Aspects include obtaining a bug report for a software and source code for the software and analyzing the bug report to determine a bug type for the bug report, where analyzing the bug report includes using a bug type labeling model. Aspects also include analyzing the source code to identify a code snippet in the source code based on the bug type, where analyzing the source code includes using a source code detection model. Aspects further include inserting a breakpoint in the source code at the code snippet.Type: ApplicationFiled: April 1, 2020Publication date: October 7, 2021Inventors: LIUJIA SHAO, YAN LUO, YAN XU, SIBIN FAN
-
Publication number: 20210286598Abstract: In an approach to creating code snippet auto-commenting models utilizing a pre-training model leveraging dependency data, one or more computer processors create a generalized pre-training model trained with one or more dependencies and one or more associated dependency embeddings, wherein dependencies include frameworks, imported libraries, header files, and application programming interfaces associated with a software project. The one or more computer processors create a subsequent model with a model architecture identical to the created pre-training model. The one or more computer processors computationally reduce a training of the created subsequent model utilizing one or more trained parameters, activations, memory cells, and context vectors contained in the created pre-training model. The one or more computer processors create deploy the subsequent model to one to more production environments.Type: ApplicationFiled: March 10, 2020Publication date: September 16, 2021Inventors: YAN LUO, Liujia Shao, YAN XU, SIBIN FAN
-
Patent number: 11119898Abstract: Techniques for automatic code coverage file recommendation are described herein. An aspect includes receiving historical code coverage data. Another aspect includes clustering the historical code coverage data. Another aspect includes performing content filtering based on the clustered historical code coverage data to determine a content filtering preferred file list. Another aspect includes performing collaborative filtering based on the clustered historical code coverage data to determine a collaborative filtering preferred file list. Another aspect includes combining the content filtering preferred file list and the collaborative filtering preferred file list to determine a code coverage file recommendation list. Another aspect includes providing the code coverage file recommendation list to a user.Type: GrantFiled: May 7, 2020Date of Patent: September 14, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Liujia Shao, Yan Luo, Yan Xu, Sibin Fan
-
Publication number: 20210263732Abstract: Techniques for context-based word embedding for programming artifacts are described herein. An aspect includes determining a plurality of keywords based on a corpus of programming artifacts, the corpus of programming artifacts including source code corresponding to a software project. Another aspect includes determining a plurality of context/keyword pair sets based on the plurality of keywords and the corpus of programming artifacts, wherein each context/keyword pair set of the plurality of context/keyword pair sets includes a first keyword, a second keyword, and a context type corresponding to a co-occurrence of the first keyword and the second keyword in the corpus of programming artifacts. Another aspect includes constructing a word embedding matrix based on the plurality of context/keyword pair sets.Type: ApplicationFiled: February 26, 2020Publication date: August 26, 2021Inventors: YAN LUO, Liujia Shao, YAN XU, SIBIN FAN
-
Publication number: 20210149793Abstract: Provided is a method, a system, and a computer program product for determining a cognitive code coverage weight for code snippets located in a portion of code. The method includes generating a set of samples from code snippets included in a portion of code. Each sample in the set of samples includes features derived from the code snippets. The method further includes generating corresponding labels for the set of samples and creating a training dataset by applying the labels to the set of samples. The training dataset includes the set of samples with each sample including a label from the labels generated. The method further includes training a machine learning model using the labeled dataset to output a code coverage weight for code snippets.Type: ApplicationFiled: November 19, 2019Publication date: May 20, 2021Inventors: Liujia Shao, Yan Luo, Yan Xu, Anton Karputkin