Patents by Inventor Wei-Peng Chen

Wei-Peng Chen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TABULAR GRAPH LANGUAGE MODEL WITH MULTI-MODAL LEARNING

Publication number: 20260170406

Abstract: According to an aspect of an embodiment, a method may include obtaining a tabular dataset. The tabular dataset may be converted into a first dataset having a first data type using a first converter. The tabular dataset may be converted into a second dataset having a second data type using a second converter. The method may further include generating, using a first encoder, a first set of embeddings in a first dimensional space based on the first dataset and generating, using a second encoder, a second set of embeddings in a second dimensional space based on the second dataset. The method may further include training one or both of the first encoder and the second encoder based on the first set of embeddings and the second set of embeddings.

Type: Application

Filed: December 18, 2024

Publication date: June 18, 2026

Applicant: Fujitsu Limited

Inventors: Anay MAJEE, Maria XENOCHRISTOU, Wei-Peng CHEN
Machine learning pipeline with visualizations

Patent number: 12619910

Abstract: A method may include obtaining a machine learning (ML) pipeline including a plurality of functional blocks within the ML pipeline. The method may also include using the ML pipeline as an input to a visualization predictor, where the visualization predictor may be trained to output one or more visualization commands based on relationships between the visualization commands and the functional blocks within the pipeline. The method may additionally include invoking the visualization commands to instantiate the ML pipeline with visualizations generated by the one or more visualization commands.

Type: Grant

Filed: March 29, 2022

Date of Patent: May 5, 2026

Assignee: FUJITSU LIMITED

Inventors: Lei Liu, Wei-Peng Chen
Domain-specific text labelling using natural language inference model

Patent number: 12524624

Abstract: In an embodiment, a set of texts associated with a domain is received. A set of hypothesis statements associated with the domain is received. A pre-trained natural language inference (NLI) model is applied on each of the received set of texts and on each of the received set of hypothesis statements. A second text corpus associated with the domain is generated. The generated second text corpus corresponds to a set of labels associated with the domain. A few-shot learning model is applied on the generated second text corpus to generate a third text corpus associated with the domain. The generated third text corpus is configured to fine-tune the applied pre-trained NLI model, and the fine-tuned NLI model is configured to label an input text associated with the domain. A display of the labelled input text on a display device is controlled.

Type: Grant

Filed: November 16, 2022

Date of Patent: January 13, 2026

Assignee: Fujitsu Limited

Inventors: Wei-Peng Chen, Mehdi Bahrami, Lei Liu
Deep parameter learning for code synthesis

Patent number: 12475318

Abstract: According to an aspect of an embodiment, operations for deep parameter learning for code synthesis are provided. The operations may include receiving a source code file and generating an abstract syntax tree (AST). The operations may further include determining a set of classes, and functions/procedures from the computer-executable code and extracting metadata associated to each component. The operations may further include selecting a subset of functions for which descriptions in the extracted metadata satisfy filtering criteria and updating the computer-executable code by filtering lines of code (LoCs) corresponding to the subset of functions/procedures. The operations may further include generating a dataset of code features and respective metadata features that includes a deep connection between parameters and its usage based on the updated computer-executable code and the metadata generation task.

Type: Grant

Filed: July 24, 2022

Date of Patent: November 18, 2025

Assignee: Fujitsu Limited

Inventors: Mehdi Bahrami, Wei-Peng Chen
DATASET ENCODING USING GENERATIVE ARTIFICIAL INTELLIGENCE

Publication number: 20250272553

Abstract: Operations may include identifying features corresponding to a dataset. An embedding for each feature may be obtained using a pretrained generative artificial intelligence model. Pair comparisons of the embeddings may be generated. An encoded dataset may be generated by applying, to the pair comparisons, weights computed using the pretrained generative artificial intelligence model. The weights may indicate correlation between features in the pair comparisons.

Type: Application

Filed: February 27, 2024

Publication date: August 28, 2025

Applicant: Fujitsu Limited

Inventors: Mehdi BAHRAMI, Wei-Peng CHEN
MACHINE LEARNING FOR TABULAR DATA

Publication number: 20250245561

Abstract: According to an aspect of at least one embodiment, one or more operations may include accessing a dataset including multiple data subsets. Each of the data subsets may include multiple tabular data values. A set of images may be generated from a data subset of the multiple data subsets. Each image of the set of images may be generated using a different configuration of an image generation process. A composite image may be formed using the set of images. The composite image may be input to a machine learning model to obtain a prediction for a value in the data subset. The machine learning model may be trained based on the prediction.

Type: Application

Filed: January 31, 2024

Publication date: July 31, 2025

Applicant: Fujitsu Limited

Inventors: Maria XENOCHRISTOU, Wei-Peng CHEN
DATA ADJUSTMENT USING LARGE LANGUAGE MODEL

Publication number: 20250209282

Abstract: A method may include accessing a dataset including multiple data subsets, each of the data subsets corresponding to a feature of the dataset. Data in the data subsets may be analyzed to determine a characteristic of the data. In addition, a prompt template may be selected from prompt templates for the one of the data subsets based on the determined characteristic of the data. Prompts may be generated using the prompt template and the data from the one of the data subsets. The prompts may be provided to an LLM. The prompts may command the LLM to perform one or more operations with respect to the data of the one of the data subsets. One or more additional data subsets may be created for the dataset based on response of the LLM. Each of the one or more additional data subsets may correspond to a new feature of the dataset.

Type: Application

Filed: December 21, 2023

Publication date: June 26, 2025

Applicant: Fujitsu Limited

Inventors: Lei LIU, Sou HASEGAWA, Wei-Peng CHEN
DATASET FEATURE TYPE INFERENCE

Publication number: 20250209373

Abstract: According to an aspect of an embodiment, one or more operations may include accessing a dataset including multiple data subsets. Feature type candidates corresponding to the data subsets may be identified. The one or more operations may further include building first machine learning models using different sets of feature type candidates. Each of the different sets of feature type candidates may be scored based on respective accuracies, relative to the dataset, of each first machine learning model that respectively corresponds to each different set of feature type candidates. A final set of feature types may be selected from the different sets of feature type candidates based on the scores of the different sets of feature types. The operations may further include training a second machine learning model using a labeled dataset that is generated by applying the final set of feature types to the dataset.

Type: Application

Filed: December 21, 2023

Publication date: June 26, 2025

Applicant: Fujitsu Limited

Inventors: Sou HASEGAWA, Lei LIU, Wei-Peng CHEN
EXPLORATORY OFFLINE GENERATIVE ONLINE MACHINE LEARNING

Publication number: 20240394564

Abstract: A method may include obtaining a set of preliminary tabular datasets and tasks to be performed by preliminary machine-learning (ML) pipelines. The method may further include training a meta-model that predicts performance of ML pipelines in performing the tasks using the preliminary ML pipelines, the preliminary ML pipelines synthesized as different approaches for performing the tasks. The method may also include obtaining a candidate tabular dataset and predicting, using the meta-model, performance of a plurality of candidate ML pipelines for performing the tasks on the candidate tabular dataset. The method may also include selecting a threshold number of top-performing candidates of the plurality of candidate ML pipelines as predicted by the meta-model for training to perform the tasks. In addition, the method may include identifying a top-performing ML pipeline based on performance of the trained top-performing candidates.

Type: Application

Filed: May 25, 2023

Publication date: November 28, 2024

Applicant: Fujitsu Limited

Inventors: Wei-Peng CHEN, Sou HASEGAWA, Mehdi BAHRAMI, Lei LIU
TRAINING MACHINE LEARNING SYSTEMS USING CUSTOM FEATURE ENGINEERING

Publication number: 20240346244

Abstract: A system may include one or more processors configured to perform one or more operations including obtaining a dataset. The operations may additionally include training a language model to determine relationships between data in data subsets in the obtained dataset. Further operations may include extracting a value and a title from data subsets in the dataset, and determining a question based on the titles, the values, and a target variable. The operations may additionally include sending the question to the language model to obtain a vector. Further, the operations may include determining based on the vector, an operation that may be performed using the data. The operations may additionally include synthesizing data related to the target variable. In some embodiments, the operations may additionally include adding the synthesized data to one or more data subsets in the dataset and modifying a machine learning pipeline using thedataset.

Type: Application

Filed: April 12, 2023

Publication date: October 17, 2024

Applicant: Fujitsu Limited

Inventors: Lei LIU, Wei-Peng CHEN
SYNTHESIZING ML PIPELINES FOR AUTOMATED PIPELINE RECOMMENDATIONS

Publication number: 20240329949

Abstract: According to an aspect of an embodiment, operations include receiving data comprising tabular datasets and code files. The operations further include generating a task specification corresponding to each dataset and determining data type information for features of each dataset. The operations further include extracting a plurality of API methods from the code files and generating an ML pipeline based on the data type information and the task specification. The operations further include obtaining variations of the ML pipeline based on options associated with at least one ML component and generating a database of pipelines based on the ML pipeline and the variations. The operations further include selecting candidate ML pipelines from the database based on an optimization approach and executing the candidate ML pipelines to evaluate a performance of each candidate pipeline on test data. The operations further include obtaining a training corpus of ML pipelines for pipeline recommendation.

Type: Application

Filed: March 31, 2023

Publication date: October 3, 2024

Applicant: Fujitsu Limited

Inventors: Wei-Peng CHEN, Sou HASEGAWA, Lei LIU, Mehdi BAHRAMI
GENERATING ML PIPELINES USING EXPLORATORY AND GENERATIVE CODE GENERATION TOOLS

Publication number: 20240330753

Abstract: Operations include receiving an input dataset associated with a machine learning (ML) task and generating a first ML pipeline associated with the ML task by executing a code generation tool. The operations further include executing one or more exploratory code generation tools and selecting a pipeline component from the set of pipeline components. Also included are modification of the first ML pipeline based on the selection to generate a second ML pipeline and determination of a first performance metric by executing the first ML pipeline on the input dataset. The operations further include determining a second performance metric by executing the second ML pipeline on the input dataset and controlling an electronic device to render an ML pipeline recommendation as one of the first ML pipeline or the second ML pipeline, based on a comparison of the first performance metric with the second performance metric.

Type: Application

Filed: March 31, 2023

Publication date: October 3, 2024

Applicant: Fujitsu Limited

Inventors: Wei-Peng CHEN, Lei LIU, Mehdi BAHRAMI
Code enrichment through metadata for code synthesis

Patent number: 12093654

Abstract: According to an aspect of an embodiment, operations for code enrichment through metadata for code synthesis are provided. The operations include acquiring package data that include source code files and package metadata. The operations further include extracting additional metadata associated with software package and preparing metadata features based on the package metadata and the additional metadata. The operations further include identifying a set of target portions of a source code included in the source code files and updating one or more source code files using the metadata features. Such files are updated by performing at least one of a revision of existing code comments, and an addition of new code comments for the target portions. The operations further include generating a dataset of natural language (NL) text features and respective code features and training a language model on a sequence-to-sequence generation task.

Type: Grant

Filed: July 24, 2022

Date of Patent: September 17, 2024

Assignee: FUJITSU LIMITED

Inventors: Mehdi Bahrami, Wei-Peng Chen
AUTOMATED EXPLORATORY DATA ANALYSIS (EDA)

Publication number: 20240289420

Abstract: In an embodiment, a statistical analysis tool is applied on a first set of datapoints related to a first variable associated with a dataset. Based on the application of the statistical analysis tool, statistical information related to the first variable is determined. A set of patterns associated with the first set of datapoints is determined, based on the determined statistical information. Thereafter, a first set of predefined templates associated with the determined set of patterns is determined. Further, a natural language model is applied on the retrieved first set of predefined templates and on the determined statistical information. A first textual explanation of the determined set of patterns is determined, based on the application of the natural language model on the retrieved first set of predefined templates and on the determined statistical information. Further, the determined first textual explanation is rendered on a display device.

Type: Application

Filed: February 23, 2023

Publication date: August 29, 2024

Applicant: Fujitsu Limited

Inventors: Mehdi BAHRAMI, Wei-Peng CHEN, Mukul PRASAD
Code enrichment for training language models relating to computer programming

Patent number: 12019992

Abstract: According to an aspect of an embodiment, operations for code enrichment for training language models on tasks related to computer programming are provided. The operations include receiving source code data including a computer-executable code and a natural language (NL) text. The operations further include determining blocks of code from the computer-executable code. The operations further include extracting a set of features related to components of the source code data from the blocks of code. The extraction is performed by parsing the blocks of code using Abstract Syntax Tree (AST) data of the blocks of code. The operations further include revising the AST data. The operations further include updating the source code data based on the revised AST data and generating a dataset of NL and abstracted code features as training data based on the updated source code data and further training a language model on a sequence-to-sequence generation task.

Type: Grant

Filed: March 31, 2022

Date of Patent: June 25, 2024

Assignee: FUJITSU LIMITED

Inventors: Mehdi Bahrami, Wei-Peng Chen
DOMAIN-SPECIFIC TEXT LABELLING USING NATURAL LANGUAGE INFERENCE MODEL

Publication number: 20240160852

Abstract: In an embodiment, a set of texts associated with a domain is received. A set of hypothesis statements associated with the domain is received. A pre-trained natural language inference (NLI) model is applied on each of the received set of texts and on each of the received set of hypothesis statements. A second text corpus associated with the domain is generated. The generated second text corpus corresponds to a set of labels associated with the domain. A few-shot learning model is applied on the generated second text corpus to generate a third text corpus associated with the domain. The generated third text corpus is configured to fine-tune the applied pre-trained NLI model, and the fine-tuned NLI model is configured to label an input text associated with the domain. A display of the labelled input text on a display device is controlled.

Type: Application

Filed: November 16, 2022

Publication date: May 16, 2024

Applicant: Fujitsu Limited

Inventors: Wei-Peng CHEN, Mehdi BAHRAMI, Lei LIU
MACHINE LEARNING ALGORITHM SELECTION

Publication number: 20240143702

Abstract: A method of machine learning algorithm selection may include obtaining a dataset that includes multiple data entries. In some embodiments, each of the data entries may include multiple features and one of the multiple features may be designated as a target variable. The method may further include selecting a subset of the data entries. In some embodiments, selecting the subset of the data entries may include binning the data entries into multiple data bins based on values in the target variable and selecting a subset of the binned data entries from each of the multiple data bins as the subset of the data entries. The method may further include constructing multiple machine learning models using the subset of the data entries and selecting one of the multiple machine learning models based on an evaluation of the multiple machine learning models.

Type: Application

Filed: October 31, 2022

Publication date: May 2, 2024

Applicant: Fujitsu Limited

Inventors: Mehdi BAHRAMI, Wei-Peng CHEN, Mukul PRASAD
RECOMMENDING VERSION UPDATES FOR SOFTWARE PACKAGES

Publication number: 20240111512

Abstract: According to an aspect of an embodiment, operations for recommending version updates for software packages are provided. The operations may include receiving an input which indicates a usage of a first version of a first software package inside a source code of a software and determining a second version of the first software package. The operations may further include selecting one or more constraints from a set of constraints and executing a set of checks based on the selected constraints to determine a suitability of the second version as an update for the first version. The set of constraints may include a security constraint, a backward compatibility constraint, an interoperability constraint, and a performance constraint. The operations may further include controlling an electronic device to render user-assistive information that includes a recommendation to update the first version to the second version.

Type: Application

Filed: September 30, 2022

Publication date: April 4, 2024

Applicant: Fujitsu Limited

Inventors: Lei LIU, Wei-Peng CHEN
Code retrieval based on multi-class classification

Patent number: 11868731

Abstract: According to an aspect of an embodiment, operations include receiving a set of NL descriptors and a corresponding set of PL codes. The operations further include determining a first vector associated with each NL descriptor and a second vector associated with each PL code, using language models. The operations further include determining a number of a set of semantic code classes to cluster the set of PL codes into the set of semantic code classes, based on the number, the first vector, and the second vector. The operations further include training a multi-class classifier model to predict a semantic code class, from the set of semantic code classes, corresponding to an input NL descriptor. The operations further include selecting an intra-class predictor model based on the predicted semantic code class. The operations further include training the intra-class predictor model to predict a PL code corresponding to the input NL descriptor.

Type: Grant

Filed: March 31, 2022

Date of Patent: January 9, 2024

Assignee: FUJITSU LIMITED

Inventors: Mehdi Bahrami, Wei-Peng Chen
MACHINE LEARNING PIPELINE WITH VISUALIZATIONS

Publication number: 20230316123

Abstract: A method may include obtaining a machine learning (ML) pipeline including a plurality of functional blocks within the ML pipeline. The method may also include using the ML pipeline as an input to a visualization predictor, where the visualization predictor may be trained to output one or more visualization commands based on relationships between the visualization commands and the functional blocks within the pipeline. The method may additionally include invoking the visualization commands to instantiate the ML pipeline with visualizations generated by the one or more visualization commands.

Type: Application

Filed: March 29, 2022

Publication date: October 5, 2023

Applicant: FUJITSU LIMITED

Inventors: Lei LIU, Wei-Peng CHEN

1 2 3 4 5 … next