Patents by Inventor Long Vu
Long Vu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240428130Abstract: According to a present invention embodiment, a system identifies a plurality of configurations for machine learning models. Each configuration indicates a machine learning model and a corresponding technique to determine parameters for the machine learning model. The plurality of configurations are evaluated by training the machine learning model of the plurality of configurations according to the parameters determined by the corresponding technique. Performance of the machine learning models of the plurality of configurations is monitored, and resources used for evaluating at least one configuration are adjusted based on the performance of the machine learning model for the at least one configuration relative to the performance of the machine learning models of others of the plurality of configurations. Embodiments of the present invention further include a method and computer program product for training machine learning models in substantially the same manner described above.Type: ApplicationFiled: June 26, 2023Publication date: December 26, 2024Inventors: Long VU, Peter Daniel Kirchner, Radu Marinescu, Dharmashankar Subramanian, Nhan Huu Pham
-
Publication number: 20240428124Abstract: Embodiments of the invention are directed to a computer system including a memory communicatively coupled to a processor system. The processor system is operable to perform processor system operations that include using a first machine learning (ML) algorithm to convert to-be-classified-data (TBC-data) from a TBC-data format to a second data format; and extract features from the TBC-data in the second data format. A second ML algorithm is used to perform a task that includes determining, based at least in part on the features of the TBC-data in the second data format, that the TBC-data having the second data format is an outlier.Type: ApplicationFiled: June 21, 2023Publication date: December 26, 2024Inventors: Long Vu, Peter Daniel Kirchner, Horst Cornelius Samulowitz, Charu C. Aggarwal
-
Publication number: 20240403726Abstract: Disclosed embodiments may include a system for identifying Markov Decision Process (MDP) solutions. The system may receive input data including one or more first states and one or more first actions. The system may identify, via a machine learning model (MLM), a subset of the input data. The system may formulate, via the MLM, a search space based on the subset of the input data, the search space including one or more second states and one or more second actions. The system may conduct, via the MLM, hyperparameter tuning of the search space. The system may generate, via the MLM, an MDP instance based on the hyperparameter tuning. The system may determine, via the MLM, whether the generated MDP instance includes a first MDP solution.Type: ApplicationFiled: June 1, 2023Publication date: December 5, 2024Inventors: Long Vu, Alexander Zadorojniy, Dharmashankar Subramanian
-
Patent number: 12066813Abstract: A relationship between an input, a set-point of a plurality of processes and an output of a corresponding process is learned using machine learning. A regression function is derived for each process based upon historical data. An autoencoder is trained for each process based upon the historical data to form a regularizer and the regression functions and regularizers are merged together into a unified optimization problem. System level optimization is performed using the regression functions and regularizers and a set of optimal set-points of a global optimal solution for operating the processes is determined. An industrial system is operated based on the set of optimal set-points.Type: GrantFiled: March 16, 2022Date of Patent: August 20, 2024Assignee: International Business Machines CorporationInventors: Dzung Tien Phan, Long Vu, Dharmashankar Subramanian
-
Patent number: 11966340Abstract: To automate time series forecasting machine learning pipeline generation, a data allocation size of time series data may be determined based on one or more characteristics of a time series data set. The time series data may be allocated for use by candidate machine learning pipelines based on the data allocation size. Features for the time series data may be determined and cached by the candidate machine learning pipelines. Predictions of each of the candidate machine learning pipelines using at least the one or more features may be evaluated. A ranked list of machine learning pipelines may be automatically generated from the candidate machine learning pipelines for time series forecasting based upon evaluating predictions of each of the one or more candidate machine learning pipelines.Type: GrantFiled: March 15, 2022Date of Patent: April 23, 2024Assignee: International Business Machines CorporationInventors: Long Vu, Bei Chen, Xuan-Hong Dang, Peter Daniel Kirchner, Syed Yousaf Shah, Dhavalkumar C. Patel, Si Er Han, Ji Hui Yang, Jun Wang, Jing James Xu, Dakuo Wang, Gregory Bramble, Horst Cornelius Samulowitz, Saket K. Sathe, Wesley M. Gifford, Petros Zerfos
-
Patent number: 11868230Abstract: Computer hardware and/or software that performs the following operations: (i) assessing a performance of a plurality of unsupervised machine learning pipelines against a plurality of data sets; (ii) associating the performance with meta-features corresponding to respective pipeline/data set combinations; (iii) training a supervised meta-learning model using the associated performance and meta-features as training data; and (iv) utilizing the trained model to identify one or more pipelines for processing an input data set.Type: GrantFiled: March 11, 2022Date of Patent: January 9, 2024Assignee: International Business Machines CorporationInventors: Saket K. Sathe, Long Vu, Peter Daniel Kirchner, Horst Cornelius Samulowitz
-
Patent number: 11829799Abstract: A method, a structure, and a computer system for predicting pipeline training requirements. The exemplary embodiments may include receiving one or more worker node features from one or more worker nodes, extracting one or more pipeline features from one or more pipelines to be trained, and extracting one or more dataset features from one or more datasets used to train the one or more pipelines. The exemplary embodiments may further include predicting an amount of one or more resources required for each of the one or more worker nodes to train the one or more pipelines using the one or more datasets based on one or more models that correlate the one or more worker node features, one or more pipeline features, and one or more dataset features with the one or more resources. Lastly, the exemplary embodiments may include identifying a worker node requiring a least amount of the one or more resources of the one or more worker nodes for training the one or more pipelines.Type: GrantFiled: October 13, 2020Date of Patent: November 28, 2023Assignee: International Business Machines CorporationInventors: Saket Sathe, Gregory Bramble, Long Vu, Theodoros Salonidis
-
Publication number: 20230342627Abstract: Predefined pipelines may be created with predefined meta-features. Time series data may be segmented using lookback window parameters. Meta-features may be determined for windowed data. Those of the predefined pipelines having a maximum amount of matching predefined meta-features may be determined. Those of the lookback window parameters that result in the windowed data having the meta-features most similar to the meta-features of one or more of the plurality of predefined pipelines may be identified.Type: ApplicationFiled: April 22, 2022Publication date: October 26, 2023Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Long VU, Saket K SATHE, Peter Daniel KIRCHNER, Gregory BRAMBLE
-
Publication number: 20230316150Abstract: A method includes training, by one or more processing devices, a plurality of machine learning predictive models, thereby generating a plurality of trained machine learning predictive models. The method further includes generating, by the one or more processing devices, a solved machine learning optimization model, based at least in part on the plurality of trained machine learning predictive models. The method further includes outputting, by the one or more processing devices, one or more control input and predicted outputs based at least in part on the solved machine learning optimization model.Type: ApplicationFiled: March 30, 2022Publication date: October 5, 2023Inventors: Dzung Tien Phan, Long Vu, Lam Minh Nguyen, Dharmashankar Subramanian
-
Publication number: 20230297073Abstract: A relationship between an input, a set-point of a plurality of processes and an output of a corresponding process is learned using machine learning. A regression function is derived for each process based upon historical data. An autoencoder is trained for each process based upon the historical data to form a regularizer and the regression functions and regularizers are merged together into a unified optimization problem. System level optimization is performed using the regression functions and regularizers and a set of optimal set-points of a global optimal solution for operating the processes is determined. An industrial system is operated based on the set of optimal set-points.Type: ApplicationFiled: March 16, 2022Publication date: September 21, 2023Inventors: Dzung Tien Phan, Long VU, Dharmashankar Subramanian
-
Publication number: 20230289277Abstract: Computer hardware and/or software that performs the following operations: (i) assessing a performance of a plurality of unsupervised machine learning pipelines against a plurality of data sets; (ii) associating the performance with meta-features corresponding to respective pipeline/data set combinations; (iii) training a supervised meta-learning model using the associated performance and meta-features as training data; and (iv) utilizing the trained model to identify one or more pipelines for processing an input data set.Type: ApplicationFiled: March 11, 2022Publication date: September 14, 2023Inventors: Saket K. Sathe, Long VU, Peter Daniel Kirchner, Horst Cornelius Samulowitz
-
Publication number: 20230237385Abstract: A computer-implemented method for configuring a plurality of machine learning pipelines into a machine learning pipeline ensemble is disclosed. The computer-implemented method includes determining, by a reinforcement learning agent coupled to a machine learning pipeline, performance information of the machine learning pipeline. The computer-implemented method further includes receiving, by the reinforcement learning agent, configuration parameter values of uncoupled machine learning pipelines of the plurality of machine learning pipelines. The computer-implemented method further includes adjusting, by the reinforcement learning agent, configuration parameter values of the machine learning pipeline based on the performance information of the machine learning pipeline and the configuration parameter values of the uncoupled machine learning pipelines.Type: ApplicationFiled: January 25, 2022Publication date: July 27, 2023Inventors: Lan Ngoc Hoang, Long Vu
-
Patent number: 11688111Abstract: Systems, computer-implemented methods, and computer program products to facilitate visualization of a model selection process are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise an interaction backend handler component that obtains one or more assessment metrics of a model pipeline candidate. The computer executable components can further comprise a visualization render component that renders a progress visualization of the model pipeline candidate based on the one or more assessment metrics.Type: GrantFiled: July 29, 2020Date of Patent: June 27, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Dakuo Wang, Bei Chen, Ji Hui Yang, Abel Valente, Arunima Chaudhary, Chuang Gan, John Dillon Eversman, Voranouth Supadulya, Daniel Karl I. Weidele, Jun Wang, Jing James Xu, Dhavalkumar C. Patel, Long Vu, Syed Yousaf Shah, Si Er Han
-
Publication number: 20230177387Abstract: A method, system, and computer program product for a metalearner for automated machine learning are provided. The method receives a labeled data set. A set of data subsets is generated from the labeled data set. A set of unsupervised machine learning pipelines is generated. A training set is generated from the set of data subsets and the set of unsupervised machine learning pipelines. The method trains a metalearner for unsupervised tasks based on the training set.Type: ApplicationFiled: December 8, 2021Publication date: June 8, 2023Inventors: Saket Sathe, Long Vu, Peter Daniel Kirchner, Charu C. Aggarwal
-
Patent number: 11620582Abstract: Techniques regarding one or more automated machine learning processes that analyze time series data are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a time series analysis component that selects a machine learning pipeline for meta transfer learning on time series data by sequentially allocating subsets of training data from the time series data amongst a plurality of machine learning pipeline candidates.Type: GrantFiled: July 29, 2020Date of Patent: April 4, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bei Chen, Long Vu, Syed Yousaf Shah, Xuan-Hong Dang, Peter Daniel Kirchner, Si Er Han, Ji Hui Yang, Jun Wang, Jing James Xu, Dakuo Wang, Dhavalkumar C. Patel, Gregory Bramble, Horst Cornelius Samulowitz, Saket Sathe, Chuang Gan
-
Publication number: 20220358388Abstract: Methods and systems for generating an environment include training transformer models from tabular data and relationship information about the training data. A directed acyclic graph is generated, that includes the transformer models as nodes. The directed acyclic graph is traversed to identify a subset of transformers that are combined in order. An environment is generated using the subset of transformers.Type: ApplicationFiled: May 10, 2021Publication date: November 10, 2022Inventors: Long Vu, Dharmashankar Subramanian, Peter Daniel Kirchner, Eliezer Segev Wasserkrug, Lan Ngoc Hoang, Alexander Zadorojniy
-
Publication number: 20220343207Abstract: In a method for ranking machine learning (ML) pipelines for a dataset, a processor receives first performance curves predicted by a meta learner model for a plurality of ML pipelines. A processor allocates a first subset of data points from the dataset to each of the plurality of ML pipelines. A processor receives first performance scores for each of the ML pipelines for the first subset of data points. A processor updates the meta learner model using the first performance scores. A processor receives second performance curves from the meta learner model updated with the first performance scores. A processor ranks the plurality of ML pipelines based on the second performance curves.Type: ApplicationFiled: April 22, 2021Publication date: October 27, 2022Inventors: Long Vu, Saket Sathe, Bei Chen, Peter Daniel Kirchner
-
Publication number: 20220327058Abstract: To automate time series forecasting machine learning pipeline generation, a data allocation size of time series data may be determined based on one or more characteristics of a time series data set. The time series data may be allocated for use by candidate machine learning pipelines based on the data allocation size. Features for the time series data may be determined and cached by the candidate machine learning pipelines. Predictions of each of the candidate machine learning pipelines using at least the one or more features may be evaluated. A ranked list of machine learning pipelines may be automatically generated from the candidate machine learning pipelines for time series forecasting based upon evaluating predictions of each of the one or more candidate machine learning pipelines.Type: ApplicationFiled: March 15, 2022Publication date: October 13, 2022Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Long VU, Bei CHEN, Xuan-Hong DANG, Peter Daniel KIRCHNER, Syed Yousaf SHAH, Dhavalkumar C. PATEL, Si Er HAN, Ji Hui YANG, Jun WANG, Jing James XU, Dakuo WANG, Gregory BRAMBLE, Horst Cornelius SAMULOWITZ, Saket K. SATHE, Wesley M. GIFFORD, Petros ZERFOS
-
Publication number: 20220261598Abstract: To rank time series forecasting in machine learning pipelines, time series data may be incrementally allocated from a time series data set for testing by candidate machine learning pipelines based on seasonality or a degree of temporal dependence of the time series data. Intermediate evaluation scores may be provided by each of the candidate machine learning pipelines following each time series data allocation. One or more machine learning pipelines may be automatically selected from a ranked list of the one or more candidate machine learning pipelines based on a projected learning curve generated from the intermediate evaluation scores.Type: ApplicationFiled: October 26, 2021Publication date: August 18, 2022Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bei CHEN, Long VU, Dhavalkumar C. PATEL, Syed Yousaf SHAH, Gregory BRAMBLE, Peter Daniel KIRCHNER, Horst Cornelius SAMULOWITZ, Xuan-Hong DANG, Petros ZERFOS
-
Patent number: 11416469Abstract: In an approach to unsupervised feature learning for relational data, a computer trains one or more entity aware autoencoders on one or more tables in a relational database, where each of the one or more entity aware autoencoders corresponds to one of the one or more tables in the relational database, and where each of the one or more entity aware autoencoders are comprised of an encoder and a decoder. A computer transforms each of the one or more tables in the relational database with the encoder of the corresponding trained entity aware autoencoder. A computer joins a first transformed table of the one or more tables in the relational database with each remaining one or more transformed tables in the relational database to form one or more joined tables. A computer aggregates the one or more joined tables. A computer outputs one or more feature representations.Type: GrantFiled: November 24, 2020Date of Patent: August 16, 2022Assignee: International Business Machines CorporationInventors: Thanh Lam Hoang, Long Vu, Theodoros Salonidis, Gregory Bramble