Patents by Inventor Long Vu

Long Vu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DISTANCE-BASED LOGIT VALUES FOR NATURAL LANGUAGE PROCESSING

Publication number: 20250117591

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

Type: Application

Filed: December 19, 2024

Publication date: April 10, 2025

Applicant: Oracle International Corporation

Inventors: Ying XU, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
SYSTEM AND TECHNIQUES FOR HANDLING LONG TEXT FOR PRE-TRAINED LANGUAGE MODELS

Publication number: 20250117585

Abstract: In some aspects, a computing device may receive, at a data processing system, a set of utterances for training or inferencing with a named entity recognizer to assign a label to each token piece from the set of utterances. The computing device may determine a length of each utterance in the set and when the length of the utterance exceeds a pre-determined threshold of token pieces: dividing the utterance into a plurality of overlapping chunks of token pieces; assigning a label together with a confidence score for each token piece in a chunk; determining a final label and an associated confidence score for each chunk of token pieces by merging two confidence scores; determining a final annotated label for the utterance based at least on the merging the two confidence scores; and storing the final annotated label in a memory.

Type: Application

Filed: December 19, 2024

Publication date: April 10, 2025

Applicant: Oracle International Corporation

Inventors: Thanh Tien Vu, Tuyen Quang Pham, Mark Edward Johnson, Thanh Long Duong, Ying Xu, Poorya Zaremoodi, Omid Mohamad Nezami, Budhaditya Saha, Cong Duy Vu Hoang
MANAGING AMBIGUOUS DATE MENTIONS IN TRANSFORMING NATURAL LANGUAGE TO A LOGICAL FORM

Publication number: 20250095635

Abstract: Techniques are disclosed herein for managing ambiguous date mentions in natural language utterances in transforming natural language utterances to logical forms by encoding the uncertainties of the ambiguous date mentions and including the encoded uncertainties in the logical forms. In a training phase, training examples including natural language utterances, logical forms, and database schema information are automatically augmented and used to train a machine learning model to convert natural language utterances to logical form. In an inference phase, input database schema information is augmented and used by the trained machine learning model to convert an input natural language utterance to logical form.

Type: Application

Filed: May 6, 2024

Publication date: March 20, 2025

Applicant: Oracle International Corporation

Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Stephen Andrew McRitchie, Steve Wai-Chun Siu, Dalu Guo, Christopher Mark Broadbent, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Kenneth Khiaw Hong Eng, Chandan Basavaraju
MANAGING DATE-TIME INTERVALS IN TRANSFORMING NATURAL LANGUAGE TO A LOGICAL FORM

Publication number: 20250094737

Abstract: Techniques are disclosed herein for managing date-time intervals in transforming natural language utterances to logical forms by providing an enhanced grammar, a natural language utterance comprising a date-time interval, and database schema information to a machine learning model that has been trained to convert natural language utterances to logical forms; and using the machine learning model to convert the natural language utterance to an output logical form, wherein the output logical form comprises at least one of the date-time interval and an extraction function for extracting date-time information corresponding to the date-time interval from at least one date-time attribute of the database schema information.

Type: Application

Filed: August 5, 2024

Publication date: March 20, 2025

Applicant: Oracle International Corporation

Inventors: Gioacchino Tangari, Cong Duy Vu Hoang, Dalu Guo, Steve Wai-Chun Siu, Stephen Andrew McRitchie, Christopher Mark Broadbent, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Chandan Basavaraju, Kenneth Khiaw Hong Eng
MODEL SEARCH AND OPTIMIZATION

Publication number: 20250068902

Abstract: Methods and systems for tuning a model include generating pipelines. The pipelines have elements that include at least an agent, a foundation model, and a tuning type. Hyperparameters of elements of the pipelines are set in accordance with an input task. Elements of the pipelines are tuned in accordance with the input task. The input task is performed using a highest-performance pipeline.

Type: Application

Filed: August 22, 2023

Publication date: February 27, 2025

Inventors: Long VU, Dharmashankar Subramanian, Radu Marinescu
TECHNIQUES FOR TRANSFORMING NATURAL LANGUAGE CONVERSATION INTO A VISUALIZATION REPRESENTATION

Publication number: 20250068627

Abstract: Techniques are disclosed herein for transforming natural language conversations into a visual output. In one aspect, a computer-implement method includes generating an input string by concatenating a natural language utterance with a schema representation comprising a set of entities for visualization actions, generating, by a first encoder of a machine learning model, one or more embeddings of the input string, encoding, by a second encoder of the machine learning model, relations between elements in the schema representation and words in the natural language utterance based on the one or more embeddings, generating, by a grammar-based decoder of the machine learning model and based on the encoded relations and the one or more embeddings, an intermediate logical form that represents at least the query, the one or more visualization actions, or the combination thereof, and generating, based on the intermediate logical form, a command for a computing system.

Type: Application

Filed: March 26, 2024

Publication date: February 27, 2025

Applicant: Oracle International Corporation

Inventors: Cong Duy Vu Hoang, Gioacchino Tangari, Stephen Andrew McRitchie, Nitika Mathur, Aashna Devang Kanuga, Steve Wai-Chun Siu, Dalu Guo, Chang Xu, Mark Edward Johnson, Christopher Mark Broadbent, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Chandan Basavaraju, Kenneth Khiaw Hong Eng
TECHNIQUES FOR MANUFACTURING TRAINING DATA TO TRANSFORM NATURAL LANGUAGE INTO A VISUALIZATION REPRESENTATION

Publication number: 20250068626

Abstract: The present disclosure relates to manufacturing training data by leveraging an automated pipeline that manufactures visualization training datasets to train a machine learning model to convert a natural language utterance into meaning representation language logical form that includes one or more visualization actions. Aspects are directed towards accessing an original training dataset, a visualization query dataset, an incremental visualization dataset, a manipulation visualization dataset, or any combination thereof. One or more visualization training datasets are generated by: (i) modifying examples in the original training dataset, the visualization query dataset, or both to include visualization actions, (ii) generating examples, using the incremental visualization dataset, the manipulation visualization dataset, or both, that include visualization actions, or (iii) both (i) and (ii).

Type: Application

Filed: March 1, 2024

Publication date: February 27, 2025

Applicant: Oracle International Corporation

Inventors: Gioacchino Tangari, Steve Wai-Chun Siu, Dalu Guo, Cong Duy Vu Hoang, Berk Sarioz, Chang Xu, Stephen Andrew McRitchie, Mark Edward Johnson, Christopher Mark Broadbent, Thanh Long Duong, Srinivasa Phani Kumar Gadde, Vishal Vishnoi, Chandan Basavaraju, Kenneth Khiaw Hong Eng
AUTOMATIC DECOMPOSITION METHOD FOR MDP

Publication number: 20250045608

Abstract: A method for Markov Decision Process (“MDP”) decomposition includes receiving data elements for a problem that include finite state data for a set of state variables and a finite set of actions. A portion of the state data corresponding to state variables represents states. The method incudes creating two or more sub-MDPs. Each sub-MDP includes a portion of the set of state variables, the set of actions and a same reward function. The method includes executing each sub-MDP. Results include a policy and an expected reward from the reward function. The policy of the sub-MDP maps states of the sub-MDP to actions. The method includes aggregating, based on the expected rewards of the results, the actions of the policies of the sub-MDPs to create a resultant policy with resultant actions and generating, using state entries for the set of state variables, results to the problem based on the resultant policy.

Type: Application

Filed: August 3, 2023

Publication date: February 6, 2025

Inventors: Alexander Zadorojniy, Long Vu, Dharmashankar Subramanian
Distance-based logit value for natural language processing

Patent number: 12210842

Abstract: Techniques for using logit values for classifying utterances and messages input to chatbot systems in natural language processing. A method can include a chatbot system receiving an utterance generated by a user interacting with the chatbot system. The chatbot system can input the utterance into a machine-learning model including a set of binary classifiers. Each binary classifier of the set of binary classifiers can be associated with a modified logit function. The method can also include the machine-learning model using the modified logit function to generate a set of distance-based logit values for the utterance. The method can also include the machine-learning model applying an enhanced activation function to the set of distance-based logit values to generate a predicted output. The method can also include the chatbot system classifying, based on the predicted output, the utterance as being associated with the particular class.

Type: Grant

Filed: December 19, 2023

Date of Patent: January 28, 2025

Assignee: Oracle International Corporation

Inventors: Ying Xu, Poorya Zaremoodi, Thanh Tien Vu, Cong Duy Vu Hoang, Vladislav Blinov, Yu-Heng Hong, Yakupitiyage Don Thanuja Samodhye Dharmasiri, Vishal Vishnoi, Elias Luqman Jalaluddin, Manish Parekh, Thanh Long Duong, Mark Edward Johnson
System and techniques for handling long text for pre-trained language models

Patent number: 12210830

Abstract: In some aspects, a computing device may receive, at a data processing system, a set of utterances for training or inferencing with a named entity recognizer to assign a label to each token piece from the set of utterances. The computing device may determine a length of each utterance in the set and when the length of the utterance exceeds a pre-determined threshold of token pieces: dividing the utterance into a plurality of overlapping chunks of token pieces; assigning a label together with a confidence score for each token piece in a chunk; determining a final label and an associated confidence score for each chunk of token pieces by merging two confidence scores; determining a final annotated label for the utterance based at least on the merging the two confidence scores; and storing the final annotated label in a memory.

Type: Grant

Filed: May 20, 2022

Date of Patent: January 28, 2025

Assignee: Oracle International Corporation

Inventors: Thanh Tien Vu, Tuyen Quang Pham, Mark Edward Johnson, Thanh Long Duong, Ying Xu, Poorya Zaremoodi, Omid Mohamad Nezami, Budhaditya Saha, Cong Duy Vu Hoang
REINFORCEMENT MACHINE LEARNING WITH MULTI-LEVEL AGENT SEARCH AND HYPERPARAMETER OPTIMIZATION

Publication number: 20240428130

Abstract: According to a present invention embodiment, a system identifies a plurality of configurations for machine learning models. Each configuration indicates a machine learning model and a corresponding technique to determine parameters for the machine learning model. The plurality of configurations are evaluated by training the machine learning model of the plurality of configurations according to the parameters determined by the corresponding technique. Performance of the machine learning models of the plurality of configurations is monitored, and resources used for evaluating at least one configuration are adjusted based on the performance of the machine learning model for the at least one configuration relative to the performance of the machine learning models of others of the plurality of configurations. Embodiments of the present invention further include a method and computer program product for training machine learning models in substantially the same manner described above.

Type: Application

Filed: June 26, 2023

Publication date: December 26, 2024

Inventors: Long VU, Peter Daniel Kirchner, Radu Marinescu, Dharmashankar Subramanian, Nhan Huu Pham
OUTLIER DETECTION WITH TRANSFER LEARNING

Publication number: 20240428124

Abstract: Embodiments of the invention are directed to a computer system including a memory communicatively coupled to a processor system. The processor system is operable to perform processor system operations that include using a first machine learning (ML) algorithm to convert to-be-classified-data (TBC-data) from a TBC-data format to a second data format; and extract features from the TBC-data in the second data format. A second ML algorithm is used to perform a task that includes determining, based at least in part on the features of the TBC-data in the second data format, that the TBC-data having the second data format is an outlier.

Type: Application

Filed: June 21, 2023

Publication date: December 26, 2024

Inventors: Long Vu, Peter Daniel Kirchner, Horst Cornelius Samulowitz, Charu C. Aggarwal
SYSTEMS AND METHODS FOR IDENTIFYING MARKOV DECISION PROCESS SOLUTIONS

Publication number: 20240403726

Abstract: Disclosed embodiments may include a system for identifying Markov Decision Process (MDP) solutions. The system may receive input data including one or more first states and one or more first actions. The system may identify, via a machine learning model (MLM), a subset of the input data. The system may formulate, via the MLM, a search space based on the subset of the input data, the search space including one or more second states and one or more second actions. The system may conduct, via the MLM, hyperparameter tuning of the search space. The system may generate, via the MLM, an MDP instance based on the hyperparameter tuning. The system may determine, via the MLM, whether the generated MDP instance includes a first MDP solution.

Type: Application

Filed: June 1, 2023

Publication date: December 5, 2024

Inventors: Long Vu, Alexander Zadorojniy, Dharmashankar Subramanian
Prediction and operational efficiency for system-wide optimization of an industrial processing system

Patent number: 12066813

Abstract: A relationship between an input, a set-point of a plurality of processes and an output of a corresponding process is learned using machine learning. A regression function is derived for each process based upon historical data. An autoencoder is trained for each process based upon the historical data to form a regularizer and the regression functions and regularizers are merged together into a unified optimization problem. System level optimization is performed using the regression functions and regularizers and a set of optimal set-points of a global optimal solution for operating the processes is determined. An industrial system is operated based on the set of optimal set-points.

Type: Grant

Filed: March 16, 2022

Date of Patent: August 20, 2024

Assignee: International Business Machines Corporation

Inventors: Dzung Tien Phan, Long Vu, Dharmashankar Subramanian
Automated time series forecasting pipeline generation

Patent number: 11966340

Abstract: To automate time series forecasting machine learning pipeline generation, a data allocation size of time series data may be determined based on one or more characteristics of a time series data set. The time series data may be allocated for use by candidate machine learning pipelines based on the data allocation size. Features for the time series data may be determined and cached by the candidate machine learning pipelines. Predictions of each of the candidate machine learning pipelines using at least the one or more features may be evaluated. A ranked list of machine learning pipelines may be automatically generated from the candidate machine learning pipelines for time series forecasting based upon evaluating predictions of each of the one or more candidate machine learning pipelines.

Type: Grant

Filed: March 15, 2022

Date of Patent: April 23, 2024

Assignee: International Business Machines Corporation

Inventors: Long Vu, Bei Chen, Xuan-Hong Dang, Peter Daniel Kirchner, Syed Yousaf Shah, Dhavalkumar C. Patel, Si Er Han, Ji Hui Yang, Jun Wang, Jing James Xu, Dakuo Wang, Gregory Bramble, Horst Cornelius Samulowitz, Saket K. Sathe, Wesley M. Gifford, Petros Zerfos
Automated unsupervised machine learning utilizing meta-learning

Patent number: 11868230

Abstract: Computer hardware and/or software that performs the following operations: (i) assessing a performance of a plurality of unsupervised machine learning pipelines against a plurality of data sets; (ii) associating the performance with meta-features corresponding to respective pipeline/data set combinations; (iii) training a supervised meta-learning model using the associated performance and meta-features as training data; and (iv) utilizing the trained model to identify one or more pipelines for processing an input data set.

Type: Grant

Filed: March 11, 2022

Date of Patent: January 9, 2024

Assignee: International Business Machines Corporation

Inventors: Saket K. Sathe, Long Vu, Peter Daniel Kirchner, Horst Cornelius Samulowitz
Distributed resource-aware training of machine learning pipelines

Patent number: 11829799

Abstract: A method, a structure, and a computer system for predicting pipeline training requirements. The exemplary embodiments may include receiving one or more worker node features from one or more worker nodes, extracting one or more pipeline features from one or more pipelines to be trained, and extracting one or more dataset features from one or more datasets used to train the one or more pipelines. The exemplary embodiments may further include predicting an amount of one or more resources required for each of the one or more worker nodes to train the one or more pipelines using the one or more datasets based on one or more models that correlate the one or more worker node features, one or more pipeline features, and one or more dataset features with the one or more resources. Lastly, the exemplary embodiments may include identifying a worker node requiring a least amount of the one or more resources of the one or more worker nodes for training the one or more pipelines.

Type: Grant

Filed: October 13, 2020

Date of Patent: November 28, 2023

Assignee: International Business Machines Corporation

Inventors: Saket Sathe, Gregory Bramble, Long Vu, Theodoros Salonidis
AUTOMATED LOOKBACK WINDOW SEARCHING

Publication number: 20230342627

Abstract: Predefined pipelines may be created with predefined meta-features. Time series data may be segmented using lookback window parameters. Meta-features may be determined for windowed data. Those of the predefined pipelines having a maximum amount of matching predefined meta-features may be determined. Those of the lookback window parameters that result in the windowed data having the meta-features most similar to the meta-features of one or more of the plurality of predefined pipelines may be identified.

Type: Application

Filed: April 22, 2022

Publication date: October 26, 2023

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Long VU, Saket K SATHE, Peter Daniel KIRCHNER, Gregory BRAMBLE
INTEGRATED MACHINE LEARNING PREDICTION AND OPTIMIZATION FOR DECISION-MAKING

Publication number: 20230316150

Abstract: A method includes training, by one or more processing devices, a plurality of machine learning predictive models, thereby generating a plurality of trained machine learning predictive models. The method further includes generating, by the one or more processing devices, a solved machine learning optimization model, based at least in part on the plurality of trained machine learning predictive models. The method further includes outputting, by the one or more processing devices, one or more control input and predicted outputs based at least in part on the solved machine learning optimization model.

Type: Application

Filed: March 30, 2022

Publication date: October 5, 2023

Inventors: Dzung Tien Phan, Long Vu, Lam Minh Nguyen, Dharmashankar Subramanian
PREDICTION AND OPERATIONAL EFFICIENCY FOR SYSTEM-WIDE OPTIMIZATION OF AN INDUSTRIAL PROCESSING SYSTEM

Publication number: 20230297073

Abstract: A relationship between an input, a set-point of a plurality of processes and an output of a corresponding process is learned using machine learning. A regression function is derived for each process based upon historical data. An autoencoder is trained for each process based upon the historical data to form a regularizer and the regression functions and regularizers are merged together into a unified optimization problem. System level optimization is performed using the regression functions and regularizers and a set of optimal set-points of a global optimal solution for operating the processes is determined. An industrial system is operated based on the set of optimal set-points.

Type: Application

Filed: March 16, 2022

Publication date: September 21, 2023

Inventors: Dzung Tien Phan, Long VU, Dharmashankar Subramanian

1 2 3 next