Patents by Inventor Chu Hong Hoi

Chu Hong Hoi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR TIME-SERIES DATA PROCESSING IN MACHINE LEARNING SYSTEMS

Publication number: 20230367271

Abstract: Embodiments described herein provide using a measure of distance between time-series data sequences referred to as optimal transport warping (OTW). Measuring the OTW distance between unbalanced sequences (sequences with different sums of their values) may be accomplished by including an unbalanced mass cost. The OTW computation may be performed using cumulative sums over local windows. Further, embodiments herein describe methods for dealing with time-series data with negative values. Sequences may be split into positive and negative components before determining the OTW distance. A smoothing function may also be applied to the OTW measurement allowing for a gradient to be calculated. The OTW distance may be used in machine learning tasks such as clustering and classification. An OTW measurement may also be used as an input layer to a neural network.

Type: Application

Filed: September 19, 2022

Publication date: November 16, 2023

Inventors: Fabian Ricardo Latorre Gomez, ChengHao Liu, Doyen Sahoo, Chu Hong Hoi
SYSTEMS AND METHODS FOR MASKED SELF-TRAINING OF UNSUPERVISED IMAGE CLASSIFICATION

Publication number: 20230359900

Abstract: Embodiments described herein provide a masked self-training (MaST) which is an unsupervised learning approach leveraging two complimentary sources of supervision: pseudo-labels and raw image pixels. Specifically, MaST jointly optimizes three objectives to finetune a pre-trained classification model on unlabeled images: (1) self-training objective to learn global task-specific class prediction; (2) masked image modeling objective to learn local pixel-level information; (3) global-local feature alignment objective to bridge the knowledge learned from the two sources of supervision.

Type: Application

Filed: May 27, 2022

Publication date: November 9, 2023

Inventors: Junnan Li, Chu Hong Hoi
SYSTEMS AND METHODS FOR CONTEXTUALIZED AND QUANTIZED SOFT PROMPTS FOR NATURAL LANGUAGE UNDERSTANDING

Publication number: 20230342559

Abstract: Embodiments described herein provide a soft prompt tuning technique referred to as the Vector quantized Input-contextualized Prompt (VIP). The VIP techniques has two integral properties i) instead of learning a fixed set of prompt tokens irrespective of the input, it generates a contextualized version of the soft prompts, conditional on the input text ii) it further passes the input-contextualized prompt tokens through a quantization network, inspired by Vector Quantized Transformers. The quantization network uses nearest neighbor search over a learnable codebook to train a discrete latent variable model over the prompt-space, thus generating quantized version of contextual prompt tokens. These quantized contextual prompt tokens are finally fed into the frozen language model along with the original input text.

Type: Application

Filed: August 16, 2022

Publication date: October 26, 2023

Inventors: Rishabh Bhardwaj, Amrita Saha, Chu Hong Hoi
SYSTEMS AND METHODS FOR CONTEXTUALIZED AND QUANTIZED SOFT PROMPTS FOR NATURAL LANGUAGE UNDERSTANDING

Publication number: 20230342552

Abstract: Embodiments described herein provide a soft prompt tuning technique referred to as the Vector quantized Input-contextualized Prompt (VIP). The VIP techniques has two integral properties i) instead of learning a fixed set of prompt tokens irrespective of the input, it generates a contextualized version of the soft prompts, conditional on the input text ii) it further passes the input-contextualized prompt tokens through a quantization network, inspired by Vector Quantized Transformers. The quantization network uses nearest neighbor search over a learnable codebook to train a discrete latent variable model over the prompt-space, thus generating quantized version of contextual prompt tokens. These quantized contextual prompt tokens are finally fed into the frozen language model along with the original input text.

Type: Application

Filed: August 16, 2022

Publication date: October 26, 2023

Inventors: Rishabh Bhardwaj, Amrita Saha, Chu Hong Hoi
Systems and methods for a multilingual speech recognition framework

Patent number: 11798534

Abstract: Embodiments described herein provide an Adapt-and-Adjust (A2) mechanism for multilingual speech recognition model that combines both adaptation and adjustment methods as an integrated end-to-end training to improve the models' generalization and mitigate the long-tailed issue. Specifically, a multilingual language model mBERT is utilized, and converted into an autoregressive transformer decoder. In addition, a cross-attention module is added to the encoder on top of the mBERT's self-attention layer in order to explore the acoustic space in addition to the text space. The joint training of the encoder and mBERT decoder can bridge the semantic gap between the speech and the text.

Type: Grant

Filed: January 29, 2021

Date of Patent: October 24, 2023

Assignee: salesforce.com, inc.

Inventors: Guangsen Wang, Chu Hong Hoi, Genta Indra Winata
Systems and methods for code understanding and generation

Patent number: 11782686

Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.

Type: Grant

Filed: August 27, 2021

Date of Patent: October 10, 2023

Assignee: SALESFORCE.COM, INC.

Inventors: Yue Wang, Weishi Wang, Shafiq Rayhan Joty, Chu Hong Hoi
Unsupervised representation learning with contrastive prototypes

Patent number: 11776236

Abstract: The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.

Type: Grant

Filed: February 2, 2022

Date of Patent: October 3, 2023

Assignee: Salesforce.com, Inc.

Inventors: Junnan Li, Chu Hong Hoi
System and methods for training task-oriented dialogue (TOD) language models

Patent number: 11749264

Abstract: Embodiments described herein provide methods and systems for training task-oriented dialogue (TOD) language models. In some embodiments, a TOD language model may receive a TOD dataset including a plurality of dialogues and a model input sequence may be generated from the dialogues using a first token prefixed to each user utterance and a second token prefixed to each system response of the dialogues. In some embodiments, the first token or the second token may be randomly replaced with a mask token to generate a masked training sequence and a masked language modeling (MLM) loss may be computed using the masked training sequence. In some embodiments, the TOD language model may be updated based on the MLM loss.

Type: Grant

Filed: November 3, 2020

Date of Patent: September 5, 2023

Assignee: Salesforce, Inc.

Inventors: Chien-Sheng Wu, Chu Hong Hoi, Richard Socher, Caiming Xiong
SYSTEMS AND METHODS FOR UNSUPERVISED ANOMALY DETECTION

Publication number: 20230244925

Abstract: Embodiments described herein provide a system and method for unsupervised anomaly detection. The system receives, via a communication interface, a dataset of instances that include anomalies. The system determines, via an inlier model, a set of noisy labels. The system trains a causality-based label-noise model based at least in part on the set of noisy labels and the set of high-confidence instances. The system determines an estimated proportion of anomalies in the dataset of instances. The system retrains the inlier model based on the estimated inlier samples. The system iteratively retrains the inlier model and the trained causality-based label-noise model based on the output from the corresponding retrained models not converging within the convergence threshold. The system extracts the anomaly detection model from the iteratively trained causality-based label-noise model.

Type: Application

Filed: January 31, 2022

Publication date: August 3, 2023

Inventors: Wenzhuo Yang, Chu Hong Hoi, Kun Zhang
SYSTEMS AND METHODS FOR ONLINE TIME SERIES FORCASTING

Publication number: 20230244943

Abstract: Embodiments provide a framework combining fast and slow learning Networks (referred to as “FSNet”) to train deep neural forecasters on the fly for online time-series fore-casting. FSNet is built on a deep neural network backbone (slow learner) with two complementary components to facilitate fast adaptation to both new and recurrent concepts. To this end, FSNet employs a per-layer adapter to monitor each layer's contribution to the forecasting loss via its partial derivative. The adapter transforms each layer's weight and feature at each step based on its recent gradient, allowing a finegrain per-layer fast adaptation to optimize the current loss. In addition, FSNet employs a second and complementary associative memory component to store important, recurring patterns observed during training. The adapter interacts with the memory to store, update, and retrieve the previous transformations, facilitating fast learning of such patterns.

Type: Application

Filed: July 22, 2022

Publication date: August 3, 2023

Inventors: Hong-Quang Pham, Chenghao Liu, Doyen Sahoo, Chu Hong Hoi
SYSTEMS AND METHODS FOR TIME SERIES FORECASTING

Publication number: 20230244947

Abstract: Embodiments described herein provide a method of forecasting time series data at future timestamps in a dynamic system. The method of forecasting time series data also includes receiving, via a data interface, a time series dataset. The method also includes determining, via a frequency attention layer, a seasonal representation based on a frequency domain analysis of the time series data. The method also includes determining, via an exponential attention layer, a growth representation based on the seasonal representation. The method also includes generating, via a decoder, a time series forecast based on the seasonal representation and the trend representation.

Type: Application

Filed: June 17, 2022

Publication date: August 3, 2023

Inventors: Gerald Woo, Chenghao Liu, Doyen Sahoo, Chu Hong Hoi
SYSTEMS AND METHODS FOR UNIFIED VISION-LANGUAGE UNDERSTANDING AND GENERATION

Publication number: 20230237773

Abstract: Embodiments described herein provide bootstrapping language-images pretraining for unified vision-language understanding and generation (BLIP), a unified VLP framework which transfers flexibly to both vision-language understanding and generation tasks. BLIP enables a wider range of downstream tasks, improving on both shortcomings of existing models.

Type: Application

Filed: May 16, 2022

Publication date: July 27, 2023

Inventors: Junnan Li, Chu Hong Hoi
SYSTEMS AND METHODS FOR AN END-TO-END EVALUATION AND TESTING FRAMEWORK FOR TASK-ORIENTED DIALOG SYSTEMS

Publication number: 20230237275

Abstract: Embodiments provide a software framework for evaluating and troubleshooting real-world task-oriented bot systems. Specifically, the evaluation framework includes a generator that infers dialog acts and entities from bot definitions and generates test cases for the system via model-based paraphrasing. The framework may also include a simulator for task-oriented dialog user simulation that supports both regression testing and end-to-end evaluation. The framework may also include a remediator to analyze and visualize the simulation results, remedy some of the identified issues, and provide actionable suggestions for improving the task-oriented dialog system.

Type: Application

Filed: June 2, 2022

Publication date: July 27, 2023

Inventors: Guangsen Wang, Samson Min Rong Tan, Shafiq Rayhan Joty, Gang Wu, Chu Hong Hoi, Ka Chun Au
SYSTEMS AND METHODS FOR UNIFIED VISION-LANGUAGE UNDERSTANDING AND GENERATION

Publication number: 20230237772

Abstract: Embodiments described herein provide bootstrapping language-images pre-training for unified vision-language understanding and generation (BLIP), a unified VLP framework which transfers flexibly to both vision-language understanding and generation tasks. BLIP enables a wider range of downstream tasks, improving on both shortcomings of existing models.

Type: Application

Filed: May 16, 2022

Publication date: July 27, 2023

Inventors: Junnan Li, Chu Hong Hoi
SYSTEMS AND METHODS FOR VIDEO AND LANGUAGE PRE-TRAINING

Publication number: 20230154188

Abstract: Embodiments described a method of video-text pre-learning to effectively learn cross-modal representations from sparse video frames and text. Specifically, an align and prompt framework provides a video and language pre-training framework that encodes the frames and text independently using a transformer-based video encoder and a text encoder. A multi-modal encoder is then employed to capture cross-modal interaction between a plurality of video frames and a plurality of texts. The pre-training includes a prompting entity modeling that enables the model to capture fine-grained region-entity alignment.

Type: Application

Filed: December 30, 2021

Publication date: May 18, 2023

Inventors: Dongxu Li, Junnan Li, Chu Hong Hoi
SYSTEMS AND METHODS FOR VIDEO AND LANGUAGE PRE-TRAINING

Publication number: 20230154146

Abstract: Embodiments described a method of video-text pre-learning to effectively learn cross-modal representations from sparse video frames and text. Specifically, an align and prompt framework provides a video and language pre-training framework that encodes the frames and text independently using a transformer-based video encoder and a text encoder. A multi-modal encoder is then employed to capture cross-modal interaction between a plurality of video frames and a plurality of texts. The pre-training includes a prompting entity modeling that enables the model to capture fine-grained region-entity alignment.

Type: Application

Filed: December 30, 2021

Publication date: May 18, 2023

Inventors: Dongxu Li, Junnan Li, Chu Hong Hoi
Entity resolution for chatbot conversations

Patent number: 11651158

Abstract: A system performs conversations with users using chatbots customized for performing a set of tasks. The system may be a multi-tenant system that allows customization of the chatbots for each tenant. The system receives a task configuration that maps tasks to entity types and an entity configuration that specifies methods for determining entities of a particular entity type. The system receives a user utterance and determines the intent of the user using an intent detection model, for example, a neural network. The intent represents a task that the user is requesting. The system determines one or more entities corresponding to the task. The system performs tasks based on the determined intent and the entities and performs conversations with users based on the tasks.

Type: Grant

Filed: August 13, 2020

Date of Patent: May 16, 2023

Assignee: Salesforce, Inc.

Inventors: Xinyi Yang, Tian Xie, Caiming Xiong, Wenhao Liu, Huan Wang, Jin Qu, Soujanya Lanka, Chu Hong Hoi, Xugang Ye, Feihong Wu
Systems and methods for explicit memory tracker with coarse-to-fine reasoning in conversational machine reading

Patent number: 11640505

Abstract: Embodiments described herein provide systems and methods for an Explicit Memory Tracker (EMT) that tracks each rule sentence to perform decision making and to generate follow-up clarifying questions. Specifically, the EMT first segments the regulation text into several rule sentences and allocates the segmented rule sentences into memory modules, and then feeds information regarding the user scenario and dialogue history into the EMT sequentially to update each memory module separately. At each dialogue turn, the EMT makes a decision among based on current memory status of the memory modules whether further clarification is needed to come up with an answer to a user question. The EMT determines that further clarification is needed by identifying an underspecified rule sentence span by modulating token-level span distributions with sentence-level selection scores. The EMT extracts the underspecified rule sentence span and rephrases the underspecified rule sentence span to generate a follow-up question.

Type: Grant

Filed: April 30, 2020

Date of Patent: May 2, 2023

Assignee: Salesforce.com, Inc.

Inventors: Yifan Gao, Chu Hong Hoi, Shafiq Rayhan Joty, Chien-Sheng Wu
SYSTEMS AND METHODS FOR NATURAL LANGUAGE CODE SEARCH

Publication number: 20230109681

Abstract: Embodiments are directed to translating a natural language query into a code snippet in a programing language that semantically represents the query. The embodiments include a cascading neural network that includes an encoder network and a classifier network. The encoder network being faster but less accurate than the classifier network. The encoder network is trained using a contrastive learning framework to identify code candidates from a large set of code snippets. The classifier network is trained using a binary classifier to identify the code snippet that semantically represents the query from the code candidates.

Type: Application

Filed: January 28, 2022

Publication date: April 13, 2023

Inventors: Akhilesh Deepak Gotmare, Junnan Li, Chu Hong Hoi
SYSTEMS AND METHODS FOR TIME-SERIES FORECASTING

Publication number: 20230105970

Abstract: A method includes receiving, via a data interface, a training dataset of time-series data samples; and generating, by an encoder of a representation training model, intermediate representations of a training data sample from the training dataset. One or more trend feature representations are generated based on the intermediate representations. One or more seasonal feature representations are generated based on the intermediate representations. The representation training model is trained, using the one or more trend feature representations and one or more seasonal feature representations, to generate a trained representation training model.

Type: Application

Filed: January 28, 2022

Publication date: April 6, 2023

Inventors: Gerald Woo, Chenghao Liu, Doyen Sahoo, Chu Hong Hoi

prev 1 2 3 4 5 next