Patents by Inventor Chu Hong Hoi

Chu Hong Hoi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for masked self-training of unsupervised image classification

Patent number: 12354013

Abstract: Embodiments described herein provide a masked self-training (MaST) which is an unsupervised learning approach leveraging two complimentary sources of supervision: pseudo-labels and raw image pixels. Specifically, MaST jointly optimizes three objectives to finetune a pre-trained classification model on unlabeled images: (1) self-training objective to learn global task-specific class prediction; (2) masked image modeling objective to learn local pixel-level information; (3) global-local feature alignment objective to bridge the knowledge learned from the two sources of supervision.

Type: Grant

Filed: May 27, 2022

Date of Patent: July 8, 2025

Assignee: Salesforce, Inc.

Inventors: Junnan Li, Chu Hong Hoi
Systems and methods for semi-supervised learning with contrastive graph regularization

Patent number: 12314861

Abstract: Embodiments described herein provide an approach (referred to as “Co-training” mechanism throughout this disclosure) that jointly learns two representations of the training data, their class probabilities and low-dimensional embeddings. Specifically, two representations of each image sample are generated: a class probability produced by the classification head and a low-dimensional embedding produced by the projection head. The classification head is trained using memory-smoothed pseudo-labels, where pseudo-labels are smoothed by aggregating information from nearby samples in the embedding space. The projection head is trained using contrastive learning on a pseudo-label graph, where samples with similar pseudo-labels are encouraged to have similar embeddings.

Type: Grant

Filed: January 28, 2021

Date of Patent: May 27, 2025

Assignee: Salesforce, Inc.

Inventors: Junnan Li, Chu Hong Hoi
Diversity and explainability parameters for recommendation accuracy in machine learning recommendation systems

Patent number: 12298982

Abstract: Embodiments are directed to a machine learning recommendation system. The system receives a user query for generating a recommendation for one or more items with an explanation associated with recommending the one or more items. The system obtains first features of at least one user and second features of a set of items. The system provides the first features and the second features to a first machine learning network for determining a predicted score for an item. The system provides a portion of the first features and a portion of the second features to second machine learning networks for determining explainability scores for an item and generating corresponding explanation narratives. The system provides the recommendation for one or more items and corresponding explanation narratives based on ranking predicted scores and explainability scores for the items.

Type: Grant

Filed: November 10, 2020

Date of Patent: May 13, 2025

Assignee: Salesforce, Inc.

Inventors: Wenzhuo Yang, Jia Li, Chenxi Li, Latrice Barnett, Markus Anderle, Simo Arajarvi, Harshavardhan Utharavalli, Caiming Xiong, Richard Socher, Chu Hong Hoi
Systems and methods for unified vision-language understanding and generation

Patent number: 12299961

Abstract: Embodiments described herein provide systems, methods, and devices for pre-training a multimodal encoder-decoder (MED) model for vision-language tasks. A method may include encoding, by an image encoder of the MED, an image into an image representation; encoding, by a text encoder of the MED, a text into a text representation; generating, by an image-grounded text encoder of the MED, a multimodal representation based on the image representation and the text; generating, by an image-grounded text decoder of the MED, a predicted text based on the image representation and the text; generating, through an image-text matching (ITM) head, a binary classification indicating whether the image and the text are a match; computing a first loss, ITM loss, and third loss based on the image representation, text representation, binary classification, predicted text and text; jointly updating the MED based on the first loss, the second loss and the third loss.

Type: Grant

Filed: May 16, 2022

Date of Patent: May 13, 2025

Assignee: Salesforce, Inc.

Inventors: Junnan Li, Chu Hong Hoi
Systems and methods for unified vision-language understanding and generation

Patent number: 12288380

Abstract: Embodiments described herein provide systems, methods, and devices for generating enhanced vison-language training data. A method may include: receiving, from a communication interface, a first training dataset of image-text pairs and a second training dataset of annotated image-text pairs; fine-tuning an image-grounded text decoder and an image-grounded text encoder using the second training dataset of annotated image-text pairs; generating, by the fine-tuned image-grounded text decoder, a predicted text based on a training image from the first training dataset; generating, by the fine-tuned image-grounded text encoder, a filtering decision based on the training image and the predicted text; adding the training image and the predicted text to form a third training dataset of image-text pairs depending on the filter decision; and training a vision-language model using the third training dataset of image-text pairs.

Type: Grant

Filed: May 16, 2022

Date of Patent: April 29, 2025

Assignee: Salesforce, Inc.

Inventors: Junnan Li, Chu Hong Hoi
Systems and methods for vision-and-language representation learning

Patent number: 12271792

Abstract: Embodiments described herein provide visual-and-language (V+L) systems and methods for learning vision and language representations. Specifically, a method may comprise receiving a training dataset comprising a plurality of image samples and a plurality of text samples; encoding the plurality of image samples into a plurality of encoded image samples and the plurality of text samples into a plurality of encoded text samples; computing a first loss objective based on the plurality of encoded image samples and the plurality of encoded text samples; encoding a first subset of the plurality of encoded image samples and a second subset of the plurality of encoded text samples into a plurality of encoded image-text samples; computing a second loss objective based on the plurality of encoded image-text samples; and updating the V+L model based at least in part on the first loss objective and the second loss objective.

Type: Grant

Filed: July 8, 2021

Date of Patent: April 8, 2025

Assignee: Salesforce, Inc.

Inventors: Junnan Li, Chu Hong Hoi
Systems and methods for code understanding and generation

Patent number: 12217033

Abstract: Embodiments described herein a code generation and understanding model that builds on a Transformer-based encoder-decoder framework. The code generation and understanding model is configured to derive generic representations for programming language (PL) and natural language (NL) in code domain via pre-training on unlabeled code corpus, and then to benefit many code-related downstream tasks with fine-tuning. Apart from the denoising sequence-to-sequence objectives widely adopted for pre-training on natural language, identifier tagging and prediction pre-training objective is adopted to enable the model to better leverage the crucial token type information from PL, which specifically are the identifiers assigned by developers.

Type: Grant

Filed: September 26, 2023

Date of Patent: February 4, 2025

Assignee: Salesforce, Inc.

Inventors: Yue Wang, Weishi Wang, Shafiq Rayhan Joty, Chu Hong Hoi
Systems and methods for video representation learning with a weak teacher

Patent number: 12210976

Abstract: Embodiments described herein provide systems and methods for learning representation from unlabeled videos. Specifically, a method may comprise generating a set of strongly-augmented samples and a set of weakly-augmented samples from the unlabeled video samples; generating a set of predictive logits by inputting the set of strongly-augmented samples into a student model and a first teacher model; generating a set of artificial labels by inputting the set of weakly-augmented samples to a second teacher model that operates in parallel to the first teacher model, wherein the second teacher model shares one or more model parameters with the first teacher model; computing a loss objective based on the set of predictive logits and the set of artificial labels; updating student model parameters based on the loss objective via backpropagation; and updating the shared parameters for the first teacher model and the second teacher model based on the updated student model parameters.

Type: Grant

Filed: March 31, 2021

Date of Patent: January 28, 2025

Assignee: Salesforce, Inc.

Inventors: Hualin Liu, Chu Hong Hoi, Junnan Li
Systems and methods for text classification using label modular prompts

Patent number: 12204857

Abstract: Embodiments described herein provide training a prompt generator for text classification. A first training dataset associated with a first plurality of class labels is received for a first training process. For a first instance of the first training dataset, a set of labels of interest is generated by sampling from a set of possible class labels including the first plurality of class labels. The prompt generator generates a first prompt based on the set of labels of interest. A pretrained language model generates a task output in response to an input of the first instance prepended with the first prompt. A loss objective is generated based on the task output and the set of labels of interest. Parameters of the prompt generator are updated based on the computed loss function via backpropagation while the pretrained language model is frozen.

Type: Grant

Filed: November 28, 2022

Date of Patent: January 21, 2025

Assignee: Salesforce, Inc.

Inventors: Hailin Chen, Amrita Saha, Shafiq Rayhan Joty, Chu Hong Hoi
Systems and methods for video and language pre-training

Patent number: 12198432

Abstract: Embodiments described a method of video-text pre-learning to effectively learn cross-modal representations from sparse video frames and text. Specifically, an align and prompt framework provides a video and language pre-training framework that encodes the frames and text independently using a transformer-based video encoder and a text encoder. A multi-modal encoder is then employed to capture cross-modal interaction between a plurality of video frames and a plurality of texts. The pre-training includes a prompting entity modeling that enables the model to capture fine-grained region-entity alignment.

Type: Grant

Filed: December 30, 2021

Date of Patent: January 14, 2025

Assignee: Salesforce, Inc.

Inventors: Dongxu Li, Junnan Li, Chu Hong Hoi
Systems and methods for contextualized and quantized soft prompts for natural language understanding

Patent number: 12147765

Abstract: Embodiments described herein provide a soft prompt tuning technique referred to as the Vector quantized Input-contextualized Prompt (VIP). The VIP techniques has two integral properties i) instead of learning a fixed set of prompt tokens irrespective of the input, it generates a contextualized version of the soft prompts, conditional on the input text ii) it further passes the input-contextualized prompt tokens through a quantization network, inspired by Vector Quantized Transformers. The quantization network uses nearest neighbor search over a learnable codebook to train a discrete latent variable model over the prompt-space, thus generating quantized version of contextual prompt tokens. These quantized contextual prompt tokens are finally fed into the frozen language model along with the original input text.

Type: Grant

Filed: August 16, 2022

Date of Patent: November 19, 2024

Assignee: Salesforce, Inc.

Inventors: Rishabh Bhardwaj, Amrita Saha, Chu Hong Hoi
Systems and methods for an end-to-end evaluation and testing framework for task-oriented dialog systems

Patent number: 12112138

Abstract: Embodiments provide a software framework for evaluating and troubleshooting real-world task-oriented bot systems. Specifically, the evaluation framework includes a generator that infers dialog acts and entities from bot definitions and generates test cases for the system via model-based paraphrasing. The framework may also include a simulator for task-oriented dialog user simulation that supports both regression testing and end-to-end evaluation. The framework may also include a remediator to analyze and visualize the simulation results, remedy some of the identified issues, and provide actionable suggestions for improving the task-oriented dialog system.

Type: Grant

Filed: June 2, 2022

Date of Patent: October 8, 2024

Assignee: Salesforce, Inc.

Inventors: Guangsen Wang, Samson Min Rong Tan, Shafiq Rayhan Joty, Gang Wu, Chu Hong Hoi, Ka Chun Au
PARAMETER UTILIZATION FOR LANGUAGE PRE-TRAINING

Publication number: 20240330409

Abstract: Embodiments are directed to pre-training a transformer model using more parameters for sophisticated patterns (PSP++). The transformer model is divided into a held-out model and a main model. A forward pass and a backward pass are performed on the held-out model, where the forward pass determines self-attention hidden states of the held-out model and the backward pass determines loss of the held-out model. A forward pass on the main model is performed to determine a self-attention hidden states of the main model. The self-attention hidden states of the main model are concatenated with the self-attention hidden states of the held-out model. A backward pass is performed on the main model to determine a loss of the main model. The parameters of the held-out model are updated to reflect the loss of the held-out model and parameters of the main model are updated to reflect the loss of the main model.

Type: Application

Filed: June 10, 2024

Publication date: October 3, 2024

Inventors: Chen Xing, Wenhao Liu, Chu Hong Hoi, Nitish Shirish Keskar, Caiming Xiong
SYSTEMS AND METHODS FOR AN ENCODER-DECODER BASED FRAMEWORK FOR CODE GENERATION AND UNDERSTANDING

Publication number: 20240289606

Abstract: Embodiments described herein provide a mixture of encoder-decoder Transformer framework for multi-task pretraining and flexible finetuning for both code understanding and generation tasks. Specifically, the framework is built on multimodal encoder and decoder modules. During pre-training, the encoder-decoder framework is trained with multiple learning objectives, including a diverse set of self-supervised tasks over two major stages of pretraining on unimodal and bimodal data.

Type: Application

Filed: February 24, 2023

Publication date: August 29, 2024

Inventors: Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Junnan Li, Chu Hong Hoi
Parameter utilization for language pre-training

Patent number: 12072955

Abstract: Embodiments are directed to pre-training a transformer model using more parameters for sophisticated patterns (PSP++). The transformer model is divided into a held-out model and a main model. A forward pass and a backward pass are performed on the held-out model, where the forward pass determines self-attention hidden states of the held-out model and the backward pass determines loss of the held-out model. A forward pass on the main model is performed to determine a self-attention hidden states of the main model. The self-attention hidden states of the main model are concatenated with the self-attention hidden states of the held-out model. A backward pass is performed on the main model to determine a loss of the main model. The parameters of the held-out model are updated to reflect the loss of the held-out model and parameters of the main model are updated to reflect the loss of the main model.

Type: Grant

Filed: November 22, 2021

Date of Patent: August 27, 2024

Assignee: Salesforce, Inc.

Inventors: Chen Xing, Wenhao Liu, Chu Hong Hoi, Nitish Shirish Keskar, Caiming Xiong
Systems and methods for partially supervised learning with momentum prototypes

Patent number: 12056610

Abstract: A learning mechanism with partially-labeled web images is provided while correcting the noise labels during the learning. Specifically, the mechanism employs a momentum prototype that represents common characteristics of a specific class. One training objective is to minimize the difference between the normalized embedding of a training image sample and the momentum prototype of the corresponding class. Meanwhile, during the training process, the momentum prototype is used to generate a pseudo label for the training image sample, which can then be used to identify and remove out of distribution (OOD) samples to correct the noisy labels from the original partially-labeled training images. The momentum prototype for each class is in turn constantly updated based on the embeddings of new training samples and their pseudo labels.

Type: Grant

Filed: August 28, 2020

Date of Patent: August 6, 2024

Assignee: Salesforce, Inc.

Inventors: Junnan Li, Chu Hong Hoi
SYSTEMS AND METHODS FOR IN-CONTEXT LEARNING USING SMALL-SCALE LANGUAGE MODELS

Publication number: 20240249077

Abstract: Embodiments described herein provide a data driven framework that (i) translates demonstration examples to a fixed-length soft prompt—a sequence of soft tokens; and (ii) learns a global (not generated from demonstrations) soft prompt. The framework then combines the global prompt, the translated prompts and the original context to create an augmented context which is given as final input for the backbone LM to use.

Type: Application

Filed: May 12, 2023

Publication date: July 25, 2024

Inventors: Hailin Chen, Shafiq Rayhan Joty, Amrita Saha, Chu Hong Hoi
Systems and methods for causality-based multivariate time series anomaly detection

Patent number: 12001546

Abstract: Embodiments described herein provide a causality-based anomaly detection mechanism that formulates multivariate time series as instances that do not follow the regular causal mechanism. Specifically, the causality-based anomaly detection mechanism leverages the causal structure discovered from data so that the joint distribution of multivariate time series is factorized into simpler modules where each module corresponds to a local causal mechanism, reflected by the corresponding conditional distribution. Those local mechanisms are modular or autonomous and can then be handled separately. In light of this modularity property, the anomaly detection problem then naturally decomposed into a series of low-dimensional anomaly detection problems. Each sub-problem is concerned with a local mechanism.

Type: Grant

Filed: October 29, 2021

Date of Patent: June 4, 2024

Assignee: Salesforce, Inc.

Inventors: Wenzhuo Yang, Chu Hong Hoi, Kun Zhang
Systems and methods for video and language pre-training

Patent number: 11989941

Abstract: Embodiments described a method of video-text pre-learning to effectively learn cross-modal representations from sparse video frames and text. Specifically, an align and prompt framework provides a video and language pre-training framework that encodes the frames and text independently using a transformer-based video encoder and a text encoder. A multi-modal encoder is then employed to capture cross-modal interaction between a plurality of video frames and a plurality of texts. The pre-training includes a prompting entity modeling that enables the model to capture fine-grained region-entity alignment.

Type: Grant

Filed: December 30, 2021

Date of Patent: May 21, 2024

Assignee: Salesforce, Inc.

Inventors: Dongxu Li, Junnan Li, Chu Hong Hoi
SYSTEMS AND METHODS FOR A VISION-LANGUAGE PRETRAINING FRAMEWORK

Publication number: 20240161520

Abstract: Embodiments described herein provide a multimodal vision-language model. The multimodal vision-language model contains a Generalist Multimodal Transformer capable of complete multiple tasks using the same set of parameters learning from pre-training. The Generalist Multimodal Transformer allows alignment between frozen, unimodal encoders, such as image encoders and large language models. The Generalist Multimodal Transformer eliminates the need for fine-tuning the image encoders and large language models.

Type: Application

Filed: January 27, 2023

Publication date: May 16, 2024

Inventors: Junnan Li, Chu Hong Hoi

1 2 3 4 5 … next