Patents by Inventor Chuhui Xue

Chuhui Xue has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VALIDATION OF UNSUPERVISED ADAPTIVE MODELS

Publication number: 20240177460

Abstract: Embodiments of the present disclosure relate to validation of unsupervised adaptive models. According to example embodiments of the present disclosure, unlike methods validating with the seen target data, the present disclosure synthesizes new samples by mixing the target samples and pseudo labels. The accuracy between model predictions of mixed samples and the mixed labels are measured for model selection, and the accuracy score may be called PseudoMix. PseudoMix enjoys the combined inductive bias of previous methods. Experiments demonstrate that PseudoMix can keep state-of-the-art performance across different validation settings.

Type: Application

Filed: November 28, 2022

Publication date: May 30, 2024

Inventors: Song BAI, Dapeng HU, Jun Hao LIEW, Chuhui XUE
METHOD, APPARATUS, DEVICE AND MEDIUM FOR IMAGE PROCESSING

Publication number: 20240144656

Abstract: A method, apparatus, device, and medium for image processing is provided. The method includes generating, using an image generation process, a first set of synthetic images based on a first set of codes associated with the first image class in a codebook and based on a first class feature associated with a first image class; generating, using a feature extraction process, a first set of reference features based on the first set of synthetic images and generating a first set of target features based on a plurality of sets of training images belonging to the first image class in a training image set; and updating the image generation process and the codebook according to at least a first training objective to reduce a difference between each reference feature in the first set of reference features and a corresponding target feature in the first set of target features.

Type: Application

Filed: December 22, 2023

Publication date: May 2, 2024

Inventors: Song Bai, Junhao Zhang, Heng Wang, Rui Yan, Chuhui Xue, Wenqing Zhang
MULTIMODAL DATA PROCESSING

Publication number: 20240144664

Abstract: Embodiments of the present disclosure provide a solution for multimodal data processing. A method comprises: obtaining image data and text data; and extracting a target visual feature of image data and a target textual feature of text data using a feature extraction model. The feature extraction model comprises alternatively deployed cross-modal encoding parts and visual encoding parts. The extracting comprises: performing, using a first cross-modal encoding part of the feature extraction model, cross-modal feature encoding on a first intermediate visual feature of the image data and a first intermediate textual feature of the text data, to obtain a second intermediate visual feature and a second intermediate textual feature; performing, using a first visual encoding part of the feature extraction model, visual modal feature encoding on the second intermediate visual feature, to obtain a third intermediate visual feature.

Type: Application

Filed: December 21, 2023

Publication date: May 2, 2024

Inventors: Song Bai, Rui Yan, Heng Wang, Junhao Zhang, Chuhui Xue, Wenqing Zhang
PRE-TRAINING FOR SCENE TEXT DETECTION

Publication number: 20240119743

Abstract: Embodiments of the present disclosure relate to a method, device and computer readable storage medium of scene text detection. In the method, a first visual representation of a first image is generated with an image encoding process. A first textual representation of a first text unit in the first image is generated with a text encoding process based on a first plurality of symbols obtained by masking a first symbol of a plurality of symbols in the first text unit. A first prediction of the masked first symbol is determined with a decoding process based on the first visual and textual representations. At least the image encoding process is updating according to at least a first training objective to increase at least similarity of the first prediction and the masked first symbol.

Type: Application

Filed: September 28, 2022

Publication date: April 11, 2024

Inventors: Chuhui XUE, Wenqing ZHANG, Yu HAO, Song BAI
MODEL TRAINING BASED ON SYNTHETIC DATA

Publication number: 20230334834

Abstract: Embodiments of the present disclosure relate to model training based on synthetic data. According to example embodiments of the present disclosure, synthetic images are generated by providing respective text prompts into a text-to-image generation model. Respective training labels associated with the synthetic images are also generated based on the used text prompts. A target model, which is configured to perform an image classification task, is trained based at least in part on the synthetic images and the associated training labels. Through this solution, a large scale of synthetic images can be automatically obtained and applicable for training a model for image classification, to improve the model performance with data-scare setting or in the case of model pre-training where the training data amount matters.

Type: Application

Filed: June 20, 2023

Publication date: October 19, 2023

Inventors: Song BAI, Ruifei HE, Shuyang SUN, Xin YU, Chuhui XUE, Wenqing ZHANG, Xiaojuan QI
OPEN VOCABULARY 3D SCENE PROCESSING

Publication number: 20230290051

Abstract: A method is proposed for detecting an object in a 3D scene, including obtaining a detecting model that describes an association relationship between a plurality of base classes of a plurality of objects and 3D data of the plurality of objects. A plurality of open classes of a plurality of candidate objects to be detected in a 3D scene are received, the plurality of open classes comprise the plurality of base classes and at least one novel class not in the plurality of base classes. A 3D portion is detected in 3D data of the 3D scene based on the detecting model and the plurality of open classes, the 3D portion corresponds to a target candidate object in the plurality of candidate objects. With this method, objects that belong to a novel class, not annotated in training data of the detecting model, may be detected from the 3D data.

Type: Application

Filed: March 14, 2023

Publication date: September 14, 2023

Inventors: Song BAI, Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Xiaojuan Qi

VALIDATION OF UNSUPERVISED ADAPTIVE MODELS

METHOD, APPARATUS, DEVICE AND MEDIUM FOR IMAGE PROCESSING

MULTIMODAL DATA PROCESSING

PRE-TRAINING FOR SCENE TEXT DETECTION

MODEL TRAINING BASED ON SYNTHETIC DATA

OPEN VOCABULARY 3D SCENE PROCESSING