Patents by Inventor Yehao LI

Yehao LI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MULTI-MODAL PRE-TRAINING METHOD AND MULTI-MODAL PRE-TRAINING APPARATUS

Publication number: 20240378865

Abstract: The present disclosure provides a multi-modal pre-training method and apparatus. The method includes: sampling a video in a video-text pair to obtain a first video frame sequence; performing word segmentation processing on a text in the video-text pair to obtain a first word segmentation sequence; masking on the first video frame sequence to obtain a second video frame sequence; masking on the first word segmentation sequence to obtain a second word segmentation sequence; encoding the first video frame sequence to obtain a first video feature, and encoding the first word segmentation sequence to obtain a first word segmentation feature; encoding the second video frame sequence to obtain a second video feature, and encoding the second word segmentation sequence to obtain a second word segmentation feature; performing multi-modal pre-training by using the first video feature, the first word segmentation feature, the second video feature and the second word segmentation feature.

Type: Application

Filed: May 13, 2022

Publication date: November 14, 2024

Inventors: Yehao LI, Yingwei PAN, Ting YAO, Tao MEI
Image description generation method, apparatus and system, and medium and electronic device

Patent number: 12073639

Abstract: The present disclosure relates to the technical field of image processing, and in particular to an image description generation method, apparatus and system, and a medium and an electronic device.

Type: Grant

Filed: March 2, 2021

Date of Patent: August 27, 2024

Assignees: BEIJING JINGDONG SHANGKE INFORMATION TECHNOLOGY CO., LTD., BEIJING JINGDONG CENTURY TRADING CO., LTD.

Inventors: Yingwei Pan, Yehao Li, Ting Yao, Tao Mei
Method and Apparatus for Generating Captioning Device, and Method and Apparatus for Outputting Caption

Publication number: 20240177506

Abstract: A method and apparatus for generating a captioning device, and a method and apparatus for outputting a caption. The method for generating a captioning device comprises: acquiring a sample image set; inputting the sample image set into an image encoder of a sentence generator, so as to output an object set; grouping the object set into a first object set and a second object set, wherein the first object set is an object set that is included within a preset object set, and the second object set is an object set that is excluded from the preset object set; inputting, into a sentence decoder of the sentence generator, the object set output by the image encoder, and performing a beam search in a decoding step by taking the first object set and the second object set as constraint conditions, so as to generate a pseudo-image sentence pair set; and training the sentence generator by taking the pseudo-image sentence pair set as a sample set, so as to obtain a captioning device.

Type: Application

Filed: January 6, 2022

Publication date: May 30, 2024

Inventors: Jingwei PAN, Yehao LI, Ting YAO, Tao MEI
IMAGE DESCRIPTION GENERATION METHOD, APPARATUS AND SYSTEM, AND MEDIUM AND ELECTRONIC DEVICE

Publication number: 20230014105

Abstract: The present disclosure relates to the technical field of image processing, and in particular to an image description generation method, apparatus and system, and a medium and an electronic device.

Type: Application

Filed: March 2, 2021

Publication date: January 19, 2023

Inventors: Yingwei PAN, Yehao LI, Ting YAO, Tao MEI

MULTI-MODAL PRE-TRAINING METHOD AND MULTI-MODAL PRE-TRAINING APPARATUS

Image description generation method, apparatus and system, and medium and electronic device

Method and Apparatus for Generating Captioning Device, and Method and Apparatus for Outputting Caption

IMAGE DESCRIPTION GENERATION METHOD, APPARATUS AND SYSTEM, AND MEDIUM AND ELECTRONIC DEVICE