Abstract: A bidirectional image-text retrieval method based on a multi-view joint embedding space includes: performing retrieval with reference to a semantic association relationship at a global level and a local level, obtaining the semantic association relationship at the global level and the local level in a frame-sentence view and a region-phrase view, and obtaining semantic association information in a global level subspace of frame and sentence in the frame-sentence view, obtaining semantic association information in a local level subspace of region and phrase in the region-phrase view, processing data by a dual-branch neural network in the two views to obtain an isomorphic feature and embedding the same in a common space, and using a constraint condition to reserve an original semantic relationship of the data during training, and merging the two semantic association relationships using multi-view merging and sorting to obtain a more accurate semantic similarity between data.
Type:
Grant
Filed:
January 29, 2018
Date of Patent:
August 31, 2021
Assignee:
Peking University Shenzhen Graduate Sohool
Inventors:
Wenmin Wang, Lu Ran, Ronggang Wang, Ge Li, Shengfu Dong, Zhenyu Wang, Ying Li, Hui Zhao, Wen Gao