Abstract: A method for detecting an occluded image. The method includes: after an image is captured by a camera, obtaining the image as an image to be detected; inputting the image to be detected into a trained occluded-image detection model, the occluded-image detection model is trained based on original occluded images and non-occluded images by using a trained data feature augmentation network; determining whether the image to be detected is an occluded image based on the occluded-image detection model; and outputting an image detection result.
Abstract: The present invention belongs to the technical field of 3D reconstruction in the field of computer vision, and provides a dataset generation method for self-supervised learning scene point cloud completion based on panoramas. Pairs of incomplete point cloud and target point cloud with RGB information and normal information can be generated by taking RGB panoramas, depth panoramas and normal panoramas in the same view as input for constructing a self-supervised learning dataset for training of the scene point cloud completion network. The key points of the present invention are occlusion prediction and equirectangular projection based on view conversion, and processing of the stripe problem and point-to-point occlusion problem during conversion. The method of the present invention includes simplification of the collection mode of the point cloud data in a real scene; occlusion prediction idea of view conversion; and design of view selection strategy.
Abstract: The present disclosure provides a method for visual question answering, which relates to fields of computer vision and natural language processing. The method includes: acquiring an input image and an input question; detecting visual information and position information of each of at least one text region in the input image; determining semantic information and attribute information of each of the at least one text region based on the visual information and the position information; determining a global feature of the input image based on the visual information, the position information, the semantic information, and the attribute information; determining a question feature based on the input question; and generating a predicted answer for the input image and the input question based on the global feature and the question feature. The present disclosure further provides a device for visual question answering, a computer device and a medium.