Patents by Inventor Dongliang He

Dongliang He has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220319141
    Abstract: A methods for processing an image, a device, and a storage medium are provided. The method may include: inputting a target image into a pre-trained image segmentation model, the target image including at least one sub-image; extracting high-level semantic features and low-level features of the target image through the image segmentation model, and determining target location information of the sub-image in the target image based on the high-level semantic features and the low-level features; and performing a preset processing operation on the sub-image, based on the target location information of the sub-image.
    Type: Application
    Filed: June 21, 2022
    Publication date: October 6, 2022
    Inventors: Fanglong LIU, Xin LI, Dongliang HE
  • Patent number: 11430265
    Abstract: The present application discloses a video-based human behavior recognition method, apparatus, device and storage medium, and relates to the technical field of human recognitions. The specific implementation scheme lies in: acquiring a human rectangle of each video frame of the video to be recognized, where each human rectangle includes a plurality of human key points, and each of the human key points has a key point feature; constructing a feature matrix according to the human rectangle of the each video frame; convolving the feature matrix with respect to a video frame quantity dimension to obtain a first convolution result and convolving the feature matrix with respect to a key point quantity dimension to obtain a second convolution result; inputting the first convolution result and the second convolution result into a preset classification model to obtain a human behavior category of the video to be recognized.
    Type: Grant
    Filed: September 16, 2020
    Date of Patent: August 30, 2022
    Inventors: Zhizhen Chi, Fu Li, Hao Sun, Dongliang He, Xiang Long, Zhichao Zhou, Ping Wang, Shilei Wen, Errui Ding
  • Patent number: 11410422
    Abstract: A method and an apparatus for grounding a target video clip in a video are provided. The method includes: determining a current video clip in the video based on a current position; acquiring descriptive information indicative of a pre-generated target video clip descriptive feature, and executing a target video clip determining step which includes: determining current state information of the current video clip, wherein the current state information includes information indicative of a feature of the current video clip; generating a current action policy based on the descriptive information and the current state information, the current action policy being indicative of a position change of the current video clip in the video; the method further comprises: in response to reaching a preset condition, using a video clip resulting from executing the current action policy on the current video clip as the target video clip.
    Type: Grant
    Filed: June 18, 2020
    Date of Patent: August 9, 2022
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Dongliang He, Xiang Zhao, Jizhou Huang, Fu Li, Xiao Liu, Shilei Wen
  • Publication number: 20220207299
    Abstract: A method for building an image enhancement model includes obtaining training data; building a neural network model consisting of a feature extraction module, at least one channel dilated convolution module and a spatial upsampling module, where each channel dilated convolution module includes a spatial downsampling submodule, a channel dilation submodule and a spatial upsampling submodule; training the neural network model by using the video frames and the standard images corresponding to the video frames until the neural network model converges, to obtain an image enhancement model. In addition, a method for image enhancement includes obtaining a video frame to be processed; taking the video frame to be processed as an input of an image enhancement model, and taking an output result of the image enhancement model as an image enhancement result of the video frame to be processed.
    Type: Application
    Filed: August 30, 2021
    Publication date: June 30, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Chao LI, Dongliang HE, Wenling GAO, Fu LI, Hao SUN
  • Patent number: 11367313
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for recognizing a body movement. A specific embodiment of the method includes: sampling an input to-be-recognized video to obtain a sampled image frame sequence of the to-be-recognized video; performing key point detection on the sampled image frame sequence by using a trained body key point detection model, to obtain a body key point position heat map of each sampled image frame in the sampled image frame sequence, the body key point position heat map being used to represent a probability feature of a position of a preset body key point; and inputting body key point position heat maps of the sampled image frame sequence into a trained movement classification model to perform classification, to obtain a body movement recognition result corresponding to the to-be-recognized video.
    Type: Grant
    Filed: July 11, 2019
    Date of Patent: June 21, 2022
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Hui Shen, Yuan Gao, Dongliang He, Xiao Liu, Xubin Li, Hao Sun, Shilei Wen, Errui Ding
  • Patent number: 11363271
    Abstract: A method for video frame interpolation, a related electronic device and a storage medium is disclosed. A video is obtained. An (i?1)th frame and an ith frame of the video are obtained. Visual semantic feature maps and depth maps of the (i?1)th frame and the ith frame are obtained. Frame interpolation information is obtained based on the visual semantic feature maps and the depth maps. An interpolated frame between the (i?1)th frame and the ith frame is generated based on the frame interpolation information and the (i?1)th frame and is inserted between the (i?1)th frame and the ith frame.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: June 14, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Chao Li, Yukang Ding, Dongliang He, Fu Li, Hao Sun, Shilei Wen, Hongwu Zhang, Errui Ding
  • Patent number: 11259029
    Abstract: A method, device, apparatus for predicting a video coding complexity and a computer-readable storage medium are provided. The method includes: acquiring an attribute feature of a target video; extracting a plurality of first target image frames from the target video; performing a frame difference calculation on the plurality of the first target image frames, to acquire a plurality of first frame difference images; determining a histogram feature for frame difference images of the target video according to a statistical histogram of each first frame difference image; and inputting a plurality of features of the target video into a coding complexity prediction model to acquire a coding complexity prediction value of the target video. Through the above method, the BPP prediction value can be acquired intelligently.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: February 22, 2022
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Zhichao Zhou, Dongliang He, Fu Li, Xiang Zhao, Xin Li, Zhizhen Chi, Xiang Long, Hao Sun
  • Patent number: 11256920
    Abstract: A method and an apparatus for classifying a video are provided. The method may include: acquiring a to-be-classified video; extracting a set of multimodal features of the to-be-classified video; inputting the set of multimodal features into a post-fusion model corresponding to each modal respectively, to obtain multimodal category information of the to-be-classified video; and fusing the multimodal category information of the to-be-classified video, to obtain category information of the to-be-classified video. This embodiment improves the accuracy of video classification.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: February 22, 2022
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Xiang Long, Dongliang He, Fu Li, Zhizhen Chi, Zhichao Zhou, Xiang Zhao, Ping Wang, Hao Sun, Shilei Wen, Errui Ding
  • Publication number: 20210406579
    Abstract: The present disclosure provides a model training method, an identification method, device, storage medium and program product, relating to computer vision technology and deep learning technology. In the solution provided by the present application, the image is deformed by the means of deforming the first training image without label itself, and the first unsupervised identification result is obtained by using the first model to identify the image before deformation, and the second unsupervised identification result is obtained by using the second model to identify the image after deformation, and the first unsupervised identification result of the first model is deformed, thus a consistency loss function can be constructed according to the second unsupervised identification result and the scrambled identification result. In this way, it is able to enhance the constraint effect of the consistency loss function and avoid destroying the scene semantic information of the images used for training.
    Type: Application
    Filed: September 8, 2021
    Publication date: December 30, 2021
    Inventors: Tianwei Lin, Dongliang He, Fu Li
  • Publication number: 20210383120
    Abstract: The present disclosure provides a method and apparatus for detecting a region of interest in a video, a device and a storage medium. The method may include: acquiring a current to-be-processed frame from a picture frame sequence of a video; detecting a region of interest (ROI) in the current to-be-processed frame, in response to determining that the current to-be-processed frame is a detection picture frame, to determine at least one ROI in the current to-be-processed frame; and updating a to-be-tracked ROI, based on the ROI in the current to-be-processed frame and a tracking result determined by a pre-order tracking picture frame; and tracking the current to-be-processed frame based on the existing to-be-tracked ROI, in response to determining that the current to-be-processed frame is a tracking picture frame, to determine at least one tracking result as the ROI of the current to-be-processed frame.
    Type: Application
    Filed: December 9, 2020
    Publication date: December 9, 2021
    Inventors: Zhichao ZHOU, Dongliang HE, Fu LI, Hao SUN
  • Publication number: 20210374415
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for identifying a video. A specific embodiment of the method includes: acquiring a predetermined number of video frames from a video to be identified to obtain a video frame sequence; performing the following processing step: importing the video frame sequence into a pre-trained video identification model to obtain a classification tag probability corresponding to the video frame sequence, wherein the classification tag probability is used to characterize a probability of identifying a corresponding tag category of the video to be identified; and setting, in response to the classification tag probability being greater than or equal to a preset identification accuracy threshold, a video tag for the video to be identified according to the classification tag probability, or else increasing the number of video frames in the video frame sequence and continuing to perform the above processing step.
    Type: Application
    Filed: March 5, 2021
    Publication date: December 2, 2021
    Inventors: Dongliang HE, Xiao TAN, Shilei WEN, Hao SUN
  • Publication number: 20210360252
    Abstract: A method for video frame interpolation, a related electronic device and a storage medium is disclosed. A video is obtained. An (i?1)th frame and an ith frame of the video are obtained. Visual semantic feature maps and depth maps of the (i?1)th frame and the ith frame are obtained. Frame interpolation information is obtained based on the visual semantic feature maps and the depth maps. An interpolated frame between the (i?1)th frame and the ith frame is generated based on the frame interpolation information and the (i?1)th frame and is inserted between the (i?1)th frame and the ith frame.
    Type: Application
    Filed: December 17, 2020
    Publication date: November 18, 2021
    Inventors: Chao LI, Yukang DING, Dongliang HE, Fu LI, Hao SUN, Shilei WEN, Hongwu ZHANG, Errui DING
  • Publication number: 20210334950
    Abstract: Embodiments of the present disclosure provide a method and apparatus for processing an image, and relates to the field of computer vision technology. The method may include: acquiring a value to be processed, where the value to be processed is associated with an image to be processed; and processing the value to be processed by using a quality scoring model to generate a score of the image to be processed in a target scoring domain, where the score of the image to be processed in the target scoring domain is related to an image quality of the image to be processed.
    Type: Application
    Filed: February 11, 2021
    Publication date: October 28, 2021
    Inventors: Xiang LONG, Ping WANG, Zhichao ZHOU, Fu LI, Dongliang HE, Hao SUN
  • Publication number: 20210334579
    Abstract: A method and apparatus for processing a video frame are provided. The method may include: converting, using an optical flow generated based on a previous frame and a next frame of adjacent frames in a video, a feature map of the previous frame to obtain a converted feature map; determining, based on an error of the optical flow, a weight of the converted feature map, and obtaining a fused feature map based on a weighted result of a feature of the converted feature map and a feature of a feature map of the next frame; and updating the feature map of the next frame as the fused feature map.
    Type: Application
    Filed: February 24, 2021
    Publication date: October 28, 2021
    Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Tianwei LIN, Xin Li, Fu Li, Dongliang He, Hao Sun, Henan Zhang
  • Publication number: 20210319062
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for searching a video segment, a device and a medium, and relate to the field of video data search. The method includes: sampling video frames from a target video and videos to be searched in a video library, and extracting features from the sampled frames; matching the target video and the videos to be searched according to the extracted features to determine a candidate video to be searched that matches the target video; determining at least one candidate video segment from the determined candidate video, and calculating a degree of matching between the target video and each candidate video segment based on the extracted features of each sampled frame; and determining a video segment matching the target video in the videos to be searched according to the calculated degree of matching between the target video and each candidate video segment.
    Type: Application
    Filed: February 23, 2021
    Publication date: October 14, 2021
    Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Xiang Long, Ping Wang, Fu Li, Dongliang He, Hao Sun, Shilei Wen
  • Publication number: 20210304413
    Abstract: An image processing method, an image processing device and an electronic device, all relate to computer vision and deep learning. The image processing method includes: acquiring a first image and a second image; performing semantic region segmentation on the first image and the second image to acquire a first segmentation image and a second segmentation image respectively; determining an association matrix between the first segmentation image and the second segmentation image; and processing the first image in accordance with the association matrix to acquire a target image.
    Type: Application
    Filed: June 10, 2021
    Publication date: September 30, 2021
    Inventors: Hao SUN, Fu LI, Tianwei LIN, Dongliang HE
  • Publication number: 20210295546
    Abstract: The satellite image processing method includes: acquiring a first target satellite image; defogging the first target satellite image through a first neural network to acquire a first satellite image; and adjusting an image quality parameter of the first satellite image through a second neural network to acquire a second satellite image.
    Type: Application
    Filed: June 1, 2021
    Publication date: September 23, 2021
    Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Dongliang He, Henan Zhang, Hao Sun
  • Publication number: 20210227302
    Abstract: Embodiments of the present disclosure relate to a method and apparatus for selecting a video clip, a server and a medium. The method may include: determining at least two video clips from a video; for each video clip, perform following excitement determination steps: inputting a feature sequence of a video frame in the video clip and title information of the video into a pre-established prediction model to obtain a relevance between the inputted video frame and a title of the video; and determining an excitement of the video clip, based on the relevance between the video frame in the video clip and the title; and determining a target video clip from the video clips, based on the excitement of each of the video clips.
    Type: Application
    Filed: September 21, 2020
    Publication date: July 22, 2021
    Inventors: Fu LI, Dongliang HE, Hao SUN
  • Publication number: 20210216783
    Abstract: A method includes screening, by a video-clip screening module in a video description model, a plurality of video proposal clips acquired from a video to be analyzed, to acquire a plurality of video clips suitable for description. The plural video proposal clips acquired from the video to be analyzed may be screened by the video-clip screening module to acquire the plural video clips suitable for description; and then, each video clip is described by a video-clip describing module, thus avoiding description of all the video proposal clips, only describing the screened video clips which have strong correlation with the video and are suitable for description, removing the interference of the description of the video clips which are not suitable for description in the description of the video, guaranteeing the accuracy of the final descriptions of the video clips, and improving the quality of the descriptions of the video clips.
    Type: Application
    Filed: January 8, 2021
    Publication date: July 15, 2021
    Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Xiang LONG, Dongliang HE, Fu LI, Xiang ZHAO, Tianwei LIN, Hao SUN, Shilei WEN, Errui DING
  • Publication number: 20210216782
    Abstract: A method and apparatus for detecting a temporal action of a video, an electronic device and a storage medium are disclosed, which relates to the field of video processing technologies. An implementation includes: acquiring an initial temporal feature sequence of a video to be detected; acquiring, by a pre-trained video-temporal-action detecting module, implicit features and explicit features of a plurality of configured temporal anchor boxes based on the initial temporal feature sequence; and acquiring, by the video-temporal-action detecting module, the starting position and the ending position of a video clip containing a specified action, the category of the specified action and the probability that the specified action belongs to the category from the plural temporal anchor boxes according to the explicit features and the implicit features of the plural temporal anchor boxes.
    Type: Application
    Filed: January 8, 2021
    Publication date: July 15, 2021
    Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Tianwei LIN, Xin LI, Dongliang HE, Fu LI, Hao SUN, Shilei WEN, Errui DING