Patents by Inventor Dongliang He

Dongliang He has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210201448
    Abstract: An image filling method and apparatus, a device and a storage medium are disclosed. The image filling method includes: performing multilevel encoding processing on features of an image to be filled to generate multilevel encoded feature layers, sizes of the multilevel encoded feature layers being reduced layer by layer; performing layer-by-layer decoding processing on the multilevel encoded feature layers to obtain multilevel decoded feature layers and a first image, there being no missing region in the first image, wherein the layer-by-layer decoding processing includes a concatenation operation on a decoded feature layer and an encoded feature layer with a same size; and performing up-sampling processing on the first image to obtain multilevel up-sampled feature layers and a second image optimized by the up-sampling processing, the up-sampling processing including a concatenation operation on an up-sampled feature layer and a decoded feature layer with a same size.
    Type: Application
    Filed: March 16, 2021
    Publication date: July 1, 2021
    Inventors: Chao LI, Dongliang HE, Fu LI, Hao SUN
  • Publication number: 20210192194
    Abstract: The present application discloses a video-based human behavior recognition method, apparatus, device and storage medium, and relates to the technical field of human recognitions. The specific implementation scheme lies in: acquiring a human rectangle of each video frame of the video to be recognized, where each human rectangle includes a plurality of human key points, and each of the human key points has a key point feature; constructing a feature matrix according to the human rectangle of the each video frame; convolving the feature matrix with respect to a video frame quantity dimension to obtain a first convolution result and convolving the feature matrix with respect to a key point quantity dimension to obtain a second convolution result; inputting the first convolution result and the second convolution result into a preset classification model to obtain a human behavior category of the video to be recognized.
    Type: Application
    Filed: September 16, 2020
    Publication date: June 24, 2021
    Inventors: Zhizhen Chi, Fu Li, Hao Sun, Dongliang He, Xiang Long, Zhichao Zhou, Ping Wang, Shilei Wen, Errui Ding
  • Publication number: 20210019531
    Abstract: a method and an apparatus for classifying a video are provided. The method may include: acquiring a to-be-classified video; extracting a set of multimodal features of the to-be-classified video; inputting the set of multimodal features into a post-fusion model corresponding to each modal respectively, to obtain multimodal category information of the to-be-classified video; and fusing the multimodal category information of the to-be-classified video, to obtain category information of the to-be-classified video. This embodiment improves the accuracy of video classification.
    Type: Application
    Filed: March 26, 2020
    Publication date: January 21, 2021
    Inventors: Xiang Long, Dongliang He, Fu Li, Zhizhen Chi, Zhichao Zhou, Xiang Zhao, Ping Wang, Hao Sun, Shilei Wen, Errui Ding
  • Patent number: 10861133
    Abstract: A super-resolution video reconstruction method, device, apparatus and a computer-readable storage medium are provided. The method includes: extracting a hypergraph from consecutive frames of an original video; inputting a hypergraph vector of the hypergraph into a residual convolutional neural network to obtain an output result of the residual convolutional neural network; and inputting the output result of the residual convolutional neural network into a spatial upsampling network to obtain a super-resolution frame, wherein a super-resolution video of the original video is formed by multiple super-resolution frames.
    Type: Grant
    Filed: March 6, 2020
    Date of Patent: December 8, 2020
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Chao Li, Dongliang He, Xiao Liu, Yukang Ding, Shilei Wen, Errui Ding, Henan Zhang, Hao Sun
  • Publication number: 20200374526
    Abstract: A method, device, apparatus for predicting a video coding complexity and a computer-readable storage medium are provided. The method includes: acquiring an attribute feature of a target video; extracting a plurality of first target image frames from the target video; performing a frame difference calculation on the plurality of the first target image frames, to acquire a plurality of first frame difference images; determining a histogram feature for frame difference images of the target video according to a statistical histogram of each first frame difference image; and inputting a plurality of features of the target video into a coding complexity prediction model to acquire a coding complexity prediction value of the target video. Through the above method, the BPP prediction value can be acquired intelligently.
    Type: Application
    Filed: February 21, 2020
    Publication date: November 26, 2020
    Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd
    Inventors: Zhichao Zhou, Dongliang He, Fu Li, Xiang Zhao, Xin Li, Zhizhen Chi, Xiang Long, Hao Sun
  • Publication number: 20200372609
    Abstract: A super-resolution video reconstruction method, device, apparatus and a computer-readable storage medium are provided. The method includes: extracting a hypergraph from consecutive frames of an original video; inputting a hypergraph vector of the hypergraph into a residual convolutional neural network to obtain an output result of the residual convolutional neural network; and inputting the output result of the residual convolutional neural network into a spatial upsampling network to obtain a super-resolution frame, wherein a super-resolution video of the original video is formed by multiple super-resolution frames.
    Type: Application
    Filed: March 6, 2020
    Publication date: November 26, 2020
    Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Chao LI, Dongliang HE, Xiao LIU, Yukang DING, Shilei WEN, Errui DING, Henan ZHANG, Hao SUN
  • Publication number: 20200320303
    Abstract: A method and an apparatus for grounding a target video clip in a video are provided. The method includes: determining a current video clip in the video based on a current position; acquiring descriptive information indicative of a pre-generated target video clip descriptive feature, and executing a target video clip determining step which includes: determining current state information of the current video clip, wherein the current state information includes information indicative of a feature of the current video clip; generating a current action policy based on the descriptive information and the current state information, the current action policy being indicative of a position change of the current video clip in the video; the method further comprises: in response to reaching a preset condition, using a video clip resulting from executing the current action policy on the current video clip as the target video clip.
    Type: Application
    Filed: June 18, 2020
    Publication date: October 8, 2020
    Inventors: Dongliang He, Xiang Zhao, Jizhou Huang, Fu Li, Xiao Liu, Shilei Wen
  • Publication number: 20200042776
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for recognizing a body movement. A specific embodiment of the method includes: sampling an input to-be-recognized video to obtain a sampled image frame sequence of the to-be-recognized video; performing key point detection on the sampled image frame sequence by using a trained body key point detection model, to obtain a body key point position heat map of each sampled image frame in the sampled image frame sequence, the body key point position heat map being used to represent a probability feature of a position of a preset body key point; and inputting body key point position heat maps of the sampled image frame sequence into a trained movement classification model to perform classification, to obtain a body movement recognition result corresponding to the to-be-recognized video.
    Type: Application
    Filed: July 11, 2019
    Publication date: February 6, 2020
    Inventors: Hui SHEN, Yuan GAO, Dongliang HE, Xiao LIU, Xubin LI, Hao SUN, Shilei WEN, Errui DING
  • Patent number: 10469824
    Abstract: In this disclosure, a hybrid digital-analog video coding scheme is described. For a video including at least two associated sequences of frames such as a stereo video, some of the frames are digitally encoded, while the others are encoded in an analog way with reference to the digital frames. For an analog frame, a previous frame in the same sequence/view and a temporally consistent frame in another sequence/view will be encoded as digital frames. These two digital frames are used to provide side information in encoding the analog frame. At decoding side, inverse operations are performed. Such “zigzag” hybrid coding significantly improves the coding efficiency while providing robustness and good visual quality.
    Type: Grant
    Filed: October 24, 2016
    Date of Patent: November 5, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Chong Luo, Dongliang He, Wenjun Zeng
  • Publication number: 20180270471
    Abstract: In this disclosure, a hybrid digital-analog video coding scheme is described. For a video including at least two associated sequences of frames such as a stereo video, some of the frames are digitally encoded, while the others are encoded in an analog way with reference to the digital frames. For an analog frame, a previous frame in the same sequence/view and a temporally consistent frame in another sequence/view will be encoded as digital frames. These two digital frames are used to provide side information in encoding the analog frame. At decoding side, inverse operations are performed. Such “zigzag” hybrid coding significantly improves the coding efficiency while providing robustness and good visual quality.
    Type: Application
    Filed: October 24, 2016
    Publication date: September 20, 2018
    Inventors: Chong Luo, Dongliang He, Wenjun Zeng