Patents by Inventor Dongliang He

Dongliang He has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11983849
    Abstract: An image filling method and apparatus, a device and a storage medium are disclosed. The image filling method includes: performing multilevel encoding processing on features of an image to be filled to generate multilevel encoded feature layers, sizes of the multilevel encoded feature layers being reduced layer by layer; performing layer-by-layer decoding processing on the multilevel encoded feature layers to obtain multilevel decoded feature layers and a first image, there being no missing region in the first image, wherein the layer-by-layer decoding processing includes a concatenation operation on a decoded feature layer and an encoded feature layer with a same size; and performing up-sampling processing on the first image to obtain multilevel up-sampled feature layers and a second image optimized by the up-sampling processing, the up-sampling processing including a concatenation operation on an up-sampled feature layer and a decoded feature layer with a same size.
    Type: Grant
    Filed: March 16, 2021
    Date of Patent: May 14, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Chao Li, Dongliang He, Fu Li, Hao Sun
  • Publication number: 20240013558
    Abstract: There is provided cross-modal feature extraction, retrieval, and model training methods and apparatuses, and a medium, which relates to the field of artificial intelligence (AI) technologies, and specifically to fields of deep learning, image processing, and computer vision technologies. A specific implementation solution involves: acquiring to-be-processed data, the to-be-processed data corresponding to at least two types of first modalities; determining first data of a second modality in the to-be-processed data, the second modality being any of the types of the first modalities; performing semantic entity extraction on the first data to obtain semantic entities; and acquiring semantic coding features of the first data based on the first data and the semantic entities and by using a pre-trained cross-modal feature extraction model.
    Type: Application
    Filed: February 23, 2023
    Publication date: January 11, 2024
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Haoran WANG, Dongliang HE, Fu LI, Errui DING
  • Patent number: 11810310
    Abstract: The satellite image processing method includes: acquiring a first target satellite image; defogging the first target satellite image through a first neural network to acquire a first satellite image; and adjusting an image quality parameter of the first satellite image through a second neural network to acquire a second satellite image.
    Type: Grant
    Filed: June 1, 2021
    Date of Patent: November 7, 2023
    Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Dongliang He, Henan Zhang, Hao Sun
  • Patent number: 11748895
    Abstract: A method and apparatus for processing a video frame are provided. The method may include: converting, using an optical flow generated based on a previous frame and a next frame of adjacent frames in a video, a feature map of the previous frame to obtain a converted feature map; determining, based on an error of the optical flow, a weight of the converted feature map, and obtaining a fused feature map based on a weighted result of a feature of the converted feature map and a feature of a feature map of the next frame; and updating the feature map of the next frame as the fused feature map.
    Type: Grant
    Filed: February 24, 2021
    Date of Patent: September 5, 2023
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Tianwei Lin, Xin Li, Fu Li, Dongliang He, Hao Sun, Henan Zhang
  • Patent number: 11734809
    Abstract: Embodiments of the present disclosure provide a method and apparatus for processing an image, and relates to the field of computer vision technology. The method may include: acquiring a value to be processed, where the value to be processed is associated with an image to be processed; and processing the value to be processed by using a quality scoring model to generate a score of the image to be processed in a target scoring domain, where the score of the image to be processed in the target scoring domain is related to an image quality of the image to be processed.
    Type: Grant
    Filed: February 11, 2021
    Date of Patent: August 22, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Xiang Long, Ping Wang, Zhichao Zhou, Fu Li, Dongliang He, Hao Sun
  • Publication number: 20230230205
    Abstract: Provided are an image enhancement method and apparatus, an electronic device, and a storage medium. The image enhancement method includes: acquiring an original image, and configuring the original image as a current image; selecting a renderer from a plurality of pre-trained renderers as a current renderer in response to the current image satisfying a preset enhancement condition; and inputting the current image to the current renderer, and outputting, through the current renderer, an enhanced image of the current image in a dimension corresponding to the current renderer; and repeating the preceding operation by configuring the enhanced image of the current image in the dimension corresponding to the current renderer as the current image until the current image does not satisfy the enhancement condition.
    Type: Application
    Filed: January 18, 2023
    Publication date: July 20, 2023
    Inventors: Xin Li, Dongliang He, Qi Zhang
  • Publication number: 20230232116
    Abstract: Provided are a video conversion method, an electronic device and a non-transitory computer readable storage medium. The implementation scheme is as follows: a to-be-converted SDR video is acquired; one frame is extracted from the to-be-converted SDR video to serve as a current SDR image, the current SDR image is input into a parameter predictor and a generator, and an adjustment parameter corresponding to the current SDR image is output from the parameter predictor; the adjustment parameter corresponding to the current SDR image is input into the generator, and an HDR image corresponding to the current SDR image is output from the generator; and the operation described above is repeatedly performed until frames are converted into HDR images each of which corresponds to a respective frame of the frames; and a corresponding HDR video is generated based on the HDR images corresponding to the frames.
    Type: Application
    Filed: January 18, 2023
    Publication date: July 20, 2023
    Inventors: Qi Zhang, Dongliang He, Xin Li
  • Publication number: 20230215136
    Abstract: The present disclosure provides a method and apparatus for training a multi-modal data matching degree calculation model, a method and apparatus for calculating a multi-modal data matching degree, an electronic device, a computer readable storage medium and a computer program product, and relates to the field of artificial intelligence technology such as deep learning, image processing and computer vision. The method comprises: acquiring first sample data and second sample data that are different in modalities; constructing a contrastive learning loss function comprising a semantic perplexity parameter, the semantic perplexity parameter being determined based on a semantic feature distance between the first sample data and the second sample data; and training, by using the contrastive learning loss function, an initial multi-modal data matching degree calculation model through a contrastive learning approach, to obtain a target multi-modal data matching degree calculation model.
    Type: Application
    Filed: February 24, 2023
    Publication date: July 6, 2023
    Inventors: Haoran WANG, Dongliang HE, Fu LI, Errui DING
  • Patent number: 11657612
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for identifying a video. A specific embodiment of the method includes: acquiring a predetermined number of video frames from a video to be identified to obtain a video frame sequence; performing the following processing step: importing the video frame sequence into a pre-trained video identification model to obtain a classification tag probability corresponding to the video frame sequence, wherein the classification tag probability is used to characterize a probability of identifying a corresponding tag category of the video to be identified; and setting, in response to the classification tag probability being greater than or equal to a preset identification accuracy threshold, a video tag for the video to be identified according to the classification tag probability, or else increasing the number of video frames in the video frame sequence and continuing to perform the above processing step.
    Type: Grant
    Filed: March 5, 2021
    Date of Patent: May 23, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Dongliang He, Xiao Tan, Shilei Wen, Hao Sun
  • Publication number: 20230147550
    Abstract: A method for pre-training a semantic representation model includes: for each video-text pair in pre-training data, determining a mask image sequence, a mask character sequence, and a mask image-character sequence of the video-text pair; determining a plurality of feature sequences and mask position prediction results respectively corresponding to the plurality of feature sequences by inputting the mask image sequence, the mask character sequence, and the mask image-character sequence into an initial semantic representation model; and building a loss function based on the plurality of feature sequences, the mask position prediction results respectively corresponding to the plurality of feature sequences and true mask position results, and adjusting coefficients of the semantic representation model to realize training.
    Type: Application
    Filed: November 1, 2022
    Publication date: May 11, 2023
    Inventors: Dongliang HE, Errui DING
  • Publication number: 20230130006
    Abstract: The present application provides a method of processing a video, a method of querying a video, and a method of training a video processing model. A specific implementation solution of the method of processing the video includes: extracting, for a video to be processed, a plurality of video features under a plurality of receptive fields; extracting a local feature of the video to be processed according to a video feature under a target receptive field in the plurality of receptive fields; obtaining a global feature of the video to be processed according to a video feature under a largest receptive field in the plurality of receptive fields; and merging the local feature and the global feature to obtain a target feature of the video to be processed.
    Type: Application
    Filed: December 22, 2022
    Publication date: April 27, 2023
    Inventors: Dongliang HE, Errui DING, Haifeng WANG
  • Patent number: 11625433
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for searching a video segment, a device and a medium, and relate to the field of video data search. The method includes: sampling video frames from a target video and videos to be searched in a video library, and extracting features from the sampled frames; matching the target video and the videos to be searched according to the extracted features to determine a candidate video to be searched that matches the target video; determining at least one candidate video segment from the determined candidate video, and calculating a degree of matching between the target video and each candidate video segment based on the extracted features of each sampled frame; and determining a video segment matching the target video in the videos to be searched according to the calculated degree of matching between the target video and each candidate video segment.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: April 11, 2023
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Xiang Long, Ping Wang, Fu Li, Dongliang He, Hao Sun, Shilei Wen
  • Patent number: 11615140
    Abstract: A method includes screening, by a video-clip screening module in a video description model, a plurality of video proposal clips acquired from a video to be analyzed, to acquire a plurality of video clips suitable for description. The plural video proposal clips acquired from the video to be analyzed may be screened by the video-clip screening module to acquire the plural video clips suitable for description; and then, each video clip is described by a video-clip describing module, thus avoiding description of all the video proposal clips, only describing the screened video clips which have strong correlation with the video and are suitable for description, removing the interference of the description of the video clips which are not suitable for description in the description of the video, guaranteeing the accuracy of the final descriptions of the video clips, and improving the quality of the descriptions of the video clips.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: March 28, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Xiang Long, Dongliang He, Fu Li, Xiang Zhao, Tianwei Lin, Hao Sun, Shilei Wen, Errui Ding
  • Patent number: 11600069
    Abstract: A method and apparatus for detecting a temporal action of a video, an electronic device and a storage medium are disclosed, which relates to the field of video processing technologies. An implementation includes: acquiring an initial temporal feature sequence of a video to be detected; acquiring, by a pre-trained video-temporal-action detecting module, implicit features and explicit features of a plurality of configured temporal anchor boxes based on the initial temporal feature sequence; and acquiring, by the video-temporal-action detecting module, the starting position and the ending position of a video clip containing a specified action, the category of the specified action and the probability that the specified action belongs to the category from the plural temporal anchor boxes according to the explicit features and the implicit features of the plural temporal anchor boxes.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: March 7, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Tianwei Lin, Xin Li, Dongliang He, Fu Li, Hao Sun, Shilei Wen, Errui Ding
  • Publication number: 20230036338
    Abstract: A method and apparatus for generating an image restoration model, a medium and a program product are provided. The method includes: obtaining a first image and a second image, wherein the second image is an image obtained by restoring the first image; synthesizing images corresponding to feature points of the first image and the first image to obtain a synthesized image; and performing training by using the second image and the synthesized image to obtain an image restoration model.
    Type: Application
    Filed: October 11, 2022
    Publication date: February 2, 2023
    Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Fanglong Liu, Xin Li, Dongliang He
  • Publication number: 20230008473
    Abstract: A video repairing method, apparatus, device, medium, and product are provided. The method includes: acquiring a to-be-repaired video frame sequence; determining a target category corresponding to each pixel in the to-be-repaired video frame sequence based on the to-be-repaired video frame sequence and a preset category detection model; determining, from the to-be-repaired video frame sequence, to-be-repaired pixels each with a target category being a to-be-repaired category; and performing repairing on to-be-repaired areas corresponding to the to-be-repaired pixels to obtain a target video frame sequence.
    Type: Application
    Filed: September 14, 2022
    Publication date: January 12, 2023
    Inventors: Xin LI, He ZHENG, Fanglong LIU, Dongliang HE
  • Patent number: 11514676
    Abstract: The present disclosure provides a method and apparatus for detecting a region of interest in a video, a device and a storage medium. The method may include: acquiring a current to-be-processed frame from a picture frame sequence of a video; detecting a region of interest (ROI) in the current to-be-processed frame, in response to determining that the current to-be-processed frame is a detection picture frame, to determine at least one ROI in the current to-be-processed frame; and updating a to-be-tracked ROI, based on the ROI in the current to-be-processed frame and a tracking result determined by a pre-order tracking picture frame; and tracking the current to-be-processed frame based on the existing to-be-tracked ROI, in response to determining that the current to-be-processed frame is a tracking picture frame, to determine at least one tracking result as the ROI of the current to-be-processed frame.
    Type: Grant
    Filed: December 9, 2020
    Date of Patent: November 29, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Zhichao Zhou, Dongliang He, Fu Li, Hao Sun
  • Patent number: 11490168
    Abstract: Embodiments of the present disclosure relate to a method and apparatus for selecting a video clip, a server and a medium. The method may include: determining at least two video clips from a video; for each video clip, perform following excitement determination steps: inputting a feature sequence of a video frame in the video clip and title information of the video into a pre-established prediction model to obtain a relevance between the inputted video frame and a title of the video; and determining an excitement of the video clip, based on the relevance between the video frame in the video clip and the title; and determining a target video clip from the video clips, based on the excitement of each of the video clips.
    Type: Grant
    Filed: September 21, 2020
    Date of Patent: November 1, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Fu Li, Dongliang He, Hao Sun
  • Publication number: 20220319141
    Abstract: A methods for processing an image, a device, and a storage medium are provided. The method may include: inputting a target image into a pre-trained image segmentation model, the target image including at least one sub-image; extracting high-level semantic features and low-level features of the target image through the image segmentation model, and determining target location information of the sub-image in the target image based on the high-level semantic features and the low-level features; and performing a preset processing operation on the sub-image, based on the target location information of the sub-image.
    Type: Application
    Filed: June 21, 2022
    Publication date: October 6, 2022
    Inventors: Fanglong LIU, Xin LI, Dongliang HE
  • Patent number: 11430265
    Abstract: The present application discloses a video-based human behavior recognition method, apparatus, device and storage medium, and relates to the technical field of human recognitions. The specific implementation scheme lies in: acquiring a human rectangle of each video frame of the video to be recognized, where each human rectangle includes a plurality of human key points, and each of the human key points has a key point feature; constructing a feature matrix according to the human rectangle of the each video frame; convolving the feature matrix with respect to a video frame quantity dimension to obtain a first convolution result and convolving the feature matrix with respect to a key point quantity dimension to obtain a second convolution result; inputting the first convolution result and the second convolution result into a preset classification model to obtain a human behavior category of the video to be recognized.
    Type: Grant
    Filed: September 16, 2020
    Date of Patent: August 30, 2022
    Inventors: Zhizhen Chi, Fu Li, Hao Sun, Dongliang He, Xiang Long, Zhichao Zhou, Ping Wang, Shilei Wen, Errui Ding