Patents by Inventor Errui DING

Errui DING has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11915484
    Abstract: A method, an apparatus, device and a storage medium for generating a target re-recognition model are provided. The method may include: acquiring a set of labeled samples, a set of unlabeled samples and an initialization model obtained through supervised training; performing feature extraction on each sample in the set of the unlabeled samples by using the initialization model; clustering features extracted from the set of the unlabeled samples by using a clustering algorithm; assigning, for each sample in the set of the unlabeled samples, a pseudo label to the sample according to a cluster corresponding to the sample in a feature space; and mixing a set of samples with a pseudo label and the set of the labeled samples as a set of training samples, and performing supervised training on the initialization model to obtain a target re-recognition model.
    Type: Grant
    Filed: June 17, 2021
    Date of Patent: February 27, 2024
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Zhigang Wang, Jian Wang, Errui Ding, Hao Sun
  • Patent number: 11908219
    Abstract: The disclosure provides a method and a device for processing information, an electronic device, and a storage medium, belonging to a field of artificial intelligence including computer vision, deep learning, and natural language processing. In the method, the computing device recognizes multiple text items in the image. The computing device classifies multiple text items into a first set of name text items and a second set of content text items based on semantics of the text items. The computing device performs a matching operation between the first set and the second set based on a layout of the text items in the image, and determines matched name-content text items. The matched name-content text items include a name text item in the first set and a content text item matching the name text item and in the second set. The computing device outputs the matched name-content text items.
    Type: Grant
    Filed: April 29, 2021
    Date of Patent: February 20, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Zihan Ni, Yipeng Sun, Kun Yao, Junyu Han, Errui Ding, Jingtuo Liu, Haifeng Wang
  • Patent number: 11893708
    Abstract: Provided are an image processing method and apparatus, a device, and a storage medium, relating to the technical field of image processing, in particular to the artificial intelligence fields such as computer vision and deep learning. The specific implementation scheme is as follows: inputting a to-be-processed image into an encoding network to obtain a basic image feature, wherein the encoding network includes at least two cascaded overlapping encoding sub-networks which perform encoding and fusion processing on input data at at least two resolutions; and inputting the basic image feature into a decoding network to obtain a target image feature for pixel point classification, wherein the decoding network includes at least one cascaded overlapping decoding sub-network to perform decoding and fusion processing on input data at at least two resolutions respectively.
    Type: Grant
    Filed: October 20, 2021
    Date of Patent: February 6, 2024
    Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Jian Wang, Xiang Long, Hao Sun, Zhiyong Jin, Errui Ding
  • Patent number: 11881044
    Abstract: A method and apparatus for processing an image, a device and a storage medium are provided. An implementation of the method includes: acquiring a template image, the template image including at least one region of interest; determining a first feature map corresponding to each region of interest in the template image; acquiring a target image; determining a second feature map of the target image; and determining at least one region of interest in the target image according to the first feature map and the second feature map.
    Type: Grant
    Filed: June 21, 2021
    Date of Patent: January 23, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Chengquan Zhang, Mengyi En, Ju Huang, Qunyi Xie, Xiameng Qin, Kun Yao, Junyu Han, Jingtuo Liu, Errui Ding
  • Publication number: 20240013558
    Abstract: There is provided cross-modal feature extraction, retrieval, and model training methods and apparatuses, and a medium, which relates to the field of artificial intelligence (AI) technologies, and specifically to fields of deep learning, image processing, and computer vision technologies. A specific implementation solution involves: acquiring to-be-processed data, the to-be-processed data corresponding to at least two types of first modalities; determining first data of a second modality in the to-be-processed data, the second modality being any of the types of the first modalities; performing semantic entity extraction on the first data to obtain semantic entities; and acquiring semantic coding features of the first data based on the first data and the semantic entities and by using a pre-trained cross-modal feature extraction model.
    Type: Application
    Filed: February 23, 2023
    Publication date: January 11, 2024
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Haoran WANG, Dongliang HE, Fu LI, Errui DING
  • Publication number: 20230419592
    Abstract: A method for training a three-dimensional face reconstruction model includes inputting an acquired sample face image into a three-dimensional face reconstruction model to obtain a coordinate transformation parameter and a face parameter of the sample face image; determining the three-dimensional stylized face image of the sample face image according to the face parameter of the sample face image and the acquired stylized face map of the sample face image; transforming the three-dimensional stylized face image of the sample face image into a camera coordinate system based on the coordinate transformation parameter, and rendering the transformed three-dimensional stylized face image to obtain a rendered map; and training the three-dimensional face reconstruction model according to the rendered map and the stylized face map of the sample face image.
    Type: Application
    Filed: January 20, 2023
    Publication date: December 28, 2023
    Inventors: Di WANG, Ruizhi Chen, Chen Zhao, Jingtuo Liu, Errui Ding, Tian Wu, Haifeng Wang
  • Publication number: 20230419610
    Abstract: An image rendering method includes the steps below. A model of an environmental object is rendered to obtain an image of the environmental object in a target perspective. An image of a target object in the target perspective and a model of the target object are determined according to a neural radiance field of the target object. The image of the target object is fused and rendered into the image of the environmental object according to the model of the target object.
    Type: Application
    Filed: March 16, 2023
    Publication date: December 28, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Xing LIU, Ruizhi CHEN, Yan ZHANG, Chen ZHAO, Hao SUN, Jingtuo LIU, Errui DING, Tian WU, Haifeng WANG
  • Patent number: 11854237
    Abstract: A human body identification method, an electronic device and a storage medium, related to the technical field of artificial intelligence such as computer vision and deep learning, are provided. The method includes: inputting an image to be identified into a human body detection model, to obtain a plurality of preselected detection boxes; identifying a plurality of key points from each of the preselected detection boxes respectively according to a human body key point detection model, and obtaining a key point score of each of the key points; determining a target detection box from each of the preselected detection boxes, according to a number of the key points whose key point scores meet a key point threshold; and inputting the target detection box into a human body key point classification model, to obtain a human body identification result for the image to be identified.
    Type: Grant
    Filed: June 21, 2021
    Date of Patent: December 26, 2023
    Assignee: Beijing Baidu Netcom Science and Technology Co., LTD
    Inventors: Zipeng Lu, Jian Wang, Yuchen Yuan, Hao Sun, Errui Ding
  • Publication number: 20230386168
    Abstract: A pre-training method for a Vision and Scene Text Aggregation model includes: acquiring a sample image-text pair; extracting a sample scene text from a sample image; inputting a sample text into a text encoding network to obtain a sample text feature; inputting the sample image and an initial sample aggregation feature into a visual encoding subnetwork and inputting the initial sample aggregation feature and the sample scene text into a scene encoding subnetwork to obtain a global image feature of the sample image and a learned sample aggregation feature; and pre-training the Vision and Scene Text Aggregation model according to the sample text feature, the global image feature of the sample image, and the learned sample aggregation feature.
    Type: Application
    Filed: March 29, 2023
    Publication date: November 30, 2023
    Inventors: Yipeng SUN, Mengjun CHENG, Longchao WANG, Xiongwei ZHU, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING, Jingdong WANG, Haifeng Wang
  • Patent number: 11776155
    Abstract: Embodiments of the present disclosure provide a method and apparatus for detecting a target object in an image. The method includes: performing following prediction operations using a pre-trained neural network: detecting a target object in a two-dimensional image to determine a two-dimensional bounding box of the target object; and determining a relative position constraint relationship between the two-dimensional bounding box of the target object and a three-dimensional projection bounding box obtained by projecting a three-dimensional bounding box of the target object into the two-dimensional image; and the method further including: determining the three-dimensional projection bounding box of the target object, based on the two-dimensional bounding box of the target object and the relative position constraint relationship between the two-dimensional bounding box of the target object and the three-dimensional projection bounding box.
    Type: Grant
    Filed: June 5, 2020
    Date of Patent: October 3, 2023
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Xiaoqing Ye, Xiao Tan, Wei Zhang, Hao Sun, Errui Ding
  • Patent number: 11763552
    Abstract: A method for detecting a surface defect, a method for training model, an apparatus, a device, and a medium, are provided. The method includes: inputting a surface image of the article for detection into a defect detection model to perform a defect detection, and acquiring a defect detection result output by the defect detection model; inputting a surface image of a defective article determined to be defective into an image discrimination model based on the defect detection result to determine whether the surface image of the defective article is defective, wherein the image discrimination model is a trained generative adversarial networks model, and the generative adversarial networks model is obtained by training using a surface image of a defect-free good article; and adjusting the defect detection result of the surface image of the defective article according to a determination result of the image discrimination model.
    Type: Grant
    Filed: December 9, 2020
    Date of Patent: September 19, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Shufei Lin, Jianfeng Zhu, Pengcheng Yuan, Bin Zhang, Shumin Han, Yingbo Xu, Yuan Feng, Ying Xin, Xiaodi Wang, Jingwei Liu, Shilei Wen, Hongwu Zhang, Errui Ding
  • Publication number: 20230290126
    Abstract: Provided are a method for training a region of interest (ROI) detection model, a method for detecting an ROI, a device, and a medium. The specific implementation includes: performing feature extraction on a sample image to obtain a sample feature data; performing non-linear mapping on the sample feature data to obtain a first feature data and a second feature data; determining an inter-region difference data according to the second feature data and a third feature data of the first feature data in a region associated with a label ROI; and adjusting at least one of a to-be-trained feature extraction parameter and a to-be-trained feature enhancement parameter of the ROI detection model according to the inter-region difference data and the region associated with the label ROI.
    Type: Application
    Filed: February 28, 2023
    Publication date: September 14, 2023
    Inventors: Pengyuan LV, Sen FAN, Chengquan ZHANG, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING, Jingdong WANG
  • Publication number: 20230289402
    Abstract: Provided are a joint perception model training method, a joint perception method, a device, and a storage medium. The joint perception model training method includes: acquiring sample images and perception tags of the sample images; acquiring a preset joint perception model, where the joint perception model includes a feature extraction network and a joint perception network; performing feature extraction on the sample images through the feature extraction network to obtain target sample features; performing joint perception through the joint perception network according to the target sample features to obtain perception prediction results; and training the preset joint perception model according to the perception prediction results and the perception tags, where the joint perception includes executing at least two perception tasks.
    Type: Application
    Filed: November 14, 2022
    Publication date: September 14, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Jian WANG, Xiangbo SU, Qiman WU, Zhigang WANG, Hao SUN, Errui DING, Jingdong WANG, Tian WU, Haifeng WANG
  • Publication number: 20230215203
    Abstract: The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.
    Type: Application
    Filed: February 14, 2023
    Publication date: July 6, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Pengyuan LV, Chengquan ZHANG, Shanshan LIU, Meina QIAO, Yangliu XU, Liang WU, Xiaoyan WANG, Kun YAO, Junyu Han, Errui DING, Jingdong WANG, Tian WU, Haifeng WANG
  • Publication number: 20230213388
    Abstract: A method and an apparatus for measuring temperature, and a computer-readable storage medium includes detecting a target position of an object in an input image; determining key points of the target position and weight information of each key point based on a detection result of the target position, in which the weight information is configured to indicate a probability of each key point being covered; acquiring temperature information of each key point; and determining a temperature of the target position at least based on the temperature information and the weight information of each key point.
    Type: Application
    Filed: October 14, 2020
    Publication date: July 6, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Haocheng Feng, Haixiao Yue, Keyao Wang, Gang Zhang, Yanwen Fan, Xiyu Yu, Junyu Han, Jingtuo Liu, Errui Ding, Haifeng Wang
  • Publication number: 20230215136
    Abstract: The present disclosure provides a method and apparatus for training a multi-modal data matching degree calculation model, a method and apparatus for calculating a multi-modal data matching degree, an electronic device, a computer readable storage medium and a computer program product, and relates to the field of artificial intelligence technology such as deep learning, image processing and computer vision. The method comprises: acquiring first sample data and second sample data that are different in modalities; constructing a contrastive learning loss function comprising a semantic perplexity parameter, the semantic perplexity parameter being determined based on a semantic feature distance between the first sample data and the second sample data; and training, by using the contrastive learning loss function, an initial multi-modal data matching degree calculation model through a contrastive learning approach, to obtain a target multi-modal data matching degree calculation model.
    Type: Application
    Filed: February 24, 2023
    Publication date: July 6, 2023
    Inventors: Haoran WANG, Dongliang HE, Fu LI, Errui DING
  • Patent number: 11694436
    Abstract: The present application discloses a vehicle re-identification method and apparatus, a device and a storage medium, which relates to the field of computer vision, intelligent search, deep learning and intelligent transportation. The specific implementation scheme is: receiving a re-identification request from a terminal device, the re-identification request including a first image of a first vehicle shot by a first camera and information of the first camera; acquiring a first feature of the first vehicle and a first head orientation of the first vehicle according to the first image; determining a second image of the first vehicle from images of multiple vehicles according to the first feature, multiple second features extracted based on the images of the multiple vehicles in an image database, the first head orientation of the first vehicle, and the information of the first camera; and transmitting the second image to the terminal device.
    Type: Grant
    Filed: February 1, 2021
    Date of Patent: July 4, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Minyue Jiang, Xiao Tan, Hao Sun, Hongwu Zhang, Shilei Wen, Errui Ding
  • Publication number: 20230186486
    Abstract: A method for tracking vehicles includes: extracting a target image at a current moment from a video stream obtained during traveling of vehicles; performing instance segmentation on the target image to obtain detection boxes corresponding to individual vehicles in the target image; extracting, from the detection box for each vehicle, a set of pixel points corresponding to each vehicle; processing image features of each pixel point in the set of pixel points corresponding to each vehicle to determine features of each vehicle in the target image; and determining, according to the features of each vehicle in the target image and the degree of matching between the features of each vehicle in past images, movement trajectory of each vehicle in the target image, wherein the past images are n images adjacent to and before the target image in the video stream, and n is a positive integer.
    Type: Application
    Filed: October 30, 2020
    Publication date: June 15, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Wei Zhang, Xiao Tan, Hao Sun, Shilei Wen, Hongwu Zhang, Errui Ding
  • Publication number: 20230147550
    Abstract: A method for pre-training a semantic representation model includes: for each video-text pair in pre-training data, determining a mask image sequence, a mask character sequence, and a mask image-character sequence of the video-text pair; determining a plurality of feature sequences and mask position prediction results respectively corresponding to the plurality of feature sequences by inputting the mask image sequence, the mask character sequence, and the mask image-character sequence into an initial semantic representation model; and building a loss function based on the plurality of feature sequences, the mask position prediction results respectively corresponding to the plurality of feature sequences and true mask position results, and adjusting coefficients of the semantic representation model to realize training.
    Type: Application
    Filed: November 1, 2022
    Publication date: May 11, 2023
    Inventors: Dongliang HE, Errui DING
  • Publication number: 20230130006
    Abstract: The present application provides a method of processing a video, a method of querying a video, and a method of training a video processing model. A specific implementation solution of the method of processing the video includes: extracting, for a video to be processed, a plurality of video features under a plurality of receptive fields; extracting a local feature of the video to be processed according to a video feature under a target receptive field in the plurality of receptive fields; obtaining a global feature of the video to be processed according to a video feature under a largest receptive field in the plurality of receptive fields; and merging the local feature and the global feature to obtain a target feature of the video to be processed.
    Type: Application
    Filed: December 22, 2022
    Publication date: April 27, 2023
    Inventors: Dongliang HE, Errui DING, Haifeng WANG