Patents by Inventor Junyu Han

Junyu Han has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11908219
    Abstract: The disclosure provides a method and a device for processing information, an electronic device, and a storage medium, belonging to a field of artificial intelligence including computer vision, deep learning, and natural language processing. In the method, the computing device recognizes multiple text items in the image. The computing device classifies multiple text items into a first set of name text items and a second set of content text items based on semantics of the text items. The computing device performs a matching operation between the first set and the second set based on a layout of the text items in the image, and determines matched name-content text items. The matched name-content text items include a name text item in the first set and a content text item matching the name text item and in the second set. The computing device outputs the matched name-content text items.
    Type: Grant
    Filed: April 29, 2021
    Date of Patent: February 20, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Zihan Ni, Yipeng Sun, Kun Yao, Junyu Han, Errui Ding, Jingtuo Liu, Haifeng Wang
  • Patent number: 11881044
    Abstract: A method and apparatus for processing an image, a device and a storage medium are provided. An implementation of the method includes: acquiring a template image, the template image including at least one region of interest; determining a first feature map corresponding to each region of interest in the template image; acquiring a target image; determining a second feature map of the target image; and determining at least one region of interest in the target image according to the first feature map and the second feature map.
    Type: Grant
    Filed: June 21, 2021
    Date of Patent: January 23, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Chengquan Zhang, Mengyi En, Ju Huang, Qunyi Xie, Xiameng Qin, Kun Yao, Junyu Han, Jingtuo Liu, Errui Ding
  • Patent number: 11861919
    Abstract: A text recognition method includes: acquiring an image including text information, the text information including M characters, M being a positive integer greater than 1; performing text recognition on the image to acquire character information about the M characters; recognizing reading direction information about each character in accordance with the character information about the M characters, the reading direction information being used to indicate a next character corresponding to a current character in a semantic reading order; and ranking the M characters in accordance with the reading direction information about the M characters to acquire a text recognition result of the text information.
    Type: Grant
    Filed: June 21, 2021
    Date of Patent: January 2, 2024
    Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Chengquan Zhang, Pengyuan Lv, Kun Yao, Junyu Han, Jingtuo Liu
  • Patent number: 11854246
    Abstract: A method, apparatus, device and storage medium for recognizing a bill image may include: performing text detection on a bill image, and determining an attribute information set and a relationship information set of each text box of at least two text boxes in the bill image; determining a type of the text box and an associated text box that has a structural relationship with the text box based on the attribute information set and the relationship information set of the text box; and extracting structured bill data of the bill image, based on the type of the text box and the associated text box that has the structural relationship with the text box.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: December 26, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Yulin Li, Ju Huang, Xiameng Qin, Junyu Han
  • Publication number: 20230386168
    Abstract: A pre-training method for a Vision and Scene Text Aggregation model includes: acquiring a sample image-text pair; extracting a sample scene text from a sample image; inputting a sample text into a text encoding network to obtain a sample text feature; inputting the sample image and an initial sample aggregation feature into a visual encoding subnetwork and inputting the initial sample aggregation feature and the sample scene text into a scene encoding subnetwork to obtain a global image feature of the sample image and a learned sample aggregation feature; and pre-training the Vision and Scene Text Aggregation model according to the sample text feature, the global image feature of the sample image, and the learned sample aggregation feature.
    Type: Application
    Filed: March 29, 2023
    Publication date: November 30, 2023
    Inventors: Yipeng SUN, Mengjun CHENG, Longchao WANG, Xiongwei ZHU, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING, Jingdong WANG, Haifeng Wang
  • Patent number: 11783565
    Abstract: An image processing method, an electronic device and a readable storage medium, which relate to the technical field of computer vision, are disclosed. In an embodiment, a face recognition module and a service processing module maintains respectively an image buffer queue; the face recognition module maintains a face image buffer queue of face images, and the service processing module maintains a background image buffer queue of background images, i.e., first images; since the face recognition module only maintains the face image buffer queue, only a determined optimal face image, i.e., a face image to be matched, is transmitted to the service processing module, and then, the service processing module determines a background image matched with the optimal face image transmitted by the face recognition module from the maintained background image buffer queue, thus performing image recognition and image matching on a face appearing in a video source.
    Type: Grant
    Filed: April 27, 2021
    Date of Patent: October 10, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Yuexiang Hu, Renyi Zhou, Junyu Han
  • Patent number: 11775574
    Abstract: A method for visual question answering, a computer device implementing the method and a medium for storing instructions on performing the method are provided. The method includes: acquiring an input image and an input question; constructing a visual graph based on the input image, wherein the visual graph comprises a first node feature and a first edge feature; constructing a question graph based on the input question, wherein the question graph comprises a second node feature and a second edge feature; performing a multimodal fusion on the visual graph and the question graph to obtain an updated visual graph and an updated question graph; determining a question feature based on the input question; determining a fusion feature based on the updated visual graph, the updated question graph and the question feature; and generating a predicted answer for the input image and the input question.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: October 3, 2023
    Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Yulin Li, Xiameng Qin, Ju Huang, Qunyi Xie, Junyu Han
  • Patent number: 11768876
    Abstract: The present disclosure provides a method for visual question answering, which relates to a field of computer vision and natural language processing. The method includes: acquiring an input image and an input question; constructing a Visual Graph based on the input image, wherein the Visual Graph comprises a Node Feature and an Edge Feature; updating the Node Feature by using the Node Feature and the Edge Feature to obtain an updated Visual Graph; determining a question feature based on the input question; fusing the updated Visual Graph and the question feature to obtain a fused feature; and generating a predicted answer for the input image and the input question based on the fused feature. The present disclosure further provides an apparatus for visual question answering, a computer device and a non-transitory computer-readable storage medium.
    Type: Grant
    Filed: January 28, 2021
    Date of Patent: September 26, 2023
    Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Xiameng Qin, Yulin Li, Qunyi Xie, Ju Huang, Junyu Han
  • Publication number: 20230290126
    Abstract: Provided are a method for training a region of interest (ROI) detection model, a method for detecting an ROI, a device, and a medium. The specific implementation includes: performing feature extraction on a sample image to obtain a sample feature data; performing non-linear mapping on the sample feature data to obtain a first feature data and a second feature data; determining an inter-region difference data according to the second feature data and a third feature data of the first feature data in a region associated with a label ROI; and adjusting at least one of a to-be-trained feature extraction parameter and a to-be-trained feature enhancement parameter of the ROI detection model according to the inter-region difference data and the region associated with the label ROI.
    Type: Application
    Filed: February 28, 2023
    Publication date: September 14, 2023
    Inventors: Pengyuan LV, Sen FAN, Chengquan ZHANG, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING, Jingdong WANG
  • Patent number: 11756170
    Abstract: Embodiments of the present disclosure provide a method and apparatus for correcting a distorted document image, where the method for correcting a distorted document image includes: obtaining a distorted document image; and inputting the distorted document image into a correction model, and obtaining a corrected image corresponding to the distorted document image; where the correction model is a model obtained by training with a set of image samples as inputs and a corrected image corresponding to each image sample in the set of image samples as an output, and the image samples are distorted. By inputting the distorted document image to be corrected into the correction model, the corrected image corresponding to the distorted document image can be obtained through the correction model, which realizes document image correction end-to-end, improves accuracy of the document image correction, and extends application scenarios of the document image correction.
    Type: Grant
    Filed: January 19, 2021
    Date of Patent: September 12, 2023
    Inventors: Qunyi Xie, Xiameng Qin, Yulin Li, Junyu Han, Shengxian Zhu
  • Patent number: 11756332
    Abstract: The present application discloses an image recognition method, apparatus, device, and a computer storage medium, which is related to a technical field of artificial intelligence, and in particular, to a technical field of image processing. The method includes: performing organ recognition on a human face image and marking positions of the human facial five sense organs in the human face image, obtaining a marked human face image; inputting the marked human face image into a backbone network model and performing feature extraction, obtaining defect features of the marked human face image outputted by different convolutional neural network levels of the backbone network model; and fusing the defect features of different levels that are located in a same area of the human face image, obtaining a defect recognition result of the human face image.
    Type: Grant
    Filed: March 22, 2021
    Date of Patent: September 12, 2023
    Inventors: Zhizhi Guo, Yipeng Sun, Jingtuo Liu, Junyu Han
  • Publication number: 20230213388
    Abstract: A method and an apparatus for measuring temperature, and a computer-readable storage medium includes detecting a target position of an object in an input image; determining key points of the target position and weight information of each key point based on a detection result of the target position, in which the weight information is configured to indicate a probability of each key point being covered; acquiring temperature information of each key point; and determining a temperature of the target position at least based on the temperature information and the weight information of each key point.
    Type: Application
    Filed: October 14, 2020
    Publication date: July 6, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Haocheng Feng, Haixiao Yue, Keyao Wang, Gang Zhang, Yanwen Fan, Xiyu Yu, Junyu Han, Jingtuo Liu, Errui Ding, Haifeng Wang
  • Publication number: 20230215203
    Abstract: The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.
    Type: Application
    Filed: February 14, 2023
    Publication date: July 6, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Pengyuan LV, Chengquan ZHANG, Shanshan LIU, Meina QIAO, Yangliu XU, Liang WU, Xiaoyan WANG, Kun YAO, Junyu Han, Errui DING, Jingdong WANG, Tian WU, Haifeng WANG
  • Publication number: 20230206667
    Abstract: A method for recognizing text includes: obtaining a first feature map of an image; for each target feature unit, performing a feature enhancement process on a plurality of feature values of the target feature unit respectively based on the plurality of feature values of the target feature unit, in which the target feature unit is a feature unit in the first feature map along a feature enhancement direction; and performing a text recognition process on the image based on the first feature map after the feature enhancement process.
    Type: Application
    Filed: December 29, 2022
    Publication date: June 29, 2023
    Inventors: Pengyuan LV, Liang WU, Shanshan LIU, Meina QIAO, Chengquan ZHANG, Kun YAO, Junyu HAN
  • Patent number: 11687704
    Abstract: Disclosed are a method, apparatus and electronic device for annotating information of a structured document. A specific implementation is: obtaining a template image of a structured document and at least one piece of annotation information of a field to be filled in the template image, where the annotation information includes attribute value and historical content of the field to be filled, and historical position of the field to be filled in the template image; generating, according to the attribute value of the field to be filled, the historical content of the field to be filled and the historical position of the field to be filled in the template image, target filling information of the field to be filled; obtaining, according to the target filling information of the field to be filled, an image of an annotated structured document.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: June 27, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Qiaoyi Li, Xiangkai Huang, Yulin Li, Ju Huang, Xiameng Qin, Duohao Qin, Minghao Liu, Junyu Han
  • Patent number: 11687779
    Abstract: An image recognition method is provided, which is related to a technical field of artificial intelligence, and in particular, to a technical field of image processing. An implementation includes: performing five-sense-organ recognition on a preprocessed human face image and marking positions of the human facial five sense organs in the human face image, to obtain the marked human face image; determining human face images at multiple scales of the marked human face image, inputting the human face images of multiple scales into a backbone network model, and performing feature extraction, to obtain a wrinkle feature of the human face image at each of the multiple scales; and fusing the wrinkle feature at each scale that is located in a same area of the human face image, to obtain a wrinkle recognition result of the human face image.
    Type: Grant
    Filed: March 22, 2021
    Date of Patent: June 27, 2023
    Inventors: Zhizhi Guo, Yipeng Sun, Jingtuo Liu, Junyu Han
  • Publication number: 20230145443
    Abstract: Provided are a video stitching method and an apparatus, an electronic device, and a storage medium. In the video stitching method, an intermediate frame is inserted between a last image frame of a first video and a first image frame of a second video. L image frames are sequentially selected in order from back to front from the first video and L image frames are sequentially selected in order from front to back from the second video separately, and L is a natural number greater than 1. The first video and the second video are stitched together to form a target video according to the intermediate frame, the L image frames in the first video, and the L image frames in the second video.
    Type: Application
    Filed: October 4, 2022
    Publication date: May 11, 2023
    Inventors: Tianshu HU, Hanqi GUO, Junyu HAN, Zhibin HONG
  • Publication number: 20230120985
    Abstract: A method for training a face recognition model includes: acquiring a plurality of first training images being uncovered face images, and acquiring a plurality of covering object images; generating a plurality of second training images by separately fusing the plurality of covering object images with the uncovered face images; and training the face recognition model by inputting the plurality of first training images and the plurality of second training images into the face recognition model.
    Type: Application
    Filed: December 16, 2022
    Publication date: April 20, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Yanwen Fan, Xiyu Yu, Gang Zhang, Jingtuo Liu, Haifeng Wang, Errui Ding, Junyu Han
  • Publication number: 20230124389
    Abstract: A model determination method and electronic device is provided, and relates to the technical field of artificial intelligence and, in particular, to the field of computer visions and deep learning, and can be applied to image processing, image identification and other scenarios. A specific implementation solution includes an image sample and a text sample are acquired, wherein text data in the text sample is used for performing text description to target image data in the image sample; at least one image feature in the image sample is stored to a first queue, and at least text feature in the text sample is stored to a second queue; the first queue and the second queue are trained to obtain a first target model; and the first target model is determined as an initialization model for a second target model.
    Type: Application
    Filed: August 15, 2022
    Publication date: April 20, 2023
    Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Longchao WANG, Yipeng SUN, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING
  • Publication number: 20230123327
    Abstract: A method for recognizing text includes: obtaining an image sequence feature of an image to be recognized; obtaining a full text string of the image to be recognized by decoding the image sequence feature; obtaining a text sequence feature by performing a semantic enhancement process on the full text string, in which the image sequence feature, the full text string and the text sequence feature are of the same length; and determining text content of the image to be recognized based on the full text string and the text sequence feature.
    Type: Application
    Filed: December 19, 2022
    Publication date: April 20, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Chengquan Zhang, Pengyuan Lv, Kun Yao, Junyu Han, Jingtuo Liu