Patents by Inventor Kun Yao

Kun Yao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12217388
    Abstract: An image processing method and apparatus, an electronic device, and a storage medium, relating to the technical field of image processing. The method includes: acquiring a first image and a second image, wherein a resolution of the second image is greater than a resolution of the first image; determining difference information between a target pixel in the second image and a reference pixel corresponding to the target pixel point in the first image; and acquire a target image with the same resolution as the second image by applying an image differencing process to a predetermined image with the same resolution as the first image based on the difference information. This method can obtain the target image based on the difference value information, and improve image quality.
    Type: Grant
    Filed: May 9, 2022
    Date of Patent: February 4, 2025
    Assignee: REALME CHONGQING MOBILE TELECOMMUNICATIONS CORP, LTD.
    Inventor: Kun Yao
  • Publication number: 20240304015
    Abstract: The present disclosure provides a method of training a deep learning model for text detection and a text detection method, which relates to the technical field of artificial intelligence, and in particular, to the technical field of computer vision and deep learning and can be used in scenarios of OCR optical character recognition. A method of training a deep learning model for text detection is provided, in which a single character segmentation sub-network outputs a single character segmentation prediction result, a text line segmentation sub-network outputs a text line segmentation prediction result, the trained deep learning model can be used for detecting a text area; and, can at the same time achieve single character segmentation and text line segmentation, and thus is capable to perform text detection by combining two ways of text segmentation, which further improves the accuracy of text area detection.
    Type: Application
    Filed: April 21, 2022
    Publication date: September 12, 2024
    Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Sen FAN, Xiaoyan WANG, Pengyuan LV, Chengquan ZHANG, Kun YAO
  • Publication number: 20240281609
    Abstract: The present application provides a method of training a text recognition model. The method includes: inputting a first sample image into the visual feature extraction sub-model to obtain a first visual feature and a first predicted text, the first sample image contains a text and a tag indicating a first actual text; obtaining, by using the semantic feature extraction sub-model, a first semantic feature based on the first predicted text; obtaining, by using the sequence sub-model, a second predicted text based on the first visual feature and the first semantic feature; and training the text recognition model based on the first predicted text, the second predicted text and the first actual text. The present disclosure further provides a method of recognizing a text, an electronic device, and a storage medium.
    Type: Application
    Filed: May 16, 2022
    Publication date: August 22, 2024
    Inventors: Pengyuan LV, Jingquan LI, Chengquan ZHANG, Kun YAO, Jingtuo LIU, Junyu HAN
  • Publication number: 20240282024
    Abstract: A method of training a text erasure model, a method of display a translation, an electronic device, and a storage medium. The training method includes: processing a set of original text block images by using a generator of a generative adversarial network model to obtain a set of simulated text block-erased images; alternately training the generator and a discriminator of the generative adversarial network model by using a set of real text block-erased images and the set of simulated text block-erased images, so as to obtain a trained generator and a trained discriminator; and determining the trained generator as the text erasure model, wherein a pixel value of a text-erased region in a real text block-erased image contained in the set of real text block-erased images is determined based on a pixel value of another region in the real text block-erased image other than the text-erased region.
    Type: Application
    Filed: April 22, 2022
    Publication date: August 22, 2024
    Inventors: Liang WU, Shanshan LIU, Chengquan ZHANG, Kun YAO
  • Publication number: 20240265718
    Abstract: A method training a text detection model and a method of detecting a text. The training method includes: inputting a sample image into a text feature extraction sub-model of a text detection model to obtain a text feature of a text in the sample image, the sample image having a label indicating an actual position information and an actual category; inputting a predetermined text vector into a text encoding sub-model of the text detection model to obtain a text reference feature; inputting the text feature and the text reference feature into a decoding sub-model of the text detection model to obtain a text sequence vector; inputting the text sequence vector into an output sub-model of the text detection model to obtain a predicted position information and a predicted category; and training the text detection model based on the predicted and actual categories, the predicted and actual position information.
    Type: Application
    Filed: April 22, 2022
    Publication date: August 8, 2024
    Inventors: Xiaoqiang ZHANG, Xiameng QIN, Chengquan ZHANG, Kun YAO
  • Publication number: 20240259416
    Abstract: Some embodiments operationally connect a risk score with cybersecurity protection mechanisms and user interactions data in a feedback loop. The risk score guides protection activities by the protection mechanisms, thereby prompting or preventing various user interactions. The protection activities and the user interactions are recorded in audit logs, and curated data based on the audit logs is fed to a risk scoring model as input. The risk scoring model then updates the risk score, and the loop repeats as the protection mechanisms alter their protection activities based on the updated risk score, thereby providing adaptive protection. Security tools for insider risk management, data leak prevention, and conditional access are enhanced to provide adaptive protection, by recording protection activities and user interactions for use as risk model input, and by checking regularly for risk score updates and modifying their protection activities accordingly.
    Type: Application
    Filed: February 21, 2023
    Publication date: August 1, 2024
    Inventors: Erin K. MIYAKE, Bhavanesh RENGARAJAN, Tanay BALDUA, Talhah Munawar MIR, Rudra MITRA, Apsara Karen SELVANAYAGAM, John ALPHONSE, Ethan Max STAKOFF, Jiajun ZHANG, Sizhe ZHENG, Ruben Eugenio CANTU VOTA, Kun YAO, Xin LIU, Maithili DANDIGE, Hubert DUSHIME
  • Patent number: 11908219
    Abstract: The disclosure provides a method and a device for processing information, an electronic device, and a storage medium, belonging to a field of artificial intelligence including computer vision, deep learning, and natural language processing. In the method, the computing device recognizes multiple text items in the image. The computing device classifies multiple text items into a first set of name text items and a second set of content text items based on semantics of the text items. The computing device performs a matching operation between the first set and the second set based on a layout of the text items in the image, and determines matched name-content text items. The matched name-content text items include a name text item in the first set and a content text item matching the name text item and in the second set. The computing device outputs the matched name-content text items.
    Type: Grant
    Filed: April 29, 2021
    Date of Patent: February 20, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Zihan Ni, Yipeng Sun, Kun Yao, Junyu Han, Errui Ding, Jingtuo Liu, Haifeng Wang
  • Patent number: 11881044
    Abstract: A method and apparatus for processing an image, a device and a storage medium are provided. An implementation of the method includes: acquiring a template image, the template image including at least one region of interest; determining a first feature map corresponding to each region of interest in the template image; acquiring a target image; determining a second feature map of the target image; and determining at least one region of interest in the target image according to the first feature map and the second feature map.
    Type: Grant
    Filed: June 21, 2021
    Date of Patent: January 23, 2024
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Chengquan Zhang, Mengyi En, Ju Huang, Qunyi Xie, Xiameng Qin, Kun Yao, Junyu Han, Jingtuo Liu, Errui Ding
  • Publication number: 20240021000
    Abstract: There is provided an image-based information extraction model, method, and apparatus, a device, and a storage medium, which relates to the field of artificial intelligence (AI) technologies, specifically to fields of deep learning, image processing, computer vision technologies, and is applicable to optical character recognition (OCR) and other scenarios. A specific implementation solution involves: acquiring a to-be-extracted first image and a category of to-be-extracted information; and inputting the first image and the category into a pre-trained information extraction model to perform information extraction on the first image to obtain text information corresponding to the category.
    Type: Application
    Filed: February 23, 2023
    Publication date: January 18, 2024
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Xiameng QIN, Yulin LI, Xiaoqiang ZHANG, Ju HUANG, Qunyi XIE, Kun YAO
  • Patent number: 11861919
    Abstract: A text recognition method includes: acquiring an image including text information, the text information including M characters, M being a positive integer greater than 1; performing text recognition on the image to acquire character information about the M characters; recognizing reading direction information about each character in accordance with the character information about the M characters, the reading direction information being used to indicate a next character corresponding to a current character in a semantic reading order; and ranking the M characters in accordance with the reading direction information about the M characters to acquire a text recognition result of the text information.
    Type: Grant
    Filed: June 21, 2021
    Date of Patent: January 2, 2024
    Assignee: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Chengquan Zhang, Pengyuan Lv, Kun Yao, Junyu Han, Jingtuo Liu
  • Publication number: 20230401828
    Abstract: A method for training an image recognition model includes: obtaining a training data set, in which the training data set includes first text images of each vertical category in a non-target scene and second text images of each vertical category in a target scene, and a type of text content involved in the first text images is the same as a type of text content involved in the second text image; training an initial recognition model by using the first text images, to obtain a basic recognition model; and modifying the basic recognition model by using the second text images, to obtain an image recognition model corresponding to the target scene.
    Type: Application
    Filed: April 8, 2022
    Publication date: December 14, 2023
    Inventors: Meina QIAO, Shanshan LIU, Xiameng QIN, Chengquan ZHANG, Kun YAO
  • Publication number: 20230386168
    Abstract: A pre-training method for a Vision and Scene Text Aggregation model includes: acquiring a sample image-text pair; extracting a sample scene text from a sample image; inputting a sample text into a text encoding network to obtain a sample text feature; inputting the sample image and an initial sample aggregation feature into a visual encoding subnetwork and inputting the initial sample aggregation feature and the sample scene text into a scene encoding subnetwork to obtain a global image feature of the sample image and a learned sample aggregation feature; and pre-training the Vision and Scene Text Aggregation model according to the sample text feature, the global image feature of the sample image, and the learned sample aggregation feature.
    Type: Application
    Filed: March 29, 2023
    Publication date: November 30, 2023
    Inventors: Yipeng SUN, Mengjun CHENG, Longchao WANG, Xiongwei ZHU, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING, Jingdong WANG, Haifeng Wang
  • Patent number: 11830288
    Abstract: Embodiments of the present disclosure provide a method for training a face fusion model and an electronic device. The method includes: performing a first face changing process on a user image and a template image to generate a reference template image; adjusting poses of facial features of the template image based on the reference template image to generate a first input image; performing a second face changing process on the template image to generate a second input image; inputting the first input image and the second input image into a generator of an initial face fusion model to generate a fused face area image; and inputting the fused image and the template image into a discriminator of the initial face fusion model to obtain a result, and performing backpropagation correction on the initial face fusion model based on the result to generate a face fusion model.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: November 28, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Kun Yao, Zhibin Hong, Jieting Xue
  • Publication number: 20230290126
    Abstract: Provided are a method for training a region of interest (ROI) detection model, a method for detecting an ROI, a device, and a medium. The specific implementation includes: performing feature extraction on a sample image to obtain a sample feature data; performing non-linear mapping on the sample feature data to obtain a first feature data and a second feature data; determining an inter-region difference data according to the second feature data and a third feature data of the first feature data in a region associated with a label ROI; and adjusting at least one of a to-be-trained feature extraction parameter and a to-be-trained feature enhancement parameter of the ROI detection model according to the inter-region difference data and the region associated with the label ROI.
    Type: Application
    Filed: February 28, 2023
    Publication date: September 14, 2023
    Inventors: Pengyuan LV, Sen FAN, Chengquan ZHANG, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING, Jingdong WANG
  • Publication number: 20230260306
    Abstract: A method and an apparatus is provided for recognizing a document image, a storage medium and an electronic device, relates to the technical field of artificial intelligent recognition, particularly relates to the technical fields of deep learning and computer vision. The method includes that a document image to be recognized is transformed into an image feature map, where the document image at least includes at least one text box and text information including multiple characters; a first recognition content of the document image to be recognized is predicted based on the image feature map, the multiple characters and the text box; the document image to be recognized is recognized based on an optical character recognition algorithm to obtain a second recognition content; and the first recognition content is matched with the second recognition content to obtain a target recognition content.
    Type: Application
    Filed: August 9, 2022
    Publication date: August 17, 2023
    Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Yuechen YU, Chengquan ZHANG, Kun YAO
  • Publication number: 20230215203
    Abstract: The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.
    Type: Application
    Filed: February 14, 2023
    Publication date: July 6, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Pengyuan LV, Chengquan ZHANG, Shanshan LIU, Meina QIAO, Yangliu XU, Liang WU, Xiaoyan WANG, Kun YAO, Junyu Han, Errui DING, Jingdong WANG, Tian WU, Haifeng WANG
  • Publication number: 20230206667
    Abstract: A method for recognizing text includes: obtaining a first feature map of an image; for each target feature unit, performing a feature enhancement process on a plurality of feature values of the target feature unit respectively based on the plurality of feature values of the target feature unit, in which the target feature unit is a feature unit in the first feature map along a feature enhancement direction; and performing a text recognition process on the image based on the first feature map after the feature enhancement process.
    Type: Application
    Filed: December 29, 2022
    Publication date: June 29, 2023
    Inventors: Pengyuan LV, Liang WU, Shanshan LIU, Meina QIAO, Chengquan ZHANG, Kun YAO, Junyu HAN
  • Publication number: 20230196805
    Abstract: The present disclosure provides a character detection method and apparatus, a model training method and apparatus, a device and a storage medium. The specific implementation is: acquiring a training sample, where the training sample includes a sample image and a marked image, and the marked image is an image obtained by marking a text instance in the sample image; inputting the sample image into a character detection model, to obtain segmented images and image types of the segmented images output by the character detection model, where the image type indicates that the segmented image includes a text instance, or the segmented image does not include a text instance; and adjusting a parameter of the character detection model according to the segmented images, the image types of the segmented images and the marked image.
    Type: Application
    Filed: February 13, 2023
    Publication date: June 22, 2023
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Ju HUANG, Xiaoqiang ZHANG, Xiameng QIN, Chengquan ZHANG, Kun YAO
  • Publication number: 20230186664
    Abstract: A method for text recognition is disclosed. The method includes obtaining a whole-image scenario for an image to be processed and a text image in the image to be processed. The method further includes determining a first text recognition model corresponding to the whole-image scenario. The method further includes performing text recognition on the text image according to the first text recognition model to obtain text information.
    Type: Application
    Filed: February 14, 2023
    Publication date: June 15, 2023
    Inventors: Shanshan LIU, Meina QIAO, Liang WU, Pengyuan LV, Sen FAN, Chengquan ZHANG, Kun YAO
  • Publication number: 20230134615
    Abstract: A method of processing a task, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence, in particular to fields of deep learning and computer vision, and may be applied to OCR optical character recognition and other scenarios. The method includes: parsing labeled data to be processed according to a task type identification, to obtain task labeled data, a tag information of the task labeled data is matched with the task type identification, and the task labeled data includes first task labeled data and second task labeled data; training a model using the first task labeled data, to obtain candidate models, the model is determined according to the task type identification; and determining a target model from the candidate models according to a performance evaluation result obtained by performing performance evaluation on the plurality of candidate models using the second task labeled data.
    Type: Application
    Filed: December 27, 2022
    Publication date: May 4, 2023
    Inventors: Qunyi XIE, Dongdong ZHANG, Xiameng QIN, Mengyi EN, Yangliu XU, Yi CHEN, Ju HUANG, Kun YAO