Patents by Inventor Chengquan Zhang

Chengquan Zhang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230010031
    Abstract: A method for recognizing a text, an electronic device and a storage medium. An implementation of the method comprises: obtaining a multi-dimensional first feature map of a to-be-recognized image; performing, based on feature values in the first feature map, feature enhancement processing on each feature value in the first feature map; and performing a text recognition on the to-be-recognized image based on the first feature map after the enhancement processing.
    Type: Application
    Filed: September 16, 2022
    Publication date: January 12, 2023
    Inventors: Pengyuan LYU, Sen FAN, Xiaoyan WANG, Yuechen YU, Chengquan ZHANG, Kun YAO, Junyu HAN
  • Publication number: 20220415071
    Abstract: The present disclosure provides a training method of a text recognition model, a text recognition method, and an apparatus, relating to the technical field of artificial intelligence, and specifically, to the technical field of deep learning and computer vision, which can be applied in scenarios such as optional character recognition, etc. The specific implementation solution is: performing mask prediction on visual features of an acquired sample image, to obtain a predicted visual feature; performing mask prediction on semantic features of acquired sample text, to obtain a predicted semantic feature, where the sample image includes text; determining a first loss value of the text of the sample image according to the predicted visual feature; determining a second loss value of the sample text according to the predicted semantic feature; training, according to the first loss value and the second loss value, to obtain the text recognition model.
    Type: Application
    Filed: August 31, 2022
    Publication date: December 29, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Chengquan ZHANG, Pengyuan LV, Shanshan LIU, Meina QIAO, Yangliu XU, Liang WU, Jingtuo LIU, Junyu HAN, Errui DING, Jingdong WANG
  • Publication number: 20220392243
    Abstract: A method for training a text classification model and an electronic device are provided. The method may include: acquiring a set of to-be-trained images, the set of to-be-trained images including at least one sample image; determining predicted position information and predicted attribute information of each text line in each sample image based on each sample image; and training to obtain the text classification model, based on the annotation position information and the annotation attribute information of each text line in each sample image, and the predicted position information and the predicted attribute information of each text line in each sample image, and the text classification model is used to detect attribute information of each text line in an to-be-recognized image.
    Type: Application
    Filed: August 18, 2022
    Publication date: December 8, 2022
    Inventors: Shanshan LIU, Meina QIAO, Liang WU, Pengyuan LYU, Sen FAN, Chengquan ZHANG, Kun YAO
  • Patent number: 11482023
    Abstract: A method and apparatus for detecting text regions in an image, a device, and a medium are provided. The method may include: detecting, based on feature representation of an image, a first text region in the image, where the first text region covers a text in the image, a region occupied by the text being of a certain shape; determining, based on a feature block of the first text region, text geometry information associated with the text, where the text geometry information includes a text centerline of the text and distance information of the centerline from the upper and lower borders of the text; and adjusting, based on the text geometry information associated with the text, the first text region to a second text region, where the second text region also covers the text and is smaller than the first text region.
    Type: Grant
    Filed: December 11, 2019
    Date of Patent: October 25, 2022
    Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Chengquan Zhang, Zuming Huang, Mengyi En, Junyu Han, Errui Ding
  • Publication number: 20220301334
    Abstract: The present disclosure provides a table generating method and apparatus, an electronic device, a storage medium and a product. A specific implementation is: recognizing at least one table object in a to-be-recognized image and obtaining a table property respectively corresponding to the at least one table object, where the table property of any table object includes a cell property or a non-cell property; determining at least one target object with the cell property in the at least one table object; determining a cell region respectively corresponding to the at least one target object to obtain cell position information respectively corresponding to the at least one target object; generating a spreadsheet corresponding to the to-be-recognized image according to the cell position information respectively corresponding to the at least one target object.
    Type: Application
    Filed: June 6, 2022
    Publication date: September 22, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Yuechen YU, Yulin LI, Chengquan ZHANG, Kun YAO
  • Publication number: 20220253631
    Abstract: The present disclosure discloses an image processing method, an electronic device and a storage medium, and relates to the field of artificial intelligence technologies, and particularly to the fields of computer vision technologies, deep learning technologies, or the like. The image processing method includes: acquiring a multi-modal feature of each of at least one text region in an image, the multi-modal feature including features in plural dimensions; performing a global attention processing operation on the multi-modal feature of each text region to obtain a global attention feature of each text region; determining a category of each text region based on the global attention feature of each text region; and constructing structured information based on text content and the category of each text region.
    Type: Application
    Filed: October 14, 2021
    Publication date: August 11, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Yulin LI, Ju HUANG, Qunyi XIE, Xiameng QIN, Chengquan ZHANG, Jingtuo LIU
  • Publication number: 20220148324
    Abstract: Provided are a method and apparatus for extracting information about a negotiable instrument, an electronic device and a storage medium. The method includes inputting a to-be-recognized negotiable instrument into a pretrained deep learning network and obtaining a visual image corresponding to the to-be-recognized negotiable instrument through the deep learning network; matching the visual image corresponding to the to-be-recognized negotiable instrument with a visual image corresponding to each negotiable-instrument template in a preconstructed base template library; and in response to the visual image corresponding to the to-be-recognized negotiable instrument successfully matching a visual image corresponding to one negotiable-instrument template in the base template library, extracting structured information of the to-be-recognized negotiable instrument by using the negotiable-instrument template.
    Type: Application
    Filed: January 21, 2022
    Publication date: May 12, 2022
    Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Xiameng QIN, Yulin Li, Ju Huang, Qunyi Xie, Chengquan Zhang, Kun Yao, Jingtuo Liu, Junyu Han
  • Publication number: 20220139096
    Abstract: A character recognition method, a model training method, a related apparatus and an electronic device are provided. The specific solution is: obtaining a target picture; performing feature encoding on the target picture to obtain a visual feature of the target picture; performing feature mapping on the visual feature to obtain a first target feature of the target picture, where the first target feature is a feature that has a matching space with a feature of character semantic information of the target picture; inputting the first target feature into a character recognition model for character recognition to obtain a first character recognition result of the target picture.
    Type: Application
    Filed: January 19, 2022
    Publication date: May 5, 2022
    Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Pengyuan Lv, Chengquan Zhang, Kun Yao, Junyu Han
  • Publication number: 20220101642
    Abstract: The disclosure discloses a method for character recognition, an electronic device, and a storage medium. The technical solution includes: obtaining a test sample image and a test sample character both corresponding to a test task; performing fine-tuning on a trained meta-learning model based on the test sample image and the test sample character to obtain a test task model; obtaining a test image corresponding to the test task; and generating a test character corresponding to the test image by inputting the test image into the test task model.
    Type: Application
    Filed: December 8, 2021
    Publication date: March 31, 2022
    Inventors: Qunyi XIE, Yangliu XU, Xiameng QIN, Chengquan ZHANG
  • Publication number: 20220092353
    Abstract: A computer-implemented method includes: acquiring training data, the training data includes training images for a preset vertical type, and the training images include a first training image containing real data of the preset vertical type and a second training image containing virtual data of the preset vertical type ; building a basic model, the basic model includes a deep learning network, and the deep learning network is configured to recognize the training images to extract text data in the training image; and training the basic model by using the training data to obtain the image recognition model.
    Type: Application
    Filed: December 1, 2021
    Publication date: March 24, 2022
    Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
    Inventors: Ruixue Liu, Xiameng Qin, Mengyi En, Kun Yao, Chengquan Zhang, Shengxian Zhu, Yunhao Li, Junyu Han, Hao Sun
  • Publication number: 20220027611
    Abstract: Provided are an image classification method and apparatus, an electronic device and a storage medium, relating to the field of artificial intelligence and, in particular, to computer vision and deep learning. The method includes inputting a to-be-classified document image into a pretrained neural network and obtaining a feature submap of each text box of the to-be-classified document image by use of the neural network; inputting the feature submap of each text box, a semantic feature corresponding to preobtained text information of each text box and a position feature corresponding to preobtained position information of each text box into a pretrained multimodal feature fusion model and fusing, by use of the multimodal feature fusion model, the three into a multimodal feature corresponding to each text box; and classifying the to-be-classified document image based on the multimodal feature corresponding to each text box.
    Type: Application
    Filed: October 11, 2021
    Publication date: January 27, 2022
    Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.
    Inventors: Yuechen YU, Chengquan ZHANG, Yulin LI, Xiaoqiang ZHANG, Ju HUANG, Xiameng QIN, Kun YAO, Jingtuo LIU, Junyu HAN, Errui DING
  • Publication number: 20210406619
    Abstract: The present disclosure provides a method for visual question answering, which relates to fields of computer vision and natural language processing. The method includes: acquiring an input image and an input question; detecting visual information and position information of each of at least one text region in the input image; determining semantic information and attribute information of each of the at least one text region based on the visual information and the position information; determining a global feature of the input image based on the visual information, the position information, the semantic information, and the attribute information; determining a question feature based on the input question; and generating a predicted answer for the input image and the input question based on the global feature and the question feature. The present disclosure further provides a device for visual question answering, a computer device and a medium.
    Type: Application
    Filed: February 5, 2021
    Publication date: December 30, 2021
    Inventors: Pengyuan LV, Xiaoqiang ZHANG, Shanshan LIU, Chengquan ZHANG, Qiming PENG, Sijin WU, Hua LU, Yongfeng CHEN
  • Publication number: 20210406548
    Abstract: A method, an apparatus, a device and a storage medium for processing an image are provided. The method includes: acquiring a target video including a target image frame and at least one image frame of a labeled target object; based on the labeled target object in the at least one image frame, determining a search area for the target object in the target image frame; based on the search area, determining center position information of the target object; based on a labeled area in which the target object is located and the center position information, determining a target object area; and based on the target object area, segmenting the target image frame.
    Type: Application
    Filed: March 10, 2021
    Publication date: December 30, 2021
    Inventors: Chengquan ZHANG, Bin HE
  • Patent number: 11210546
    Abstract: The present disclosure proposes an end-to-end text recognition method and apparatus, computer device and readable medium. The method comprises: obtaining a to-be-recognized picture containing a text region; recognizing a position of the text region in the to-be-recognized picture and text content included in the text region with a pre-trained end-to-end text recognition model; the end-to-end text recognition model comprising a region of interest perspective transformation processing module for performing perspective transformation processing for the text region. The technical solution of the present disclosure does not need to serially arrange a plurality of steps, and may avoid introducing the accumulated errors and may effectively improve the accuracy of the text recognition.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: December 28, 2021
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Yipeng Sun, Chengquan Zhang, Zuming Huang, Jiaming Liu, Junyu Han, Errui Ding
  • Publication number: 20210390296
    Abstract: The present application discloses a method and an apparatus for optical character recognition, an electronic device and a storage medium, and relates to the fields of artificial intelligence and deep learning. The method may include: determining, for a to-be-recognized image, a text bounding box of a text area therein, and extracting a text area image from the to-be-recognized image according to the text bounding box; determining a bounding box of text lines in the text area image, and extracting a text-line image from the text area image according to the bounding box; and performing text sequence recognition on the text-line image, and obtaining a recognition result. The application of the solution in the present application can improve a recognition speed and the like.
    Type: Application
    Filed: March 11, 2021
    Publication date: December 16, 2021
    Inventors: Mengyi En, Shanshan Liu, Xuan Li, Chengquan Zhang, Hailun Xu, Xiaoqiang Zhang
  • Publication number: 20210357710
    Abstract: A text recognition method includes: acquiring an image including text information, the text information including M characters, M being a positive integer greater than 1; performing text recognition on the image to acquire character information about the M characters; recognizing reading direction information about each character in accordance with the character information about the M characters, the reading direction information being used to indicate a next character corresponding to a current character in a semantic reading order; and ranking the M characters in accordance with the reading direction information about the M characters to acquire a text recognition result of the text information.
    Type: Application
    Filed: June 21, 2021
    Publication date: November 18, 2021
    Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.
    Inventors: Chengquan Zhang, Pengyuan Lv, Kun Yao, Junyu Han, Jingtuo Liu
  • Publication number: 20210342621
    Abstract: The disclosure provides a method and an apparatus for character recognition and processing. A character region is labelled for each character contained in each sample image of a sample image set. A character category and a character position code corresponding to each character region are labelled. A preset neural network model for character recognition is trained based on the sample image set having labelled character regions, character categories and character position codes corresponding to the character regions.
    Type: Application
    Filed: July 12, 2021
    Publication date: November 4, 2021
    Inventors: Pengyuan LV, Chengquan Zhang
  • Publication number: 20210334602
    Abstract: The present application discloses a method and an apparatus for recognizing text content, and an electronic device, and relates to a text recognition technique in the field of computer technology. The specific implementation is as follows: acquiring a dial picture; detecting at least one text centerline and a bounding box corresponding to each text centerline in the dial picture; and recognizing text content in each line of text in the dial picture based on the at least one text centerline and the bounding box corresponding to each text centerline.
    Type: Application
    Filed: February 9, 2021
    Publication date: October 28, 2021
    Inventors: Shanshan Liu, Chengquan Zhang, Xuan Li, Mengyi En, Hailun Xu, Xiaoqiang Zhang
  • Publication number: 20210319241
    Abstract: A method, an apparatus, a device and a storage medium for processing an image are provided. The method may include: acquiring a target image; determining at least one stamp image included in the target image; determining position information of a character in the at least one stamp image; and determining a text in the at least one stamp image based on the position information.
    Type: Application
    Filed: June 22, 2021
    Publication date: October 14, 2021
    Inventors: Pengyuan LYU, Chengquan ZHANG
  • Publication number: 20210312174
    Abstract: A method and apparatus for processing an image, a device and a storage medium are provided. An implementation of the method includes: acquiring a template image, the template image including at least one region of interest; determining a first feature map corresponding to each region of interest in the template image; acquiring a target image; determining a second feature map of the target image; and determining at least one region of interest in the target image according to the first feature map and the second feature map.
    Type: Application
    Filed: June 21, 2021
    Publication date: October 7, 2021
    Inventors: Chengquan ZHANG, Mengyi EN, Ju HUANG, Qunyi XIE, Xiameng QIN, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING