Patents by Inventor Junyu Han
Junyu Han has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230145443Abstract: Provided are a video stitching method and an apparatus, an electronic device, and a storage medium. In the video stitching method, an intermediate frame is inserted between a last image frame of a first video and a first image frame of a second video. L image frames are sequentially selected in order from back to front from the first video and L image frames are sequentially selected in order from front to back from the second video separately, and L is a natural number greater than 1. The first video and the second video are stitched together to form a target video according to the intermediate frame, the L image frames in the first video, and the L image frames in the second video.Type: ApplicationFiled: October 4, 2022Publication date: May 11, 2023Inventors: Tianshu HU, Hanqi GUO, Junyu HAN, Zhibin HONG
-
Publication number: 20230123327Abstract: A method for recognizing text includes: obtaining an image sequence feature of an image to be recognized; obtaining a full text string of the image to be recognized by decoding the image sequence feature; obtaining a text sequence feature by performing a semantic enhancement process on the full text string, in which the image sequence feature, the full text string and the text sequence feature are of the same length; and determining text content of the image to be recognized based on the full text string and the text sequence feature.Type: ApplicationFiled: December 19, 2022Publication date: April 20, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Chengquan Zhang, Pengyuan Lv, Kun Yao, Junyu Han, Jingtuo Liu
-
Publication number: 20230124389Abstract: A model determination method and electronic device is provided, and relates to the technical field of artificial intelligence and, in particular, to the field of computer visions and deep learning, and can be applied to image processing, image identification and other scenarios. A specific implementation solution includes an image sample and a text sample are acquired, wherein text data in the text sample is used for performing text description to target image data in the image sample; at least one image feature in the image sample is stored to a first queue, and at least text feature in the text sample is stored to a second queue; the first queue and the second queue are trained to obtain a first target model; and the first target model is determined as an initialization model for a second target model.Type: ApplicationFiled: August 15, 2022Publication date: April 20, 2023Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.Inventors: Longchao WANG, Yipeng SUN, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING
-
Publication number: 20230120985Abstract: A method for training a face recognition model includes: acquiring a plurality of first training images being uncovered face images, and acquiring a plurality of covering object images; generating a plurality of second training images by separately fusing the plurality of covering object images with the uncovered face images; and training the face recognition model by inputting the plurality of first training images and the plurality of second training images into the face recognition model.Type: ApplicationFiled: December 16, 2022Publication date: April 20, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Yanwen Fan, Xiyu Yu, Gang Zhang, Jingtuo Liu, Haifeng Wang, Errui Ding, Junyu Han
-
Publication number: 20230106873Abstract: A text extraction method and a text extraction model training method are provided. The present disclosure relates to the technical field of artificial intelligence, in particular to the technical field of computer vision. An implementation of the method comprises: obtaining a visual encoding feature of a to-be-detected image; extracting a plurality of sets of multimodal features from the to-be-detected image, wherein each set of multimodal features includes position information of one detection frame extracted from the to-be-detected image, a detection feature in the detection frame and first text information in the detection frame; and obtaining second text information matched with a to-be-extracted attribute based on the visual encoding feature, the to-be-extracted attribute and the plurality of sets of multimodal features, wherein the to-be-extracted attribute is an attribute of text information needing to be extracted.Type: ApplicationFiled: November 28, 2022Publication date: April 6, 2023Inventors: Xiameng QIN, Xiaoqiang ZHANG, Ju HUANG, Yulin LI, Qunyi XIE, Kun YAO, Junyu HAN
-
Publication number: 20230065675Abstract: A method of processing an image, a method of training a model, an electronic device and a medium, which relate to a field of artificial intelligence technology, in particular to deep learning, computer vision and other technical fields. A solution includes: generating a first face image, wherein a definition difference and an authenticity difference between the first face image and a reference face image are within a set range; adjusting, according to a target voice used to drive the first face image, a facial action information related to pronunciation in the first face image to generate a second face image with a facial tissue position conforming to a pronunciation rule of the target voice; and determining the second face image as a face image driven by the target voice.Type: ApplicationFiled: November 8, 2022Publication date: March 2, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Tianshu Hu, Shengyi He, Junyu Han, Zhibin Hong
-
Publication number: 20230045715Abstract: The present disclosure provides a text detection method, a text recognition method and an apparatus, which relate to the field of artificial intelligence technology, in particular to the field of deep learning and computer vision technologies, and can be applied to scenarios such as optical character recognition. The text detection method is: acquiring an image feature of a text strip in a to-be-recognized image; performing visual enhancement processing on the to-be-recognized image to obtain an enhanced feature map of the to-be-recognized image; comparing the image feature of the text strip with the enhanced feature map for similarity to obtain a target bounding box of the text strip on the enhanced feature map.Type: ApplicationFiled: October 14, 2022Publication date: February 9, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Chengquan ZHANG, Pengyuan LV, Sen FAN, Kun YAO, Junyu HAN, Jingtuo LIU
-
Publication number: 20230010031Abstract: A method for recognizing a text, an electronic device and a storage medium. An implementation of the method comprises: obtaining a multi-dimensional first feature map of a to-be-recognized image; performing, based on feature values in the first feature map, feature enhancement processing on each feature value in the first feature map; and performing a text recognition on the to-be-recognized image based on the first feature map after the enhancement processing.Type: ApplicationFiled: September 16, 2022Publication date: January 12, 2023Inventors: Pengyuan LYU, Sen FAN, Xiaoyan WANG, Yuechen YU, Chengquan ZHANG, Kun YAO, Junyu HAN
-
Publication number: 20220415071Abstract: The present disclosure provides a training method of a text recognition model, a text recognition method, and an apparatus, relating to the technical field of artificial intelligence, and specifically, to the technical field of deep learning and computer vision, which can be applied in scenarios such as optional character recognition, etc. The specific implementation solution is: performing mask prediction on visual features of an acquired sample image, to obtain a predicted visual feature; performing mask prediction on semantic features of acquired sample text, to obtain a predicted semantic feature, where the sample image includes text; determining a first loss value of the text of the sample image according to the predicted visual feature; determining a second loss value of the sample text according to the predicted semantic feature; training, according to the first loss value and the second loss value, to obtain the text recognition model.Type: ApplicationFiled: August 31, 2022Publication date: December 29, 2022Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Chengquan ZHANG, Pengyuan LV, Shanshan LIU, Meina QIAO, Yangliu XU, Liang WU, Jingtuo LIU, Junyu HAN, Errui DING, Jingdong WANG
-
Publication number: 20220392205Abstract: Embodiments of the present disclosure provide a method and apparatus for training an image recognition model based on a semantic enhancement, a method and apparatus for recognizing an image, an electronic device, and a computer readable storage medium. The method for training an image recognition model based on a semantic enhancement comprises: extracting, from an inputted first image being unannotated and having no textual description, a first feature representation of the first image; calculating a first loss function based on the first feature representation; extracting, from an inputted second image being unannotated and having an original textual description, a second feature representation of the second image; calculating a second loss function based on the second feature representation, and training an image recognition model based on a fusion of the first loss function and the second loss function.Type: ApplicationFiled: August 22, 2022Publication date: December 8, 2022Inventors: Yipeng SUN, Rongqiao AN, Xiang WEI, Longchao WANG, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING
-
Publication number: 20220383626Abstract: An image processing method includes: obtaining a first categorical feature and M first image features corresponding to M first images respectively, each first image being associated with a task index, task indices associated with different first images being different from each other, M being a positive integer; fusing the M first image features with the first categorical feature respectively so as to obtain M first target features; performing feature extraction on the M first target features so as to obtain M second categorical features; selecting a second categorical feature corresponding to each task index from the M second categorical features, and performing regularization corresponding to the task index on the second categorical feature, to obtain a third categorical feature corresponding to the task index; and performing image processing in accordance with M third categorical features so as to obtain M first image processing results of the M first images.Type: ApplicationFiled: August 8, 2022Publication date: December 1, 2022Inventors: Jian WANG, Junyu HAN, Jinwen CHEN, Lufei LIU
-
Patent number: 11482023Abstract: A method and apparatus for detecting text regions in an image, a device, and a medium are provided. The method may include: detecting, based on feature representation of an image, a first text region in the image, where the first text region covers a text in the image, a region occupied by the text being of a certain shape; determining, based on a feature block of the first text region, text geometry information associated with the text, where the text geometry information includes a text centerline of the text and distance information of the centerline from the upper and lower borders of the text; and adjusting, based on the text geometry information associated with the text, the first text region to a second text region, where the second text region also covers the text and is smaller than the first text region.Type: GrantFiled: December 11, 2019Date of Patent: October 25, 2022Assignee: Beijing Baidu Netcom Science and Technology Co., Ltd.Inventors: Chengquan Zhang, Zuming Huang, Mengyi En, Junyu Han, Errui Ding
-
Publication number: 20220292131Abstract: A method, apparatus and system for retrieving an image is provided, the method comprises: detecting, in response to receiving a query request comprising a target image, a target subject from the target image; extracting a subject feature from the target subject if a confidence level of a detection box of the detected target subject is greater than a first threshold, the subject feature comprising an identical feature, a similar feature and a category; performing matching on the subject feature of the target image and a subject feature of a candidate image pre-stored in a database, to obtain a similarity score and an identicalness score of the candidate image; and selecting, according to the similarity score and the identicalness score, a predetermined number of candidate images as a search result for output.Type: ApplicationFiled: May 27, 2022Publication date: September 15, 2022Inventors: Ruibin BAI, Xiang WEI, Yipeng SUN, Kun YAO, Jingtuo LIU, Junyu HAN
-
Publication number: 20220148324Abstract: Provided are a method and apparatus for extracting information about a negotiable instrument, an electronic device and a storage medium. The method includes inputting a to-be-recognized negotiable instrument into a pretrained deep learning network and obtaining a visual image corresponding to the to-be-recognized negotiable instrument through the deep learning network; matching the visual image corresponding to the to-be-recognized negotiable instrument with a visual image corresponding to each negotiable-instrument template in a preconstructed base template library; and in response to the visual image corresponding to the to-be-recognized negotiable instrument successfully matching a visual image corresponding to one negotiable-instrument template in the base template library, extracting structured information of the to-be-recognized negotiable instrument by using the negotiable-instrument template.Type: ApplicationFiled: January 21, 2022Publication date: May 12, 2022Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.Inventors: Xiameng QIN, Yulin Li, Ju Huang, Qunyi Xie, Chengquan Zhang, Kun Yao, Jingtuo Liu, Junyu Han
-
Publication number: 20220139096Abstract: A character recognition method, a model training method, a related apparatus and an electronic device are provided. The specific solution is: obtaining a target picture; performing feature encoding on the target picture to obtain a visual feature of the target picture; performing feature mapping on the visual feature to obtain a first target feature of the target picture, where the first target feature is a feature that has a matching space with a feature of character semantic information of the target picture; inputting the first target feature into a character recognition model for character recognition to obtain a first character recognition result of the target picture.Type: ApplicationFiled: January 19, 2022Publication date: May 5, 2022Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.Inventors: Pengyuan Lv, Chengquan Zhang, Kun Yao, Junyu Han
-
Publication number: 20220092353Abstract: A computer-implemented method includes: acquiring training data, the training data includes training images for a preset vertical type, and the training images include a first training image containing real data of the preset vertical type and a second training image containing virtual data of the preset vertical type ; building a basic model, the basic model includes a deep learning network, and the deep learning network is configured to recognize the training images to extract text data in the training image; and training the basic model by using the training data to obtain the image recognition model.Type: ApplicationFiled: December 1, 2021Publication date: March 24, 2022Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Ruixue Liu, Xiameng Qin, Mengyi En, Kun Yao, Chengquan Zhang, Shengxian Zhu, Yunhao Li, Junyu Han, Hao Sun
-
Publication number: 20220027611Abstract: Provided are an image classification method and apparatus, an electronic device and a storage medium, relating to the field of artificial intelligence and, in particular, to computer vision and deep learning. The method includes inputting a to-be-classified document image into a pretrained neural network and obtaining a feature submap of each text box of the to-be-classified document image by use of the neural network; inputting the feature submap of each text box, a semantic feature corresponding to preobtained text information of each text box and a position feature corresponding to preobtained position information of each text box into a pretrained multimodal feature fusion model and fusing, by use of the multimodal feature fusion model, the three into a multimodal feature corresponding to each text box; and classifying the to-be-classified document image based on the multimodal feature corresponding to each text box.Type: ApplicationFiled: October 11, 2021Publication date: January 27, 2022Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.Inventors: Yuechen YU, Chengquan ZHANG, Yulin LI, Xiaoqiang ZHANG, Ju HUANG, Xiameng QIN, Kun YAO, Jingtuo LIU, Junyu HAN, Errui DING
-
Publication number: 20220005244Abstract: The present disclosure relates to a field of artificial intelligence technology, in particular to a field of computer vision and deep learning technology, and more particularly, a method and an apparatus for changing a hairstyle of a character, a device, and a storage medium are provided. The method includes: determining an original feature vector of an original image containing the character, wherein the character in the original image has an original hairstyle; acquiring a boundary vector associated with the original hairstyle and a target hairstyle based on a hairstyle classification model; determining a target feature vector corresponding to the target hairstyle based on the original feature vector and the boundary vector; and generating a target image containing the character based on the target feature vector, wherein the character in the target image has the target hairstyle.Type: ApplicationFiled: September 20, 2021Publication date: January 6, 2022Inventors: Zhizhi GUO, Borong LIANG, Zhibin HONG, Junyu HAN
-
Publication number: 20210406592Abstract: The present disclosure provides a method for visual question answering. The method includes: acquiring an input image and an input question; constructing a visual graph based on the input image, wherein the visual graph comprises a first node feature and a first edge feature; constructing a question graph based on the input question, wherein the question graph comprises a second node feature and a second edge feature; performing a multimodal fusion on the visual graph and the question graph to obtain an updated visual graph and an updated question graph; determining a question feature based on the input question; determining a fusion feature based on the updated visual graph, the updated question graph and the question feature; and generating a predicted answer for the input image and the input question. The present disclosure further provides an apparatus for visual question answering, a computer device and a medium.Type: ApplicationFiled: February 23, 2021Publication date: December 30, 2021Inventors: Yulin LI, Xiameng QIN, Ju HUANG, Qunyi XIE, Junyu HAN
-
Publication number: 20210406468Abstract: The present disclosure provides a method for visual question answering, which relates to a field of computer vision and natural language processing. The method includes: acquiring an input image and an input question; constructing a Visual Graph based on the input image, wherein the Visual Graph comprises a Node Feature and an Edge Feature; updating the Node Feature by using the Node Feature and the Edge Feature to obtain an updated Visual Graph; determining a question feature based on the input question; fusing the updated Visual Graph and the question feature to obtain a fused feature; and generating a predicted answer for the input image and the input question based on the fused feature. The present disclosure further provides an apparatus for visual question answering, a computer device and a non-transitory computer-readable storage medium.Type: ApplicationFiled: January 28, 2021Publication date: December 30, 2021Inventors: Xiameng QIN, Yulin LI, Qunyi XIE, Ju HUANG, Junyu HAN