Patents by Inventor Kun Yao
Kun Yao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230123327Abstract: A method for recognizing text includes: obtaining an image sequence feature of an image to be recognized; obtaining a full text string of the image to be recognized by decoding the image sequence feature; obtaining a text sequence feature by performing a semantic enhancement process on the full text string, in which the image sequence feature, the full text string and the text sequence feature are of the same length; and determining text content of the image to be recognized based on the full text string and the text sequence feature.Type: ApplicationFiled: December 19, 2022Publication date: April 20, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Chengquan Zhang, Pengyuan Lv, Kun Yao, Junyu Han, Jingtuo Liu
-
Publication number: 20230124389Abstract: A model determination method and electronic device is provided, and relates to the technical field of artificial intelligence and, in particular, to the field of computer visions and deep learning, and can be applied to image processing, image identification and other scenarios. A specific implementation solution includes an image sample and a text sample are acquired, wherein text data in the text sample is used for performing text description to target image data in the image sample; at least one image feature in the image sample is stored to a first queue, and at least text feature in the text sample is stored to a second queue; the first queue and the second queue are trained to obtain a first target model; and the first target model is determined as an initialization model for a second target model.Type: ApplicationFiled: August 15, 2022Publication date: April 20, 2023Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.Inventors: Longchao WANG, Yipeng SUN, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING
-
Publication number: 20230106873Abstract: A text extraction method and a text extraction model training method are provided. The present disclosure relates to the technical field of artificial intelligence, in particular to the technical field of computer vision. An implementation of the method comprises: obtaining a visual encoding feature of a to-be-detected image; extracting a plurality of sets of multimodal features from the to-be-detected image, wherein each set of multimodal features includes position information of one detection frame extracted from the to-be-detected image, a detection feature in the detection frame and first text information in the detection frame; and obtaining second text information matched with a to-be-extracted attribute based on the visual encoding feature, the to-be-extracted attribute and the plurality of sets of multimodal features, wherein the to-be-extracted attribute is an attribute of text information needing to be extracted.Type: ApplicationFiled: November 28, 2022Publication date: April 6, 2023Inventors: Xiameng QIN, Xiaoqiang ZHANG, Ju HUANG, Yulin LI, Qunyi XIE, Kun YAO, Junyu HAN
-
Publication number: 20230048495Abstract: A method and a platform of generating a document, an electronic device, and a storage medium are provided, which relate to a field of an artificial intelligence technology, in particular to fields of computer vision and deep learning technologies, and may be applied to a text recognition scenario and other scenarios. The method includes: performing a category recognition on a document picture to obtain a target category result; determining a target structured model matched with the target category result; and performing, by using the target structured model, a structure recognition on the document picture to obtain a structure recognition result, so as to generate an electronic document based on the structure recognition result, wherein the structure recognition result includes a field attribute recognition result and a field position recognition result.Type: ApplicationFiled: October 26, 2022Publication date: February 16, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Qunyi XIE, Xiameng QIN, Mengyi EN, Dongdong ZHANG, Ju HUANG, Yangliu XU, Yi CHEN, Kun YAO
-
Publication number: 20230050079Abstract: Provided are a text recognition method, an electronic device, and a non-transitory computer-readable storage medium, which are applicable in an OCR scenario. In the particular solution, a text image to be recognized is acquired. Feature extraction is performed on the text image, to obtain an image feature corresponding to the text image, where a height-wise feature and a width-wise feature of the image feature each have a dimension greater than 1. According to the image feature, sampling features corresponding to multiple sampling points in the text image are determined. According to the sampling features corresponding to the multiple sampling points, a character recognition result corresponding to the text image is determined.Type: ApplicationFiled: October 27, 2022Publication date: February 16, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Pengyuan LV, Xiaoyan WANG, Liang WU, Shanshan LIU, Yuechen YU, Meina QIAO, Jie LU, Chengquan ZHANG, Kun YAO
-
Publication number: 20230042234Abstract: A method for training a model includes: obtaining a scene image, second actual characters in the scene image and a second construct image; obtaining first features and first recognition characters of characters obtained by performing character recognition on the scene image using the model to be trained; obtaining second features of characters obtained by performing character recognition on the second construct image using the training auxiliary model; and obtaining a character recognition model by adjusting model parameters of the model to be trained based on the first recognition characters, the second actual characters, the first features and the second features.Type: ApplicationFiled: October 24, 2022Publication date: February 9, 2023Inventors: Yangliu XU, Qunyi Xie, Yi Chen, Xiameng Qin, Chengquan Zhang, Kun Yao
-
Publication number: 20230045715Abstract: The present disclosure provides a text detection method, a text recognition method and an apparatus, which relate to the field of artificial intelligence technology, in particular to the field of deep learning and computer vision technologies, and can be applied to scenarios such as optical character recognition. The text detection method is: acquiring an image feature of a text strip in a to-be-recognized image; performing visual enhancement processing on the to-be-recognized image to obtain an enhanced feature map of the to-be-recognized image; comparing the image feature of the text strip with the enhanced feature map for similarity to obtain a target bounding box of the text strip on the enhanced feature map.Type: ApplicationFiled: October 14, 2022Publication date: February 9, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Chengquan ZHANG, Pengyuan LV, Sen FAN, Kun YAO, Junyu HAN, Jingtuo LIU
-
Publication number: 20230020022Abstract: A method of recognizing a text, which relates to a field of an artificial intelligence technology, in particular to a field of computer vision and deep learning technology, and may be applied to optical character recognition or other applications. The method includes: acquiring a plurality of image sequences by continuously scanning a document; performing an image stitching, so as to obtain a plurality of successive frames of stitched images corresponding to the plurality of image sequences respectively, an overlapping region exists between each two successive frames of stitched images; performing a text recognition based on the plurality of successive frames of stitched images, so as to obtain a plurality of corresponding recognition results; and performing a de-duplication on the plurality of recognition results based on the overlapping region between each two successive frames of stitched images, so as to obtain a text recognition result for the document.Type: ApplicationFiled: August 11, 2022Publication date: January 19, 2023Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Shanshan LIU, Meina QIAO, Liang WU, Chengquan ZHANG, Kun YAO
-
Publication number: 20230010031Abstract: A method for recognizing a text, an electronic device and a storage medium. An implementation of the method comprises: obtaining a multi-dimensional first feature map of a to-be-recognized image; performing, based on feature values in the first feature map, feature enhancement processing on each feature value in the first feature map; and performing a text recognition on the to-be-recognized image based on the first feature map after the enhancement processing.Type: ApplicationFiled: September 16, 2022Publication date: January 12, 2023Inventors: Pengyuan LYU, Sen FAN, Xiaoyan WANG, Yuechen YU, Chengquan ZHANG, Kun YAO, Junyu HAN
-
Publication number: 20220392205Abstract: Embodiments of the present disclosure provide a method and apparatus for training an image recognition model based on a semantic enhancement, a method and apparatus for recognizing an image, an electronic device, and a computer readable storage medium. The method for training an image recognition model based on a semantic enhancement comprises: extracting, from an inputted first image being unannotated and having no textual description, a first feature representation of the first image; calculating a first loss function based on the first feature representation; extracting, from an inputted second image being unannotated and having an original textual description, a second feature representation of the second image; calculating a second loss function based on the second feature representation, and training an image recognition model based on a fusion of the first loss function and the second loss function.Type: ApplicationFiled: August 22, 2022Publication date: December 8, 2022Inventors: Yipeng SUN, Rongqiao AN, Xiang WEI, Longchao WANG, Kun YAO, Junyu HAN, Jingtuo LIU, Errui DING
-
Publication number: 20220392242Abstract: A method for training a text positioning model includes: obtaining a sample image, where the sample image contains a sample text to be positioned and a text marking box for the sample text; inputting the sample image into a text positioning model to be trained to position the sample text, and outputting a prediction text box for the sample image; obtaining a sample prior anchor box corresponding to the sample image; and adjusting model parameters of the text positioning model based on the sample prior anchor box, the text marking box and the prediction text box, and continuing training the adjusted text positioning model based on a next sample image until model training is completed, to generate a target text positioning model.Type: ApplicationFiled: August 15, 2022Publication date: December 8, 2022Inventors: Ju HUANG, Yulin LI, Peng WANG, Qunyi XIE, Xiameng QIN, Kun YAO
-
Publication number: 20220392243Abstract: A method for training a text classification model and an electronic device are provided. The method may include: acquiring a set of to-be-trained images, the set of to-be-trained images including at least one sample image; determining predicted position information and predicted attribute information of each text line in each sample image based on each sample image; and training to obtain the text classification model, based on the annotation position information and the annotation attribute information of each text line in each sample image, and the predicted position information and the predicted attribute information of each text line in each sample image, and the text classification model is used to detect attribute information of each text line in an to-be-recognized image.Type: ApplicationFiled: August 18, 2022Publication date: December 8, 2022Inventors: Shanshan LIU, Meina QIAO, Liang WU, Pengyuan LYU, Sen FAN, Chengquan ZHANG, Kun YAO
-
Patent number: 11467868Abstract: An orchestration service enables simplified establishment of relationships between services. Attributes and other information associated with a service are defined in a service definition. The information from the service definition is utilized by the orchestration service during execution of one or more workflows to establish a relationship between services. The workflow includes a set of operations that establishes the relationship based at least in part on the service definition.Type: GrantFiled: May 3, 2017Date of Patent: October 11, 2022Assignee: Amazon Technologies, Inc.Inventors: William Voorhees, Jason Brewster, Venumadhav Yalla, Vilcya Wirantana, Gunnar Onarheim, Peter Reidy, Xiao Kun Yao
-
Publication number: 20220301334Abstract: The present disclosure provides a table generating method and apparatus, an electronic device, a storage medium and a product. A specific implementation is: recognizing at least one table object in a to-be-recognized image and obtaining a table property respectively corresponding to the at least one table object, where the table property of any table object includes a cell property or a non-cell property; determining at least one target object with the cell property in the at least one table object; determining a cell region respectively corresponding to the at least one target object to obtain cell position information respectively corresponding to the at least one target object; generating a spreadsheet corresponding to the to-be-recognized image according to the cell position information respectively corresponding to the at least one target object.Type: ApplicationFiled: June 6, 2022Publication date: September 22, 2022Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.Inventors: Yuechen YU, Yulin LI, Chengquan ZHANG, Kun YAO
-
Publication number: 20220301286Abstract: A method and apparatus for identifying a display scene, a method and apparatus for training a model, a device, a storage medium and a computer program product are provided. An implementation of the method may comprise: acquiring a feature vector of a to-be-identified image and acquiring a base library feature vector set; ascertaining, from the base library feature vector set, at least two candidate feature vectors based on a similarity coefficient between the feature vector of the to-be-identified image and each feature vector in the base library feature vector set; performing threshold comparisons on similarity coefficients of the at least two candidate feature vectors, to obtain a target feature vector; and determining a display scene of the to-be-identified image based on a display scene tag corresponding to the target feature vector.Type: ApplicationFiled: June 6, 2022Publication date: September 22, 2022Inventors: Kehua Chen, Zihan Ni, Rongqiao An, Yipeng Sun, Yuerong Chen, Kun Yao
-
Publication number: 20220292131Abstract: A method, apparatus and system for retrieving an image is provided, the method comprises: detecting, in response to receiving a query request comprising a target image, a target subject from the target image; extracting a subject feature from the target subject if a confidence level of a detection box of the detected target subject is greater than a first threshold, the subject feature comprising an identical feature, a similar feature and a category; performing matching on the subject feature of the target image and a subject feature of a candidate image pre-stored in a database, to obtain a similarity score and an identicalness score of the candidate image; and selecting, according to the similarity score and the identicalness score, a predetermined number of candidate images as a search result for output.Type: ApplicationFiled: May 27, 2022Publication date: September 15, 2022Inventors: Ruibin BAI, Xiang WEI, Yipeng SUN, Kun YAO, Jingtuo LIU, Junyu HAN
-
Patent number: 11426993Abstract: A three-dimensional printing system includes a build device and an optical projection engine. The build device includes a curing tank, a photocurable material and a build platform, and the photocurable material and the build platform is disposed in the curing tank. The optical projection engine has a zoom lens for projecting image beams with at least a first pixel size and a second pixel size on the build platform to cure the photocurable material, and the first pixel size is different to the second pixel size.Type: GrantFiled: August 29, 2016Date of Patent: August 30, 2022Assignee: YOUNG OPTICS INC.Inventors: Chao-Shun Chen, Kun-Yao Chen, Chien-Hsing Tsai
-
Publication number: 20220261961Abstract: An image processing method and apparatus, an electronic device, and a storage medium, relating to the technical field of image processing. The method includes: acquiring a first image and a second image, wherein a resolution of the second image is greater than a resolution of the first image; determining difference information between a target pixel in the second image and a reference pixel corresponding to the target pixel point in the first image; and acquire a target image with the same resolution as the second image by applying an image differencing process to a predetermined image with the same resolution as the first image based on the difference information. This method can obtain the target image based on the difference value information, and improve image quality.Type: ApplicationFiled: May 9, 2022Publication date: August 18, 2022Applicant: REALME CHONGQING MOBILE TELECOMMUNICATIONS CORP, LTD.Inventor: Kun Yao
-
Patent number: 11354875Abstract: The present disclosure provides a video blending method, apparatus, electronic device and readable storage medium, and relates to computer vision technologies.Type: GrantFiled: September 15, 2020Date of Patent: June 7, 2022Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Kun Yao, Zhibin Hong, Hanqi Guo, Xusheng Zeng
-
Publication number: 20220148324Abstract: Provided are a method and apparatus for extracting information about a negotiable instrument, an electronic device and a storage medium. The method includes inputting a to-be-recognized negotiable instrument into a pretrained deep learning network and obtaining a visual image corresponding to the to-be-recognized negotiable instrument through the deep learning network; matching the visual image corresponding to the to-be-recognized negotiable instrument with a visual image corresponding to each negotiable-instrument template in a preconstructed base template library; and in response to the visual image corresponding to the to-be-recognized negotiable instrument successfully matching a visual image corresponding to one negotiable-instrument template in the base template library, extracting structured information of the to-be-recognized negotiable instrument by using the negotiable-instrument template.Type: ApplicationFiled: January 21, 2022Publication date: May 12, 2022Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.Inventors: Xiameng QIN, Yulin Li, Ju Huang, Qunyi Xie, Chengquan Zhang, Kun Yao, Jingtuo Liu, Junyu Han