Patents by Inventor Canhui XU
Canhui XU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Logic processing apparatus and logic processing method for composite graphs in fixed layout document
Patent number: 9727536Abstract: A logic process apparatus for composite graphs in a fixed layout document is provided in this invention. The apparatus includes a composite graph block extraction unit, for extracting composite graph blocks from the fixed layout document, a document parsing unit, for parsing the fixed layout document to obtain text primitives contained therein, a legend primitive extraction unit, for extracting legend primitives from the text primitives, a correlation detection unit, for detecting correlations between the composite graph blocks and the legend primitives, and a correlation storage unit, for storing the detected correlations. A logic process method for composite graphs in a fixed layout document is also provided.Type: GrantFiled: December 12, 2013Date of Patent: August 8, 2017Assignees: Peking University Founder Group Co., Ltd., Founder APABI Technology Limited, Peking UniversityInventors: Canhui Xu, Zhi Tang, Xin Tao, Cao Shi -
Patent number: 9569407Abstract: The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.Type: GrantFiled: December 3, 2013Date of Patent: February 14, 2017Assignees: Peking University Founder Group Co., Ltd., Founder Apabi Technology Limited, Peking UniversityInventors: Canhui Xu, Zhi Tang, Xin Tao, Cao Shi
-
Patent number: 9542362Abstract: The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.Type: GrantFiled: December 3, 2013Date of Patent: January 10, 2017Assignees: Peking University Founder Group Co., Ltd., Founder Apabi Technology Limited, Peking UniversityInventors: Canhui Xu, Zhi Tang, Xin Tao, Cao Shi
-
Patent number: 9268999Abstract: Provided is a table recognizing method, comprising: parsing and analyzing metadata information in an original fixed-layout document, and extracting basic elements on a page of the document; segmenting the basic elements, extracting segmented text lines on the page, and acquiring fragments; constructing an undirected graph with respect to each of the fragments; extracting an image on the page, detecting intersection points of horizontal lines and vertical lines, detecting an external bounding box of the intersection points, and taking whether the segmented text lines fall within the external bounding box as local relationship features; training a learning model according to the local relationship features, local features of the fragments, and neighborhood relationship features among the fragments, acquiring model parameters, and establishing a table recognizing model; and invoking the table recognizing model to perform table recognizing for the document, and acquiring a recognizing result.Type: GrantFiled: December 4, 2013Date of Patent: February 23, 2016Assignees: Peking University Founder Group Co., Ltd., Founder Apabi Technology LimitedInventors: Canhui Xu, Zhi Tang, Jianbo Xu, Xin Tao
-
Publication number: 20150093021Abstract: Provided is a table recognizing method, comprising: parsing and analyzing metadata information in an original fixed-layout document, and extracting basic elements on a page of the document; segmenting the basic elements, extracting segmented text lines on the page, and acquiring fragments; constructing an undirected graph with respect to each of the fragments; extracting an image on the page, detecting intersection points of horizontal lines and vertical lines, detecting an external bounding box of the intersection points, and taking whether the segmented text lines fall within the external bounding box as local relationship features; training a learning model according to the local relationship features, local features of the fragments, and neighborhood relationship features among the fragments, acquiring model parameters, and establishing a table recognizing model; and invoking the table recognizing model to perform table recognizing for the document, and acquiring a recognizing result.Type: ApplicationFiled: December 4, 2013Publication date: April 2, 2015Applicants: Founder Apabi Technology Limited, Peking University Founder Group Co., Ltd.Inventors: Canhui XU, Zhi TANG, Jianbo XU, Xin TAO
-
Publication number: 20150095022Abstract: A list recognizing method and system, which comprises: parsing and analyzing metadata information within an original fixed-layout document, and extracting basic elements within a page; segmenting the basic elements, extracting segmented text lines within the page to obtain fragments; building an undirected graph with respect to the fragments; detecting indent features of a bullet according to features of the basic elements; training a learning model according to the indent features, local features of the fragments and neighborhood relation features among the fragments, obtaining model parameters, and establishing a list recognizing model; and invoking the list recognizing model to perform list recognizing on the required document, so as to get recognition result.Type: ApplicationFiled: December 4, 2013Publication date: April 2, 2015Applicants: Founder Apabi Technology Limited, Peking University Founder Group Co., Ltd.Inventors: Canhui XU, Zhi TANG, Jianbo XU, Xin TAO
-
Publication number: 20150046784Abstract: An extraction device for the composite graph in a fixed layout document comprising: a document parsing unit, for parsing the fixed layout document, and determining the primitives of the fixed layout document and their types; a layer generation unit, for extracting text primitives so as to form a text layer, and using the rest non-text primitives to form a non-text layer; a page analysis unit, for processing the text layer and the non-text layer with page analyses respectively; a block generation unit, for generating a text block in the text layer and a graph block in the non-text layer; a correlation block determination unit, for determining text blocks correlating to every graph block and merging those correlated text blocks and graph blocks into a composite graph block; an identifier storage unit, for storing the identifiers of all the primitives contained in the composite graph block.Type: ApplicationFiled: December 12, 2013Publication date: February 12, 2015Applicants: PEKING UNIVERSITY FOUNDER GROUP CO., LTD., PEKING UNIVERSITY, FOUNDER APABI TECHNOLOGY LIMITEDInventors: Canhui XU, Zhi Tang, Xin Tao, Cao Shi
-
Publication number: 20140337719Abstract: The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.Type: ApplicationFiled: December 3, 2013Publication date: November 13, 2014Applicants: Peking University Founder Group Co., Ltd., Peking University, Founder Apabi Technology LimitedInventors: Canhui Xu, Zhi Tang, Xin Tao, Cao Shi
-
LOGIC PROCESSING APPARATUS AND LOGIC PROCESSING METHOD FOR COMPOSITE GRAPHS IN FIXED LAYOUT DOCUMENT
Publication number: 20140337717Abstract: A logic process apparatus for composite graphs in a fixed layout document is provided in this invention, comprising: a composite graph block extraction unit, for extracting composite graph blocks from the fixed layout document; a document parsing unit, for parsing the fixed layout document to obtain text primitives contained therein; a legend primitive extraction unit, for extracting legend primitives from the text primitives; a correlation detection unit, for detecting correlations between the composite graph blocks and the legend primitives; a correlation storage unit, for storing the detected correlations. A logic process method for composite graphs in a fixed layout document is also provided.Type: ApplicationFiled: December 12, 2013Publication date: November 13, 2014Applicants: PEKING UNIVERSITY FOUNDER GROUP CO., LTD., PEKING UNIVERSITY, FOUNDER APABI TECHNOLOGY LIMITEDInventors: Canhui XU, Zhi TANG, Xin TAO, Cao SHI