Patents by Inventor Canhui XU

Canhui XU has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9727536
    Abstract: A logic process apparatus for composite graphs in a fixed layout document is provided in this invention. The apparatus includes a composite graph block extraction unit, for extracting composite graph blocks from the fixed layout document, a document parsing unit, for parsing the fixed layout document to obtain text primitives contained therein, a legend primitive extraction unit, for extracting legend primitives from the text primitives, a correlation detection unit, for detecting correlations between the composite graph blocks and the legend primitives, and a correlation storage unit, for storing the detected correlations. A logic process method for composite graphs in a fixed layout document is also provided.
    Type: Grant
    Filed: December 12, 2013
    Date of Patent: August 8, 2017
    Assignees: Peking University Founder Group Co., Ltd., Founder APABI Technology Limited, Peking University
    Inventors: Canhui Xu, Zhi Tang, Xin Tao, Cao Shi
  • Patent number: 9569407
    Abstract: The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.
    Type: Grant
    Filed: December 3, 2013
    Date of Patent: February 14, 2017
    Assignees: Peking University Founder Group Co., Ltd., Founder Apabi Technology Limited, Peking University
    Inventors: Canhui Xu, Zhi Tang, Xin Tao, Cao Shi
  • Patent number: 9542362
    Abstract: The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.
    Type: Grant
    Filed: December 3, 2013
    Date of Patent: January 10, 2017
    Assignees: Peking University Founder Group Co., Ltd., Founder Apabi Technology Limited, Peking University
    Inventors: Canhui Xu, Zhi Tang, Xin Tao, Cao Shi
  • Patent number: 9268999
    Abstract: Provided is a table recognizing method, comprising: parsing and analyzing metadata information in an original fixed-layout document, and extracting basic elements on a page of the document; segmenting the basic elements, extracting segmented text lines on the page, and acquiring fragments; constructing an undirected graph with respect to each of the fragments; extracting an image on the page, detecting intersection points of horizontal lines and vertical lines, detecting an external bounding box of the intersection points, and taking whether the segmented text lines fall within the external bounding box as local relationship features; training a learning model according to the local relationship features, local features of the fragments, and neighborhood relationship features among the fragments, acquiring model parameters, and establishing a table recognizing model; and invoking the table recognizing model to perform table recognizing for the document, and acquiring a recognizing result.
    Type: Grant
    Filed: December 4, 2013
    Date of Patent: February 23, 2016
    Assignees: Peking University Founder Group Co., Ltd., Founder Apabi Technology Limited
    Inventors: Canhui Xu, Zhi Tang, Jianbo Xu, Xin Tao
  • Publication number: 20150093021
    Abstract: Provided is a table recognizing method, comprising: parsing and analyzing metadata information in an original fixed-layout document, and extracting basic elements on a page of the document; segmenting the basic elements, extracting segmented text lines on the page, and acquiring fragments; constructing an undirected graph with respect to each of the fragments; extracting an image on the page, detecting intersection points of horizontal lines and vertical lines, detecting an external bounding box of the intersection points, and taking whether the segmented text lines fall within the external bounding box as local relationship features; training a learning model according to the local relationship features, local features of the fragments, and neighborhood relationship features among the fragments, acquiring model parameters, and establishing a table recognizing model; and invoking the table recognizing model to perform table recognizing for the document, and acquiring a recognizing result.
    Type: Application
    Filed: December 4, 2013
    Publication date: April 2, 2015
    Applicants: Founder Apabi Technology Limited, Peking University Founder Group Co., Ltd.
    Inventors: Canhui XU, Zhi TANG, Jianbo XU, Xin TAO
  • Publication number: 20150095022
    Abstract: A list recognizing method and system, which comprises: parsing and analyzing metadata information within an original fixed-layout document, and extracting basic elements within a page; segmenting the basic elements, extracting segmented text lines within the page to obtain fragments; building an undirected graph with respect to the fragments; detecting indent features of a bullet according to features of the basic elements; training a learning model according to the indent features, local features of the fragments and neighborhood relation features among the fragments, obtaining model parameters, and establishing a list recognizing model; and invoking the list recognizing model to perform list recognizing on the required document, so as to get recognition result.
    Type: Application
    Filed: December 4, 2013
    Publication date: April 2, 2015
    Applicants: Founder Apabi Technology Limited, Peking University Founder Group Co., Ltd.
    Inventors: Canhui XU, Zhi TANG, Jianbo XU, Xin TAO
  • Publication number: 20150046784
    Abstract: An extraction device for the composite graph in a fixed layout document comprising: a document parsing unit, for parsing the fixed layout document, and determining the primitives of the fixed layout document and their types; a layer generation unit, for extracting text primitives so as to form a text layer, and using the rest non-text primitives to form a non-text layer; a page analysis unit, for processing the text layer and the non-text layer with page analyses respectively; a block generation unit, for generating a text block in the text layer and a graph block in the non-text layer; a correlation block determination unit, for determining text blocks correlating to every graph block and merging those correlated text blocks and graph blocks into a composite graph block; an identifier storage unit, for storing the identifiers of all the primitives contained in the composite graph block.
    Type: Application
    Filed: December 12, 2013
    Publication date: February 12, 2015
    Applicants: PEKING UNIVERSITY FOUNDER GROUP CO., LTD., PEKING UNIVERSITY, FOUNDER APABI TECHNOLOGY LIMITED
    Inventors: Canhui XU, Zhi Tang, Xin Tao, Cao Shi
  • Publication number: 20140337719
    Abstract: The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.
    Type: Application
    Filed: December 3, 2013
    Publication date: November 13, 2014
    Applicants: Peking University Founder Group Co., Ltd., Peking University, Founder Apabi Technology Limited
    Inventors: Canhui Xu, Zhi Tang, Xin Tao, Cao Shi
  • Publication number: 20140337717
    Abstract: A logic process apparatus for composite graphs in a fixed layout document is provided in this invention, comprising: a composite graph block extraction unit, for extracting composite graph blocks from the fixed layout document; a document parsing unit, for parsing the fixed layout document to obtain text primitives contained therein; a legend primitive extraction unit, for extracting legend primitives from the text primitives; a correlation detection unit, for detecting correlations between the composite graph blocks and the legend primitives; a correlation storage unit, for storing the detected correlations. A logic process method for composite graphs in a fixed layout document is also provided.
    Type: Application
    Filed: December 12, 2013
    Publication date: November 13, 2014
    Applicants: PEKING UNIVERSITY FOUNDER GROUP CO., LTD., PEKING UNIVERSITY, FOUNDER APABI TECHNOLOGY LIMITED
    Inventors: Canhui XU, Zhi TANG, Xin TAO, Cao SHI