Abstract: Systems and methods are provided for creating tables using auto-generated templates. Reports including lines of text to be extracted into tables are received. An auto define input is received to auto-generate the tables corresponding to the reports. Groups of lines are identified from among the lines of text in the reports. A detail group and relevant groups are selected and identified from among the groups of lines. A final detail group is created by merging the detail group with at least a portion of the relevant groups. Append groups are identified from among the groups of lines not included in the final detail group. Templates corresponding to the final detail group and the append groups are generated. Text is extracted from the reports based on the templates. Tables are generated using the text extracted from the reports, by assigning the text from the text fragments to entries in the tables.
Abstract: Systems and methods are provided for generating tables from print-ready digital source documents. A document is received and one or more text fragments are identified on a rendered page of the document. A wrapping region collection is generated, comprising one or more wrapping regions. A tabular, narrative and label score is generated for each wrapping region. A block type is assigned to each wrapping region based on the scores. A wrapping region group and a block set are generated. One or more tables are generated based on text fragments corresponding to one of the one or more blocks. The text fragments are organized into corresponding fields of the one or more tables.
Type:
Grant
Filed:
June 2, 2017
Date of Patent:
May 14, 2019
Assignee:
Datawatch Corporation
Inventors:
Mark Stephen Kyre, Jeffrey Lucas Eldridge, Austin Alexander Spears, Samuel Allen Hudock
Abstract: Systems and methods are provided for creating tables using auto-generated templates. Reports including lines of text to be extracted into tables are received. An auto define input is received to auto-generate the tables corresponding to the reports. Groups of lines are identified from among the lines of text in the reports. A detail group and relevant groups are selected and identified from among the groups of lines. A final detail group is created by merging the detail group with at least a portion of the relevant groups. Append groups are identified from among the groups of lines not included in the final detail group. Templates corresponding to the final detail group and the append groups are generated. Text is extracted from the reports based on the templates. Tables are generated using the text extracted from the reports, by assigning the text from the text fragments to entries in the tables.
Abstract: Systems and methods are provided for generating tables from print-ready digital source documents. A document is received and one or more text fragments are identified on a rendered page of the document. A wrapping region collection is generated, comprising one or more wrapping regions. A tabular, narrative and label score is generated for each wrapping region. A block type is assigned to each wrapping region based on the scores. A wrapping region group and a block set are generated. One or more tables are generated based on text fragments corresponding to one of the one or more blocks. The text fragments are organized into corresponding fields of the one or more tables.
Type:
Grant
Filed:
January 12, 2016
Date of Patent:
July 11, 2017
Assignee:
Datawatch Corporation
Inventors:
Mark Stephen Kyre, Jeffrey Lucas Eldridge, Austin Alexander Spears, Samuel Allen Hudock