Abstract: Techniques for extracting data from electronic documents, including determining vertical positions for text elements encoded in an electronic document based on an intended visual appearance of the text elements; generating text rows for subsets of the text elements based on the vertical positions of the text elements; generating text cells, each associated with one of the text rows and including characters from one or more of the text elements used for the associated text row; obtaining a first set of rules selecting a row group type as a function of an indicated text row; obtaining a second set of rules selecting a row subgroup type as a function of an indicated text row; and creating a record in an electronic database, the record including a field value based on characters included in text cell associated with a text row selected based on the first and second sets of rules.
Type:
Application
Filed:
October 24, 2017
Publication date:
April 25, 2019
Applicant:
Education & Career Compass
Inventors:
Sunil BALA, Kristopher Philip BARTH, Rahul BHATNAGAR