Abstract: Various embodiments of the invention provide systems and methods for classifying physical documents that have been converted to digital documents. Specifically, some embodiments are configured to classify digital documents that belong to a document classification whose representative members that lack structure or have varying structure, either of which makes automatic classification of such documents using conventional methods difficult. For example, certain systems and methods according to the invention can be used to classify physical real estate documents that have been converted to digital real estate documents, especially those that lack a discernable document structure.
Abstract: Various embodiments of the invention provide systems and methods for extracting information from digital documents, including physical documents that have been converted to digital documents. For example, some embodiments are configured to extract information from a field in a digital document by identifying a block of tokens before (i.e., a prior block) and a block of tokens after (i.e., a post block) the field from which the information is to be extracted, where both the prior block and post block are known to be associated with the field type of the field (e.g., name, address, phone number, etc.).
Type:
Grant
Filed:
May 10, 2011
Date of Patent:
December 31, 2013
Assignee:
First American Data Tree LLC
Inventors:
Christopher Lawrence Rubio, Vladimir Sevastyanov