Abstract: An automated method for generation of training samples for classification of electronic images of commercial documents such as invoices, bills of lading, explanations of benefits, etc. is described. An image of a page of a document is generated as a representative of similar pages from the same origin regardless whether the page of the document contains permanent and variable information or just permanent information.