Abstract: An improvement to an automatic data extractor has the capability of discovering new values that are not recognized by the vocabulary of the automatic data extractor and adding them to the record being formed and to the vocabulary, thus accumulating new vocabulary through use. The extractor gleans new values by deducing them from the structure of the text data and learns them by adding them to its vocabulary. The data extractor determines the structure of the data in much the same way as prior art data extractors but then a discovery process is used to identify a series of field lists using preferably at least one field parser and a field grader. The results of the grader are returned to an attribute mapper that identifies the position in the field list for each of the attributes.