Abstract: A computer program is disclosed including but not limited to instructions to input an initial description of a data format and a batch of data comprising data in a new data format not covered by the initial description, instructions to use the first description to parse the records in the data source, instructions to discard records in the input data that parse successfully, instructions to collect records that fail to parse, instructions to accumulate a quantity, M of records that fail to parse, instructions to return a modified description that extends the initial description to cover the new data, instructions to transform the first description, D into a second description D? to accommodate differences between the input data format and the first description D by introducing options where a piece of data was missing in the input data and introducing unions where a new type of data was found in the input data; and instructions to use a non-incremental format inference system such as LEARNPADS to infer descri
Type:
Grant
Filed:
November 27, 2009
Date of Patent:
December 25, 2012
Assignee:
AT&T Intellectual Property I, LP
Inventors:
Kathleen Fisher, David Walker, Kenny Zhu