Abstract: An apparatus for aggregating data and for building a virtual data model 4 of an organisation's data which will typically be held in a plurality of different data source 1. The method and apparatus function by first standardising and splitting 2 the data into different types, then performing a cleaning operation 3 on the standardised and split data and from this building a virtual data model 4 which includes the cleaned data as well as an audit trail. The process and apparatus then perform matching and de-duplication operations on the cleaned data. This allows the output of a data set which has been improved, standardised and is of known quality.