In the match step, researchers align the different attributes/parameters in the dataset’s schema with the mediated schema/ontology. To do so, the researcher must often consult the data descriptions of each parameter, which are either listed with the data-set in the source repository or described as part of the methods section of the accompanying paper.
In some cases, the semantics of the data in one source are slightly different from that of the mediated schema/ontology.In such cases a mapping phase where conversion functions are generated to facilitate data integration according to correspondences found in the matching step. Even more mundane, but crucial is the need to map from the source format to that of the central repository used to collect the data from the different datasets.
In this step, researchers need to mitigate problems that emanate from differences in spatiotemporal resolution between the datasets. Thus, one data-set may include measurements of a 50-m depth in increments of 1 m, while another in increments of 10 cm. Decisions must be made on whether to aggregate upwards to lower resolutions, omit incompatible resolutions or interpolate the data to align the resolutions, or fill out missing data in some areas.