Climate Data Homogenization
Why it is important?
Climate data are the records of observed climate conditions taken at specific sites and times with particular instruments under a set of standard procedures. A climate dataset therefore contains climate information at the observation sites, as well as other not-climate-related factors such as the environment of the observation station, and information about the instruments and observation procedures under which the records were taken. An assumption is made that the station records are representative of climate conditions over a region when the data are used in climate analysis. This is, unfortunately, not always the case. For example, if an observing station is moved from a hill top location to the valley floor 300 meters lower in the elevation, analysis of its temperature data will likely show an abrupt warming at the time of the station relocation. This artifical jump would not be representative of temperature change in the region. Also, consider a station located in the garden of a competent and conscientious observer for 50 years, and suppose a tree was planted west of the garden at the time the observation station was established. The instruments are maintained in good condition and the observer accurately records the temperature in the garden. The tree slowly grows up and shades the observing site during the late afternoon when the daily maximum temperature is observed. As a result, the recorded daily maximum temperature would gradually become lower than that over the surrounding area not shaded by the tree. Thus the station would gradually become less representative of the surrounding area. As shown in a real example, where a Canadian station was moved about 100m away from its original site, trend analysis conducted on inhomoheneous data can be very unreliable. It is therefore important to remove the non-climate factors from the data as much as possible, before the climate data can be reliably used for climate change studies.
Purpose of Data Homogenization
The aim of climate data homogenization is to adjust observations, if necessary, so that the temporal variations in the adjusted data are caused only by climate processes. This is not an easy task. A great deal of effort has been made to develop methods to identify and remove non-climatic inhomogeneities. An authoritative review can be found in Peterson et al. (1998). The CCl has also developed a set of practical guidelines on how to deal with inhomogeneity problems depending on the circumstances under which inhomogeneity occurs. Most techniques developed so far are suited for data at monthly or longer time scales. A few methods are also developed for use with daily data (e.g. Vincent 2001). An R-based toolkit RHTest, that uses a two-phase regression technique (Wang 2003) for the detection and adjustment of inhomogeneity is available from this site.
References
- Aguilar, E., I. Auer, M. Brunet, T.C. Peterson, and J. Wieringa, ??? Guidance on metadata and homogenization
- Peterson, T.C., Easterling, D.R., Karl, T.R., Groisman, P., Nicholls, N., Plummer, N., Torok, S., Auer, I., Bohm, R., Gullett, D., Vincent, L., Heino, R., Tuomenvirta, H., Mestre, O., Szentimrey, T., Salinger, J., Foland, E.J., Hanssen-Bauer, I., Alexandersson, H., Jones, P. and Parker, D., 1998: Homogeneity adjustments of in situ atmospheric climate data: a review. International Journal of Climatology, 18,1493-1517.
- Vincent, L. A., X. Zhang, B.R. Bonsal, W.D. Hogg, 2002: Homogenization of daily temperatures over Canada. Journal of Climate,15,1322-1334.
- Wang, X.L., 2003: Comments on "Detection of undocumented changepoints: A revision of the two-phase regression model". Journal of Climate,16,3383-3385.