Abstract:
A dissimilarity coefficient for estimating the dissimilarity between two bird atlas distributions is developed. This coefficient is based on the Euclidean distance concept. The atlas distributions are compared over all quarter degree grid cells. Existing coefficients are not suitable for the comparison of distributions with different total areas and species with different mean reporting rates. In each grid cell the reliability of the reporting rates depends on the number of checklists collected for the grid cell. Weights are used to solve this problem. To solve the problem of different levels of abundance and conspicuousness of species, the reporting rates are sorted into percentiles, using five or 10 categories for the strictly positive reporting rates. Each grid cell is weighted by a function of the number of checklists collected for the grid cell. The coefficient is scaled by the maximum possible sum of the differences which would occur if there is no overlap between the two distributions, so that the dissimilarity coefficient lies between zero (a perfect match) and one (no overlap). A variety of these coefficients are investigated and compared. The continuity of observed reporting rates in a spatial cellular map is an indication of spatial autocorrelation present, especially between observations which are in close vicinity. We are particularly interested in measuring and comparing the continuity of the reporting rates in the bird distributions from The Atlas of Southern African Birds. The variogram, developed in geostatistics, estimates this spatial autocorrelation. The classical variogram estimator, however, is dependent on the scale of measurement and assumes that the data are intrinsically stationary. The bird atlas distribution maps contain trend and the variance of each observation (reporting rate) is a function of the number of checklists collected for the grid cell and the underlying probability of encountering the species in the grid cell. The approach of removing this binomial measurement error from the variogram developed by McNeill (1991) is investigated but not found satisfactory. A weighted variogram, where each squared difference is weighted by a function of the smaller number of checklists, is developed. To make the variogram values comparable between species a function of the mean reporting rates is used to scale the variogram. We were particularly interested in the first variogram value of each species distribution, 2y(1). The bird distribution maps in The Atlas of Southern African Birds show the raw observed reporting rates. Each of these reporting rates is a random variable dependent on sampling error due to binomial variation based on the number of checklists collected for the grid cell and on the underlying probability of encountering the species. The distribution maps show this measurement error. It is believed that a smoothed version of the bird distribution maps will to some extent improve the statement these observed distributions are aiming to make. Single-step regression methods are investigated for a fast approach to this problem. These cause problems because of frequent 'zero' observed reporting rates and because they smooth the maps too heavily. Generalized Linear Models are investigated and this iterative procedure is applied to model the reporting rates with a binomial distribution on square blocks of nine grid cells where a value for the central cell is 'predicted' in each regression. This approach is especially suited to accommodate the binomial distribution characteristics and is found to smooth the bird atlas distributions well. Because only a local window is taken for each regression, the spatial autocorrelation is adequately included in the spatial explanatory variables.
Includes bibliographical references.