Browsing by Author "Zucchini, Walter"
Now showing 1 - 10 of 10
Results Per Page
Sort Options
- ItemOpen AccessA comparative study of stochastic models in biology(1997) Brandão, Anabela de Gusmão; Zucchini, Walter; Underhill, Les
- ItemOpen AccessA comparative study of stochastic models in biology(1997) Brandão, Anabela de Gusmão; Zucchini, Walter; Underhill, LesIn many instances, problems that arise in biology do not fall under any category for which standard statistical techniques are available to be able to analyse them. Under these situations, specifics methods have to be developed to solve and answer questions put forward by biologists. In this thesis four different problems occurring in biology are investigated. A stochastic model is built in each case which describes the problem at hand. These models are not only effective as a description tool but also afford strategies consistent with conventional model selection processes to deal with the standard statistical hypothesis testing situations. The abstracts of the papers resulting from these problems are presented below.
- ItemOpen AccessThe analysis of some bivariate astronomical time series(1993) Koen, Marthinus Christoffel; Zucchini, WalterIn the first part of the thesis, a linear time domain transfer function is fitted to satellite observations of a variable galaxy, NGC5548. The transfer functions relate an input series (ultraviolet continuum flux) to an output series (emission line flux). The methodology for fitting transfer function is briefly described. The autocorrelation structure of the observations of NGC5548 in different electromagnetic spectral bands is investigated, and appropriate univariate autoregressive moving average models given. The results of extensive transfer function fitting using respectively the λ1337 and λ1350 continuum variations as input series, are presented. There is little evidence for a dead time in the response of the emission line variations which are presumed driven by the continuum. Part 2 of the thesis is devoted to the estimation of the lag between two irregularly spaced astronomical time series. Lag estimation methods which have been used in the astronomy literature are reviewed. Some problems are pointed out, particularly the influence of autocorrelation and non-stationarity of the series. If the two series can be modelled as random walks, both these problems can be dealt with efficiently. Maximum likelihood estimation of the random walk and measurement error variances, as well as the lag between the two series, is discussed. Large-sample properties of the estimators are derived. An efficient computational procedure for the likelihood which exploits the sparseness of the covariance matrix, is briefly described. Results are derived for two example data sets: the variations in the two gravitationally lensed images of a quasar, and brightness changes of the active galaxy NGC3783 in two different wavelengths. The thesis is concluded with a brief consideration of other analysis methods which appear interesting.
- ItemOpen AccessDiscriminant analysis : a review of its application to the classificationof grape cultivars(1989) Blignaut, Rennette Julia; Zucchini, Walter; Stewart, Theodor JThe aim of this study was to calculate a classification function for discriminating between five grape cultivars with a view to determine the cultivar of an unknown grape juice. In order to discriminate between the five grape cultivars various multivariate statistical techniques, such as principal component analysis, cluster analysis, correspondence analysis and discriminant analysis were applied. Discriminant analysis resulted in the most appropriate technique for the problem at hand and therefore an in depth study of this technique was undertaken. Discriminant analysis was the most appropriate technique for classifying these grape samples into distinct cultivars because this technique utilized prior information of population membership. This thesis is divided into two main sections. The first section (chapters 1 to 5) is a review on discriminant analysis, describing various aspects of this technique and matters related thereto. In the second section (chapter 6) the theories discussed in the first section are applied to the problem at hand. The results obtained when discriminating between the different grape cultivars are given. Chapter 1 gives a general introduction to the subject of discriminant analysis, including certain basic derivations used in this study. Two approaches to discriminant analysis are discussed in Chapter 2, namely the parametrical and non-parametrical approaches. In this review the emphasis is placed on the classical approach to discriminant analysis. Non-parametrical approaches such as the K-nearest neighbour technique, the kernel method and ranking are briefly discussed. Chapter 3 deals with estimating the probability of misclassification. In Chapter 4 variable selection techniques are discussed. Chapter 5 briefly deals with sequential and logistical discrimination techniques. The estimation of missing values is also discussed in this chapter. A final summary and conclusion is given in Chapter 7. Appendices A to D illustrate some of the obtained results from the practical analyses.
- ItemOpen AccessThe estimation of missing values in hydrological records using the EM algorithm and regression methods(1988) Makhuvha, Tondani; Zucchini, Walter; Sparks, Ross SThe objective of this thesis is to review existing methods for estimating missing values in rainfall records and to propose a number of new procedures. Two classes of methods are considered. The first is based on the theory of variable selection in regression. Here the emphasis is on finding efficient methods to identify the set of control stations which are likely to yield the best regression estimates of the missing values in the target station. The second class of methods is based on the EM algorithm, proposed by Dempster, Laird and Rubin (1977). The emphasis here is to estimate the missing values directly without first making a detailed selection of control stations. All "relevant" stations are included. This method has not previously been applied in the context of estimating missing rainfall values.
- ItemOpen AccessModels for ocean waves(1988) Button, Peter; Zucchini, WalterOcean waves represent an important design factor in many coastal engineering applications. Although extreme wave height is usually considered the single most important of these factors there are other important aspects that require consideration. These include the probability distribution of wave heights, the seasonal variation and the persistence, or duration, of calm and storm periods. If one is primarily interested in extreme wave height then it is possible to restrict one's attention to events which are sufficiently separated in time to be effectively independently (and possibly even identically) distributed. However the independence assumption is not tenable for the description of many other aspects of wave height behaviour, such as the persistence of calm periods. For this one has to take account of the serial correlation structure of observed wave heights, the seasonal behaviour of the important statistics, such as mean and standard deviation, and in fact the entire seasonal probability distribution of wave heights. In other words the observations have to be regarded as a time series.
- ItemOpen AccessOrthogonal models for cross-classified observations(1987) Bust, Reg; Zucchini, WalterThis thesis describes methods of constructing models for cross-classified categorical data. In particular we discuss the construction of a class of approximating models and the selection of the most suitable model in the class. Examples of application are used to illustrate the methodology. The main purpose of the thesis is to demonstrate that it is both possible and advantageous to construct models which are specifically designed for the particular application under investigation. We believe that the methods described here allow the statistician to make good use of any expert knowledge which the client (typically a non-statistician) might possess on the subject to which the data relate.
- ItemOpen AccessA stochastic model for daily climate(1986) Brandão, Anabela de Gusmão; Zucchini, WalterThis thesis describes the results of a study to establish whether climate variables could be usefully modelled on a daily basis. Three stochastic models are considered for the description of daily climate sequences, which can then be used to generate artificial sequences. The climate variables under consideration are rainfall, maximum and minimum temperature, evaporation, sunshine duration, windrun and maximum and minimum humidity. A simple Markov chain-Weibull model is proposed to model rainfall. Three multivariate models (one proposed by Richardson (1981), two new) are suggested for modelling the remaining climate variables. The model parameters are allowed to vary seasonally, while the error term is assumed to follow an autoregressive process. The models were validated and their general performance·was found to be satisfactory. Some weaknesses were identified and are discussed. The. main conclusion of this study is that daily climate sequences can indeed be usefully described by means of stochastic models.
- ItemOpen AccessTopics in interpolation and smoothing of spatial data(1994) McNeill, Lindsay; Zucchini, WalterThis thesis addresses a number of special topics in spatial interpolation and smoothing. The motivation for the thesis comes from two projects, one being to extend the availability of a daily rainfall model for southern Africa to sites at which little or no rainfall data is available, using data from nearby sites, and the other arising from a need to improve the species abundance estimates used to produce maps for the Southern African Bird Atlas Project in areas where the original presence/absence data is sparse. Although problems of spatial interpolation and smoothing have been the subject of much research in recent years, leading to the development of the specialised discipline of geostatistics, these two problems have features which render the available methodology inappropriate in certain respects. The semi-variogram plays a central role in geostatistical work. In both of the applications considered here, the raw semi-variogram is 'contaminated' by error, but the error variance varies widely between data points, so that the spatial autocorrelation structure of the underlying error-free variable is blurred. An adjusted semi-variogram, which removes the effect of the measurement error, is defined and incorporated into the kriging equations. A number of measures have been proposed for kriging in the presence of trend, ranging from explicit modelling of a deterministic trend function to 'moving window' kriging, which assumes local stationarity as an approximation. The former approach is often inappropriate over large non-homogenous regions, while the latter approach tends to underestimate the kriging variance. As an alternative strategy it is proposed here that the trend function be considered as another random variable, with a long-range spatial autocorrelation. This approach is simple to implement, and can also be used as a basis for filtering the data to separate trend from local or high-frequency variation. The daily rainfall model is based on a Fourier series representation giving rise to amplitude and phase parameters; the latter are circular in nature, and not amenable to analysis by standard techniques. This thesis describes a method of interpolation and smoothing, analogous to kriging, which is appropriate for unit vector data available at a number of spatial locations. The cumulated values of species counts in the SABAP are essentially binomially distributed and thus again specialised techniques are required for interpolation. New geostatistical methods which cater for both binomial and Poisson data are presented. Another problem arises from the need to improve interpolated values of the rainfall model parameters by incorporating information on altitude. Although a number of approaches are possible, for example, using co-kriging or kriging with external drift, difficulties are caused by the complexity of the relationship between the rainfall at a point and the surrounding topography. This problem is overcome by the use of orthogonal functions of altitude to model the patterns of topography.
- ItemOpen AccessVariable selection in logistic regression, with special application to medical data(1994) Joubert, Georgina; Zucchini, WalterIn this thesis, the various methods of variable selection which have been proposed in the statistical, epidemiological and medical literature for prediction and estimation problems in logistic regression will be described. The procedures will be applied to medical data sets. On the basis of the literature review as well as the applications to examples, strengths and weaknesses of the approaches will be identified. The procedures will be compared on the basis of the results obtained, their appropriateness for the specific aim of the analysis, and demands they place on the analyst and researcher, intellectually and computationally. In particular, certain selection procedures using bootstrap samples, which have not been used before, will be investigated, and the partial Gauss discrepancy will be extended to the case of logistic regression. Recommendations will be made as to which approaches are the most suitable or most practical in different situations. Most statistical texts deal with issues regarding prediction, whereas the epidemiological literature focuses on estimation. It is therefore hoped that the thesis will be a useful reference for those, statistically or epidemiologically trained, who have to deal with issues regarding variable selection in logistic regression. When fitting models in general, and logistic regression models in particular, it is standard practice to determine the goodness of fit of models, and to ascertain whether outliers or influential observations are present in a data set. These aspects will not be discussed in this thesis, although they were considered when fitting the models.