Browsing by Author "Moodley, Deshen"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- ItemOpen AccessAutomated Machine Learning for Predicting Trends in Time Series Data(2021) Kouassi, Kouame; Moodley, DeshenRecently, a hybrid Deep Neural Network (DNN) algorithm, TreNet was proposed for predicting trends in time series data. While TreNet was shown to have superior performance to a number of alternative approaches, the validation method used did not take into account the sequential nature of time series data. It also did not deal with model update and model stability, which are important for real-world applications. Furthermore, in the TreNet paper and previous trend prediction research, the Algorithm Selection and Hyperparameter Optimisation (ASHO) is performed manually. However, manual ASHO is expensive and often results in a sub-optimal or mediocre model because it needs extensive experimentation as well as domain specific and Machine Learning (ML) expert knowledge. This dissertation replicates TreNet experiments on the same datasets using a walk-forward validation method, which includes model update. The model is tested over multiple independent runs to evaluate model stability. TreNet, which takes in both raw point data and trend line features, is compared to vanilla DNNs and traditional ML algorithms that take in raw point data features. A recent Automated Machine Learning (AutoML) namely the hybrid Bayesian optimisation and hyperband (BOHB) framework is implemented and evaluated for ASHO. The AutoML models are then compared to the manually tuned models. The results show that in general TreNet still performs better than the vanilla DNN, but not on all datasets as reported in the original TreNet paper. On non-stationary datasets, traditional ML models outperform DNN models. The AutoML experiments found optimal configurations that produced models that surpass or compare well against the average performance and stability levels of configurations found during the experiments with manual tuning for ASHO across four datasets. This work highlights the importance of using an appropriate validation method and evaluating model stability while developing and testing ML models for time series applications. It also demonstrates that AutoML techniques such as BOHB are effective to automatically finding a well-performing models for predicting trends in time series data, thus making ML model development more systematic and less error-prone.
- ItemOpen AccessEvaluation of clustering techniques for generating household energy consumption patterns in a developing country(2019) Toussaint, Wiebke; Moodley, Deshen; Meyer, ThomasThis work compares and evaluates clustering techniques for generating representative daily load profiles that are characteristic of residential energy consumers in South Africa. The input data captures two decades of metered household consumption, covering 14 945 household years and 3 295 848 daily load patterns of a population with high variability across temporal, geographic, social and economic dimensions. Different algorithms, normalisation and pre-binning techniques are evaluated to determine the best clustering structure. The study shows that normalisation is essential for producing good clusters. Specifically, unit norm produces more usable and more expressive clusters than the zero-one scaler, which is the most common method of normalisation used in the domain. While pre-binning improves clustering results for the dataset, the choice of pre-binning method does not significantly impact the quality of clusters produced. Data representation and especially the inclusion or removal of zero-valued profiles is an important consideration in relation to the pre-binning approach selected. Like several previous studies, the k-means algorithm produces the best results. Introducing a qualitative evaluation framework facilitated the evaluation process and helped identify a top clustering structure that is significantly more useable than those that would have been selected based on quantitative metrics alone. The approach demonstrates how explicitly defined qualitative evaluation measures can aid in selecting a clustering structure that is more likely to have real world application. To our knowledge this is the first work that uses cluster analysis to generate customer archetypes from representative daily load profiles in a highly variable, developing country context
- ItemOpen AccessPredicting anomalous weather events using supervised machine learning(2022) Williams, Edwina; Moodley, DeshenThe complexity and variability of atmospheric processes make it difficult to predict weather anomalies. Early detection of weather anomalies is critical to ensure that the necessary precautions are taken to limit the impact on people and economic activities. There is a growing interest in the use of machine learning techniques as an alternative to traditional weather forecasting methods. In this study, the use of machine learning techniques to predict daily maximum temperatures and detect temperature anomalies is investigated. Machine learning techniques were trained to predict weather anomalies for three stations in the Gauteng and Northern Cape provinces of South Africa. Three machine learning techniques were selected based on their use and performance in the relevant literature. The techniques include the Support Vector Machine, Artificial Neural Network and Huber Regressor. Both regression and classification-based techniques were evaluated and compared to determine which provide optimal performance for predicting temperatures and detecting anomalies. The regression-based techniques were trained to predict the daily maximum temperatures (for the next day) based on the previous three day's conditions. The predictions were evaluated based on the next day prediction error and the anomaly detection rate in the predictions. Techniques based on classification were trained to classify whether an anomaly would occur the next day based on the previous three day's conditions. The results showed that the machine learning techniques performed well at predicting the next day's maximum temperatures. However, the techniques had a low success rate in detecting anomalies.