Classification of multiwavelength transients with machine learning

dc.contributor.advisorLochner, Michelle
dc.contributor.advisorBassett, Bruce
dc.contributor.authorSooknunan, Kimeel
dc.date.accessioned2020-02-24T12:38:09Z
dc.date.available2020-02-24T12:38:09Z
dc.date.issued2019
dc.date.updated2020-02-24T08:37:16Z
dc.description.abstractWith the advent of powerful telescopes such as the Square Kilometre Array (SKA), its precursor MeerKAT and the Large Synoptic Survey Telescope (LSST), we are entering a golden era of multiwavelength transient astronomy. The large MeerKAT science project ThunderKAT may dramatically increase the detected number of radio transients. Currently radio transient datasets are still very small, allowing spectroscopic classification of all objects of interest. As the event rate increases, follow-up resources must be prioritised by making use of early classification of the radio data. Machine learning algorithms have proven themselves invaluable in the context of optical astronomy, however it has yet to be applied to radio transients. In the burgeoning era of multimessenger astronomy, incorporating data from different telescopes such as MeerLICHT, Fermi, LSST and the gravitational wave observatory LIGO could significantly improve classification of events. Here we present MALT (Machine Learning for Transients): a general machine learning pipeline for multiwavelength transient classification. In order to make use of most machine learning algorithms, "features" must be extracted from complex and often high dimensional datasets. In our approach, we first interpolate the data onto a uniform grid using Gaussian processes, we then perform a wavelet decomposition and finally reduce the dimensionality using principal component analysis. We then classify the light curves with the popular machine learning algorithm random forests. For the first time, we apply machine learning to the classification of radio transients. Unfortunately publicly available radio transient data is scarce and our dataset consists of just 87 light curves, with several classes only consisting of a single example. However machine learning is often applied to such small datasets by making use of data augmentation. We develop a novel data augmentation technique based on Gaussian processes, able to generate new data statistically consistent with the original. As the dataset is currently small, three studies were done on the effect of the training set. The classifier was trained on a non-representative training set, achieving an overall accuracy of 77.8% over all 11 classes with the known 87 lightcurves with just eight hours of observations. The expected increase in performance, as more training data are acquired, is shown by training the classifier on a simulated representative training set, achieving an average accuracy of 95.8% across all 11 classes. Finally, the effectiveness of including multiwavelength data for general transient classification is demonstrated. First the classifier is trained on wavelet features and a contextual feature, achieving an average accuracy of 72.9%. The classifier was then trained on wavelet features and a contextual feature, together with a single optical flux feature. This addition improves the overall accuracy to 94.7%. This work provides a general approach for multiwavelength transient classification and shows that machine learning can be highly effective at classifying the influx of radio transients anticipated with MeerKAT and other radio telescopes.
dc.identifier.apacitationSooknunan, K. (2019). <i>Classification of multiwavelength transients with machine learning</i>. (). ,Faculty of Science ,Department of Astronomy. Retrieved from http://hdl.handle.net/11427/31271en_ZA
dc.identifier.chicagocitationSooknunan, Kimeel. <i>"Classification of multiwavelength transients with machine learning."</i> ., ,Faculty of Science ,Department of Astronomy, 2019. http://hdl.handle.net/11427/31271en_ZA
dc.identifier.citationSooknunan, K. 2019. Classification of multiwavelength transients with machine learning.en_ZA
dc.identifier.ris TY - Thesis / Dissertation AU - Sooknunan, Kimeel AB - With the advent of powerful telescopes such as the Square Kilometre Array (SKA), its precursor MeerKAT and the Large Synoptic Survey Telescope (LSST), we are entering a golden era of multiwavelength transient astronomy. The large MeerKAT science project ThunderKAT may dramatically increase the detected number of radio transients. Currently radio transient datasets are still very small, allowing spectroscopic classification of all objects of interest. As the event rate increases, follow-up resources must be prioritised by making use of early classification of the radio data. Machine learning algorithms have proven themselves invaluable in the context of optical astronomy, however it has yet to be applied to radio transients. In the burgeoning era of multimessenger astronomy, incorporating data from different telescopes such as MeerLICHT, Fermi, LSST and the gravitational wave observatory LIGO could significantly improve classification of events. Here we present MALT (Machine Learning for Transients): a general machine learning pipeline for multiwavelength transient classification. In order to make use of most machine learning algorithms, "features" must be extracted from complex and often high dimensional datasets. In our approach, we first interpolate the data onto a uniform grid using Gaussian processes, we then perform a wavelet decomposition and finally reduce the dimensionality using principal component analysis. We then classify the light curves with the popular machine learning algorithm random forests. For the first time, we apply machine learning to the classification of radio transients. Unfortunately publicly available radio transient data is scarce and our dataset consists of just 87 light curves, with several classes only consisting of a single example. However machine learning is often applied to such small datasets by making use of data augmentation. We develop a novel data augmentation technique based on Gaussian processes, able to generate new data statistically consistent with the original. As the dataset is currently small, three studies were done on the effect of the training set. The classifier was trained on a non-representative training set, achieving an overall accuracy of 77.8% over all 11 classes with the known 87 lightcurves with just eight hours of observations. The expected increase in performance, as more training data are acquired, is shown by training the classifier on a simulated representative training set, achieving an average accuracy of 95.8% across all 11 classes. Finally, the effectiveness of including multiwavelength data for general transient classification is demonstrated. First the classifier is trained on wavelet features and a contextual feature, achieving an average accuracy of 72.9%. The classifier was then trained on wavelet features and a contextual feature, together with a single optical flux feature. This addition improves the overall accuracy to 94.7%. This work provides a general approach for multiwavelength transient classification and shows that machine learning can be highly effective at classifying the influx of radio transients anticipated with MeerKAT and other radio telescopes. DA - 2019 DB - OpenUCT DP - University of Cape Town KW - Astronomy LK - https://open.uct.ac.za PY - 2019 T1 - Classification of multiwavelength transients with machine learning TI - Classification of multiwavelength transients with machine learning UR - http://hdl.handle.net/11427/31271 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/31271
dc.identifier.vancouvercitationSooknunan K. Classification of multiwavelength transients with machine learning. []. ,Faculty of Science ,Department of Astronomy, 2019 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/31271en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Astronomy
dc.publisher.facultyFaculty of Science
dc.subjectAstronomy
dc.titleClassification of multiwavelength transients with machine learning
dc.typeMaster Thesis
dc.type.qualificationlevelMasters
dc.type.qualificationnameMSc
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_sci_2019_sooknunan_kimeel.pdf
Size:
7.31 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description:
Collections