Aspects of Bayesian inference, classification and anomaly detection

dc.contributor.advisorBassett, Bruce
dc.contributor.authorRoberts, Ethan
dc.date.accessioned2022-03-11T10:43:27Z
dc.date.available2022-03-11T10:43:27Z
dc.date.issued2021
dc.date.updated2022-03-11T10:42:52Z
dc.description.abstractThe primary objective of this thesis is to develop rigorous Bayesian tools for common statistical challenges arising in modern science where there is a heightened demand for precise inference in the presence of large, known uncertainties. This thesis explores in detail two arenas where this manifests. The first is the development and testing of a unified Bayesian anomaly detection and classification framework (BADAC) which allows principled anomaly detection in the presence of measurement uncertainties, which are rarely incorporated into machine learning algorithms. BADAC deals with uncertainties by marginalising over the unknown, true value of the data. Using simulated data with Gaussian noise as an example, BADAC is shown to be superior to standard algorithms in both classification and anomaly detection performance in the presence of uncertainties. Additionally, BADAC provides well-calibrated classification probabilities, valuable for use in scientific pipelines. BADAC is therefore ideal where computational cost is not a limiting factor and statistical rigour is important. We discuss approximations to speed up BADAC, such as the use of Gaussian processes, and finally introduce a new metric, the Rank-Weighted Score (RWS), that is particularly suited to evaluating an algorithm's ability to detect anomalies. The second major exploration in this thesis presents methods for rigorous statistical inference in the presence of classification uncertainties and errors. Although this is explored specifically through supernova cosmology, the context is general. Supernova cosmology without spectra will be an important component of future surveys due to massive increases in data volumes in next-generation surveys such as from the Vera C. Rubin Observatory. This lack of supernova spectra results both in uncertainty in the redshifts and type of the supernova, which if ignored, leads to significantly biased estimates of cosmological parameters. We present a hierarchical Bayesian formalism, zBEAMS, which addresses this problem by marginalising over the unknown or uncertain supernova redshifts and types to produce unbiased cosmological estimates that are competitive with supernova data with fully spectroscopically confirmed redshifts. zBEAMS thus provides a unified treatment of both photometric redshifts, classification uncertainty and host galaxy misidentification, effectively correcting the inevitable contamination in the Hubble diagram with little or no loss of statistical power.
dc.identifier.apacitationRoberts, E. (2021). <i>Aspects of Bayesian inference, classification and anomaly detection</i>. (). ,Faculty of Science ,Department of Mathematics and Applied Mathematics. Retrieved from http://hdl.handle.net/11427/36053en_ZA
dc.identifier.chicagocitationRoberts, Ethan. <i>"Aspects of Bayesian inference, classification and anomaly detection."</i> ., ,Faculty of Science ,Department of Mathematics and Applied Mathematics, 2021. http://hdl.handle.net/11427/36053en_ZA
dc.identifier.citationRoberts, E. 2021. Aspects of Bayesian inference, classification and anomaly detection. . ,Faculty of Science ,Department of Mathematics and Applied Mathematics. http://hdl.handle.net/11427/36053en_ZA
dc.identifier.ris TY - Doctoral Thesis AU - Roberts, Ethan AB - The primary objective of this thesis is to develop rigorous Bayesian tools for common statistical challenges arising in modern science where there is a heightened demand for precise inference in the presence of large, known uncertainties. This thesis explores in detail two arenas where this manifests. The first is the development and testing of a unified Bayesian anomaly detection and classification framework (BADAC) which allows principled anomaly detection in the presence of measurement uncertainties, which are rarely incorporated into machine learning algorithms. BADAC deals with uncertainties by marginalising over the unknown, true value of the data. Using simulated data with Gaussian noise as an example, BADAC is shown to be superior to standard algorithms in both classification and anomaly detection performance in the presence of uncertainties. Additionally, BADAC provides well-calibrated classification probabilities, valuable for use in scientific pipelines. BADAC is therefore ideal where computational cost is not a limiting factor and statistical rigour is important. We discuss approximations to speed up BADAC, such as the use of Gaussian processes, and finally introduce a new metric, the Rank-Weighted Score (RWS), that is particularly suited to evaluating an algorithm's ability to detect anomalies. The second major exploration in this thesis presents methods for rigorous statistical inference in the presence of classification uncertainties and errors. Although this is explored specifically through supernova cosmology, the context is general. Supernova cosmology without spectra will be an important component of future surveys due to massive increases in data volumes in next-generation surveys such as from the Vera C. Rubin Observatory. This lack of supernova spectra results both in uncertainty in the redshifts and type of the supernova, which if ignored, leads to significantly biased estimates of cosmological parameters. We present a hierarchical Bayesian formalism, zBEAMS, which addresses this problem by marginalising over the unknown or uncertain supernova redshifts and types to produce unbiased cosmological estimates that are competitive with supernova data with fully spectroscopically confirmed redshifts. zBEAMS thus provides a unified treatment of both photometric redshifts, classification uncertainty and host galaxy misidentification, effectively correcting the inevitable contamination in the Hubble diagram with little or no loss of statistical power. DA - 2021 DB - OpenUCT DP - University of Cape Town KW - Applied Mathematics LK - https://open.uct.ac.za PY - 2021 T1 - Aspects of Bayesian inference, classification and anomaly detection TI - Aspects of Bayesian inference, classification and anomaly detection UR - http://hdl.handle.net/11427/36053 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/36053
dc.identifier.vancouvercitationRoberts E. Aspects of Bayesian inference, classification and anomaly detection. []. ,Faculty of Science ,Department of Mathematics and Applied Mathematics, 2021 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/36053en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Mathematics and Applied Mathematics
dc.publisher.facultyFaculty of Science
dc.subjectApplied Mathematics
dc.titleAspects of Bayesian inference, classification and anomaly detection
dc.typeDoctoral Thesis
dc.type.qualificationlevelDoctoral
dc.type.qualificationlevelPhD
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_sci_2021_roberts ethan.pdf
Size:
6.28 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description:
Collections