Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost
dc.contributor.advisor | Berman, Sonia | |
dc.contributor.author | Myburg, Marius Errol | |
dc.date.accessioned | 2023-07-12T10:20:29Z | |
dc.date.available | 2023-07-12T10:20:29Z | |
dc.date.issued | 2023 | |
dc.date.updated | 2023-07-12T10:16:39Z | |
dc.description.abstract | CRM) will continue to gain prominence in the coming years. A commonly used CRM metric called Customer Lifetime Value (CLV) is the value a customer will contribute while they are an active customer. This study investigated the ability of supervised machine learning models constructed with XGBoost to predict future CLV, as well as the likelihood that a customer will drop to a lower CLV in the future. One approach to determining CLV, called the RFM method, is done by isolating recency (R), frequency (F) and (M) monetary values. The produced models used these RFM variables and also assessed if including temporal, product, and other customer transaction information assisted the XGBoost classifier in making better predictions. The classification models were constructed by extracting each customer's RFM values and transaction information from a Fast Mover Consumer Goods dataset. Different variations of CLV were calculated through one- and two-dimensional K-means clustering of the M (Monetary), F and M (Profitability), F and R (Loyalty), as well as the R and M (Burgeoning) variables. Two additional CLV variations were also determined by isolating the M tercile segments and a commonly used weighted-RFM approach. To test the effectiveness of XGBoost in predicting future timeframes, the dataset was divided into three consecutive periods, where the first period formed the features used to predict the target CLV variables in the second and third periods. Models that predicted if CLV dropped to a lower value from the first to the second and from the first to the third periods were also constructed. It was found that the XGBoost models were moderately to highly effective in classifying future CLV in both the second and third periods. The models also effectively predicted if CLV would drop to a lower value in both future periods. The ability to predict future CLV and CLV drop in the second period, was only slightly better than the ability to predict the future CLV in the third period. Models constructed by adding additional temporal, product, and customer transaction information to the RFM values did not improve on those created that used only the RFM values. These findings illustrate the effectiveness of XGBoost as a predictor for future CLV and CLV drop, as well as affirming the efficacy of utilising RFM values to determine future CLV. | |
dc.identifier.apacitation | Myburg, M. E. (2023). <i>Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost</i>. (). ,Faculty of Science ,Department of Computer Science. Retrieved from http://hdl.handle.net/11427/38088 | en_ZA |
dc.identifier.chicagocitation | Myburg, Marius Errol. <i>"Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost."</i> ., ,Faculty of Science ,Department of Computer Science, 2023. http://hdl.handle.net/11427/38088 | en_ZA |
dc.identifier.citation | Myburg, M.E. 2023. Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost. . ,Faculty of Science ,Department of Computer Science. http://hdl.handle.net/11427/38088 | en_ZA |
dc.identifier.ris | TY - Master Thesis AU - Myburg, Marius Errol AB - CRM) will continue to gain prominence in the coming years. A commonly used CRM metric called Customer Lifetime Value (CLV) is the value a customer will contribute while they are an active customer. This study investigated the ability of supervised machine learning models constructed with XGBoost to predict future CLV, as well as the likelihood that a customer will drop to a lower CLV in the future. One approach to determining CLV, called the RFM method, is done by isolating recency (R), frequency (F) and (M) monetary values. The produced models used these RFM variables and also assessed if including temporal, product, and other customer transaction information assisted the XGBoost classifier in making better predictions. The classification models were constructed by extracting each customer's RFM values and transaction information from a Fast Mover Consumer Goods dataset. Different variations of CLV were calculated through one- and two-dimensional K-means clustering of the M (Monetary), F and M (Profitability), F and R (Loyalty), as well as the R and M (Burgeoning) variables. Two additional CLV variations were also determined by isolating the M tercile segments and a commonly used weighted-RFM approach. To test the effectiveness of XGBoost in predicting future timeframes, the dataset was divided into three consecutive periods, where the first period formed the features used to predict the target CLV variables in the second and third periods. Models that predicted if CLV dropped to a lower value from the first to the second and from the first to the third periods were also constructed. It was found that the XGBoost models were moderately to highly effective in classifying future CLV in both the second and third periods. The models also effectively predicted if CLV would drop to a lower value in both future periods. The ability to predict future CLV and CLV drop in the second period, was only slightly better than the ability to predict the future CLV in the third period. Models constructed by adding additional temporal, product, and customer transaction information to the RFM values did not improve on those created that used only the RFM values. These findings illustrate the effectiveness of XGBoost as a predictor for future CLV and CLV drop, as well as affirming the efficacy of utilising RFM values to determine future CLV. DA - 2023 DB - OpenUCT DP - University of Cape Town KW - computer science LK - https://open.uct.ac.za PY - 2023 T1 - Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost TI - Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost UR - http://hdl.handle.net/11427/38088 ER - | en_ZA |
dc.identifier.uri | http://hdl.handle.net/11427/38088 | |
dc.identifier.vancouvercitation | Myburg ME. Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost. []. ,Faculty of Science ,Department of Computer Science, 2023 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/38088 | en_ZA |
dc.language.rfc3066 | eng | |
dc.publisher.department | Department of Computer Science | |
dc.publisher.faculty | Faculty of Science | |
dc.subject | computer science | |
dc.title | Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost | |
dc.type | Master Thesis | |
dc.type.qualificationlevel | Masters | |
dc.type.qualificationlevel | MSc |