Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost

dc.contributor.advisorBerman, Sonia
dc.contributor.authorMyburg, Marius Errol
dc.date.accessioned2023-07-12T10:20:29Z
dc.date.available2023-07-12T10:20:29Z
dc.date.issued2023
dc.date.updated2023-07-12T10:16:39Z
dc.description.abstractCRM) will continue to gain prominence in the coming years. A commonly used CRM metric called Customer Lifetime Value (CLV) is the value a customer will contribute while they are an active customer. This study investigated the ability of supervised machine learning models constructed with XGBoost to predict future CLV, as well as the likelihood that a customer will drop to a lower CLV in the future. One approach to determining CLV, called the RFM method, is done by isolating recency (R), frequency (F) and (M) monetary values. The produced models used these RFM variables and also assessed if including temporal, product, and other customer transaction information assisted the XGBoost classifier in making better predictions. The classification models were constructed by extracting each customer's RFM values and transaction information from a Fast Mover Consumer Goods dataset. Different variations of CLV were calculated through one- and two-dimensional K-means clustering of the M (Monetary), F and M (Profitability), F and R (Loyalty), as well as the R and M (Burgeoning) variables. Two additional CLV variations were also determined by isolating the M tercile segments and a commonly used weighted-RFM approach. To test the effectiveness of XGBoost in predicting future timeframes, the dataset was divided into three consecutive periods, where the first period formed the features used to predict the target CLV variables in the second and third periods. Models that predicted if CLV dropped to a lower value from the first to the second and from the first to the third periods were also constructed. It was found that the XGBoost models were moderately to highly effective in classifying future CLV in both the second and third periods. The models also effectively predicted if CLV would drop to a lower value in both future periods. The ability to predict future CLV and CLV drop in the second period, was only slightly better than the ability to predict the future CLV in the third period. Models constructed by adding additional temporal, product, and customer transaction information to the RFM values did not improve on those created that used only the RFM values. These findings illustrate the effectiveness of XGBoost as a predictor for future CLV and CLV drop, as well as affirming the efficacy of utilising RFM values to determine future CLV.
dc.identifier.apacitationMyburg, M. E. (2023). <i>Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost</i>. (). ,Faculty of Science ,Department of Computer Science. Retrieved from http://hdl.handle.net/11427/38088en_ZA
dc.identifier.chicagocitationMyburg, Marius Errol. <i>"Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost."</i> ., ,Faculty of Science ,Department of Computer Science, 2023. http://hdl.handle.net/11427/38088en_ZA
dc.identifier.citationMyburg, M.E. 2023. Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost. . ,Faculty of Science ,Department of Computer Science. http://hdl.handle.net/11427/38088en_ZA
dc.identifier.ris TY - Master Thesis AU - Myburg, Marius Errol AB - CRM) will continue to gain prominence in the coming years. A commonly used CRM metric called Customer Lifetime Value (CLV) is the value a customer will contribute while they are an active customer. This study investigated the ability of supervised machine learning models constructed with XGBoost to predict future CLV, as well as the likelihood that a customer will drop to a lower CLV in the future. One approach to determining CLV, called the RFM method, is done by isolating recency (R), frequency (F) and (M) monetary values. The produced models used these RFM variables and also assessed if including temporal, product, and other customer transaction information assisted the XGBoost classifier in making better predictions. The classification models were constructed by extracting each customer's RFM values and transaction information from a Fast Mover Consumer Goods dataset. Different variations of CLV were calculated through one- and two-dimensional K-means clustering of the M (Monetary), F and M (Profitability), F and R (Loyalty), as well as the R and M (Burgeoning) variables. Two additional CLV variations were also determined by isolating the M tercile segments and a commonly used weighted-RFM approach. To test the effectiveness of XGBoost in predicting future timeframes, the dataset was divided into three consecutive periods, where the first period formed the features used to predict the target CLV variables in the second and third periods. Models that predicted if CLV dropped to a lower value from the first to the second and from the first to the third periods were also constructed. It was found that the XGBoost models were moderately to highly effective in classifying future CLV in both the second and third periods. The models also effectively predicted if CLV would drop to a lower value in both future periods. The ability to predict future CLV and CLV drop in the second period, was only slightly better than the ability to predict the future CLV in the third period. Models constructed by adding additional temporal, product, and customer transaction information to the RFM values did not improve on those created that used only the RFM values. These findings illustrate the effectiveness of XGBoost as a predictor for future CLV and CLV drop, as well as affirming the efficacy of utilising RFM values to determine future CLV. DA - 2023 DB - OpenUCT DP - University of Cape Town KW - computer science LK - https://open.uct.ac.za PY - 2023 T1 - Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost TI - Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost UR - http://hdl.handle.net/11427/38088 ER - en_ZA
dc.identifier.urihttp://hdl.handle.net/11427/38088
dc.identifier.vancouvercitationMyburg ME. Using recency, frequency and monetary variables to predict customer lifetime value with XGBoost. []. ,Faculty of Science ,Department of Computer Science, 2023 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/38088en_ZA
dc.language.rfc3066eng
dc.publisher.departmentDepartment of Computer Science
dc.publisher.facultyFaculty of Science
dc.subjectcomputer science
dc.titleUsing recency, frequency and monetary variables to predict customer lifetime value with XGBoost
dc.typeMaster Thesis
dc.type.qualificationlevelMasters
dc.type.qualificationlevelMSc
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis_sci_2023_myburg marius errol.pdf
Size:
3.47 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description:
Collections