This project aims to predict and classify customer lifetime value based on historical transaction data of an e-business entity between December 2015 and December 2017. The research begins with exploratory data analysis followed by RFM analysis [2] so as to develop a foundational understanding of the customer\'s behavior and classify customers into groups based on their purchase behavior. This is through two-step modeling when approximating the CLTV. In this case, while BG/NBD generates a sort of estimate on how that particular customer will make that purchase, the Gamma-Gamma model will approximate the monetary average per head. This establishes that the project provides an in-depth analysis of the customer value gained from the outcome of the two models, identifies high-value customers, and classifies them into four different groups. Such insights are applicable in the implementation of optimal market targeting strategies that facilitate customer retention strategies and enhance revenue maximization. By this, the project manifests the promise that predictive advanced analytics holds in enhancing decision-making processes and furthering business performance.
Introduction
Understanding Customer Lifetime Value (CLV) is essential for firms aiming for sustainable growth by identifying the most valuable customers and predicting their future behavior. CLV helps businesses allocate resources efficiently, target high-value customers, and enhance profitability, especially in non-contractual industries where customers can leave at any time.
This research applies probabilistic models—primarily the BG/NBD model for forecasting customer transaction frequency and the Gamma-Gamma model for estimating average transaction value—to predict CLV. These models require minimal data features and are well-suited for non-contractual settings. The study also incorporates RFM (Recency, Frequency, Monetary) analysis and K-Means clustering to segment customers and tailor marketing strategies accordingly.
The literature review covers various CLV estimation methods including decision trees (like XGBoost), neural networks, and probabilistic models, highlighting their strengths and limitations. BG/NBD is favored for its simplicity and accuracy, especially compared to more data-heavy machine learning approaches.
Methodologically, the project involves thorough data cleaning, preprocessing, RFM scoring, application of BG/NBD and Gamma-Gamma models for CLV prediction, and customer segmentation based on predicted values. Model performance is validated using metrics such as RMSE and tested against real-world data patterns to ensure practical applicability.
In summary, combining these analytical techniques enables businesses to develop targeted retention and acquisition strategies, driving long-term profitability and competitive advantage.
Conclusion
This study attempts to use a hybrid approach of data analysis and predictive modeling techniques in the task of CLTV forecasting. The project started with intense analysis of the transactional data where data quality issues were addressed, buying patterns identified, and RFM analysis conducted. Such groundwork was laid down so that predictive models could be constructed [6],[9].
We applied the BG/NBD model alongside the Gamma-Gamma model to estimate CLTV. The BG/NBD model helped us determine the purchase frequency and the likelihood of customers ceasing to make purchases. Meanwhile, the Gamma-Gamma model assisted in estimating the average revenue per transaction generated by each customer. By combining these models, we were able to accurately predict the future value of each customer and categorize them into various value segments [6],[7].
CLTV and customer segmentation can be achieved accurately and have business implications that are specific in nature. The company will thus know the most valuable customers to target, and the rest of its resources can be channelled in developing more precise marketing techniques. A company can use loyalty programs and customized promotions for its best customers, while strategies in retaining customers who will most likely leave should be formulated. It\'s a data-driven approach in marketing efficiency while supporting long-term profitability and growth [8],[10].
References
[1] L. Vrana, L. Sperkova, M. Kobulsky, P. Jasek, and Z. Smutny, “A Comparative Study of Probabilistic Models for Customer Lifetime Value in Online Retail,” Journal of Business Economics and Management, vol. 20, no. 3, pp. 403–413, 2018.
[2] S. Chen, “Data Mining: Estimating Customer Lifetime Value with Machine Learning Methods,” pp. 26–32, Apr. 2018.
[3] I. Hanif, “Enhancing Customer Churn Prediction Through the XGBoost Classifier,” Media & Digital Department of Telkom, Indonesia, Aug. 2019.
[4] C. J. Cheng, C. B. Cheng, J. Y. Wu, and S. W. Chiu, “Predicting Customer Lifetime Value Using a Markov Chain-Based Data Mining Model: Insights from an Auto Repair and Maintenance Company in Taiwan,” ScientiaIranica, pp. 850–855, Nov. 2011.
[5] D. Garcia, J. Desirena, G. Desirena, and I. Moreno, “Maximizing Customer Lifetime Value Through Stacked Neural Networks: A Case Study in the Insurance Sector,” pp. 3–8, Jul. 2019.
[6] P. Fader, B. Hardie, and K. Lee, “Simplified Methods for Estimating Customer Lifetime Value: An Alternative to the Pareto/NBD Model,” Marketing Science, vol. 24, no. 2, pp. 275–284, 2005.
[7] D. C. Schmittlein, D. G. Morrison, and R. Colombo, “Who Are Your Customers and What Will They Do Next? Insights from Counting Customer Behavior,” Management Science, vol. 33, no. 1, pp. 1–24, 2016.
[8] H. Casteran, L. Mayer-Waarden, and W. Reinartz, “Customer Lifetime Value, Retention, and Churn Modeling,” in Handbook of Market Research, C. Homburg, M. Klarmann, and A. Vomberg, Eds. Springer, Cham, pp. 14–41, Apr. 2017.
[9] M. Mzoughia and M. Limam, “Advancing the BG/NBD Model for Purchasing Behavior Analysis Using the Com-Poisson Distribution,” International Journal of Modeling and Optimization, vol. 4, no. 2, pp. 141–145, 2014.
[10] M. Khajvand, K. Zolfaghar, S. Ashoori, and S. Alizadeh, “RFM Analysis of Customer Purchase Behavior: Estimating Customer Lifetime Value - A Case Study,” Procedia Computer Science, vol. 3, pp. 57–63, 2011, doi: 10.1016/j.procs.2010.12.011.