Enhancing Customer Subscription Prediction in Bank Telemarketing Using Deep Learning and Ensemble Model

Authors: Dinesh Kumar Katakam

DOI Link: https://doi.org/10.22214/ijraset.2025.73563

Abstract

Predicting customer subscriptions is a crucial task in bank telemarketing campaigns that aim to enhance customer acquisition, decrease operating expenses, and optimize marketing strategies. To resolve this classification problem, traditional machine learning methods, including bagging, boosting, and stacking, are currently used extensively. Stacking has a 91.88% accuracy rate. While these ensemble methods have demonstrated promising performance, they often lack interpretability and struggle to capture temporal dependencies and nonlinear interactions inherent in customer effort data. To address these limitations, this study explores the effectiveness of deep learning models—specifically, the Multilayer Perceptron (MLP) and Recurrent Neural Network (RNN)—for predicting customer subscription outcomes. The RNN model performs noticeably better than MLP in all important metrics, according to a comparison and contrast, with 94.55% accuracy, 89.54% precision, 98.95% recall, and 94.01% F1-score. In contrast, MLP achieves slightly lower scores across the board. The superior performance of the RNN model can be attributed to its ability to capture sequential patterns and complex dependencies within the customer interaction data. These findings highlight the potential of RNN-based architectures for enhancing the predictive capability of telemarketing systems, offering a more robust and scalable solution for customer targeting and campaign optimization.

Introduction

Banks increasingly use telemarketing to promote financial products, but its success depends on accurately identifying likely subscribers.
Traditional ML models (e.g., logistic regression, decision trees, SVMs) are interpretable but lack the ability to learn complex or sequential patterns in customer behavior.
Ensemble methods like bagging, boosting, and stacking have improved prediction accuracy but still struggle with sequential data and temporal patterns.
Deep learning, particularly Recurrent Neural Networks (RNNs), provides a promising alternative by capturing feature interactions and temporal dependencies that are key in understanding customer behavior over time.

2. Related Work

ML and ensemble models like stacking and SMOTE (for class imbalance) have been effective but limited by:
- Lack of sequential learning
- Low interpretability
MLPs (Multilayer Perceptrons) improved prediction on structured data but fail to model feature order or time-based interactions.
RNNs, traditionally used for sequential data like text and audio, are now being adapted to tabular banking data by simulating pseudo-sequences.

3. Proposed Method

Dataset: bank-additional.csv (real-world telemarketing data).
Preprocessing:
- Label encoding for categoricals, standardization, and binary target transformation.
RNN Architecture:
- Inputs treated as pseudo-sequences (each feature = one time step).
- Uses Simple RNN layers, LeakyReLU, dropout (0.3), sigmoid output.
- Compiled with binary cross-entropy loss and Adam optimizer.
- EarlyStopping and ReduceLROnPlateau used for training stability.
Feature ordering strategies include:
- Mutual information ranking
- Semantic grouping
- Ensemble of RNNs with different feature orders to enhance robustness.

4. Experimental Results

Models compared:

Deep Learning: RNN, MLP
Ensemble ML: Random Forest, Bagging, Gradient Boosting, AdaBoost, Stacking

Model	Accuracy	Precision	Recall	F1 Score
RNN	94.55%	89.54%	98.95%	94.01%
MLP	93.64%	88.47%	98.07%	93.02%
Random Forest	90.65%	63.15%	39.13%	48.32%
Bagging	91.01%	61.84%	51.09%	55.95%
Gradient Boosting	90.17%	57.75%	44.56%	53.03%
AdaBoost	89.08%	56.90%	35.86%	44.00%
Stacking	90.53%	60.60%	43.48%	50.63%

???? Key Insights:

RNN outperformed all models, especially in recall (true positive rate) and F1-score, critical for identifying actual subscribers in imbalanced datasets.
MLPs were effective but lacked sequential modeling capacity.
Ensemble models performed well in accuracy but struggled in recall and minority class detection.

Conclusion

This study focused on predicting customer subscription in bank telemarketing campaigns by comparing deep learning architectures—Multilayer Perceptron (MLP) and Recurrent Neural Network (RNN)—with classical ensemble learning methods, including Random Forest, Bagging, Gradient Boosting, AdaBoost, and Stacking. Using a real-world dataset, each model was evaluated using standard metrics, including accuracy, precision, recall, and F1-score, with the results clearly outlined in Table I. The findings indicate that while traditional ensemble models provide competitive accuracy, they often struggle to identify subscribing customers, as reflected in their lower recall and F1-scores. Among the deep learning models, the RNN demonstrated superior performance across all metrics, particularly excelling in recall and overall classification balance. Its sequential modeling capability enabled it to capture complex and latent patterns in customer conduct, even when the input data was originally in tabular form. The MLP also performed well, outperforming all ensemble methods, but its inability to model sequential dependencies limited its generalization compared to the RNN. This comparison reinforces the value of integrating deep learning—especially sequence-aware architectures like RNNs—into predictive marketing systems. By leveraging temporal or contextual relationships between features, these models can provide more accurate and balanced predictions, which are crucial for optimizing telemarketing efforts and enhancing customer engagement strategies. As part of future work, the study can be extended in several directions. First, the integration of more advanced recurrent models such as Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) could be explored for enhanced performance. Second, explainability techniques like SHAP and LIME can be fully implemented to interpret model predictions and provide transparency to marketing teams. Third, it is suggested that incorporating additional temporal data—such as historical customer call logs, campaign durations, or transaction sequences—could potentially improve the network’s ability to model sequential dependencies more effectively. Finally, real-time deployment of the RNN model in active marketing systems would provide valuable feedback on its operational effectiveness and adaptability in dynamic environments.

References

[1] M. Peter, H. Mofi, S. Likoko, J. Sabas, et al., \"Predicting Customer Subscription in Bank Telemarketing Campaigns Using Ensemble Learning Models,\" Machine Learning with Applications, Mar. 2025. https://www.sciencedirect.com/science/article/pii/S2666827025000015. R. [2] Manggala, D. Daniati, and R. R. Haris, \"Telemarketing Bank Success Prediction Using Multilayer Perceptron (MLP) Algorithm with Resampling,“ ResearchGate, 2021. https://www.researchgate.net/publication/350243159_TELEMARKETING_BANK_SUCCESS_PREDICTION_USING_MULTILAYER_PERCEPTRON_MLP_ALGORITHM_WITH_RESAMPLING/download. [3] Shahriar Kaisar et al., \"Enhancing Telemarketing Success Using Ensemble-Based Online Learning Models,\" Big Data Mining and Analytics, 2023. https://www.sciopen.com/article/10.26599/BDMA.2023.9020041. [4] A. Bansal, S. Singh, Y. Jain, and A. Verma, \"Analysis of Ensemble Classifiers for Bank Churn Prediction,\" in Proc. 2022 Int. Conf. on Computing, Communication, and Intelligent Systems, pp. 593–598, 2022. https://doi.org/10.1109/ICCCIS56430.2022.10037623. [5] M. E. F. Milli, S. Aras, and Kocakoc?, \"Investigating the Effect of Class Balancing Methods on the Performance of Machine Learning Techniques: Credit Risk Application,\" Istanbul Journal of Economics and Management, vol. 5, pp. 55–70, 2024. https://doi.org/10.56203/iyd.1436742. [6] S. Moro, P. Cortez, and P. Rita, \"A Data-Driven Approach to Predict the Success of Bank Telemarketing,\" Decision Support Systems, vol. 62, pp. 22–31, 2014. https://doi.org/10.1016/j.dss.2014.03.001. [7] V. Sharma, S. Khanna, P. Gautam, and J. Kaushik, \"Bank Customer Identification for Targeted Marketing and Revenue Optimisation: A Comparative Analysis of Predictive Models,\" in Proc. 2024 Int. Conf. on Reliability, Infocom Technologies and Optimization, pp. 1–6, 2024. https://doi.org/10.1109/ICRITO61523.2024.10522140 [8] A. Sikri, R. Jameel, S. M. Idrees, and H. Kaur, \"Enhancing Customer Retention in Telecom Industry with Machine Learning Driven Churn Prediction,\" Scientific Reports, vol. 14, Article 13097, 2024. https://doi.org/10.1038/s41598-024-63750-0 [9] M. A. Talukder, M. M. Islam, M. A. Uddin, K. F. Hasan, S. Sharmin, S. A. Alyami, and M. A. Moni, \"Machine Learning-Based Network Intrusion Detection for Big and Imbalanced Data Using Oversampling, Stacking Feature Embedding and Feature Extraction,\" Journal of Big Data, vol. 11, Article 33, 2024. https://doi.org/10.1186/s40537-024-00886-w [10] H. H. Thabet, S. M. Darwish, and G. M. Ali, \"Measuring the Efficiency of Banks Using High-Performance Ensemble Technique,\" Neural Computing and Applications, vol. 36, pp. 16797–16815, 2024. https://doi.org/10.1007/s00521-024-09929-y

Copyright

Copyright © 2025 Dinesh Kumar Katakam. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET73563

Publish Date : 2025-08-05

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here