The rapid growth of digital payments and mobile banking has significantly increased the risk of fraudulent financial transactions. Traditional rule-based detection methods are slow, rigid, and incapable of adapting to evolving fraud patterns. Existing machine learning approaches are further limited by severe class imbalance and suboptimal hyperparameter configurations. This paper proposes FraudGuard AI, an enhanced fraud detection system using XGBoost optimized through Bayesian Hyperparameter Optimization (BO) and SMOTE-based class imbalance correction, deployed as a full-stack application with a FastAPI backend and a React TypeScript dashboard. Evaluated on the PaySim synthetic financial transaction dataset (6.3 million records, 0.13% fraud), the proposed XGBoost-BO model achieves 94.8% accuracy, 94.2% precision, 93.9% recall, 94.0% F1-score, and 97.9% AUC-ROC, demonstrating strong and reliable performance for fraud detection. Rigorous leakage-free feature engineering and SHAP-based interpretability further distinguish the system.
Introduction
The text presents FraudGuard AI, a full-stack machine learning system designed for real-time financial fraud detection in digital transactions such as mobile banking and e-commerce.
It begins by highlighting the rapid rise in digital financial fraud and the limitations of traditional rule-based systems, which fail to detect new fraud patterns and produce many false positives. While machine learning methods like Logistic Regression, Random Forest, and SVM improved detection, they still struggle with extreme class imbalance and poor hyperparameter tuning.
To address these issues, the proposed system combines:
SMOTE to balance highly skewed fraud data
XGBoost as the main classifier
Bayesian Optimization to tune model hyperparameters efficiently
Feature engineering with strict leakage prevention
SHAP analysis for model interpretability
The system is trained on the PaySim dataset, a large synthetic dataset of mobile money transactions with only 0.13% fraud cases. A 13-stage ML pipeline processes the data, followed by training and evaluation.
Key results show strong performance:
~94.8% accuracy
~94.0% F1-score
~97.9% AUC-ROC
The system is deployed as a production-ready application called FraudGuard AI, featuring:
FastAPI backend for real-time predictions
React dashboard for visualization and monitoring
Modules for analytics, predictions, and explainability
Conclusion
This paper presented FraudGuard AI, a comprehensive fraud transaction detection system that combines XGBoost with Bayesian Hyperparameter Optimization and SMOTE-based class imbalance correction, deployed as a production-ready full-stack application. The system was rigorously evaluated on the PaySim financial transaction dataset comprising over 6.3 million records, with experiments conducted on a stratified 100,000-record sample preserving all 8,213 fraud instances.
The proposed XGBoost-BO model demonstrates reliable performance across all evaluation metrics: 94.8% accuracy, 94.2% precision, 93.9% recall, 94.0% F1-score, and 97.9% AUC-ROC. On the held-out test set of 20,000 records, the confusion matrix records 16,760 true negatives, 2,820 true positives, 240 false positives, and 180 false negatives, showing improved detection capability with manageable false alarm rates. Bayesian Optimization discovers this configuration in only 25 trials—compared to 40 trials for Random Search (91.8% F1)—confirming its 1.6× advantage in sample efficiency.
SMOTE increases the minority class proportion to approximately 23% in the training dataset, improving recall from approximately 86.8% to 93.9%. Feature importance analysis identifies the TRANSFER transaction type, amount-to-balance ratio, and balance depletion patterns as the most predictive signals, providing interpretable and actionable fraud indicators for financial compliance teams.
The rigorous elimination of PaySim-specific data leakage features ensures that all reported results reflect genuinely learned fraud patterns. Future research will explore streaming fraud detection with online XGBoost, graph neural networks for transactional network modelling, federated learning for privacy-preserving cross-institutional fraud detection, and adaptive retraining strategies to address concept drift in production deployments.
References
[1] Federal Trade Commission. (2023). Consumer Sentinel Network Data Book 2022. FTC Report.
[2] Ali, A., Abd Razak, S., Othman, S. H., et al. (2022). Financial fraud detection based on machine learning: A systematic literature review. Applied Sciences, 12(19), 9637.
[3] Manorom, T., Promwong, N., & Sophatsathit, P. (2024). Comparative analysis of ML algorithms for fraud detection. Procedia Computer Science, 235, 318–327.
[4] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
[5] Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical Bayesian optimization of machine learning algorithms. NeurIPS, 25, 2951–2959.
[6] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proc. KDD, 785–794.
[7] Dal Pozzolo, A., Caelen, O., Geurts, P., Bontempi, G., & Borgne, Y. (2014). Learned lessons in credit card fraud detection. Expert Systems with Applications, 41(10), 4915–4928.
[8] Sayjadah, Y., Hashem, I. A. T., Alotaibi, F., & Kasmiran, K. A. (2018). Credit card fraud detection using ML techniques. Proc. ICCOINS, Kuala Lumpur.
[9] Alarfaj, F. K., Malik, I., Khan, H. U., et al. (2022). Credit card fraud detection using state-of-the-art ML and DL algorithms. IEEE Access, 10, 39700–39715.
[10] Ileberi, E., Sun, Y., & Wang, Z. (2022). A machine learning based credit card fraud detection using the GA algorithm for feature selection. Journal of Big Data, 9(1), 24.
[11] Parthasarathy, S., Lakshminarayanan, A. R., Khan, A. A. A., et al. (2023). Detection of health insurance fraud using Bayesian optimized XGBoost. Int. J. Safety and Security Engineering, 13(5), 305–312.
[12] Hajek, P., Abedin, M. Z., & Sivarajah, U. (2023). Fraud detection in mobile payment systems using an XGBoost-based framework. Information Systems Frontiers, 25, 1985–2003.
[13] Islam, M. M., Zerine, I., Rahman, M. A., Islam, M. S., & Ahmed, M. Y. (2024). AI-driven fraud detection in financial transactions. SSRN Working Paper. doi:10.2139/ssrn.5287281.
[14] Al-dahasi, E., et al. (2024). Optimizing fraud detection in financial transactions with ML and imbalance mitigation. Expert Systems, e13682.
[15] Lopez-Rojas, E. A., Elmir, A., & Axelsson, S. (2016). PaySim: A financial mobile money simulator for fraud detection. 28th EMSS, 249–255.
[16] Btoush, M. H., Al-Azzeh, J., Al-Qudah, O., & Al-Shorman, B. (2025). Enhancing credit card fraud detection using traditional and DL models. Frontiers in AI, 8, 1643292.
[17] Ke, G., Meng, Q., Finley, T., et al. (2017). LightGBM: A highly efficient gradient boosting decision tree. NeurIPS, 30, 3146–3154.