The rapid explosion in e-transactions has also been accompanied by an equal explosion in credit card fraud (CCF), and this is seriously raising money and security issues with individuals as well as banks. Rule-based anti-fraud solutions are susceptible to failing to recognize advanced fraudulent patterns and thus the requirement to incorporate advanced machine learning (ML) algorithms. This review covers a broad spectrum of supervised and unsupervised learning methods, i.e., AdaBoost, and ensemble methods, in light of their capacity to identify fraudulent transactions. An elaborate discussion of most important F1-score, and AUC is done by utilizing benchmark datasets such as the European Credit Card Dataset. The paper is concentrating on the difficulties in fraud detection within real-world use cases, such as multi-modal class distribution overlap, extremely imbalanced class data, model interpretability, and real-time processing requirements. Ensemble and stacking models were found to perform outstanding accuracy and credibility, while real-time distributed processing design and explainable AI are still central to wider adoption. Deep learning, federated learning, and hybrid blockchain technologies are to increase security, transparency, and scalability. In conclusion, this review discusses the prospects of ML in creating stable, smart, and dynamic CCF detection systems.
Introduction
The rise of internet-based transactions has improved payment convenience but significantly increased fraud risks.
Machine learning (ML) and data analytics are now central to identifying fraudulent behavior through pattern recognition and anomaly detection.
Models like Random Forest, Logistic Regression, XGBoost, and CatBoost have demonstrated strong performance in fraud detection, particularly when enhanced with feature selection and ensemble techniques.
2. Challenges in Fraud Detection
Class imbalance is a major issue: fraud cases are extremely rare (<0.2% in datasets), which skews model training.
Models like Credit Card Outlier Detection (CCOD) and Cluster-Based Local Outlier Factor (CBLOF), as well as Isolation Forest, have been effective in detecting such rare anomalies.
Use of oversampling methods like SMOTE, SMOTE-ENN, and ADASYN helps balance the datasets.
3. ML Models & Techniques
Ensemble learning (e.g., XGBoost, CatBoost, AdaBoost, Extra Trees) remains the most effective approach due to high precision, recall, and F1-scores.
CatBoost often outperforms others in handling categorical data and shows strong calibration.
Feature selection methods such as Pearson Correlation, Information Gain, and Random Forest Importance improve classification accuracy.
Neural Networks, Autoencoders, and Extreme Learning Machines (ELM) are increasingly used for complex, non-linear fraud patterns.
Hybrid and stacking ensembles combining SMOTE-ENN with models like SVM, KNN, and TOPSIS achieve near-perfect performance in some studies.
4. Explainability and Interpretability
Explainable AI (XAI) techniques like SHAP are used to provide transparency into model decisions.
Features such as transaction value and merchant category were identified as key fraud indicators.
5. Architecture and Datasets
The standard ML pipeline includes:
Data Acquisition
Preprocessing (normalization, class balancing)
Model Training and Testing
Evaluation (metrics like F1-score, ROC-AUC, MCC)
The European Credit Card Dataset (2013) is a benchmark dataset with 284,807 transactions and only 492 fraud cases (0.17%).
XGBoost and Extra Trees, especially with oversampling (e.g., SMOTE), show excellent results on such datasets.
6. Results Overview
Model
Precision
Accuracy
F1-Score
Recall
AUC
XGBoost
0.99
0.87
0.93
0.88
–
Extra Trees
0.93
0.99
0.89
0.86
0.98
CBLOF
0.97
0.97
0.97
0.97
0.489
Logistic Regression
0.96
0.96
0.96
0.96
–
Neural Network (MLP)
0.99
0.99
0.99
0.98
–
Random Forest
–
0.99
0.99
–
–
Stacking Ensemble
0.99
0.99
0.99
0.99
1.00
SMOTE + ENN Ensemble
0.99
0.99
1.00
0.99
–
7. Technological Advancements
IoT-based banking systems and cloud infrastructure are increasingly integrated into fraud detection platforms to support real-time processing.
Future directions include:
Use of CNNs, RNNs, and Transformer-based models
GANs and Autoencoders for detecting unseen fraud
Privacy-preserving collaboration between institutions
Blockchain for transparency and secure identity verification
8. Challenges and Limitations
Overfitting remains a risk, especially with oversampling methods or small datasets.
Interpretability is a concern with complex models like CatBoost and XGBoost, especially in regulated sectors.
Some models lack generalization to new, unseen fraud patterns.
Balancing methods like SMOTE can introduce noise or redundant samples if not carefully tuned.
Conclusion
ML algorithms and ensemble models have brought detection of CCF to high accuracy and recall. Overfitting, class imbalance, and lack of interpretability are the primary concerns. Deep learning big-data frameworks such as PySpark, and Explainable AI need to researches towards explainability. Federated Learning and blockchain-based authentication can improve security, while applying these solutions to areas such as insurance and health can facilitate effective cross-domain fraud prevention.
References
[1] Sellam, V., Tushar, P., Rohit, G. and Sanyam, S., 2025. Credit card fraud detection using ML. Journal of Computer Graphics and Multimedia Applications, p.1.
[2] Sizan, M.M.H., Chouksey, A., Tannier, N.R., Al, M.A., Jobaer, J.A., Roy, A., Ridoy, M.H., Sartaz, M.S. and Aminul, D., 2025. Advanced ML Approaches for Credit Card Fraud Detection in the USA: A Comprehensive Analysis. Journal of Ecohumanism, 4(2), pp.883-905.
[3] Siam, A.M., Bhowmik, P. and Uddin, M.P., 2025. Hybrid feature selection framework for enhanced credit card fraud detection using ML models. PloS one, 20(7), p.e0326975.
[4] Chugh, B., Malik, N., Gupta, D. and Alkahtani, B.S., 2025. A probabilistic approach driven credit card anomaly detection with CBLOF and isolation forest models. Alexandria Engineering Journal, 114, pp.231-242.
[5] Theodorakopoulos, L., Theodoropoulou, A., Tsimakis, A. and Halkiopoulos, C., 2025. Big data-driven distributed ML for scalable credit card fraud detection using PySpark, XGBoost, and CatBoost. Electronics, 14(9), p.1754.
[6] Alatawi, M.N., 2025. Detection of fraud in IoT based credit card collected dataset using ML. ML with Applications, 19, p.100603.
[7] Muksalmina, M., Syahyana, A., Hidayatullah, F., Idroes, G.M. and Noviandy, T.R., 2025. Credit Card Fraud Detection Through Explainable Artificial Intelligence for Managerial Oversight. Indatu Journal of Management and Accounting, 3(1), pp.17-28.
[8] Gupta, R.K., Hassan, A., Majhi, S.K., Parveen, N., Zamani, A.T., Anitha, R., Ojha, B., Singh, A.K. and Muduli, D., 2025. Enhanced framework for credit card fraud detection using robust feature selection and a stacking ensemble model approach. Results in Engineering, p.105084.
[9] Ahmed, K.H., Axelsson, S., Li, Y. and Sagheer, A.M., 2025. A Credit Card Fraud Detection Approach Based on Ensemble ML Classifier with Hybrid Data Sampling. ML with Applications, p.100675.