The financial industry continues to face significant challenges due to credit card fraud, which causes significant financial losses worldwide. Traditional machine learning techniques usually depend on centralized data aggregation, raising issues with inter-institutional data sharing, user privacy, and regulatory compliance. This paper presents a privacy-aware framework that uses Federated Learning (FL) to identify fraudulent transactions in order to overcome these constraints. This setup maintains anonymity by having several simulated financial organizations (clients) train models privately on their own private datasets without disclosing raw data. To address the class imbalance common in fraud detection, each client employs a Random Forest classifier in combination with the Synthetic Minority Over-sampling Technique (SMOTE). Using the Flower architecture, the federated system is constructed and assessed throughout a number of communication cycles. According to the results, the FL- based strategy preserves data privacy while achieving accuracy on par with centralized methods that use models like Random Forest, Decision Tree, and Logistic Regression. The feasibility of federated learning for safe and scalable fraud detection in dispersed situations is highlighted by this study.
Introduction
Credit card fraud is a growing threat due to the surge in digital financial transactions. Traditional fraud detection systems are increasingly inadequate because of class imbalance (fraud is rare) and data siloing among financial institutions. Additionally, privacy regulations like the GDPR complicate centralized data sharing.
To address these issues, the study proposes a Federated Learning (FL)-based system, which enables multiple institutions to collaboratively train machine learning models without sharing raw data, thereby preserving privacy and complying with legal requirements.
Key Contributions
1. Federated Learning Architecture
Involves multiple clients (e.g., banks) training local models on private data.
Clients share only encrypted model updates (weights/gradients) with a central aggregator.
Reduces privacy risks, maintains data locality, and supports scalable deployment in sensitive financial environments.
2. Algorithm and Framework
Random Forest is used as the base model due to its robustness in handling high-dimensional and noisy data.
SMOTE (Synthetic Minority Over-sampling Technique) is applied locally to each client’s data to address the class imbalance problem and improve detection of fraudulent transactions.
The implementation is done using the Flower framework, suitable for real-world distributed systems.
Results
The FL-based approach offers comparable or better accuracy than traditional centralized models, while ensuring higher privacy and data security.
Evaluated using key metrics: accuracy, precision, recall, and F1-score.
Demonstrates practical viability for privacy-sensitive environments like digital banking.
Related Work
Various machine learning methods like Random Forest, XGBoost, SVM, KNN, and Naive Bayes have been used for fraud detection.
Ensemble and hybrid models (e.g., AdaBoost, MLPs, stacked models) often perform better on imbalanced datasets.
Oversampling techniques like SMOTE, ADASYN, and more advanced methods (e.g., GANs, RMDD) improve sensitivity to rare fraud cases.
Federated Learning has been recognized as a key innovation in privacy-preserving AI, with developments like FedAvg, FedGAT, and hybrid supervised-unsupervised models gaining traction.
Interpretability remains a challenge, especially in regulated industries.
Dataset
Uses the creditcard.csv dataset (Kaggle standard), with 284,807 transactions (492 fraudulent).
Features include Time, Amount, and 28 anonymized PCA components (V1–V28).
Highly imbalanced dataset (~0.17% fraud), making it ideal for evaluating fraud detection techniques.
Conclusion
Using the Random Forest algorithm, this study compares centralized and federated learning strategies for detecting credit card fraud. Because centralized learning had access to the complete dataset, it was able to catch a wider range of transaction behavior, which contributed to its somewhat higher accuracy. Federated learning, on the other hand, maintained data privacy by storing user information locally on client devices. The federated technique yielded results comparable to the centralized model, even though it worked with decentralized and non-identically distributed (non-IID) data. By correcting class imbalance and raising the sensitivity of fraud detection, the use of SMOTE substantially enhanced the effectiveness of both strategies.
Federated learning has a lot of promise for use in actual fraud situations in the future. To better describe intricate transaction patterns, future studies can investigate the integration of sophisticated neural network designs like CNNs, DNNs, and transformers. FedProx and FedAvgM are examples of customized federated strategies that can be used to handle client data variability. Furthermore, security can be improved by implementing strong privacy- preserving technologies as secure multi-party computation (SMPC), homomorphic encryption, and differential privacy. The creation of a standardized, scalable, and privacy- conscious framework for fraud detection may also be aided by real-time data handling and cooperation between financial institutions.
References
[1] R. K. Chanda, P. K. Pagadala, C. K. Edukulla, S. S. Archana, S. Gurram, and S. R. Maram, \"Enhancing credit card fraud prediction using decision trees, SMOTE, and hyper-tuned random forests: A comprehensive approach,\" in Proceedings of IEEE, 2023.
[2] H. P. N., P. D. Rathika, and P. A., \"Privacy preservation using federated learning for credit card transactions,\" in Proceedings of IEEE, 2023.
[3] K. D’souza, S. Puthusseri, and A. G. Samuel, \"Scalable federated learning for privacy-preserving credit card fraud detection,\" in Proceedings of IEEE International Carnahan Conference on Security Technology (ICCST), 2023.
[4] S. Lynch, A. M. Abdelmoniem, and S. S. Gill, \"Centralised and decentralised fraud detection approaches in federated learning: A performance analysis,\" in Applications of AI for Interdisciplinary Research, 1st ed., CRC Press, Jul. 2024, pp. 1–20, doi: 10.1201/9781003467199-18.
[5] H. Zheng, \"Federated learning-based credit card fraud detection: A comparative analysis of advanced machine learning models,\" in Proceedings of International Conference on Data Science, Advanced Algorithm and Intelligent Computing (DAI), vol. 70, 2025.
[6] R. K. Chaudhary, R. Kumar, and N. Saxena, \"A systematic review on federated learning system: A new paradigm to machine learning,\" An International Journal (Springer), vol. 67, Nov. 2024.
[7] T. C. Tran and T. K. Dang, \"Machine learning for prediction of imbalanced data: Credit fraud detection,\" in Proceedings of IEEE, 2021.
[8] N. Prabhakaran and R. Nedunchelian, \"Oppositional cat swarm optimization–based feature selection approach for credit card fraud detection,\" Computational Intelligence and Neuroscience, vol. 2023, Article ID 2693022, 2023, doi: 10.1155/2023/2693022.
[9] S. R. Dammavalam and M. Mukheed, \"Credit card fraud detection using machine learning,\"International Journal of Advanced Engineering and Management (IJAEM), vol. 5, no. 1, pp. 1–6, Jan. 2023.
[10] M. A. Mim, N. Majadi, and P. Mazamder, \"A soft voting ensemble learning approach for credit card fraud detection,\" Heliyon, vol. 10, no. 3, p. e25466, Feb. 2024, doi: 10.1016/j.heliyon.2024.e25466.
[11] L. Ali and A. Kasem, \"Enhancement model to detect credit card fraud based on processing data,\" European Modeling Studies Journal, vol. 6, no. 5, pp. 1–8, 2022.
[12] V. T. Gowda, \"Credit card fraud detection using supervised and unsupervised learning,\" in Proceedings of CMC, NCO, SOFT, CDKP, MLT, ICAITA, 2021, doi: 10.5121/csit.2021.111107.
[13] K. S. Srinivas, \"Credit card fraud detection using supervised machine learning algorithms,\" International Journal of Creative Research Thoughts (IJCRT), vol. 10, no. 9, Sep. 2022. [Online]. Available: https://ijcrt.org/
[14] A. Mniai and K. Jebari, \"Credit card fraud detection by improved SVDD,\" in Proceedings of World Congress on Engineering, Jul. 2022.
[15] S. Bagga, A. Goyal, N. Gupta, and A. Goyal, \"Credit card fraud detection using pipelining and ensemble learning,\" Elsevier, 2020.
[16] Y. Xie, A. Li, L. Gao, and Z. Liu, \"A heterogeneous ensemble learning model based on data distribution for credit card fraud detection,\" Wireless Communications and Mobile Computing, vol. 2021, Article ID 2531210, 2021, doi: 10.1155/2021/2531210.
[17] F. Carcillo, Y.-A. Le Borgne, O. Caelen, Y. Kessaci, F. Oble, and G. Bontempi, \"Combining unsupervised and supervised learning in credit card fraud detection,\" Information Sciences, vol. 2019, pp. 245–257, doi: 10.1016/j.ins.2019.05.042.
[18] S. R. Adapa, M. A. S. Nirob, S. Bhatt, M. Yerram, and A. P. Nivas, \"Enhancing credit card fraud detection: A novel approach with random forest and behavioral biometrics,\" International Journal for Research in Applied Science and Engineering Technology (IJRASET), vol. 12, no. 3, Mar. 2024.
[19] J. Gao, Z. Zhou, J. Ai, B. Xia, and S. Coggeshall, \"Predicting credit card transaction fraud using machine learning algorithms,\" Journal of Intelligent Learning Systems and Applications, vol. 11, no. 3, 2019, doi: 10.4236/jilsa.2019.113003.
[20] Y. Xie, A. Li, B. Hu, L. Gao, and H. Tu, \"A credit card fraud detection model based on multi-feature fusion and generative adversarial network,\" Computer Materials & Continua, vol. 2023, pp. 123–136, doi: 10.32604/cmc.2023.037039.
[21] A. Dal Pozzolo, \"Credit Card Fraud Detection Dataset,\" Kaggle. [Online]. Available: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud
[22] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, \"SMOTE: Synthetic minority over-sampling technique,\" Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002.
[23] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Arcas, \"Communication-efficient learning of deep networks from decentralized data,\" in Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2017, pp. 1273–1282.
[24] D. J. Beutel, T. Topal, A. Mathur, X. Qiu, T. Parcollet, and N. D. Lane, \"Flower: A friendly federated learning research framework,\" arXiv preprint, arXiv:2007.14390, 2020.