Federated Learning is a recent, distributed machine learning paradigm that allows multiple clients to collaboratively train a global model without explicit raw local data sharing. FL decentralizes the training process by reducing privacy risks, regulatory limitations, and hence improving large-scale learning over mobile, IoT, healthcare, and financial systems. This paper provides a comprehensive review of the foundational paradigm of FL, its architectures, optimization methods, security mechanisms, key challenges, especially in the case of non-IID data, communication cost, and system heterogeneity, and application domains. Recent advancements and open research directions are also outlined.
Introduction
Federated Learning (FL) is a distributed machine learning approach that enables multiple clients to collaboratively train a shared global model without sharing their private data. Unlike traditional centralized learning, FL keeps data on local devices and only exchanges model updates, making it highly suitable for privacy-sensitive and resource-constrained environments such as mobile devices, healthcare systems, IoT networks, and financial applications.
The FL process operates in repeated communication rounds between a central server and participating clients. The server initializes a global model, selects clients, and sends the model for local training. Clients train the model on their private data and send back updates, which are aggregated (commonly using methods like FedAvg) to refine the global model. This iterative process continues until convergence.
FL can be categorized into horizontal FL (same feature space, different data samples), vertical FL (different features but shared users), and federated transfer learning (different feature and sample spaces). It is also deployed in cross-device settings (e.g., smartphones and IoT devices) and cross-silo settings (e.g., hospitals and institutions).
Despite its advantages, FL faces several key challenges. Data heterogeneity (non-IID data) can reduce model accuracy and slow convergence due to differences in local data distributions. System heterogeneity across devices leads to delays caused by slower clients. Communication overhead is another major issue due to frequent transmission of large model parameters. Additionally, FL is vulnerable to privacy and security threats such as model inversion, poisoning attacks, and membership inference, even without sharing raw data. Data quality imbalance among clients can also negatively affect global model performance.
To address these issues, several optimization techniques and algorithmic improvements have been developed. FedAvg serves as the baseline method, while extensions such as FedProx, SCAFFOLD, FedNova, and FedDvn improve convergence under heterogeneous conditions. Other strategies include momentum-based training, adaptive client selection, and variance reduction techniques, all aimed at improving stability and efficiency in non-IID environments.
Conclusion
Federated Learning provides a compelling framework for privacy-preserving distributed computation across diverse and decentralized data sources. While optimization, privacy, personalization, and communication reduction have seen progress, challenges remain in addressing non-IID data, heterogeneity, fairness, and real-world scalability. Continued research and innovations in systems will make FL more reliable and widespread in industries such as healthcare, finance, communication, and smart cities.
References
[1] C. Zhang et al., A Survey on Federated Learning.
[2] Q. Li et al., A Survey on Federated Learning Systems.
[3] T. Li et al., “Federated Optimization in Heterogeneous Networks.”
[4] Y. Kang et al., “Federated Learning: Challenges, Methods, and Future Directions.”
[5] J. Xie et al., “Efficient Communication in Vertical Federated Learning.”
[6] K. Bonawitz et al., “Practical Secure Aggregation for Privacy-Preserving ML.”
[7] M. Mohri et al., “Fairness in Federated Learning.”
[8] S. Karimireddy et al., “SCAFFOLD: Stochastic Controlled Averaging for FL.”
[9] Z. Cheng et al., “Momentum Benefits Non-IID Federated Learning.”
[10] M. Haller et al., “Handling Non-IID Data in Federated Learning.”
[11] A. Smith et al., “Federated Multi-Task Learning.”
[12] A. Rieke et al., “Federated Learning in Healthcare.”
[13] S. Pokhrel et al., “FL for Edge-IoT Systems.”
[14] M. Chen et al., “FL in 6G Networks.”
[15] D. Ji et al., “Fairness and Bias in Federated Learning.”
[16] T. Reddi et al., “Adaptive Optimization in Federated Learning.”
[17] Ye et al., “Personalized Federated Multi-Task Learning.”
[18] Choi et al., “Federated Learning in Medical Imaging.”
[19] Baqer et al., “Lightweight Federated Learning for IoT.”
[20] Hard et al., “Federated Learning for Mobile Keyboard Prediction.”
[21] Shen et al., “FL for Smart Cities and Intelligent Transportation.”