Federated learning (FL) has emerged as a revolutionary solution to decentralized machine learning that provides model training over many clients without sharing raw data. Still, privacy threats continue to be a significant challenge because of possible loopholes in data aggregation, adversarial attacks, and communication schemes [1]. This article critically compares some of the privacy-preserving methods applied in FL, such as differential privacy, secure multi-party computation, homomorphic encryption, clustered sampling, and robust aggregation. By considering their efficacy, computational overheads, and trade-offs between model utility and privacy, this research identifies important advantages and shortcomings of each approach. Additionally, the paper delves into open challenges and outlines future directions for research to improve privacy in FL, especially in edge computing and 6G-enabled IoT settings.
Introduction
Federated Learning (FL) enables decentralized training of machine learning models without centralizing data, enhancing privacy but introducing risks such as privacy leakage, adversarial attacks, and exposure of model updates. To address these, several privacy-preserving techniques have been developed:
Differential Privacy (DP): Adds noise to model updates to protect individual data, providing strong privacy but potentially reducing model accuracy.
Secure Multi-Party Computation (SMPC): Allows multiple parties to compute functions collaboratively without revealing inputs, ensuring confidentiality but at high computational and communication costs.
Homomorphic Encryption (HE): Enables computations on encrypted data without decryption, maintaining confidentiality but with significant computational overhead, limiting real-time use.
Clustered Sampling: Groups client data into clusters before training, balancing privacy and accuracy, especially useful for non-IID data.
Robust Aggregation: Protects against poisoning and privacy attacks during model aggregation, enhancing security and data integrity.
Applications of privacy-preserving FL include healthcare, finance, IoT/edge computing, smart cities, cybersecurity, and retail, enabling collaborative model training while safeguarding sensitive data.
A comparative analysis highlights trade-offs: DP offers moderate to high privacy with low computational cost but reduces accuracy; SMPC and HE provide high privacy without accuracy loss but are computationally expensive; Clustered Sampling and Robust Aggregation balance privacy, efficiency, and scalability differently.
Conclusion
Based on the comparative analysis, there is no one privacy-preserving method that is superior in all cases since each has its trade-offs:
1) For strong privacy: Homomorphic encryption and SMPC are the strongest, but at the cost of high computational requirements and lower scalability.
2) For scalability and efficiency: Clustered sampling and differential privacy provide higher scalability with average privacy, but DP compromises on accuracy.
3) For a compromise between security and performance: Aggregation mechanisms with robustness provide a good balance.
A hybrid solution blending multiple methods like differential privacy with strong aggregation can be a better solution based on the context of the application. Future work should aim to lower computational expense while sustaining robust privacy guarantees, especially for real-time FL applications in edge computing and 6G-enabled IoT systems.
References
[1] S. Wu, J. Yin, J. Zhang, and Z. Wei, \"Analyzing Federated Learning with Enhanced Privacy Preservation,\" IEEE Transactions on Artificial Intelligence, vol. 4, no. 5, pp. 890-902, May 2022.
[2] L. Yang, J. Yin, Y. Liu, J. Zhang, and Q. Yang, \"Privacy-Preserving Federated Learning through Clustered Sampling on Fine-Tuning Distributed non-iid Large Language Models,\" IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 3, pp. 1547-1559, March 2023.
[3] J. Liu, Y. Zhang, J. Yin, and Q. Yang, \"Personalized Privacy-Preserving Federated Learning: Optimized Trade-off Between Utility and Privacy,\" IEEE Access, vol. 11, pp. 12043-12055, June 2023.
[4] S. Lin et al., \"Federated Learning Security and Privacy-Preserving Algorithm and Application,\" in IEEE Transactions on Information Forensics and Security, vol. 18, pp. 55-70, January 2023.
[5] M. Bennis, M. Pandhya, and P. M. Ruiz, \"Enabling Privacy-Preserving Edge AI: Federated Learning and Beyond,\" in IEEE Communications Surveys & Tutorials, vol. 26, no. 2, pp. 1433-1447, March 2024.
[6] M. Zhang, Z. Zhang, and Z. Xiong, \"Privacy-Preserving Federated Learning via System Immersion and Homomorphic Encryption,\" IEEE Transactions on Signal Processing, vol. 71, pp. 1234-1247, April 2023.
[7] Y. Wu, Y. Lu, S. Li, and P. Yang, \"Privacy-Preserving AI Framework for 6G-Enabled Consumer Internet of Things,\" IEEE Transactions on Wireless Communications, vol. 23, no. 6, pp. 1123-1136, June 2024.
[8] X. Fan, T. Zhang, and F. Yuan, \"Privacy Preservation for Federated Learning with Robust Aggregation in Edge Computing,\" IEEE Transactions on Cloud Computing, vol. 11, no. 7, pp. 2900-2912, July 2023.