The early detection of mental health disorders, particularly depression and anxiety, is essential for timely intervention and improved patient outcomes. Recent advances in Artificial Intelligence (AI) have enabled the development of systems capable of identifying psychological distress through multimodal data, including text, audio, and facial expressions. However, conventional AI models typically rely on centralized data collection, which poses significant risks to user privacy and data security, especially in healthcare applications involving sensitive personal information.
This paper proposes a privacy-preserving framework for mental health prediction based on Federated Learning (FL). This decentralized training approach eliminates the need to transmit raw data to central servers. The proposed architecture integrates additional privacy-enhancing techniques such as differential privacy and secure aggregation to further safeguard user information. By enabling collaborative learning across distributed devices while maintaining data confidentiality, this framework offers a scalable and ethical solution for building AI systems in mental health care. The paper presents the system architecture, implementation strategy, and potential for real-world deployment.
Introduction
1. Introduction
Mental health disorders like depression and anxiety are on the rise, but early detection is hindered by stigma, limited access, and privacy concerns.
AI models can detect mental health signals from text, speech, and facial expressions, but traditional centralized systems risk data privacy breaches.
The paper proposes a Federated Learning (FL) approach, allowing AI to train on local user devices, avoiding data transfer and enhancing privacy.
2. Applications
The proposed framework can be applied to:
Mental health chatbots
Telepsychiatry platforms
mHealth apps for mood tracking
Workplace wellness tools
Ethical clinical research using anonymized, federated data
3. Contribution
Introduces a federated, multimodal AI system for mental health prediction.
Uses text, audio, and video data for better accuracy.
Ensures privacy via differential privacy, encryption, and on-device training.
Offers scalability and compliance with data protection laws (e.g., GDPR, DPDP).
4. Research Questions
How can FL be used for mental health prediction while preserving privacy?
What are the benefits of decentralized AI in sensitive domains?
Can multimodal inputs be used securely and effectively?
What privacy safeguards are essential?
How does FL compare to centralized models in performance?
5. Literature Review
AI in Mental Health: Successful use of ML on social media, voice, and video to detect symptoms.
Multimodal Learning: Combines data types for higher accuracy but raises privacy concerns.
Federated Learning (FL): Promising for privacy but underused in mental health.
Research Gap: Lack of comprehensive systems combining FL, multimodal data, and robust privacy protections.
6. Methodology
Dataset: DAIC-WOZ (includes audio, transcripts, facial expressions, and depression labels).
Preprocessing: Text cleaned and lemmatized; emotional features extracted from audio and video.
Model Architecture:
Text: LSTM or transformer
Audio/Video: CNN/RNN
Fusion Layer for final prediction
FL Setup:
Local training, global model aggregation using FedAvg
Accuracy & Privacy: High performance with strong privacy protection.
Comparison:
Centralized AI: Higher accuracy but weak privacy
Encrypted ML: Moderate
Proposed FL Model: High accuracy + high privacy
Trade-off: Slightly lower accuracy than centralized AI but significantly improved user trust and ethical alignment.
8. Model Testing
Environment: Simulated 20 clients, 50 FL rounds using standard laptops.
Functional Testing: Confirmed successful local training and global aggregation.
Performance: ~3.5s training per client, stable accuracy by round 40.
Security Testing: No data leaks; encrypted updates; resilient to attacks.
Conclusion
In this study, we introduced a new system that can help predict mental health problems like depression while keeping people’s personal data safe. We used a method called federated learning, which trains the model on users\' own devices. This means there is no need to send personal data to a central server, which protects user privacy.
Our system works with different types of data, like text, voice, and facial expressions. Each user’s device processes the data and trains the model locally. Only the learning results (not the data itself) are sent securely to a central system to build a better overall model.
We tested the system and found that it gave good results, with high accuracy in detecting mental health issues. When compared to older methods that collect all data in one place, our method gave almost the same performance but with much better privacy.
To sum up, our approach shows that it is possible to use AI to detect mental health conditions without compromising personal privacy. It offers a safe and smart way to support mental health care using modern technology.
References
[1] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. YArcas, “Communication-efficient learning of deep networks from decentralized data,” in Proc. 20th Int. Conf. Artif. Intell. Stat., 2017, pp. 1273–1282.
[2] N. Rieke et al., “The future of digital health with federated learning,” npj Digit. Med.., vol. 3, no. 1, pp. 1–7, 2020.
[3] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” ACM Trans. Intell. Syst. Technol., vol. 10, no. 2, pp. 12:1–12:19, 2019.
[4] E. Abbe and C. Sandon, “Privacy-preserving AI in healthcare: Challenges and opportunities,” J. Biomed. Inform., vol. 110, p. 103569, 2020.
[5] R. A. Calvo, K. Dinakar, R. W. Picard, and P. Maes, “Computing in mental health,” Commun. ACM, vol. 61, no. 12, pp. 62–70, 2018.
[6] W. Lu and X. Li, “Multimodal depression detection based on fusion of text, audio, and video features,” IEEE Access, vol. 9, pp. 110200–110211, 2021.
[7] P. Kairouzet al., “Advances and open problems in federated learning,” arXiv preprint arXiv:1912.04977, 2019.