Stress and mental health issues are increasingly common among office workers and students, often worsening because emotional changes go unrecognized and untracked; to address this, this project proposes a Facial Expression Recognition System for early detection of stress and mental health disorders that delivers intelligent, real-time emotional analysis using modern deep learning and web technologies. The proposed system accepts live webcam streams, single image uploads, and video files, and uses a pre-trained facial emotion recognition model built on PyTorch to classify seven emotions: angry, disgust, fear, happy, sad, surprise, and neutral, while combining MTCNN face detection with a Haar Cascade fallback for robust performance across varied conditions. An advanced stress analysis algorithm monitors emotion patterns over a 30-second rolling window to compute stress levels and detect trends. A session management module maintains user profiles, device tracking, and emotion history. The application is implemented with React 18.2 and Vite on the frontend and Flask with Socket.IO on the backend to ensure a responsive interface and efficient real-time communication. A database is maintained to store user profiles, session details, and emotion history. The system also supports multilingual interaction and concurrent multi-device streaming, making it accessible to users from different regions and deployable in large-scale scenarios. By providing accessible emotion detection and continuous stress monitoring, the system supports mental health assessment, user experience research, educational engagement tracking, and customer satisfaction analysis, enabling data-driven interventions and early support for improved well-being
Introduction
The project develops a Facial Expression Recognition System for Early Detection of Stress and Mental Health Disorders. Stress, anxiety, and burnout are common among students and office workers, but traditional mental health monitoring methods are slow, self-reported, and often fail to detect gradual deterioration. The system uses deep learning and computer vision to monitor emotional states in real time through webcams, image uploads, or video files.
It employs MTCNN and Haar Cascade for robust face detection and a pre-trained FER model to classify seven emotions: angry, disgust, fear, happy, sad, surprise, and neutral. An advanced stress analysis algorithm tracks emotions over a 30-second rolling window to calculate stress levels and trends. The system includes session management, multi-device support, and a React-Flask web interface for real-time interaction.
By combining automated emotion detection, stress trend analysis, and a user-friendly platform, the system aims to enable early intervention, improve mental health awareness, and promote well-being in educational and professional settings.
Conclusion
This project successfully presents the design and implementation of a Facial Expression Recognition System for Early Detection of Stress and Mental Health Disorders, aimed at addressing the critical challenges faced by individuals due to limited access to continuous mental health monitoring and early stress detection. By integrating Deep Learning, Computer Vision, Natural Language Processing, and Web Technologies, the system provides an intelligent, accessible, and user-friendly platform for delivering real-time emotional state analysis and stress level monitoring. The developed system effectively detects facial expressions through advanced face detection using MTCNN with Haar Cascade fallback, classifies emotions into seven distinct categories using a pre-trained PyTorch-based FER model, and calculates stress levels through an intelligent weighted scoring algorithm.
The use of deep learning techniques enables accurate emotion recognition, while the implementation of a 30-second rolling window for stress pattern analysis provides actionable insights into mental health trends. The integration of WebSocket-based real-time communication ensures low-latency performance, supporting continuous monitoring and immediate feedback. Experimental evaluation demonstrates that the system is capable of delivering accurate and timely emotion detection and stress level analysis for common emotional states, thereby reducing dependency on traditional mental health assessment methods and minimizing delays in identifying emotional distress. The web-based interface built with React and Vite ensures accessibility and ease of use, making the system suitable for users with varying levels of technical expertise.
Although the current implementation shows promising results, its effectiveness is influenced by factors such as lighting conditions, face orientation, and image quality. Overall, the Facial Expression Recognition System highlights the potential of artificial intelligence in transforming mental health monitoring and stress detection. The project contributes toward improving emotional well-being awareness, enabling early intervention, reducing stress-related health issues, and promoting proactive mental health management. With further enhancements such as multi-modal emotion detection, advanced 3D face analysis, and expanded deployment platforms, the proposed system can be scaled to support a wider range of applications including workplace wellness programs, educational institutions, healthcare facilities, and remote mental health services, making it a valuable technological solution for modern mental health support.
The Confusion Matrix shows how the model confuses between some of the emotions. Sometimes fear can be confused for sad, the results of emotion prediction can be based on many factors as mentioned above. It purely depends on the quality of an image or the video.
References
[1] Z. Zhang, M. Lyons, M. Schuster, and S. Akamatsu, \"Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron,\" in Proc. 3rd IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 1998, pp. 454-459.
[2] Y. LeCun, Y. Bengio, and G. Hinton, \"Deep learning,\" Nature, vol. 521, no. 7553, pp. 436-444, 2015.
[3] I. J. Goodfellow, D. Erhan, P. L. Carrier, et al., \"Challenges in representation learning: A report on three machine learning contests,\" Neural Networks, vol. 64, pp. 59-63, 2015.
[4] J. Shenk and M. Schwabacher, \"FER: Facial Expression Recognition,\" GitHub repository, https://github.com/justinshenk/fer, 2020.
[5] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, \"Joint face detection and alignment using multitask cascaded convolutional networks,\" IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016.
[6] B. C. Ko, \"A brief review of facial emotion recognition based on visual information,\" Sensors, vol. 18, no. 2, p. 401, 2018.
[7] M. Pantic and L. J. M. Rothkrantz, \"Automatic analysis of facial expressions: The state of the art,\" IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1424-1445, 2000.
[8] R. W. Picard, Affective Computing. Cambridge, MA, USA: MIT Press, 1997.
[9] Z. Zeng, M. Pantic, G. I. Roisman, and T. S. Huang, \"A survey of affect recognition methods: Audio, visual, and spontaneous expressions,\" IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 39-58, 2009.
[10] S. Li and W. Deng, \"Deep facial expression recognition: A survey,\" IEEE Transactions on Affective Computing, vol. 12, no. 4, pp. 502-524, Oct.-Dec. 2022. doi: 10.1109/TAFFC.2020.2981446