In recent era stress, anxiety, and emotional exhaustion are rising because of accelerated lifestyles and constant digital exposure. Due to this mental health challenges has heightened. To help overcome these challenges, music is a proven therapy for managing emotions. A study was happened recently, which showed that 256 participants attended a music therapy session. The impact this session created was remarkable as an exponential increase in emotional resilience and overall well-being was observed.
But the music playback system is not designed according to the user’s mood. The selection of music is intervened by the user. To bridge this gap, this paper is introducing MoodSync. It is an AI-driven system that captures the mood of the user via webcam face detection and according to the language selected it plays the music.
This helps to deliver adaptive music therapy. The strategy is to capture the user’s facial expression through a webcam and then analysing it with Convolution Neural Network (CNN). The research uses multiple trending technologies too support emotional health. Mood Sync provides a flexible, scalable framework that can advance emotion-responsive technologies in mental health support and human-computer interaction domains.
Introduction
Unlike traditional music streaming platforms that depend on manual input, Mood Sync uses computer vision and deep learning (CNNs) to detect emotions through a webcam and recommend suitable music accordingly. For example, it plays calming or uplifting music when a sad emotion is detected, and it also adapts based on the user’s preferred language.
The system is motivated by growing mental health concerns and the need for music therapy-based emotional support, especially for students and working professionals. It aims to improve well-being by combining emotion recognition with personalized music recommendation.
The methodology includes:
Capturing facial images via webcam
Detecting emotions using OpenCV and CNN models
Using a trained dataset of facial expressions (48×48 grayscale images)
Processing and classifying emotions such as happy, sad, angry, etc.
Playing music based on detected mood and language preference
The literature review shows that while facial emotion recognition using CNNs has achieved high accuracy in controlled settings, real-world performance is more challenging. Existing music systems mostly rely on user history and preferences, but few integrate real-time emotion detection with music playback, which Mood Sync aims to address.
Key system factors include:
Emotion detection accuracy
Image quality and lighting conditions
Facial alignment and model robustness
Real-time responsiveness
User engagement is improved through real-time mood detection, personalized recommendations, and a simple interface.
Conclusion
In conclusion, the paper complements the current strides in the field of AI, by the proposal of an emotionally aware music playback system – MoodSync. MoodSync has been explored as a potential supplement in the sphere of mental health and well-being, which can utilize music therapy for patient treatment. Further, the paper traversed through various factors which lead to the effectiveness of the proposed system, including the strength of the CNN model, the variety in the input dataset and the focus on user satisfaction.
The results of the research further indicate that MoodSync was able to effectively identify emotions and play context-relevant music which highlights MoodSync’s capability to positively influence the user’s mood.
However, the system is still sensitive to factors such as the lighting conditions, low quality or blurred images, which can lead to unexpected results. This brings to the fore, the need for further improvements in the proposed system, including advanced model training and supplemental optimization.
Despite the challenges, MoodSync can be considered as a pragmatic and constructive solution in the discipline of mental health.
References
[1] HTTPS://WWW.KAGGLE.COM/DATASETS/ANANTHU017/EMOTI ON-DETECTION-FER
[2] Jaiswal, A., Raju, A.K. and Deb, S., 2020, June. Facial emotion detection using deep learning. In 2020 international conference for emerging technology (INCET) (pp. 1-5). IEEE.
[3] Chutia, T. and Baruah, N., 2024. A review on emotion detection by using deep learning techniques. Artificial Intelligence Review, 57(8), p.203.
[4] Pereira, R., Mendes, C., Ribeiro, J., Ribeiro, R., Miragaia, R., Rodrigues, N., Costa, N. and Pereira, A., 2024. Systematic review of emotion detection with computer vision and deep learning. Sensors, 24(11), p.3484.
[5] Halkiopoulos, C., Gkintoni, E., Aroutzidis, A. and Antonopoulou, H., 2025. Advances in neuroimaging and deep learning for emotion detection: A systematic review of cognitive neuroscience and algorithmic innovations. Diagnostics, 15(4), p.456.
[6] Gaddam, D.K.R., Ansari, M.D., Vuppala, S., Gunjan, V.K. and Sati, M.M., 2021, November. Human facial emotion detection using deep learning. In ICDSMLA 2020: Proceedings of the 2nd International Conference on Data Science, Machine Learning and Applications (pp. 1417-1427). Singapore: Springer Singapore.
[7] Chen, Y. and He, J., 2022. Deep learning-based emotion detection. Journal of Computer and Communications, 10(02), pp.57-71.
[8] Liao, Y.J., Wang, W.C., Ruan, S.J., Lee, Y.H. and Chen, S.C., 2022. A music playback algorithm based on residual-inception blocks for music emotion classification and physiological information. Sensors, 22(3), p.777.
[9] Devarajan, D. and Rajkumar, K., 2024, May. EmoSync: Facial Emotion Detection for Adaptive Music Playback. In 2024 3rd International Conference on Artificial Intelligence For Internet of Things (AIIoT) (pp. 1-6). IEEE.
[10] Bakariya, B., Singh, A., Singh, H., Raju, P., Rajpoot, R. and Mohbey, K.K., 2024. Facial emotion recognition and music recommendation system using CNN-based deep learning techniques. Evolving Systems, 15(2), pp.641-658.