Monitoring student engagement during online lectures is challenging, as traditional subjective methods often fail to accurately capture attention. This study proposes an objective, real-time method using facial action units (AUs)—specific facial muscle movements—to estimate students’ attention by predicting their reaction times (RT) to irrelevant auditory stimuli.
Methodology:
Fifteen participants watched tutorial videos while responding to random silences in white noise by pressing a key. Their facial expressions were recorded and analyzed using OpenFace to extract 35 facial action units. Features from these AUs within a 3-second window before each auditory event were used to train machine learning models (LightGBM) to predict reaction times.
Implementation:
A standard webcam and a personal computer processed facial video in real time. Face detection was done using OpenCV, and emotions recognized by a lightweight CNN. Preprocessing included noise filtering and grayscale conversion. Engagement was classified into high, moderate, or low categories based on recognized emotions. Data and results were logged for further analysis. The system was built in Python using OpenCV, Keras, and other libraries.
Results:
The pooled model showed a strong correlation (0.66) between facial features and attention, identifying key facial actions like nose wrinkling, blinking, and lip corner depressing as important predictors. Removing sleepy faces did not affect accuracy, indicating facial cues reflect attention beyond general alertness. However, individual-specific models performed poorly due to variability in how people express engagement. The study suggests personalized or group-specific models may improve accuracy.
Conclusion
Facialexpressionanalysis, paired with machine learning, canobjectivelyestimate student engagementduringonlinelectures.Futureworkshouldfocusonrefiningfeatureextraction methods, handling individual variability, and exploring integration into real-world educational platforms.
References
[1] R. Miao, H. Kato, Y. Hatori, Y. Sato, and S. Shioiri, \"Analysis of Facial Expressions to Estimate the LevelofEngagement in Online Lectures,\" IEEE Access, vol. 11, pp. 76551– 76562, 2023, doi: 10.1109/ACCESS.2023.3297651.
[2] Shan, C., Gong, S., &McOwan, P. W. (2005, September). Robust facial expression recognition using local binary patterns. In Image Processing, 2005. ICIP 2005. IEEE International Conference on (Vol. 2, pp. II-370). IEEE.
[3] Bhatt, M., Drashti, H., Rathod, M., Kirit, R., Agravat, M., &Shardul, J. (2014). A Studyof Local Binary Pattern Method for Facial Expression Detection. arXiv preprint arXiv:1405.6130.
[4] Chen, J., Chen, Z., Chi, Z., & Fu, H. (2014, August). Facial expression recognition based on facial components detection and hog features. In International Workshops on Electrical and Computer Engineering Subfields (pp. 884-888).
[5] Ahmed, F., Bari, H., &Hossain, E. (2014). Person-independent facial expression recognition based on compound local binary pattern (CLBP). Int. Arab J. Inf. Technol., 11(2), 195-203.
[6] Happy, S. L., George, A., &Routray, A. (2012, December). A real time facial expression classification system using Local Binary Patterns. In Intelligent Human Computer Interaction (IHCI), 2012 4th International Conference on (pp. 1-5). IEEE.
[7] Zhang, S., Zhao, X., & Lei, B. (2012). Facial expression recognition based on local binary patterns and local fisher discriminant analysis. WSEAS Trans. Signal Process, 8(1), 21-31.
[8] Chibelushi, C. C., &Bourel, F. (2003). Facial expression recognition: A brief tutorial overview. CVonline: On-Line Compendium of Computer Vision, 9.
[9] Sokolova, M., Japkowicz, N., &Szpakowicz, S. (2006, December). Beyond accuracy, Fscore and ROC: a family of discriminant measures for performance evaluation. In Australasian Joint Conference on Artificial Intelligence (pp. 1015- 1021). Springer Berlin Heidelberg.