The COVID-19 pandemic accelerated the development of intelligent monitoring systems capable of ensuring compliance with public health guidelines. Among these technologies, face mask detection systems have emerged as an important application of computer vision, machine learning (ML), and deep learning (DL). Automated mask detection enables real-time monitoring in hospitals, airports, educational institutions, shopping malls, and other public environments. Various techniques including Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Decision Trees, Random Forests, Gradient Boosting, Convolutional Neural Networks (CNN), MobileNetV2, ResNet-50, YOLO, and transfer learning models have been proposed for this purpose.
Introduction
This text reviews face mask detection as an important computer vision application used for public health monitoring and safety, especially during infectious disease outbreaks. Manual monitoring methods are inefficient and error-prone, which has led to increased use of AI-based solutions for automatic mask detection.
The system pipeline includes dataset collection, preprocessing, feature extraction, and classification. Preprocessing improves image quality through resizing, normalization, and augmentation, while feature extraction identifies key facial patterns. Classification is then performed using either machine learning or deep learning methods.
Traditional machine learning approaches such as KNN, SVM, Decision Trees, Random Forest, and Gradient Boosting are widely used for mask detection due to their simplicity and efficiency. However, deep learning models like CNN, MobileNetV2, ResNet-50, and YOLO generally achieve higher accuracy because they automatically learn complex image features. Transfer learning further improves performance by adapting pre-trained models to smaller datasets.
Comparative studies show that machine learning models typically achieve over 95% accuracy, while deep learning models can exceed 98% under controlled conditions. Despite their success, challenges remain, including lighting variations, occlusions, dataset imbalance, and real-time deployment limitations.
References
[1] R. Szeliski, Computer Vision: Algorithms and Applications, Springer, 2022.
[2] G. Bradski and A. Kaehler, Learning OpenCV, O’Reilly Media, 2021.
[3] World Health Organization, “Mask use in the context of COVID-19,” WHO Guidelines, 2023.
[4] F. Pedregosa et al., “Scikit-Learn: Machine Learning in Python,” JMLR, vol. 12, pp. 2825–2830, 2011.
[5] S. Chavda et al., “Face Mask Detection Using Machine Learning,” IEEE Access, vol. 9, pp. 181530–181542, 2021.
[6] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” NIPS, 2012.
[7] A. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” arXiv:1704.04861, 2017.
[8] K. He et al., “Deep Residual Learning for Image Recognition,” CVPR, 2016.
[9] J. Deng et al., “ImageNet: A Large-Scale Hierarchical Image Database,” CVPR, 2009.
[10] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016.
[11] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” CVPR, 2005.
[12] T. Cover and P. Hart, “Nearest Neighbor Pattern Classification,” IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 21–27, 1967.
[13] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, pp. 273–297, 1995.
[14] J. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.
[15] L. Breiman, “Random Forests,” Machine Learning, vol. 45, pp. 5–32, 2001.
[16] J. Friedman, “Greedy Function Approximation: A Gradient Boosting Machine,” Annals of Statistics, vol. 29, pp. 1189–1232, 2001.
[17] K. He et al., “Deep Residual Learning for Image Recognition,” CVPR, 2016.
[18] J. Redmon et al., “You Only Look Once: Unified Real-Time Object Detection,” CVPR, 2016.
[19] S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, 2010.
[20] Z. Wang et al., “Real-Time Face Mask Detection in Public Spaces,” Sensors, vol. 22, no. 8, 2022.
[21] A. Dosovitskiy et al., “An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale,” ICLR, 2021.
[22] B. McMahan et al., “Communication-Efficient Learning of Deep Networks from Decentralized Data,” AISTATS, 2017.