Ensuring safety on construction sites continues to be a major challenge, highlighting the need for intelligent and automated monitoring systems. This research presents a smart helmet detection framework based on advanced computer vision techniques, designed to enforce safety measures in real-time. The paper systematically evaluates multiple cutting-edge object detection models—such as YOLOv5s, YOLOv5-M, YOLOv3, YOLOv4, YOLOv8, YOLOv5 integrated with GhostCNN, SSD, RetinaNet, and Faster R-CNN. These models are assessed based on detection accuracy, processing speed, and hardware efficiency to determine their viability for safety enforcement tasks. The system primarily aims to safeguard construction personnel while also assisting site supervisors by enhancing resource management and surveillance. Experimental results indicate that the YOLOv5-GhostCNN architecture achieves a remarkable performance, attaining a mean Average Precision (mAP) exceeding 97%, highlighting its capability in critical safety applications. This work advances the goal of safer construction environments by promoting effective use of AI-based safety monitoring.
Introduction
1. Background & Motivation
Construction sites are high-risk environments, where falling objects are a leading cause of injuries.
Helmets are essential safety gear, but manual enforcement of helmet usage is inefficient and error-prone.
Deep learning and computer vision offer a transformative approach for automating helmet detection in real-time.
2. Technological Advances
Early helmet detection systems used YOLO (You Only Look Once) models but struggled with precision.
Improvements include:
Use of lightweight models (e.g., YOLO-M, MobileNet-based YOLO) for real-time efficiency.
Enhancements like Efficient Channel Attention (ECA), BiCAM, and model pruning improve performance.
Reinforcement learning and self-supervised learning are being explored for future adaptability.
3. Literature Review Insights
Manual enforcement methods are outdated and inefficient.
Multiple studies have refined YOLO architectures (YOLOv3, YOLOv4, YOLOv5, etc.) to balance speed and accuracy.
Lightweight models like YOLO-S and optimized architectures (e.g., YOLOv5-GhostCNN, YOLOv5X6) make helmet detection feasible on low-power devices.
4. Methodology
A new model, YOLO-M, was developed based on YOLOv5s, with a MobileNetV3 backbone and BiCAM for better performance on limited hardware.
Comparison with other models: YOLOv3, YOLOv4, YOLOv5X6, YOLOv8, SSD, RetinaNet, Faster R-CNN, etc.
Implemented via a Flask web application with user login, upload, and detection features.
5. Dataset & Preprocessing
Annotated images from real construction sites were used, with labels created using Labelbox and Roboflow.
Diverse conditions (lighting, angles, weather) were included for robust training.
Preprocessing included image resizing, blob conversion, normalization, and bounding box annotation.
6. Data Augmentation
Random flipping, rotations, and affine transformations were applied to simulate real-world variations and enhance model robustness.
7. Algorithms Used
YOLOv3: Best performer overall in precision, recall, and mAP.
YOLOv5X6: Highest precision and mAP, though heavier in computation.
YOLO-M: Efficient and lightweight with balanced performance.
Other models like SSD, RetinaNet, and Faster R-CNN showed varying degrees of trade-offs between accuracy and speed.
8. Results & Evaluation
Evaluation metrics: Precision, Recall, and Mean Average Precision (mAP).
YOLOv3 achieved top performance:
Precision: 0.944
Recall: 0.870
mAP: 0.931
YOLOv5X6 also showed strong performance, but required more computational power.
Visual comparisons (Graph 1) and web interface screenshots demonstrate usability and performance.
Conclusion
In conclusion, the development of an automated safety helmet detection system represents a significant advancement in workplace safety within the construction industry. Through the utilization of computer vision technologies and cutting-edge algorithms such as YOLO variants, customized YOLO-M model, SSD, RetinaNet, and FasterRCNN [14], the project has effectively addressed the critical need for real-time monitoring of safety helmet compliance. The extension to explore additional algorithms like YOLOv5x6 [18] and YOLOv8 further enhances the system\'s robustness and accuracy. By integrating Flask with user authentication, the project ensures a user-friendly interface for testing and validation, facilitating practical deployment. Ultimately, these outcomes contribute tangibly to the construction industry by automating safety monitoring processes, aiding site managers and workers in maintaining a safer working environment.
References
[1] Q. Y. Li, J. B. Wang, and H. W. Wang, ‘‘Study on impact resistance of industrial safety helmet,’’ J. Saf. Sci. Technol., vol. 17, no. 3, pp. 182–186, Mar. 2021, doi: 10.11731/j.issn.1673-193x.2021.03.028.
[2] Y. X. Wang, Z. Wang, and B. Wu, ‘‘Research review of safety helmet wearing detection algorithm in intelligent construction site,’’ J. Wuhan Univ. Technol., vol. 43, no. 10, pp. 56–62, Oct. 2021, doi: 10.3963/ j.issn.1671-4431.2021.10.00.
[3] L. Jun, W. C. Dang, and P. Lihu, ‘‘Safety helmet detection based on YOLO,’’ Comput. Syst. Appl., vol. 28, no. 9, pp. 174–179, Sep. 2019, doi: 10.15888/j.cnki.csa.007065.
[4] W. Bing, L. Wenjing, and T. Huan, ‘‘Improved YOLOv3 algorithm and its application in helmet detection,’’ Comput. Eng. Appl., vol. 26, no. 9, pp. 33–40, Feb. 2020, doi: 10.3778/j.issn.1002-8331.1912-0267.
[5] F. Ming, S. Tengteng, and S. Zhen, ‘‘Fast helmet-wearing-condition detection based on improved YOLOv2,’’ Opt. Precis. Eng., vol. 27, no. 5, pp. 1196–1205, Mar. 2019, doi: 10.3788/OPE.20192705.1196.
[6] H. C. Zhao, X. X. Tian, and Z. S. Yang, ‘‘YOLO-S: A new lightweight helmet wearing detection model,’’ J. East China Normal Univ. Natural Sci., vol. 47, no. 5, pp. 134–145, Sep. 2021, doi: 10.3969/j.issn.1000- 5641.2021.05.012.
[7] T. Ding, X. Y. Chen, Q. Zhou, and H. L. Xiao, ‘‘Real-time detection of helmet wearing based on improved YOLOX,’’ Electron. Meas. Technol., vol. 45, no. 17, pp. 72–78, Sep. 2022, doi: 10.19651/j.cnki.emt.2209425.
[8] X. Ma, K. Ji, B. Xiong, L. Zhang, S. Feng, and G. Kuang, ‘‘LightYOLOv4: An edge-device oriented target detection method for remote sensing images,’’ IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 10808–10820, 2021, doi: 10.1109/JSTARS.2021.3120009.
[9] Z. Z. Sun, X. G. Len, and L. Yu, ‘‘BiFA-YOLO: A novel YOLObased method for arbitrary-oriented ship detection in high-resolution SAR images,’’ Remote Sens., vol. 13, no. 21, pp. 4209–4237, Oct. 2021, doi: 10.3390/rs13214209.
[10] R. Girshick, J. Donahue, T. Darrell, and J. Malik, ‘‘Rich feature hierarchies for accurate object detection and semantic segmentation,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2014, pp. 580–587.
[11] R. Girshick, ‘‘Fast R-CNN,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 1440–1448.
[12] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards realtime object detection with region proposal networks,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, Jun. 2017.
[13] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, ‘‘The PASCAL visual object classes (VOC) challenge,’’ Int. J. Comput. Vis., vol. 88, no. 2, pp. 303–338, Jun. 2010.
[14] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, ‘‘You only look once: Unified, real-time object detection,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 779–788.
[15] W. Liu, D. Anguelov, and D. Erhan, ‘‘SSD: Single shot multibox detector,’’ in Proc. Eur. Conf. Comput. Vis. (ECCV), Oct. 2016, pp. 21–37.
[16] A. Howard, M. Sandler, B. Chen, W. Wang, L.-C. Chen, M. Tan, G. Chu, V. Vasudevan, Y. Zhu, R. Pang, H. Adam, and Q. Le, ‘‘Searching for MobileNetV3,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019, pp. 1314–1324.
[17] J. Hu, L. Shen, and G. Sun, ‘‘Squeeze-and-excitation networks,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 7132–7141.
[18] Y. H. Shao, D. Zhang, and H. Y. Chu, ‘‘A review of Yolo object detection based on deep learning,’’ J. Electron. Inf. Technol., vol. 44, no. 10, pp. 3697–3708, Oct. 2022, doi: 10.11999/JEIT210790.
[19] H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, ‘‘Generalized intersection over union: A metric and a loss for bounding box regression,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 658–666.
[20] A. Neubeck and L. Van Gool, ‘‘Efficient non-maximum suppression,’’ in Proc. 18th Int. Conf. Pattern Recognit. (ICPR), Aug. 2006, pp. 850–855.
[21] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, ‘‘Path aggregation network for instance segmentation,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018, pp. 8759–8768.
[22] S. Woo, J. Park, and J. Y. Lee, ‘‘CBAM: Convolutional block attention module,’’ in Proc. Eur. Conf. Comput. Vis. (ECCV), Sep. 2018, pp. 3–19.