Combining deep learning with computer vision has enabled significant advancements in real-time object detection. Leveraging the capabilities of the YOLOv8 architecture, the proposed system efficiently detects and tracks objects across various input sources, including static images, live webcam feeds, and video streams. By integrating a Flask-based web application with the YOLOv8 model, the system delivers a secure, responsive, and user-friendly interface for real-time detection. The model is trained on a custom dataset, enhanced through extensive data augmentation techniques to improve generalization across diverse environments. Key performance metrics such as Mean Average Precision (mAP), precision, recall, and frames per second (FPS) were used to evaluate accuracy and speed. Results demonstrated robust performance with up to 75% mAP at IoU 0.5 and real-time processing speeds of 25–30 FPS on GPU-enabled systems. Challenges such as performance degradation under low-light and high-occlusion conditions were addressed through thoughtful dataset preparation and architectural tuning. The system is scalable and adaptable, making it suitable for real-world applications such as surveillance, industrial monitoring, and accessibility tools.
Introduction
This study presents the development of a real-time object detection system combining the YOLOv8 deep learning model with a Flask-based web application. The system supports multiple input types—webcam feeds, image uploads, and video files—while incorporating secure user authentication to control access. Custom datasets are prepared with augmentation to improve detection robustness, and the YOLOv8 model is fine-tuned for optimal accuracy and speed.
The system achieves a mean average precision (mAP) of about 75% at IoU 0.5, running at 25-30 FPS on a GPU (NVIDIA GTX 1660 SUPER) and 5-7 FPS on CPU, demonstrating practical real-time performance. It performs better on still images than on live video streams, with some accuracy challenges in low-light or heavily obstructed scenes. The web interface offers a user-friendly experience with clear visual feedback and optional text-to-speech for accessibility.
Key challenges include managing computational demands, dataset diversity, and maintaining performance across devices and browsers. Limitations relate to hardware requirements, dataset representativeness, and scalability for multiple users. Despite these, the system provides a robust, secure, and accessible platform for practical real-time object detection, laying groundwork for further improvements such as enhanced dataset variety and better low-light detection.
Conclusion
This study successfully designed and evaluated a real-time object detection system using the YOLOv8 model, integrated into a Flask-based web application. The model, trained on a custom dataset, showed promising results, achieving a Mean Average Precision (mAP) of 75% at an IoU of 0.5. It performed best on static images, while slightly lower performance was noted on real-time webcam streams, primarily due to challenges such as motion blur and inconsistent lighting. Despite these, the system maintained smooth performance at 25–30 FPS with GPU support, demonstrating its capability for real-time applications.
In contrast, inference using only a CPU highlighted the performance gap, underlining the critical role of hardware acceleration.
The Flask web interface provided an intuitive and secure platform for users to interact with the system across different input types—webcam, image uploads, and video files—demonstrating its flexibility for diverse practical use cases. The outcomes of this study underscore the importance of a robust and diverse dataset, as well as the need to address edge-case challenges like low-light and occlusion scenarios. This project also illustrates the potential of combining modern deep learning models with lightweight web frameworks to build accessible and scalable computer vision solutions. Looking ahead, improvements such as model variant selection (e.g., YOLOv8n for speed or YOLOv8x for higher accuracy), more extensive dataset expansion, and backend optimizations for concurrent users could further enhance system performance and adaptability. Overall, the integration of deep learning with web-based interfaces marks a meaningful step toward making real-time object detection systems more practical, scalable, and accessible for real-world deployment.
References
[1] R. K, G. B, K. R. P, M. K. R, G. M and D. K.G, \"A Perspective Study of Real-Time Object Detection Using Deep Learning by Applying Design Thinking Approach,\" 2024 MIT Art, Design and Technology School of Computing International Conference (MITADTSoCiCon), Pune, India, 2024, pp. 1-5, doi: 10.1109/MITADTSoCiCon60330.2024.10575232.
[2] G. M. B. Catedrilla, \"Mobile-Based Navigation Assistant for Visually Impaired Person with Real-time Obstacle Detection Using YOLO-based Deep Learning Algorithm,\" 2022 5th Asia Conference on Machine Learning and Computing (ACMLC), Bangkok, Thailand, 2022, pp. 63-67, doi: 10.1109/ACMLC58173.2022.00020.
[3] U. Dwivedi, K. Joshi, S. K. Shukla and A. S. Rajawat, \"An Overview of Moving Object Detection Using YOLO Deep Learning Models,\" 2024 2nd International Conference on Disruptive Technologies (ICDT), Greater Noida, India, 2024, pp. 1014-1020, doi: 10.1109/ICDT61202.2024.10489800.
[4] S. Borkar, U. Singh and S. S, \"Dynamic Approach for Object Detection using Deep Reinforcement Learning,\" 2024 IEEE Space, Aerospace and Defence Conference (SPACE), Bangalore, India, 2024, pp. 393-397, doi: 10.1109/SPACE63117.2024.10667858.
[5] T. Sarkar, M. Rakhra, V. Sharma, S. Takkar and K. Jairath, \"Comparative Study of Object Recognition Utilizing Machine Learning Techniques,\" 2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE), Gautam Buddha Nagar, India, 2024, pp. 726-731, doi: 10.1109/IC3SE62002.2024.10593475.
[6] S. A. Babu Parisapogu, N. Narla, A. Juryala and S. Ramavath, \"YOLO based Object Detection Techniques for Autonomous Driving,\" 2024 Second International Conference on Inventive Computing and Informatics (ICICI), Bangalore, India, 2024, pp. 249-256, doi: 10.1109/ICICI62254.2024.00049.
[7] B. Jaison, A. J. G, J. J and D. P. C, \"You Only Look Once(YOL O) Object Detection with COCO using Machine Learning,\" 2022 International Interdisciplinary Humanitarian Conference for Sustainability (IIHC), Bengaluru, India, 2022, pp. 1574-1578, doi: 10.1109/IIHC55949.2022.10059737.
[8] T. S. Gunawan, I. M. M. Ismail, M. Kartiwi and N. Ismail, \"Performance Comparison of Various YOLO Architectures on Object Detection of UAV Images,\" 2022 IEEE 8th International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA), Melaka, Malaysia, 2022, pp. 257-261, doi: 10.1109/ICSIMA55652.2022.9928938.
[9] A. Senapati et al., \"Identification of blurred objects in real time Video using deep learning neural networks,\" 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, 2022, pp. 1-4, doi: 10.1109/ICCCNT54827.2022.9984429.
[10] J. Wang, W. Hongjun, J. Liu, R. Zhou, C. Chen and C. Liu, \"Fast and Accurate Detection of UAV Objects Based on Mobile-Yolo Network,\" 2022 14th International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China, 2022, pp. 01-05, doi: 10.1109/WCSP55476.2022.10039216.
[11] L. Kuhlane, D. Brown and M. Marais, \"Real- Time Detecting and Tracking of Squids Using YOLOv5,\" 2023 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), Durban, South Africa, 2023, pp. 1-5, doi: 10.1109/icABCD59051.2023.10220521.
[12] P. C. Manojkumar, L. S. Kumar and B. Jayanthi, \"Performance Comparison of Real Time Object Detection Techniques with YOLOv4,\" 2023 International Conference on Signal Processing, Computation, Electronics, Power and Telecommunication (IConSCEPT), Karaikal, India, 2023, pp. 1-6, doi: 10.1109/IConSCEPT57958.2023.10169970
[13] V. A. Rajan, S. Sakhamuri, A. P. Nayaki, S. Agarwal, A. Aeron and M. Lawanyashri, \"Optimizing Object Detection Efficiency for Autonomous Vehicles through the Integration of YOLOv4 and EfficientDet Algorithms,\" 2024 International Conference on Trends in Quantum Computing and Emerging Business Technologies, Pune, India, 2024, pp. 1-5, doi: 10.1109/TQCEBT59414.2024.10545157.
[14] N. Chatterjee, A. V. Singh and R. Agarwal, \"You Only Look Once (YOLOv8) Based Intrusion Detection System for Physical Security and Surveillance,\" 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 2024, pp. 1-5, doi: 10.1109/ICRITO61523.2024.10522139.
[15] A. Afdhal, K. Saddami, S. Sugiarto, Z. Fuadi and N. Nasaruddin, \"Real-Time Object Detection Performance of YOLOv8 Models for Self-Driving Cars in a Mixed Traffic Environment,\" 2023 2nd International Conference on Computer System, Information Technology, and Electrical Engineering (COSITE), Banda Aceh, Indonesia, 2023, pp. 260-265, doi: 10.1109/COSITE60233.2023.10249521.
[16] X. Peng, L. Zeng, W. Zhu and Z. Zeng, \"A Small Object Detection Model for Improved YOLOv8 for UAV Aerial Photography Scenarios,\" 2024 5th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), Nanjing, China, 2024, pp. 2099-2104, doi: 10.1109/AINIT61980.2024.10581840.