This initiative involves an advanced AI-driven object detection system that employs the YOLO (You Only Look Once) deep learning framework to recognize and monitor objects in real-time from both images and videos. It comprises multiple Python scripts, including object_detection.py, real_time.py, and video_with_distance.py, which facilitate the identification of objects in photographs, live video feeds, and the estimation of their distances. The system is equipped with pre-trained YOLO model weights (yolov8m.pt, yolov8n.pt), enabling rapid and effective object recognition. Additionally, it provides sample videos (33.mp4, 34.mp4) and images (bus.jpg, output_detected.jpg) to evaluate the performance of the detection model. A COCO dataset file (coco.txt) is included, signifying that the model has been trained to identify a diverse range of common objects. Furthermore, other Python scripts (conv.py, test4.py) appear to be utilized for data conversion, testing, or enhancing the system\'s capabilities. This project holds significant potential for applications in real-time surveillance, autonomous vehicles, intelligent traffic management, security systems, and AI-driven automation, thereby improving the efficiency of object detection and tracking across various practical scenarios
Introduction
Project Summary
This project focuses on developing a real-time object detection system using the YOLOv8 (You Only Look Once) deep learning model. The goal is to accurately detect and track objects in both images and live video streams, while also estimating object distances, supporting applications like autonomous vehicles, security surveillance, and smart traffic systems.
Dataset
The COCO (Common Objects in Context) dataset is used, known for its rich variety of real-world annotated images. It supports tasks like object detection, segmentation, and captioning. The dataset’s realistic complexity helps train models to perform well in practical environments.
Literature Review
Studies from 2021–2025 highlight ongoing advances in autonomous vehicle technology, particularly in:
Object detection in adverse conditions (e.g., fog, rain)
Sensor fusion (LiDAR, radar, cameras)
Deep learning for localization
Challenges like real-time processing, cost, adversarial attacks, and ethical concerns
These findings underscore the importance of AI robustness, safety, and scalability.
Methodology
Data Acquisition: Uses pre-trained YOLOv8 models (yolov8m.pt, yolov8n.pt) with image and video inputs.
Preprocessing: Input is resized and normalized; video is processed frame-by-frame using OpenCV.
Model Implementation: Multiple Python scripts enable static, real-time, and distance-aware object detection.
Post-processing: Adds bounding boxes and filters predictions by confidence to improve clarity.
Evaluation & Optimization: Performance is measured using precision, recall, and MAP; model size vs. speed is balanced.
Deployment: Designed for real-time applications on both standard and edge devices (e.g., Raspberry Pi, Jetson), with future plans for web/mobile integration.
Results & Discussion
Tests show high accuracy in both image and video scenarios. For instance:
In "bus2.jpg", five people were detected (confidence: 0.39–0.91).
Two buses and a truck were identified, but lower scores for some objects (e.g., 0.27) suggest occasional misclassification or occlusion effects.
Real-time detection was smooth and efficient, with added distance estimation enhancing spatial understanding.
Conclusion
The Autonomous Navigation & Safety System (Cars/Objects) utilizing the YOLO model represents a cutting-edge solution aimed at improving personal safety through advanced technological integration. This system merges hand gesture recognition, facial emotion detection, voice activation, GPS tracking, and real-time live streaming to function as an automated emergency response assistant, delivering prompt and effective safety interventions. Leveraging deep learning, computer vision, and real-time communication, it is capable of identifying distress signals and responding without delay. The emergency alert feature guarantees swift assistance for users, while live video streaming and GPS tracking provide immediate updates for responders. With its diverse capabilities, the system serves as a dependable safety net for individuals in precarious situations or emergencies. Future enhancements could include the integration of AI-driven behavioral analysis, IoT connectivity, and cloud-based data storage to further improve its efficiency and scalability. This initiative underscores the potential of AI in enhancing safety and illustrates the significant role technology can play in safeguarding lives and fostering a safer environment.
References
[1] Appiah, Emmanuel Owusu, and Solomon Mensah. \"Object detection in adverse weather condition for autonomous vehicles.\" Multimedia Tools and Applications 83, no. 9 (2024): 28235-28261.
[2] Yeong, De Jong, Gustavo Velasco-Hernandez, John Barry, and Joseph Walsh. \"Sensor and sensor fusion technology in autonomous vehicles: A review.\" Sensors 21, no. 6 (2021): 2140.
[3] Sarasa-Cabezuelo, Antonio. \"Prediction of rainfall in Australia using machine learning.\" Information 13, no. 4 (2022): 163.
[4] Wason, Ritika, Parul Arora, Vishal Jain, Devansh Arora, and M. N. Hoda. \"Exploring the Convergence of Artificial Intelligence and Mechatronics in Autonomous Driving.\" Computational Intelligent Techniques in Mechatronics (2024): 297-316.
[5] Ghintab, Shahad S., and Mohammed Y. Hassan. \"Localization for self-driving vehicles based on deep learning networks and RGB cameras.\" International Journal of Advanced Technology and Engineering Exploration 10, no. 105 (2023): 1016.
[6] Hasanujjaman, Muhammad, Mostafa Zaman Chowdhury, and Yeong Min Jang. \"Sensor fusion in autonomous vehicle with traffic surveillance camera system: detection, localization, and AI networking.\" Sensors 23, no. 6 (2023): 3335.
[7] Hacohen, Shlomi, Oded Medina, and Shraga Shoval. \"Autonomous driving: A survey of technological gaps using google scholar and web of science trend analysis.\" IEEE Transactions on Intelligent Transportation Systems 23, no. 11 (2022): 21241-21258.
[8] Mahima, KT Yasas, Asanka G. Perera, SreenathaAnavatti, and Matt Garratt. \"Toward robust 3d perception for autonomous vehicles: A review of adversarial attacks and countermeasures.\" IEEE Transactions on Intelligent Transportation Systems (2024).
[9] Huang, Kun, Yifu Wang, Si\'ao Zhang, Zhirui Wang, Zhanpeng Ouyang, Zhenghua Yu, and Laurent Kneip. \"OpenGV 2.0: Motion prior-assisted calibration and SLAM with vehicle-mounted surround-view systems.\" arXiv preprint arXiv:2503.03230 (2025).
[10] Yan, Shen, Maojun Zhang, Yang Peng, Yu Liu, and Hanlin Tan. \"Agenti2p: Optimizing image-to-point cloud registration via behaviour cloning and reinforcement learning.\" Remote Sensing 14, no. 24 (2022): 6301.