Object detection has become a cornerstone of computer vision, with applications spanning from autonomous vehicles to surveillance systems. This paper presents a novel approach using YOLOv8 for real-time object detection in video streams, coupled with a method to calculate the total duration of detected objects. We demonstrate the system\'s effectiveness using a custom dataset, highlighting its potential for various real-world applications where object presence duration is crucial. Our results show the count and duration of the object, indicating the viability of this approach for industrial use by various Brands to know about the appearance of their products.
Introduction
The paper presents an object detection and temporal analysis system that combines YOLOv8 with a custom duration-calculation method to measure how long specific objects appear in video streams. While object detection has advanced significantly, existing systems often lack meaningful temporal analysis. This research addresses that gap by automatically detecting objects and calculating their on-screen presence time, with a practical application in influencer marketing analytics.
The study is motivated by a food product manufacturer seeking objective measurement of product exposure in influencer videos, where products often appear briefly. To solve this, the authors created a custom dataset of food products and trained a YOLOv8-based model using transfer learning. The system detects products frame by frame, tracks them across consecutive frames, and computes their visibility duration using spatial tracking and frame-rate-based timing calculations. The final solution is implemented in Python and deployed as a web-based application.
The methodology includes dataset creation, manual annotation, data augmentation, model training on cloud-based GPUs, and real-time video processing using OpenCV. A lightweight YOLOv8n architecture is used to balance detection accuracy and processing speed. Object tracking relies on center-point matching and Euclidean distance thresholds to maintain object identity across frames.
Experimental results demonstrate strong detection performance, achieving 85.7% precision, 83.2% recall, and a mean average precision (mAP) of 76.5%. The system successfully calculates product visibility duration in real-world influencer videos, even under challenging conditions such as partial occlusion and variable lighting. Although the model performs well in real-time scenarios, challenges remain, including dataset diversity, computational cost, product variant recognition, and privacy considerations.
Overall, the study shows that integrating object detection with temporal analysis provides valuable insights into product visibility in video content, offering a scalable and objective tool for marketing analytics and real-time video analysis applications.
Conclusion
Thus, the research project effectively handles the problem of recognition and tracking of a company\'s products in video content by providing a sound solution for the computation of duration appearance of the products. Using company aligned custom-made data, trained upon super advanced machine learning models, this project has already performed accurate detection results. Using this methodology, the weights of the trained model further underwent integration into the final implementation to make the process automatic for the correct identification of the products and the duration.
easier and more dependable.
This research paper showcases the potential of leveraging custom datasets and pre-trained models toward the resolution of real business challenges in marketing and product placement analysis. Future versions may further improve the accuracy in a variety of scenarios, offer better processing, and implement the model to handle a more comprehensive video content format. The innovation
here will provide an excellent foundation for further developments concerning video content analysis and its exploitation within businesses and beyond.
References
[1] Gonzalez, Rafael (2018). Digital image processing. New York, NY: Pearson. ISBN 978-0-13-335672-4. OCLC 966609831.
[2] Petrou M M and Petrou C 2010 Image Processing: The Fundamentals (New York: Wiley)
[3] Irshad Ahmad Ansari and Varun Bajaj. Image Processing with Python A practical approach.
[4] Hang Yin. Chengdu College of University of Electronic Science and Technology of China, Sichuan, China. Object Detection Based on Deep Learning.
[5] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems (NeurIPS).
[6] *Mupparaju Sohan1, Thotakura SaiRam2, and Ch. Venkata RamiReddy3 1,2,3School of Computer Science and Engineering, VIT-AP University, Amaravati, India, 522237. A Review on YOLOv8 and its Advancements.
[7] Luis Balderas, Miguel Lastra and José M. Benítez. Optimizing Convolutional Neural Network Architectures
[8] Sai Sundar. Image Processing: The Fundamentals Image Processing: The Fundamentals, Second Edition. 2010 John Wiley & Sons, Ltd. ISBN: 978-0-470-74586-1
[9] Matias Gran-Henriksen*, Hans Andreas Lindgard Gabriel Kiss, Frank Lindseth. Norwegian University of Science and Technology – NTNU. Deep HM-SORT: Enhancing Multi-Object Tracking in Sports with Deep Features, Harmonic Mean, and Expansion IOU
[10] Sumit, Shrishti Bisht, Sunita Joshi, Urvi Rana. Comprehensive Review of R-CNN and its Variant Architectures.
https://irjaeh.com
https://doi.org/10.47392/IRJAEH.2024.0134
[11] Junbiao Liang. A review of the development of YOLO object detection algorithm. DOI: 10.54254/2755-2721/71/20241642
[12] Moahaimen Talib Ahmed H. Y. Al-Noori Jameelah Suad. YOLOv8-CAB: Impr Ov8-CAB: Improved YOLOv8 for Real-time object detection
[13] Jia, C., Wang, D., Liu, J., & Deng, W. (2024). Performance Optimization and Application Research of YOLOv8 Model in Object Detection. Academic Journal of Science and Technology, 10(1), 325-329. drpress.org
Focus: YOLOv8 architecture, optimisation strategies (sample imbalance, NMS etc) & applications.
Relevance: Helps you justify choosing YOLOv8 and shows optimisation aspects.
[14] Berndt, J., Meißner, H., & Kraft, T. (2023). On the Accuracy of YOLOv8-CNN Regarding Detection of Humans in Nadir Aerial Images for Search and Rescue Applications. ISPRS Archives XLVIII-1/W2, 139-146. isprs-archives.copernicus.org
Focus: YOLOv8 for human detection in aerial images; deals with challenging image conditions (ground sample distance).
Relevance: Good for discussing how detection performance can vary with viewpoint, resolution — relevant for video input in your project.
[15] Megantara, N.A., & Utami, E. (2024). Object Detection using YOLOv8: A Systematic Review. SISTEMASI. sistemasi.org
Focus: Systematic review of YOLOv8 across many application domains (UAV, medical, road defects, etc).
Relevance: Useful for the “state of the art” section of your project — what has been done with YOLOv8 so far.
[16] Verma, U., Kalia, A., & Sood, S. (2024). YOLOV8: An Enhanced Object Detection Model for Distance Estimation. International Journal of Intelligent Systems and Applications in Engineering, 12(21s). IJISAE
Focus: Enhanced YOLOv8 for distance estimation (object distance) by integrating Coordinate Attention & WIoU loss.
Relevance: Very aligned with your “object duration or proximity” dimension (you can map distance/proximity estimation to your duration/proximity calculation step).
[17] Bento, J., Paixão, T., & Alvarez, A.B. (2025). Performance Evaluation of YOLOv8, YOLOv9, YOLOv10, and YOLOv11 for Stamp Detection in Scanned Documents. Applied Sciences, 15(6), 3154. MDPI
Focus: Comparing YOLOv8 – YOLOv11 on a document detection task; highlights model evolution.
Relevance: Helps you show how YOLO family is evolving, which supports your literature review (especially for “why YOLOv8?” vs newer versions).
[18] “The YOLO Framework: A Comprehensive Review of Evolution, Applications, and Benchmarks in Object Detection.” Computers, 2024, 13(12), 336. MDPI
Focus: Review of YOLO family (v1-v11) including architectures, benchmarks, application domains, and limitations.
Relevance: Good to anchor your project’s theoretical foundations and justify your choice of YOLOv8.
[19] Singh, R., Goyal, S., Agarwal, S., Saxena, D., & Upadhyay, S. (2025). Evaluating the Performance of Ensembled YOLOv8 Variants in Smart Parking Applications Under Varying Lighting Conditions. International Journal of Mathematics and Computer Research, 13(4), 5026-5032. ijmcr.in
Focus: YOLOv8 variants ensemble + adverse lighting conditions in smart parking.
Relevance: Good for discussing environmental challenges (lighting, shadows) which may affect your video input.
[20] “YOLOv8 with Post-Processing for Small Object Detection Enhancement.” Applied Sciences, 2025, 15(13), 7275. MDPI
Focus: Improve YOLOv8 specifically for small object detection, by adding CARAFE up-sampling + confidence-based re-detection.
Relevance: If your project deals with small objects (or distant objects, product representation in video), this is relevant.
[21] Object Detection with Deep Learning: A Review — Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, Xindong Wu (2018). A comprehensive review of deep-learning based object detection, covering the transition from traditional methods to CNN-based detectors. DeepAI+1
[22] A Comprehensive Survey on Object Detection Using Deep Learning — Bhagyashri More & Snehal Bhosale (2023). Surveys two-stage and one-stage detectors, comparing architectures, trends, and challenges. IIETA