This review surveys recent methods for vehicle detection, tracking, and speed estimation using monocular video, stereo vision, aerial imagery, LiDAR, and hardware-assisted optical sensing.Detection backbones such as YOLO, SSD, and Faster R-CNN areexaminedalongsidetrackingalgorithmsincludingSORT,DeepSORT,andByteTrack.Monocularhomography,stereodepth estimation,LiDAR-basedtracking,andopticalmodulationapproachesareanalyzedintermsofaccuracy,robustness,anddeploy-ment feasibility. Thepapersummarizesdatasets,evaluationmetrics,andresearchtrends,whileidentifyingkeychallengessuchas weather robustness, lackofstandardizedspeedgroundtruth,andreal-timesensorfusion. Recommendationsforfutureintelligent transportation systems are also discussed.
Introduction
Accurate vehicle detection and speed estimation are key components of Intelligent Transportation Systems (ITS), supporting traffic monitoring, congestion management, accident prevention, and smart city development. Vision-based methods using surveillance cameras and aerial platforms are widely adopted due to their scalability and cost-effectiveness compared to traditional sensor-based systems. Recent advances in deep learning models (such as YOLOv8, SSD, and Faster R-CNN) and tracking algorithms (SORT, DeepSORT, ByteTrack) have significantly improved detection accuracy and reliable speed estimation from video streams. However, challenges like perspective distortion, lighting variations, occlusion, and adverse weather persist.
The literature survey categorizes existing approaches into several types:
Monocular video-based methods, which use object detection, tracking, and camera calibration or homography; they are computationally efficient but sensitive to calibration errors.
Stereo vision systems, which estimate depth using dual cameras and provide higher accuracy, though they require precise alignment and calibration.
LiDAR-based systems, which offer highly accurate 3D tracking and robust performance under varying lighting, but are expensive and less scalable.
Optical and hardware-assisted techniques, which directly encode speed information into image patterns and improve robustness but need specialized hardware.
Aerial and UAV-based systems, which enable wide-area traffic analysis but are affected by altitude changes and motion blur.
Conclusion
This review highlights the evolution of vehicle detection and speed estimation from traditional vision-based pipelines to modern deeplearningandmultimodalsystems. WhileLiDARandstereovisiondeliversuperioraccuracy,monocularandaerialmethods remain attractive for scalable deployment. Futureresearchshouldfocusonunifieddetection–tracking–speedestimationpipelines, weather-adaptive models, and real-time multimodal sensor fusion for smart transportation systems.
References
[1] D.Zhao,L.Zhang,andK.Sun,“TrackingandSpeedEstimationofGroundVehiclesUsingAerial-viewVideos,”IEEE Access, vol. 12, pp. 21543–21558, 2024.
[2] D.A.Kamil,K.Lee,andS.Park,“VehicleSpeedEstimationUsingDeepImageHomography,”Sensors,vol.24,no. 7, 2024.
[3] L.Yang,Q.Li,andJ.Chen,“VehicleSpeedMeasurementBasedonBinocularStereovision,”IEEEAccess,vol.7,pp. 146352–146363, 2019.
[4] J.Zhang,M.Wang,andZ.Liu,“VehicleTrackingandSpeedEstimationFromRoadsideLiDAR,”IEEEJSTARS,vol. 13, pp. 5943–5955, 2020.
[5] M. Lee, D. Kim, and S. Lee, “Vehicle Speed Estimation Using Modulated Motion Blur,” IEEE Access, vol. 11, pp. 87824–87836, 2023.