The study introduces a advanced assistive technology that enables precise distance calculation and real-time object identification for individuals with vision impairments. With the help of the MobileNetV3 trained model, the system can recognize up to 91 distinct items at once, providing a thorough awareness of the user\'s surroundings. In addition to identifying particular items, the system evaluates the distances between various things in the field of vision and between the user and those objects. It utilizes sophisticated computer vision algorithms to provide reliable object detection and depth estimates, therefore mitigating the difficulties encountered by those with visual impairments. By using crafted aural interface to provide distance information, the audio feedback function improves the user experience. Through the provision of accurate and timely spatial layout information, our technology seeks to enhance the ability of visually impaired people to navigate and to become independently mobile. In order to guarantee successful integration into the everyday life of the visually handicapped, the research also addresses the difficulties in designing such systems, such as the requirement for real-time processing, environmental adaptation, and user-friendly aural feedback.
Introduction
The text describes the development of an assistive technology system for visually impaired individuals using deep learning and computer vision. The system employs a pretrained MobileNetV3 model to perform real-time multi-object detection (recognizing up to 91 objects) and precise distance estimation. It aims to enhance spatial awareness by identifying objects, calculating their distances, and conveying this information through an easy-to-understand auditory feedback system, thereby supporting independent navigation.
The research builds on prior advances in object detection, distance measurement, and assistive devices, improving upon limitations in accuracy, real-time performance, and simultaneous multi-object recognition. The dataset includes diverse indoor and outdoor objects relevant to visually impaired users, with special attention to person-centric items for better contextual awareness.
The MobileNetV3 model was selected for its balance of speed, accuracy, and efficiency, ideal for real-time processing on mobile devices. Distance measurement relies on camera focal length and object size in the image, with additional object tracking using centroid distances. The system continuously processes live video input, providing users with spoken information about nearby objects and their spatial relationships.
Ultimately, the system aims to empower visually impaired users with greater environmental understanding and mobility through an accessible, real-time, audio-guided interface.
Conclusion
To sum up, this initiative is a noteworthy development in assistive technology that is designed to improve the navigational skills and spatial awareness of those who are visually impaired. The system gives users extensive information about their surroundings by integrating a dynamic distance level display, a real-time object identification system, and distance calculation using a computed focal length. An essential layer of accessibility is added by the text-to-speech technology, which allows the system to provide audio feedback along with dynamic communication of objects that are recognized and their distances. The real-time voice answers, object tracking, and constant monitoring all support a user-centric design that makes sure visually impaired people get timely and pertinent information. The project\'s flexibility, demonstrated by the utilization of the pyttsx3 library and the MobileNetV3 model, shows a dedication to tackling issues that the visually impaired confront in the actual world.
All things considered, this project not only demonstrates the technological capabilities of deep learning and computer vision, but also highlights how these fields have the potential to improve the lives of people who are visually impaired by giving them more inclusive and accessible ways to perceive and interact with their surroundings.
References
[1] Brady, Erin, et al. \"Visual challenges in the everyday lives of blind people.\" Proceedings of the SIGCHI conference on human factors in computing systems. 2013.
[2] Bousbia-Salah, Mounir, Maamar Bettayeb, and Allal Larbi. \"A navigation aid for blind people.\" Journal of Intelligent & Robotic Systems 64 (2011): 387-400.
[3] Zou, Zhengxia, et al. \"Object detection in 20 years: A survey.\" Proceedings of the IEEE (2023).
[4] Wu, Jia-jie, et al. \"Focal length measurement based on Hartmann–Shack principle.\" Optik 123.6 (2012): 485-488.
[5] Neal, Daniel R., et al. \"Measurement of lens focal length using multicurvature analysis of Shack-Hartmann wavefront data.\" Current Developments in Lens Design and Optical Engineering V. Vol. 5523. SPIE, 2004.
[6] Koteswararao, M., and P. R. Karthikeyan. \"Accurate and real-time object detection system using YOLO v3-320 in comparison with MobileNet SSD network.\" AIP Conference Proceedings. Vol. 2822. No. 1. AIP Publishing, 2023.
[7] Iyer, Rakkshab, et al. \"Comparison of YOLOv3, YOLOv5s and MobileNet-SSD V2 for real-time mask detection.\" Artic. Int. J. Res. Eng. Technol 8 (2021): 1156-1160.
[8] Li, Wei, and Kai Liu. \"Confidence-aware object detection based on MobileNetv2 for autonomous driving.\" Sensors 21.7 (2021): 2380.
[9] Cazzato, Dario, et al. \"A survey of computer vision methods for 2d object detection from unmanned aerial vehicles.\" Journal of Imaging 6.8 (2020): 78.
[10] Diwan, Tausif, G. Anirudh, and Jitendra V. Tembhurne. \"Object detection using YOLO: Challenges, architectural successors, datasets and applications.\" multimedia Tools and Applications 82.6 (2023): 9243-9275.
[11] Zou, Zhengxia, et al. \"Object detection in 20 years: A survey.\" Proceedings of the IEEE (2023).
[12] Gehrig, Mathias, and Davide Scaramuzza. \"Recurrent vision transformers for object detection with event cameras.\" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
[13] Yang, Yonghao, and Jin Han. \"Real-Time object detector based MobileNetV3 for UAV applications.\" Multimedia Tools and Applications 82.12 (2023): 18709-18725.
[14] Zuhair, Muhammad Syihabuddin Az, Andi Widiyanto, and Setiya Nugroho. \"Comparison of tensorflow and tensorflow lite for object detection on Raspberry Pi 4.\" AIP Conference Proceedings. Vol. 2706. No. 1. AIP Publishing, 2023.
[15] Brownlee, Jason. Deep learning for computer vision: image classification, object detection, and face recognition in python. Machine Learning Mastery, 2019.
[16] Wang, Chien-Yao, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. \"YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors.\" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.
[17] Zhang, Shan, et al. \"Kernelized few-shot object detection with efficient integral aggregation.\" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
[18] Wang, Pei, et al. \"Omni-detr: Omni-supervised object detection with transformers.\" Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
[19] Su, Yukun, et al. \"A unified transformer framework for group-based segmentation: Co-segmentation, co-saliency detection and video salient object detection.\" IEEE Transactions on Multimedia (2023).
[20] Nagarajan, Anandh, and M. P. Gopinath. \"Hybrid Optimization-Enabled Deep Learning for Indoor Object Detection and Distance Estimation to Assist Visually Impaired Persons.\" Advances in Engineering Software 176 (2023): 103362.
[21] Kim, Youngseok, et al. \"Craft: Camera-radar 3d object detection with spatio-contextual fusion transformer.\" Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 1. 2023.
[22] Mao, Jiageng, et al. \"3D object detection for autonomous driving: A comprehensive survey.\" International Journal of Computer Vision (2023): 1-55.
[23] Ta?yürek, Murat. \"Odrp: a new approach for spatial street sign detection from exif using deep learning-based object detection, distance estimation, rotation and projection system.\" The Visual Computer (2023): 1-21.
[24] Yang, Taocun, et al. \"Symmetry-driven unsupervised abnormal object detection for railway inspection.\" IEEE Transactions on Industrial Informatics (2023).
[25] Zhao, Liquan, and Leilei Wang. \"A new lightweight network based on MobileNetV3.\" KSII Transactions on Internet & Information Systems 16.1 (2022)