Reading printed text can be quite a challenge for visually impaired individuals trying to navigate on their own. This project aims to change that by creating a system that helps blind people by capturing printed English text and turning it into speech. When a visually impaired person places a document under the camera, the system snaps a picture, processes it using Optical Character Recognition (OCR), and then reads the content aloud through a Bluetooth speaker. This way, blind individuals can enjoy printed material without needing any human help. The system, built on a Raspberry Pi, is not only affordable but also portable and user-friendly. By combining OCR with speech synthesis, it plays a crucial role in making information more accessible. The system uses image pre-processing with OpenCV, text extraction with Pytesseract, and speech synthesis with gTTS, all functioning locally on the Raspberry Pi without needing an internet connection. Prototype tests show that the OCR accuracy is over 95% in normal indoor lighting, and the text-to-speech response time is under 3 seconds, demonstrating that this system is practical for real-world assistive applications.
Introduction
To address this, the paper proposes a portable assistive device built using a Raspberry Pi 3 B+, a USB camera, Optical Character Recognition (OCR), and text-to-speech (TTS). The system captures printed text, processes it locally using Tesseract OCR, converts it into speech using gTTS, and plays it through a Bluetooth speaker. Because all processing is done on-device, it works offline, improves privacy, reduces cost, and increases accessibility through a simple push-button interface.
The literature review highlights prior research in OCR-based assistive systems, noting progress in embedded and cloud-based solutions but also limitations such as high cost, internet dependence, or reduced accuracy under poor conditions. Commercial tools like OrCam and Seeing AI offer advanced features but are financially inaccessible for many users. Open-source tools like Tesseract are widely used for affordable implementations.
The system architecture is structured as a pipeline with three main modules: image capture (SPECS), OCR processing, and text-to-speech conversion. The process begins when the user activates a button, triggering a camera to capture a document image, which is then processed to extract text and convert it into audio output.
Conclusion
This paper introduces a Raspberry Pi-based reading assistant designed to empower visually impaired individuals by converting printed text into speech through Optical Character Recognition (OCR) and Text-to-Speech (TTS) synthesis. The prototype, constructed with easily accessible components for under INR 5,000, achieved an impressive OCR accuracy of over 95% in typical indoor lighting, with text-to-speech response times of less than 3 seconds. This demonstrates the practical effectiveness of the system. With its fully offline functionality, user-friendly push-button interface, and Bluetooth audio output, this solution stands out as a truly accessible and affordable assistive technology for the visually impaired community in India and other developing regions around the globe.
Looking ahead, there are several exciting possibilities for this project: (i) Language Expansion: Incorporating regional Indian languages like Telugu, Hindi, and Tamil through multi-language Tesseract models and gTTS language settings. (ii) Enhanced OCR Accuracy: Utilizing advanced deep learning-based OCR models such as EasyOCR and PaddleOCR to boost recognition capabilities in difficult lighting and with handwritten text. (iii) Voice Customization: Offering natural, adaptive speech output with neural TTS engines like Coqui TTS for a more human-like voice experience. (iv) Additional Assistive Features: Implementing navigation aids using GPS, object recognition with a camera and YOLO, and context awareness through scene description AI. (v) Large-Scale Deployment: Establishing accessible reading stations in schools, public libraries, government offices, and healthcare centers for permanent use.
References
[1] World Health Organization (WHO), World Report on Vision, WHO Press, Geneva, Switzerland, 2019. [Online]. Available: https://www.who.int/publications/i/item/9789241516570
[2] OrCam Technologies Ltd., \"OrCam MyEye 2 — Assistive Device for the Visually Impaired,\" Jerusalem, Israel, 2023. [Online]. Available: https://www.orcam.com
[3] eSight Corporation, \"eSight 4 Electronic Glasses for the Visually Impaired,\" Ottawa, Canada, 2023. [Online]. Available: https://esighteyewear.com
[4] Raspberry Pi Foundation, Raspberry Pi 3 Model B+ Official Documentation, Cambridge, UK, 2023. [Online]. Available: https://www.raspberrypi.com/documentation/
[5] Google LLC, \"gTTS — Google Text-to-Speech Python Library,\" Version 2.x, 2023. [Online]. Available: https://pypi.org/project/gTTS/
[6] UNESCO, Assistive Technology for Inclusive Education: Principles, Policies and Practices, UNESCO, Paris, France, 2020.
[7] A. Praveen Kumar, R. Suresh, and M. Lakshmi, \"Assistive technology for visually impaired using OCR and speech synthesis,\" in Proc. IEEE Int. Conf. Electron., Comput. Commun. Technol. (CONECCT), Bangalore, India, 2022, pp. 1–6.
[8] V. Ramesh and K. Sundaravadivelu, \"Smart reading device for visually impaired using Raspberry Pi and cloud OCR,\" Int. J. Eng. Res. Technol. (IJERT), vol. 10, no. 3, pp. 245–250, 2021.
[9] P. Jayaraman, G. Suresh, and T. Kumar, \"Low-cost reading aid for blind people using OCR and TTS on embedded Linux platforms,\" in Proc. Int. Conf. Comput. Commun. Informatics (ICCCI), Coimbatore, India, 2020, pp. 1–5. Springer, Singapore.
[10] UNESCO, \"Assistive Technology for Inclusive Education,\" Education Sector, UNESCO, Paris, 2020. [Online]. Available: https://unesdoc.unesco.org
[11] N. Katzir, \"OrCam MyEye: A comprehensive assistive device for the visually impaired,\" IEEE Pervasive Comput., vol. 18, no. 1, pp. 68–71, Jan.–Mar. 2019.
[12] Microsoft Corporation, \"Seeing AI — Talking Camera App for the Blind Community,\" Microsoft, Redmond, WA, 2023. [Online]. Available: https://www.microsoft.com/en-us/ai/seeing-ai
[13] NV Access, \"NVDA (NonVisual Desktop Access) Screen Reader — Open Source,\" Version 2023.x, 2023. [Online]. Available: https://www.nvaccess.org
[14] Envision Technologies B.V., \"Envision AI — Reading and Navigation for the Blind,\" Delft, Netherlands, 2023. [Online]. Available: https://www.letsenvision.com
[15] R. Smith, \"An overview of the Tesseract OCR engine,\" in Proc. 9th Int. Conf. Document Anal. Recognition (ICDAR), Curitiba, Brazil, 2007, pp. 629–633. Google Inc. Tesseract OCR GitHub: https://github.com/tesseract-ocr/tesseract