Communication is an essential requirement for human interaction, but it is very difficult for people with speech and hearing disabilities to communicate their thoughts to the non-sign language speakers. To solve this problem, the project AI-Based Real-Time Sign Language Recognition is conceived with the objective of filling this communication gap by developing a system that can translate sign language movements into speech and text, thus facilitating effective and seamless interaction between the deaf, mute, and the general population. The system is deployed on Raspberry Pi 4, with the advantage of the efficiency of TensorFlow Lite for AI-based gesture recognition and OpenCV for real-time image processing. Hand gestures are captured using a camera, and they are processed by a trained AI model that detects and classifies individual signs. Upon detection, they are translated into text, which in turn is transformed into natural spoken output through Google Text-to-Speech (TTS) technology. Apart from speech translation, the detected text is also rendered on an external display device, making the translated message readily available in a broad range of settings, ranging from noisy environments where audio alone might prove inadequate. By combining affordability, portability, and live performance capabilities, this project offers a practical and effective solution that can significantly increase accessibility for the deaf and mute community, ultimately fostering inclusiveness in education, the workplace, and day-to-day social interactions.
Introduction
Communication barriers faced by people with hearing and speech impairments persist because sign language is not widely understood. Advances in artificial intelligence, especially deep learning and computer vision, now enable real-time translation of sign language into text and speech. Using CNNs, RNNs, and frameworks like Mediapipe and TensorFlow Lite, modern systems can accurately detect hand shapes, movements, and facial cues, making AI-based sign language recognition a practical and inclusive solution for education, healthcare, workplaces, and public service environments.
A review of recent works shows major progress but also clear limitations: most systems rely on predefined datasets, static gestures, controlled environments, or specialized hardware like gloves. Accuracy is often high, but scalability, robustness, and real-world usability remain limited. Existing solutions struggle with dynamic gestures, lighting variation, hardware constraints, and generalization.
The paper identifies the need for a portable, affordable, real-time Indian Sign Language (ISL) recognition system that works without wearables, large GPUs, or controlled environments. To address this, a Raspberry Pi–based AI system is proposed, combining Mediapipe for hand landmark extraction with a TensorFlow Lite CNN–LSTM model for gesture classification. The system captures gestures through a webcam, extracts 21 hand landmarks, processes them via an optimized deep learning model, and outputs both on-screen text and speech using a TTS engine.
Implementation involves four stages: dataset creation, landmark extraction, model training, and real-time prediction. Experimental evaluation using five ISL gestures achieved 96.44% accuracy, demonstrating reliable performance with low latency and smooth real-time recognition. Results confirm that the system is cost-effective, portable, and practical for real-world use, though still sensitive to lighting, occlusion, and limited to static gestures.
Overall, the proposed solution contributes an accessible AI-driven platform that reduces communication barriers for the hearing and speech-impaired population. While effective, future improvements are needed in dataset size, environmental robustness, and support for continuous sign language to enable more natural and expressive communication.
Conclusion
The current contribution is a low-cost and portable AI-driven sign language recognition system in real-time implemented on Raspberry Pi. Computer vision and deep learning come together to make the system efficiently recognize pre-defined ISL gestures and map them to text and speech output. The system with dual-mode feedback guarantees accessibility across diverse environments, even noisy ones. Experimental results validate that the system works well in real time, thus closing the communication gap between sign language users and non-signers. In general, the project showcases a practical solution for inclusivity and accessibility for the hearing and speech-impaired society.
References
[1] S. Srivastava, A. Gangwar, R. Mishra, and S. Singh, “Sign Language Recognition System using TensorFlow Object Detection API,” International Conference on Advanced Network Technologies and Intelligent Computing (ANTIC-2021), Springer, 2021.
[2] R. Kumar, S.K. Singh, A. Bajpai, and A. Sinha, “Mediapipe and CNNs for Real-Time ASL Gesture Recognition,” ArXiv Preprint arXiv:2305.05296v3, 2023.
[3] M. A. R. ElTokhy and S. S. Abas, “User-Driven Sign Language Recognition Using AI: AI Tool for Enhanced Communication in Closed Communities,” Nanotechnology Perceptions, vol. 20, no. S14, pp. 3644–3652, 2024.
[4] S. Kumar, R. Rani, and U. Chaudhari, “Real-time sign language detection: Empowering the disabled community,” MethodsX, vol. 13, 2024.
[5] A. M. A., D. L. P., S. K. S., and S. V. N., “A Comprehensive Deep Learning-Based System for Real-Time Sign Language Recognition and Translation Using Raspberry Pi,” International Journal of Computer Trends and Technology (IJCTT), vol. 72, no. 12, pp. 8–16, Dec. 2024.
[6] R. A. Ch., N. S. Ramya, K. Sumanjali, K. V. Lakshmi, and K. Gayatri, “Sign Language Recognition and Speech Conversion Using Raspberry Pi,” International Journal of Creative Research Thoughts (IJCRT), vol. 8, no. 5, pp. 2103-2104, May 2020