\"Sign Language Detection with Learning\" is a project designed to create the possibility of real-time conversation for both hearing-impaired signers and the general audience by detecting ASL hand gestures with the ability to convert them into text and spoken words. The web app\'s deep-learning models (EfficientNetV2, MobileNetV2) are trained in a Kaggle-provided ASL dataset that has been optimized from TensorFlow to TensorFlow Lite so it can be deployed as quickly as and be as lightweight as possible. The web app is designed to use live webcam video as input and perform frame-by-frame processing by OpenCV to detect what the user did and to display both the same alphabet or word and the confidence value that OpenCV comes up with. In addition to real-time detection, the project features an interactive learning module which allows an English-to-ASL converter with backend performance visualization, scoring by character, and suggestions for better clarity in making the gestures. We have also included a quiz module to allow students to re-enforce what they are learning, which makes the application a communications and education platform. The application itself is programed in Python using Flask as the backend, HTML, CSS, and JavaScript for the frontend, and the Web Speech API to turn the recognized text into speech with different voices and languages to choose from. Users can also modify detection threshold, consecutive acceptance rate, and speech speed settings as per their requirement to personalize their experience. Overall, the project integrates machine learning, accessibility, and education on one platform, providing a user-friendly solution to enable inclusive learning and communication
Introduction
Communication is vital for everyday life, but for hearing or speech-impaired individuals, sign language—especially American Sign Language (ASL)—is the primary medium. However, communication barriers arise when signers interact with non-signers, causing alienation and limited access. Advances in machine learning, AI, and computer vision enable real-time recognition and translation of sign language gestures into text and speech, bridging this gap.
The project proposes a web-based system using lightweight deep learning models (EfficientNetV2 and MobileNetV2) trained on large ASL datasets to detect static ASL alphabets through a webcam. The system translates signs into spoken and written language using technologies like TensorFlow, Flask, OpenCV, and Web Speech API. Users can customize settings for detection sensitivity and voice output. Beyond communication, the platform offers interactive learning tools, including an English-to-ASL translator, performance tracking, sign clarity feedback, and quizzes, making it useful for education and awareness.
Problem: Few people know sign language, creating a communication divide. Existing tools focus on detection but lack interactive learning components. This system fills that gap by combining real-time detection with engaging learning features.
Literature Review: Earlier methods relied on image processing and traditional machine learning, which struggled with real-world variability. Deep learning, especially CNNs, improved accuracy and scalability. Efficient, lightweight models like MobileNet and EfficientNet enable real-time applications on web and mobile platforms. Interactive systems with feedback and quizzes have been suggested to promote learning, alongside multilingual text-to-speech for wider accessibility.
Technical Requirements: The system runs on Windows/Linux/macOS, developed in Python with Flask backend. Key libraries include TensorFlow, OpenCV, and Web Speech API. Frontend technologies are HTML, CSS, JavaScript, and opencv.js for webcam access.
Proposed System: The web app detects ASL signs via webcam, predicts alphabets with confidence scores, and converts them into speech/text in multiple languages. Features include English-to-ASL translation, customizable settings, real-time feedback, and a learning quiz module. Privacy is maintained by not storing user data.
Implementation & Results: Models pretrained on ImageNet and fine-tuned on ASL datasets were converted to TensorFlow Lite for efficiency. Real-time webcam input is processed with OpenCV. The MobileNetV2 model achieved 91.7% accuracy, and EfficientNetV2 reached 95%, demonstrating strong performance. The platform supports both assistive communication and sign language education.
Conclusion
This project was fueled by a clear mission to connect the hearing-impaired with the rest of society through communication. Based on the MobileNetV2 deep learning model, we created a real-time ASL alphabet detection system that is both effective and user-friendly. Beyond detection, we added an interactive learning module featuring tutorials and quizzes to facilitate ongoing learning.
From data gathering to deployment, each process was crafted with accuracy and accessibility in mind. Validating at more than 91%, the system is both reliable and pragmatic for applications in everyday life. Future applications include extension to dynamic gesture recognition, entire sentence translation, and multilingual sign systems, opening doors to more accessible communication technologies.
References
[1] E. J. H. Praiselin, \"Sign Language Detection and Recognition Using Media Pipe and,\" International Journal of Scientific Research in Science and Technology, p. 9, 2024.
[2] M. J, V. Brahmadesam, S. N. S and S. K, \"Sign Language Recognition using Machine Learning,\" IEEE, p. 5, 2022.
[3] A. A. I, B. O. O and A. A. M, \"Machine learning methods for sign language recognition: A critical review and analysis,\" Elsevier, vol. 21, no. 2667-3053, p. 200056, 2021.
[4] C.-H. C. and C. A. G. , \"Designing SmartSignPlay: An Interactive and Intelligent American Sign Language App for Children who are Deaf or Hard of Hearing and their Families,\" in Association for Computing Machinery, New York, 2016.
[5] H. Orovwode, I. D. Oduntan and J. Abubakar, \"Development of a Sign Language Recognition System Using Machine Learning,\" 2023.
[6] H. Brashear, V. Henderson and K.-H. Park, \"American sign language recognition in game development for deaf children,\" Ney York, 2006.
[7] P. Mekala, Y. Gao, J. Fan and A. Davari, \"Real-time sign language recognition based on neural network architecture,\" 2011.
[8] A. Sahoo, G. Mishra and K. Ravulakollu, \"Sign language recognition: State of the art,\" ARPN Journal of Engineering and Applied Sciences, vol. 9, pp. 116-134, 2014.
[9] A. Hekmat, H. Abbas and H. Shahadi, \"Sign Language Recognition and Hand Gestures Review,\" Kerbala Journal for Engineering Science, 2022.
[10] Y. Z. and X. J. , \"Recent Advances on Deep Learning for Sign Language Recognition,\" Computer Modeling in Engineering & Sciences, vol. 139, p. 2399–2450, 2024.