Communication plays a vital role in everyone’s life, but people who rely on sign language often struggle to interact with those who don’t understand it. To address this gap, this paper introduces Handify, a real-time system designed to make communication easier between deaf individuals and hearing people. Handify converts sign language gestures into readable text, allowing smoother and more natural interaction. The system uses computer vision techniques with OpenCV and applies a Random Forest machine learning model to accurately classify hand gestures. Using a custom dataset created from American Sign Language (ASL), the model achieved an accuracy of 91.7%. This paper also explains the design of the system, the challenges faced during development, the experimental results, and how the tool can improve accessibility for people with hearing or speech disabilities. Overall, Handify contributes to inclusive technology by helping create seamless, barrier-free communication between signers and non-signers.
Introduction
The text discusses the importance of communication and the challenges faced by people with hearing disabilities, many of whom rely on sign language. Since most people do not understand sign language and interpreters can be costly or unavailable, communication barriers persist. To address this, the Handify project uses computer vision and machine learning to translate sign language gestures into real-time text, improving accessibility.
The literature review shows the evolution of sign language recognition systems—from early sensor-based methods (like gloves) to more practical vision-based approaches using cameras. Advanced techniques such as Hidden Markov Models and deep learning (CNNs and RNNs) have improved accuracy, but often require high computational power, large datasets, or expensive setups. The proposed Handify system offers a simpler and more accessible alternative with competitive accuracy (~91.7%).
The methodology involves three main stages:
Gesture Detection using a camera, OpenCV, and MediaPipe for hand tracking.
Gesture Processing and Classification by extracting hand landmarks and using a Random Forest classifier trained on over 10,000 samples.
Text Conversion and Output, where recognized gestures are converted into readable text with real-time feedback and improved accuracy using buffering and language modeling.
The system is designed to run on standard hardware using tools like Python, OpenCV, MediaPipe, and Scikit-learn. Its modular architecture includes input, detection, recognition, and user interface components, enabling efficient real-time performance (15–20 FPS).
Overall, Handify provides an efficient, cost-effective, and user-friendly solution for real-time sign language translation, helping bridge communication gaps between sign language users and non-signers.
Conclusion
Handify represents a significant step toward bridging the communication gap between sign language users and others. By leveraging computer vision and machine learning, it can recognize signs in real time with 91.7% accuracy for static gestures. While dynamic gesture recognition remains a challenge, the system provides a solid foundation for future improvements. As technology progresses, tools like Handify have the potential to greatly enhance accessibility and inclusion for the deaf and hard-of-hearing community.
References
[1] World Health Organization, \"Deafness and hearing loss,\" WHO Fact Sheet, 2020.
[2] Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.
[3] Scikit-learn API Reference – https://scikit-learn.org/
[4] American Sign Language Dataset – https://www.kaggle.com/datasets/grassknoted/asl-alphabet
[5] Rosebrock, A. (2018). Deep Learning for Computer Vision with Python.
[6] S. Mitra and T. Acharya, \"Gesture Recognition: A Survey,\" IEEE Transactions on Systems, Man, and Cybernetics, vol. 37, no. 3, pp. 311–324, 2007.
[7] T. Starner, J. Weaver, and A. Pentland, \"Real-Time American Sign Language Recognition Using Desk and Wearable Computer-Based Video,\" IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 12, pp. 1371–1375, 1998.
[8] N. Camgoz et al., \"Neural Sign Language Translation,\" IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[9] S. Ong and S. Ranganath, \"Automatic Sign Language Analysis: A Survey and the Future Beyond Lexical Meaning,\" IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 6, pp. 873–891, 2005.
[10] O. Koller et al., \"Continuous Sign Language Recognition: Towards Large Vocabulary Statistical Recognition Systems Handling Multiple Signers,\" Computer Vision and Image Understanding, vol. 141, pp. 108–125, 2015.
[11] SignAll Technologies, \"SignAll: Real-time Translation of Sign Language,\" 2020.