Sign language recognition systems are crucial for improving communication between the hearing and deaf communities. This paper explores the development of a real-time sign language detection system that uses a combination of computer vision techniques and machine learning algorithms. Specifically, it employs MediaPipe, a computer vision framework, to extract hand landmarks, and a Random Forest Classifier to classify the gestures. This system is capable of recognizing ten distinct signs in real time. The paper provides a detailed description of the design, development, and implementation of the system, as well as its evaluation. The findings demonstrate that the system can detect signs with over 80% accuracy and offers potential for further development in sign language accessibility applications.
Introduction
Summary:
Sign language is essential for communication among the deaf community, with over 466 million people worldwide affected by disabling hearing loss. This paper presents a real-time sign language detection system that recognizes 10 hand gestures using MediaPipe for hand landmark detection and a Random Forest Classifier for gesture classification.
Methodology:
Data Collection: Images of 10 common sign language gestures (e.g., “Hello,” “Thank You,” “No”) were captured using a webcam, collecting 200 images per class.
Preprocessing: MediaPipe detected 21 hand landmarks per image, normalizing their coordinates for consistency.
Model Training: A Random Forest Classifier was trained on the normalized landmark data, with an 80-20 train-test split.
Inference: The system performs real-time gesture prediction from webcam input, displaying signs only when prediction confidence exceeds 0.75.
System Architecture:
The system consists of modules for input capture, hand detection, feature preprocessing, classification using a multi-layer perceptron (MLP) neural network, and output display. It uses Python, OpenCV, MediaPipe, and TensorFlow/Keras.
Results:
The model achieved 82.5% accuracy on the test set, successfully recognizing gestures such as “Hello,” “Thank You,” and “No” in real time with low latency.
Future Work:
Plans include expanding the gesture set, enabling detection of gestures from both hands, integrating with messaging or virtual assistant apps, and improving accuracy using advanced deep learning models like CNNs.
Conclusion
This project highlights the feasibility of using machine learning and computer vision techniques to build a sign language detection system. By leveraging MediaPipe for hand landmark extraction and a Random Forest Classifier for gesture classification, the system demonstrated promising real-time performance. While there are still challenges to overcome, such as lighting conditions and camera quality, the system shows potential for real-world applications in improving communication for individuals with hearing impairments.
References
[1] HamzaMnassri; RiadhBchir; Mohamed Amine Zayane; TaoufikLadhari,Sign Language Detection Based on Artificial Intelligence from Images,2024 IEEE International Conference on Artificial Intelligence Green Energy (ICAIGE)
[2] JeetDebnath; Praveen Joe I R,RealTime Gesture Based Sign Language Recogniti on System,2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS)
[3] KajalDakhare;VidhiWankhede;PrateekVerma A Survey on Recognition and Translati on System of Real-Time Sign Language,2024 2nd DMIHER International Conference on Artificial Intelligence in Healthcare, Education and Industry (IDICAIEI)
[4] Gladys Jessica Ruslim; Nicholas Matthew Salim; Ivan Sebastian Edbert; DerwinSuhartono,Sign Language Detection to Enhance Online Communication for Deaf and Mute Individuals through Deep Learning Models,2024 International Conference on Computer Engineering,
[5] Network, and Intelligent Multimedia (CENIM)