Communication plays an essential role in human interaction, allowing individuals to express ideas and emotions. While spoken languages are widely used, individuals with hearing and speech impairments rely on sign language. However, the lack of widespread understanding of sign language creates communication barriers between them and the hearing community. This study presents a real-time Indian Sign Language (ISL) recognition system using the Media pipe framework and Long Short-Term Memory (LSTM) networks. The approach involves training an LSTM model to distinguish between different signs, utilizing a dataset generated through a pre-trained Holistic model from the Mediapipe framework, which serves as a feature extractor.
Introduction
Effective communication is essential for social interaction but poses challenges for deaf or mute individuals due to limited widespread understanding of sign language. Sign language recognition (SLR) technology helps bridge this gap by enabling hearing-impaired people to communicate more easily and participate fully in society. This study focuses on using Long Short-Term Memory (LSTM) networks to accurately detect, classify, and translate Indian Sign Language (ISL) gestures, which are complex and differ significantly from other sign languages like ASL.
A literature survey reveals that previous ISL recognition methods have mostly addressed static gestures with limited success on dynamic, continuous signs. Deep learning models, especially transformers and hybrid CNN-RNN architectures, show promise in handling spatial-temporal patterns for improved real-time and signer-independent recognition.
The study implements an SLR system that captures dynamic ISL gestures using video frames, extracting features via the Mediapipe holistic model (hands, face, pose). An LSTM-based neural network is trained on this data to recognize gestures. The model achieves over 95% accuracy on validation and real-time tests, though it struggles slightly with gesture transitions, indicating areas for future improvement.
Conclusion
Inconclusion,thestudysuccessfullydevelopedandevaluatedaSLRsystemfordynamicgesturesinISL recognition. Further research can focus on exploring different architectures and hyper-parametersto potentially improve accuracy even further.
Additionally, investigating the use of SLR systems for other sign languages and expanding the dataset couldyieldvaluableinsightsandadvancementsinthefieldofgesturerecognition.Furthermore,it would be interesting to investigate the impact of incorporating temporal information into the SLR system for dynamic gestures. This could involve exploring recurrent neural network architectures or attention mechanisms to capture the sequential nature of sign language.
Moreover, conducting user studies to evaluate the usability and effectiveness of the SLR system in real- world scenarios would provide valuable feedback for improving its practical applications.
Overall, the findings from this study lay the foundation for further studies in the domain of ISL recognition and pave pathways forthe creation of morerobust and accurate gesture recognition systems.
References
[1] D. S, K. H. K B, A. M, S. M, D. S and K. V, \"An Efficient Approach for Interpretation of Indian Sign Language using Machine Learning,\" 2021 3rd International Conference on Signal Processing and Communication (ICPSC), Coimbatore, India, 2021, pp. 130-133, doi: 10.1109/ICSPC51351.2021.9451692.
[2] K. Shenoy, T. Dastane, V. Rao and D. Vyavaharkar, \"Real-time Indian Sign Language (ISL) Recognition,\" 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Bengaluru, India, 2018, pp. 1-9, doi: 10.1109/ICCCNT.2018.8493808.
[3] P. C. Badhe and V. Kulkarni, \"Artificial Neural Network based Indian Sign Language Recognition using hand crafted features,\" 2020 11th International Conference on Computing, Communicationand Networking Technologies (ICCCNT), Kharagpur, India, 2020, pp. 1-6, doi: 10.1109/ICCCNT49239.2020.9225294.
[4] A.Wadhwan,andP.Kumar,“SignLanguageRecognitionSystems:ADecadeSystematicLiterature Review”, Archives of Computational Methods in Engineering, Springer, 2019, DOI: https://doi.org/10.1007/s11831-019-09384-2
[5] D. R. Kothadiya, C. M. Bhatt, T. Saba, A. Rehman and S. A. Bahaj, \"SIGNFORMER: DeepVision Transformer for Sign Language Recognition,\" in IEEE Access, vol. 11, pp. 4730-4739, 2023, doi: 10.1109/ACCESS.2022.3231130.
[6] K. Han, Y. Wang, H. Chen, X. Chen, J. Guo, Z. Liu, Y. Tang, A. Xiao, C. Xu, Y. Xu, Z. Yang, Y. Zhang, and D.Tao, ‘‘Asurvey on vision transformer,’’ IEEETrans. PatternAnal. Mach. Intell., vol. 45, no. 1, pp. 87–110, Jan. 2023.
[7] M. Al-Qurishi, T. Khalid and R. Souissi, \"Deep Learning for Sign Language Recognition: Current Techniques, Benchmarks, and Open Issues,\" in IEEE Access, vol. 9, pp. 126917-126951, 2021, doi: 10.1109/ACCESS.2021.3110912.