Sign Language Recognition Through Action Detection

Authors: Punit Mittal, Priyansh Rana, Geetansh Anand, Deepanshu Kumar Mani

DOI Link: https://doi.org/10.22214/ijraset.2025.70398

Abstract

In order to bridge the communication gap between the general public and people who are difficult of hearing, sign detection is crucial. This study presents a novel approach to sign language identification that blends action detection with deep learning models based on the Python-based LSTM algorithm. Accurately recognizing and decoding sign language movements in real time is the goal of the suggested system. The goal of this study is to use deep learning techniques, namely the LSTM architecture, to effectively recognize sign language motions and record temporal connections. A large dataset of various sign language gestures has been gathered and examined with the goal to train and evaluate the accuracy of the LSTM model.The processes in the process of sign language detection pipeline include media acquisition, preprocessing, feature extraction, and model training. During the pre-processing stage, the acquired video data is separated into individual frames, and various image processing techniques are applied to enhance the quality and remove noise. After that, the preprocessed frames undergo robust feature extraction using techniques like optical flow or machine learning-based feature extraction. The LSTM model is then trained with the recovered information to learn the time-dependent properties of sign language gestures. Transfer training is also being researched as a way to use pre-trained models on large action recognition datasets. The effectiveness of the model that was trained is assessed using a variety of measures, including recall, accuracy, precision, and F1-score. The LSTM-based deep learning model\'s exceptional accuracy rate shows that it can handle the temporal component of sign language. The system\'s real-time performance makes it suitable for a wide range of applications, such as real-time interpretation tools or assistive devices for those with hearing impairments.

Introduction

The project focuses on developing an accurate, real-time sign language recognition system using deep learning, specifically LSTM (Long Short-Term Memory) networks. Sign language is essential for communication with the deaf and hard-of-hearing, but interpreting it can be challenging for non-users. Gesture recognition systems using LSTM can capture the temporal and sequential nature of sign language more effectively than traditional methods.

The literature review covers various hand gesture recognition techniques, including appearance-based and 3D hand model approaches, with a focus on image feature extraction, classification algorithms (like SVM, AdaBoost, HMM), and recent advances in deep learning (CNN, RNN, transformers) for improved accuracy and robustness.

The methodology involves acquiring a diverse dataset, preprocessing video frames, extracting features using optical flow and CNNs, training an LSTM model, evaluating it with accuracy, precision, recall, and F1-score metrics, and deploying it for real-time sign language detection.

The proposed LSTM architecture consists of multiple LSTM layers, dropout layers to prevent overfitting, fully connected layers, and activation functions (ReLU, Softmax) to output gesture classifications.

Compared to traditional methods, the LSTM-based approach excels at modeling complex temporal relationships in dynamic gestures, reducing the need for manual feature engineering, and scaling better with larger gesture vocabularies.

Results show the model achieves high accuracy and low loss, demonstrating efficient learning and strong performance in recognizing sign language gestures in real time. The system has practical applications for improving communication for sign language users, with future work suggested to expand vocabulary, enhance deployment on edge devices, and support multiple languages.

Conclusion

In conclusion, by effectively using both deep learning and action recognition approaches, the Sign Language Recognition using Action Recognising with Python project correctly recognizes and interprets sign language gestures. The project\'s primary components—feature extraction, LSTM model training, data collection, pre-processing, and real-time detection—have all been implemented and evaluated. By utilizing pre-processing techniques and collecting a variety of sign language video data, the submitted frames\' uniqueness was increased. By including temporal correlations and capturing both motion-based and high-level features, the LSTM model effectively represented the sequential nature of sign language gestures.The trained LSTM model demonstrated strong performance throughout the evaluation, as evidenced by metrics such as accuracy, recall, and the F1 score. The model\'s ability to recognize and interpret sign language gestures was further validated by its successful real-time deployment on a video stream. This implementation provided a practical and efficient tool for fostering communication between the general population and those with hearing impairments. The Sign Language identification using Actions Identification with Python project improves sign language identification systems by utilizing deep neural networks and action recognition methods. The experiment\'s result shows the potential of LSTM algorithms and their ability to identify temporal correlations in sign language gestures. Further research and optimization, such as analyzing transfer learning techniques or fine-tuning on specific sign language datasets, can enhance the model\'s performance. The research may be extended to accommodate a greater variety of sign language gestures and use more advanced deep learning techniques for even greater accuracy and robustness. All things taken into account, this Sign Language Identification using Action Detection with the programming language Python project shows how deep learning can be used to address real-world problems while providing a helpful tool for promoting inclusivity and effective communication for those with hearing impairments.

References

[1] Nicolas D. Georganas and Nasser H. Dardas. Real-time hand gesture identification and detection with support vector machine and bag-of-features methods. Measurement and Instrumentation Transactions, IEEE, 2011. [2] Nicolas D. Georganas and Emil M. Petriu Qing Chen. Haar-like characteristics for real-time vision-based hand gesture detection, 2007. [3] Bora, P.K. D. Ghosh and M.K. Bhuyan. Hand gestures with just global motions may be recognised using trajectory guidance. Index of International Science, 2008. [4] Shaped class i_cation utilising zernike moments, Michael Vorobyov, 2011. [5] Starner, Thad Eugene. Using hidden Markov models, American sign language may be visually recognised. 1995, MIT, Cambridge, MA, master\'s thesis. [6] Bora, P.K. D. Ghosh and M.K. Bhuyan. Hand gestures with just global motions may be recognised using trajectory guidance. Index of International Science, 2008. [7] Petriu Qing Chen, Emil M., and Georganas, Nicolas D. Dynamic hand gesture detection using extraction of features from 2D gesture trajectory, 2006. [8] Ingle, Manisha M. Gilorkar, Neelam K. Feature extraction for American and Indian Sign Language: A Review, 2014. [9] Chang Chih-Chung and Lin Chih-Jen. Feature extraction for American and Indian Sign Language: A Review, 2013. [10] DEEPA I K and ATHIRA P K ALEENA K RAJ. Software for converting sign language for those with voice and hearing impairments, 20 13. [11] Fore-Arm Contour-Based Real-Time Palm Monitoring and Hand Gesture Estimation, 20 II. [12] \"Speaker Independent Linked Speech Recognition\" [Online] by Fifth Generation Computers Corporation- Accessible at FifthGen.com. [13] Software for Dragon Speech Recognition [Online] by NUANCE. It is accessible via http://www.dragonsys.com. Raj Reddy, Hsiao-Wuen Hon, and Kai-Fu Lee, The SPHINX Speech Identification System Overview. IEEE Transactions on Signal Processing, Speech, and Acoustics. [14] [Online] \"SpeechRecognizer\" is developed by Android. Developer.android.com is accessible. [15] Zhang, Y., & Li, Y. (2021). Deep learning-based hand gesture recognition: A review. Pattern Recognition Letters, 146, 35-45. [16] Kim, J., Kim, J., & Hwang, E. (2022). Real-time hand gesture recognition using convolutional neural networks and recurrent neural networks. IEEE Transactions on Neural Networks and Learning Systems, 33(5), 2234-2245. [17] Rahman, M. M., Chowdhury, M. E. H., & Khandakar, A. (2023). Multi-modal hand gesture recognition for human-computer interaction. Sensors, 23(1), 122. [18] Wang, H., Zhang, J., & Luo, X. (2023). A hybrid deep learning model for robust hand gesture recognition. Expert Systems with Applications, 214, 119087. [19] Kumar, P., & Singh, R. (2024). Transformer-based hand gesture recognition for real-time applications. IEEE Access, 12, 10923-10936.

Copyright

Copyright © 2025 Punit Mittal, Priyansh Rana, Geetansh Anand, Deepanshu Kumar Mani. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET70398

Publish Date : 2025-05-05

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here

A PHP Error was encountered