Sign language serves as a vital communicationmedium for individuals with hearing and speech impairments. Recent advances in computer vision and deep learning, especially Convolutional Neural Networks (CNNs), have paved the way for more accurate and real-time gesture recognition systems. This paper presents a CNN-based model for recognizing hand gestures corresponding to American Sign Language (ASL). The proposed system utilizes image preprocessing techniques, a custom CNN architecture, and publicly available datasets to trainand validate the model. Experimental results show that our system achieves an accuracy of over 95%, demonstrating its effectiveness in translating sign language into textual format.
Introduction
Communication for deaf or hard-of-hearing individuals relies heavily on sign language, which uses hand gestures and facial expressions. However, the limited understanding of sign language among the general population creates communication barriers. Advances in gesture recognition technology, especially camera-based systems powered by deep learning and Convolutional Neural Networks (CNNs), offer promising solutions for real-time sign language recognition.
This project developed a CNN-based system to recognize static American Sign Language (ASL) alphabet signs from images and convert them into audio output, facilitating communication between hearing-impaired individuals and others. Using a large ASL image dataset, the model achieved high accuracy (~95-96%) after training and was capable of real-time performance with low latency.
The system captures hand gestures via a camera, preprocesses images, classifies gestures using a CNN model, and then converts recognized signs into speech using text-to-speech technology. Despite its success with static gestures, challenges remain in handling similar-looking signs and dynamic signing. Future work could focus on recognizing full sentences and improving robustness under varying conditions.
Conclusion
his research demonstrates that CNN-based sign language recognition systems can enhance accessibility and inclusivity by enabling real-time translation of sign language into spoken words.
References
[1] W.H.Organization,“DeafnessandHearingLoss,” 2018. [Online].Available: https://www.who.int/newsroom/factsheets/detail/deafness-and-hearing-loss.
[2] W. C. Stokoe and M. Marschark, “Sign language structure: An outline of the visual communication systems of the american deaf,” J. Deaf Stud. Deaf Educ., vol. 10, no. 1, pp. 3–37, 2005.
[3] J. Wu, L. Sun, and R. Jafari, “A Wearable System for Recognizing American Sign Language in Real-Time Using IMU and Surface EMG Sensors,” IEEE J. Biomed.Heal.Informatics,vol.20,no.5,pp.1281–1290, 2016
[4] D. P. Corina, U. Bellugi, and J. Reilly, “Neuropsychological studies of linguistic and affective facial expressions in deaf signers,” Lang. Speech, vol.42,no.2–3,pp.307–331,1999.
[5] W. C. Stokoe, “Sign Language Structure,” Annu. Rev. Inc., vol. 9, no. 23, pp. 365–390, 1980.
[6] H. Lane, “Ethnicity, Ethics, and the Deaf-World,” J. Deaf Stud. Deaf Educ., vol. 10, no. 3, pp. 291–310, 2005.
[7] H. Brashear, T. Starner, P. Lukowicz, and H. Junker, “Using multiple sensors for mobile sign language recognition,” Seventh IEEE Int. Symp. Wearable Comput. 2003. Proceedings., pp. 45–52, 2003.
[8] U. Bellugi and S. Fischer, “A comparison of sign language and spoken language,” Cognition, vol. 1, no. 2–3, pp. 173–200, 1972.
[9] T. Mohammed, R. Campbell, M. MacSweeney, E. Milne, P. Hansen, and M. Coleman, “Speechreading skill and visual movement sensitivity are related in deaf speechreaders,” Perception, vol. 34, pp. 205–216, 2005.
[10] P. Arnold, “The Structure and Optimization of Speechreading,” J. Deaf Stud. Deaf Educ., vol. 2, no. 4, pp. 199–211, 1997.
[11] S. Liddell, Grammar, Gesture, and Meaning in American Sign Language. Cambridge University Press, 2003.
[12] R. Butler, D. Ph, S. Mcnamee, D. Ph, G. Valentine, and D. Ph, “Language Barriers: Exploring the Worlds of the Deaf,” vol. 21, no. 4, 2001.
[13] M. Munoz-Baell and M. T. Ruiz, “Empowering the deaf. Let the deaf be deaf,” J. Epidemiol. Community Health, vol. 54, no. 1, pp. 40–44, 2000.
[14] K. Grobel and M. Assan, “Isolated sign language recognition using hidden Markov models,” Syst. Man, Cybern. 1997., pp. 162–167, 1997.
[15] P. Garg, N. Aggarwal, and S. Sofat, “Vision-based hand gesture recognition,” IIH-MSP 2009 - 2009 5th Int. Conf. Intell. Inf. Hiding Multimed. Signal Process., vol. 3, no. 1, pp. 1–4, 2009.
[16] B. Garcia and S. A. Viesca, “Real-time American Sign Language Recognition with Convolutional Neural Networks,” Convolutional Neural Networks Vis. Recognit., 2016.
[17] T. Starner, J. Weaver, and A. Pentland, “Real-time American sign language recognition using desk and wearable computer based video,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 12, p. 1371, 1998.
[18] Al-Shamayleh, R. Ahmad, M. Abushariah, K. Alam, and N. Jomhari, “A systematic literature review on vision based gesture recognition techniques,” Multimed. Tools Appl., vol. 77, no. 21, pp. 28121–
[19] 28184, 2018.
[20] Dong, M. C. Leu, and Z. Yin, “American Sign Language alphabet recognition using Microsoft Kinect,” IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work., vol. 2015-Octob, pp. 44–52, 2015.