Human–Computer Interaction (HCI) traditionally relies on physical input devices such as keyboards, mice, and touchscreens. While these devices provide efficient interaction mechanisms, they require direct physical contact and may not be suitable in environments where hygiene, accessibility, or hands- free operation is required. Gesture recognition offers a natural and intuitive alternative that enables users to interact with computing systems using hand movements.
This paper presents a real-time hand gesture–based touchless control system using computer vision techniques. The proposed system utilizes MediaPipe for detecting 21 hand landmarks and OpenCV for real-time image processing. Hand gestures are recognized by analyzing spatial relationships between detected landmarks and are mapped to system commands such as cursor navigation, clicking, scrolling, slide navigation, and multimedia control.
A gesture dataset consisting of 500 samples was collected from multiple users to evaluate system performance. Experimental resultsdemonstratethattheproposedsystemachievesanaverage recognition accuracy of approximately 92% while maintaining real-time performance at around 25 frames per second.
The proposed approach provides a low-cost, scalable, and efficientsolutionfortouchlessinteractionthatcanbeappliedinsmartclassrooms,healthcareenvironments,assistivetechnolo- gies, and public interactive systems.
Introduction
Human–Computer Interaction (HCI) traditionally depends on physical input devices like keyboards and mice, which are not suitable for touchless or hygiene-sensitive environments. To address this, the paper proposes a gesture-based touchless control system that allows users to interact with computers using natural hand movements.
The system uses computer vision technologies, specifically OpenCV for image processing and MediaPipe for detecting 21 hand landmarks, to recognize gestures in real time through a webcam. These gestures are interpreted using geometric relationships (distances and angles between fingers) and mapped to computer actions such as cursor movement, clicking, scrolling, slide navigation, and volume control.
A modular architecture is implemented, consisting of stages like frame capture, preprocessing, hand landmark detection, gesture recognition, and command execution. The system avoids heavy deep learning models, making it efficient and capable of running on standard computers.
A dataset of 500 gesture samples from multiple users was created to evaluate performance. The system achieved high accuracy (around 92–93%), low latency (~70 ms), and strong precision and recall, enabling smooth real-time interaction.
Conclusion
This paper presented a gesture-based touchless control system designed for real-time human–computer interaction. The proposed system utilizes MediaPipe hand tracking and OpenCV image processing to detect hand landmarks and interpret user gestures.
The system enables users to control cursor movement, per- form clicking operations, scroll documents, navigate presen- tation slides, and control multimedia functions using natural hand gestures captured by a webcam.
Experimental results demonstrate that the proposed ap- proach achieves high gesture recognition accuracy with an averageperformanceofapproximately92%whilemaintaining real-time processing speed of 25 frames per second.
Thesystemprovidesacost-effectiveandaccessiblesolution for touchless interaction without requiring specialized hard- ware. This makes it suitable for applications in education, healthcare, assistive technologies, and public interactive sys- tems.
Overall, the proposed gesture recognition framework con- tributes to the development of intuitive and hygienic interac- tion mechanisms for next-generation computing systems.
References
[1] G. Bradski and A. Kaehler, Learning OpenCV: Computer Vision withthe OpenCV Library. O’Reilly Media, 2008.
[2] R. Szeliski, Computer Vision: Algorithms and Applications. Springer,2011.
[3] Google Research, “MediaPipe: A Framework for Building PerceptionPipelines,” 2020. [Online]. Available: https://mediapipe.dev
[4] A. Haria, A. Subramanian, N. Asokkumar, S. Poddar, and J. S. Nayak,“HandGestureRecognitionforHumanComputerInteraction,”ProcediaComputer Science, vol. 115, pp. 367–374, 2017.
[5] H. Khanum and P. H. B, “Smart Presentation Control by Hand GesturesUsingComputerVisionandMediaPipe,”InternationalResearchJournalof Engineering and Technology, vol. 9, no. 7, pp. 1234–1238, 2022.
[6] S. Mitra and T. Acharya, “Gesture Recognition: A Survey,” IEEETransactions on Systems, Man, and Cybernetics, vol. 37, no. 3, pp.311–324, 2007.
[7] T. Starner and A. Pentland, “Real-Time American Sign LanguageRecognitionfromVideoUsingHiddenMarkovModels,”inProceedingsof the IEEE International Symposium on Computer Vision, 1995.
[8] Z. Ren, J. Yuan, J. Meng, and Z. Zhang, “Robust Hand GestureRecognitionwithKinectSensor,”inProceedingsoftheACMMultimediaConference, 2011.
[9] J.Shottonetal.,“Real-TimeHumanPoseRecognitioninPartsfroma Single Depth Image,” in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition, 2011.
[10] I.Goodfellow,Y.Bengio,andA.Courville,DeepLearning.MITPress,2016.
[11] Y.LeCun,Y.Bengio,andG.Hinton,“DeepLearning,”Nature,vol.521,no. 7553, pp. 436–444, 2015.
[12] P. Viola and M. Jones, “Rapid Object Detection Using a BoostedCascade of Simple Features,” in Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition, 2001.
[13] W. Freeman and M. Roth, “Orientation Histograms for Hand GestureRecognition,”inInternationalWorkshoponAutomaticFaceandGestureRecognition, 1995.
[14] Z. Zhang, “A Flexible New Technique for Camera Calibration,” IEEETransactions on Pattern Analysis and Machine Intelligence, vol. 22, no.11, pp. 1330–1334, 2000.
[15] J. Malik et al., “Object Recognition in Computer Vision,” IEEE Trans-actionsonPatternAnalysisandMachineIntelligence,vol.33,no.8,pp.1545–1560,2011.