This study presents a real-time hand gesture-based virtual mouse system leveraging computer vision technologies. The system utilizes a webcam, OpenCV, Mediapipe, PyAutoGUI, and Pynput to allow touchless cursor movement, left/right clicks, double click, and screenshot functionality. The goal is to improve human-computer interaction, especially in hygienic or accessibility-focused environments. This method replaces traditional physical mouse devices with intuitive gestures, enhancing ease of use and inclusivity.
Introduction
Overview
This project introduces a gesture-based virtual mouse system that enables touchless interaction using hand tracking via a webcam. It is particularly useful in fields like healthcare, AR/VR, gaming, and assistive technologies. The system simulates standard mouse functions (move, click, screenshot) by interpreting hand gestures detected in real time.
Key Features
No wearable device required – operates with a standard webcam.
Uses Mediapipe to detect 21 hand landmarks.
PyAutoGUI and Pynput simulate mouse actions based on hand gestures.
Cursor movement is controlled by the index finger position.
Clicking and screenshot gestures are determined using angle and distance calculations between finger joints.
Methodology
Webcam captures live video.
OpenCV processes the video frames.
Mediapipe detects hand landmarks.
Angles/distances between landmarks are computed.
Gestures are mapped to mouse actions:
Move cursor when index finger is extended.
Click based on joint angles.
Screenshot triggered by a unique hand pose.
Implementation
Language: Python
Libraries Used:
OpenCV – frame capture and display
Mediapipe – real-time hand landmark tracking
PyAutoGUI – simulate cursor actions
Pynput – control mouse buttons
Coordinates from hand landmarks are scaled to match screen dimensions for smooth, natural cursor motion.
Results & Limitations
Reliable performance under good lighting and plain backgrounds.
Cursor control and gesture recognition showed high accuracy.
Click and screenshot gestures functioned with minimal latency.
Limitations:
Sensitive to hand occlusions and lighting variations.
Reduced accuracy with complex or cluttered backgrounds.
Future Improvements
Add adaptive thresholding for dynamic lighting conditions.
Allow gesture customization and training for user-specific preferences.
Conclusion
This virtual mouse system showcases the practical potential of real-time hand tracking for touchless HCI. It demonstrates that standard webcams and lightweight libraries are sufficient for effective mouse control. The system can serve as an assistive tool for differently-abled users or be deployed in environments requiring hygiene. Future enhancements could involve multi-hand detection and custom gesture training for expanded control.
References
[1] Taylor, J. et al., \'Real-Time Hand Tracking with a Deep Sensor\'
[2] Zhang, Z., \'Hand Gesture Recognition with EMG Sensors\'
[3] Mueller, C., \'Low-Latency Hand Tracking Using Event Cameras\'
[4] Tompson, J., \'Real-Time Continuous Pose Recovery of Hands\'
[5] Zhang, F., \'End-to-End Hand Mesh Recovery\' (ECCV 2020)
[6] Sridhar, S., \'Real-Time Hand Tracking with a Color Glove\'