Surveillance systems now move faster than before, leaving behind old cloud-heavy cameras. Instead of sending everything online, small smart gadgets handle tasks right where they happen. One example uses the tiny ESP32 chip, turning it into a watcher that moves two ways while spotting things. Vision jobs run light and quick on board, avoiding heavy data loads. A compact form of machine learning, powered by TensorFlow Lite for Microcontrollers, helps spot objects without delay. Messages go out efficiently when needed, thanks to MQTT\'s lean communication method. Speed improves, internet traffic drops, personal information stays safer. Earlier designs get examined here alongside decisions made during building. Choices about parts and code shape what each system can do. Some compromises appear necessary between power needs, accuracy, and cost. The setup described guides movement with servos, keeping eyes on moving targets in real time.
Introduction
This text describes the development of a low-cost, intelligent surveillance system that moves beyond traditional CCTV cameras. Conventional systems rely on continuous video streaming to remote servers, which causes high bandwidth usage, delayed responses, privacy concerns, and dependency on internet connectivity. They also lack on-device intelligence, making them inefficient for real-time decision-making.
To solve these issues, the proposed system uses edge computing and TinyML on an ESP32-CAM device. Instead of sending video to the cloud, the camera performs local processing to detect motion and make decisions in real time. It is enhanced with a dual-axis (pan-tilt) servo mechanism that allows the camera to automatically move and track objects, expanding its field of view without extra hardware.
The system is built in phases. Phase 1 (completed) includes live video streaming and pan-tilt control using ESP32-CAM. Phase 2 (planned) will add on-device person detection and automatic tracking using Edge AI, while Phase 3 will introduce alerts, recording, and mobile notifications for a fully autonomous surveillance system.
The literature review shows that existing work typically focuses on either video streaming, servo control, or edge AI separately, but not as a unified system. The proposed project fills this gap by integrating all these features into a single low-cost embedded platform.
Conclusion
This paper presented the design, development, and testing of an Edge-AI enabled smart surveillance system as Phase 1 of a modular, multi-phase project. The prototype successfully integrates live video streaming and dual-axis camera movement using the ESP32-CAM. Testing confirmed reliable performance for real-time monitoring and remote access over Wi-Fi, establishing a strong foundation for adding on-device intelligence in the next phase. The system demonstrates a practical application of IoT, embedded systems, and Edge AI in intelligent surveillance, with relevance for homes, laboratories, campuses, and office environments.
References
[1] Espressif Systems, ESP32-CAM Wi-Fi Camera Module: Datasheet, Hardware Specifications, and Application Notes, Espressif Systems, 2019. [
[2] P. Warden and D. Situnayake, TinyML: Machine Learning with TensorFlow Lite for Microcontrollers. Sebastopol, CA, USA: O’Reilly Media, 2019, covering embedded ML deployment on low-power devices.
[3] Z. Shelby et al., “Embedded Machine Learning for Real-Time Image Classification on Microcontrollers,” Edge Impulse Technical Documentation, 2021, explaining edge AI model deployment.
[4] N. Kolban, “ESP32 Camera Web Server and Servo Motor Control using Arduino IDE,” Technical Tutorial Series on ESP32 Applications, 2020. [
[5] M. Banzi and D. Cuartielles, “Servo-Based Pan-Tilt Mechanism for Camera Systems,” Arduino Project Resources and Hardware Interfacing Guides, 2018.
[6] MIT Media Lab Researchers, “Embedded Vision and Real-Time Object Tracking on Low-Power Devices,” Research Publications on Edge Vision Systems, 2020.
[7] Random Nerd Tutorials, “ESP32-CAM Video Streaming and Pan-Tilt Control Projects,” Online Technical Guide for IoT and Embedded Vision, 2022.
[8] Espressif Systems, ESP32 Technical Reference Manual, Espressif Systems, 2020, detailing architecture, peripherals, and programming guidelines.
[9] Edge Impulse, “Deploying Person Detection Models on ESP32 Devices,” Edge Impulse Learning Resources for TinyML, 2022.
[10] Various Authors, “Edge-AI Based Surveillance and Real-Time Object Tracking Using Embedded Systems,” Recent Journal and Conference Publications in Embedded AI, 2023.