Authors: Lakshya Jain, Upendra Bhanushali, Rajashree Daryapurkar
DOI Link: https://doi.org/10.22214/ijraset.2022.47142
Certificate: View Certificate
This paper describes the design and implementation of a deployable, multi-modal system executed using Deep Learning with Computer Vision. The system was majorly created using Transfer Learning Algorithms, GoogleNet, and MATLAB, which were then made portable and deployable using a Raspberry Pi with Pi Cam. The neural networks were trained on a dataset consisting of several images and is highly accurate. This system consists of Fire Detection, Face and Intruder Detection, Object Identification, and Animal Identification, and follow me features which are meant to be deployed on a mobile robot such as an autonomous rover, drone, glider, and aquatic robot.
Forest environments are an essential ecosystem for most animal species around the world. The world's forests are central to several major developmental and environmental issues like climate change, sustainability, and biodiversity. In the past few centuries, around forty-five percent of the earth's original forest cover has been lost due to urbanization, deforestation, degradation, and fragmentation of forests. Several malicious activities like hunting and poaching endangered species cause biodiversity problems and hence compromise the health of the forest. It is necessary to ensure that the ecosystem's health, such as forest areas, is adequately maintained. For the sustainability of forests and countries, it is necessary to monitor and track the health of forest areas. Global forest assessments provide information on development, changes, or progress in forests and forestry, which is required for decision-making on a national and international level. Therefore, collecting and compiling information related to our forest areas is essential. Since we have limited tools for monitoring the forest, this is where the need for innovation rises.
The main aim of our project is to design a system equipped with computer vision and a deep neural network capable of identifying fires, objects, and animals and making it such that it can be deployed onto an unmanned/manned vehicle. This system will be highly beneficial for the surveillance and health maintenance of remote forest areas as it will enable us to remotely monitor areas of the forest that we cannot access using traditional methods. It will also relieve humans of the risks and responsibility of going on expeditions into the forest by themselves. It also allows us to keep track of forest fires and animal populations and identify any malicious activities in the forest areas. Forests are an essential asset to our existence, and this makes the preservation of forests extremely important. Surveillance provides crucial forest data, such as human interference and early forest fire alerts, but multiple camera systems can be costly. The advantages of a surveillance rover are that it provides an actual time feed on our ground station(laptop), which can be recorded or post-processed; it is robust and more reliable than a CCTV camera which can leave blind spots. It makes autonomous decisions when an obstacle is encountered, identifies animals, and detects fire that can alert us about forest fires. While visiting a location which is dark or being on an expedition, it is always better to have someone carry your things for you, for example hold a torch or capture a feed while you focus on your activities. That's where an object following function plays its part. Utilizing robots for mundane tasks such as providing assistance makes it more reliable and precise. The basic idea behind the implementation of a computer vision enabled robot is to make it identify an object of a particular pattern by training it on the images captured by its camera and making it react to what it sees using the camera in real time. Robots with capabilities of following particular objects, living things, and humans are being actively researched for their potential to carry out routine tasks like carrying loads, collaborating with humans, and monitoring certain activities. Recent innovations in sensors and computer vision have helped us create more intuitive robots which can live with humans by utilizing their perception for movement estimation and obstacle avoidance. Object/People following robots can be used in purposes such as shopping carts for supermarkets, tour guides, and for carrying loads for industrial automation, hospitals, and other public places. In our case, the following feature has been deployed on a forest surveillance rover which can be used in forest related activities. This is achieved by doing the processing on the received camera feed from the rover on our laptop using MATLAB’s computer vision toolbox and MATLAB support package for Raspberry Pi.
II. LITERATURE REVIEW
Sahu et al.  have discussed how an autonomous vehicle's hardware setup is developed. OpenCV, the popular Computer Vision tool, is a controller of this robot. Images captured by a raspberry pi camera are processed by OpenCV and are sent as the signal in digital form to do the following action.
An algorithm has been developed based on color detection for navigating the vehicle in various directions. High-level algorithms such as 3D scene mapping, object recognition, and object tracking make the decisions by application. The raspberry pi camera captures images of the desired trajectory and symbols for making decisions. For the direction and movement of the vehicle, the images are processed to produce different commands.
Different symbols like stop left turn, and right turn is used as input. They verify the efficacy of the proposed algorithm experimentally.
Wu and Zhang  have worked on three crucial problems of forest fire detection: real-time detection, early fire detection, and false detection. To achieve real-time fire detection, they have implemented several object detection algorithms, such as Faster R-CNN, SSD, and Yolo V3. It is observed that SSD has the highest accuracy of 99.8% for fire and 68.4% for smoke detection. SSD detects the area of the fire by four values: Xmin, Ymin, Xmax, and Ymax.
These values represent the coordinates of the top left corner and the coordinates of the lower right corner; from these values, we calculate the area.
We catch the two interval frames of the fire; when the area grows more prominent, this must be the fire. This paper compares the performances of SSD and YOLO, and the results prove that SSD performs better in terms of detection accuracy and faster frame rate. The limitation is that these methods need a faster GPU, as NVidia 1070 has been used here.
Kumari and Sanjay  have worked on a robot car that can conduct surveillance operations in remote places, which may be harmful and difficult for humans to reach.
The authors have designed an android application that sends instructions to the robot through Bluetooth or Wi-Fi for navigation. The authors have created a mobile application made with MIT application creator. A smartphone is attached to the robot, along with the ESP8266 nodeMCU module, allowing the user to control it and provide a sound and video feed to the ground station. They have used an L298N HBridge motor driver to interface the Wi-Fi module with the motors. Audio and video feed are acquired using the mobile phone attached to the robot via an IP WEBCAM mobile app. They have also used the YOLO Algorithm for object detection, which identifies several objects.
S. Liawatimena et al.  aim at solving the problem of maintaining organized marine and fishery statistics according to the law in Indonesia in this paper.
They have developed a Matlab and transfer learning-based fish classification model on the dataset of fish images to solve the problem. Modified from the pre-trained network AlexNet they have made a network called FishNet. FishNet has a dataset of nearly fifteen thousand images divided into five thousand for each category of fish type. The three categories of fish are Coryphaena Hippurus, Euthynnus Affinis, and Katsuwonus Pelamis.
Dataset is split into 70:30, 70 for training, and 30 for the validation set. MATLAB is used for training purposes. The paper covers the concept of convolution neural networks and how a pre-trained network be used to make our network with high accuracy and less processing. FishNet, as it utilizes Alexnet, can achieve higher accuracy. FishNet achieves a validation accuracy of 99.63%.
In this paper, F. Yuesheng et al.  have worked on an optimized GoogleNet with the same accuracy and how it performs compared to other convolutional neural networks.
First, they use GoogleNet to classify tomatoes, pomegranates, oranges, lemons, apples, and colored peppers. To solve the problem of the low training speed of the GoogleNet network, they adjusted the structure of Inception and reduced the number of convolutional kernels of the model.
Then they increased the model's accuracy by 2% by making some changes. Then they compared the optimized GoogleNet to the other CNN like ResNet, AlexNet, VGG, and DenseNet.
The performance was increased to 98.82% and the speed to 33.68 sheets/sec.
We also explored various papers related to Computer Vision using Pi Cam, Deep Neural Networks, and its various implementations on systems similar to ours.
We then took a deep dive into MATLAB with Deep Learning Toolbox, Raspberry Pi Toolbox, and transfer learning to understand better the methods and concepts we wanted to implement in our project.
A. Block Diagram of our System
The following is a block diagram for our proposed solution for fire, object, and animal identification features on a mobile robot.
B. Raspberry Pi Setup for MATLAB Deep Learning Implementation
To start working with the Raspberry Pi and all the functions and programs involved, we opted to go for a 'headless setup' of the Raspberry Pi to be able to use it with our computer. Since we are working with MATLAB, we had to use a modified version of Raspbian OS with Deep Neural Network capabilities. This OS and the MATLAB Support Package for Raspberry Pi are needed for using MATLAB with Raspberry Pi.
C. Face Detection
The first feature we implemented using the Raspberry Pi-MATLAB system was face detection and recognition using the Computer Vision Toolbox in MATLAB. In our system, we access the Raspberry Pi's camera view over Wi-Fi using the Raspberry Pi Toolbox and Webcam toolbox in MATLAB.
D. Object Detection Algorithm:
???????E. ???????Animal Identification Algorithm:
Using Transfer Learning and a modified version of GoogleNet as our primary neural network, we have trained a deep neural network to identify and classify different 45 species of animals. This new network is called ‘animalnet’.
The Results for the Implementations are as follows:
a. Using image processing and pre-trained Deep Neural Network on MATLAB on
b. our Ground Station along with fire detection feature.
c. We have achieved a validation accuracy of 92.61% for animal identification.
d. We have also deployed the network to take input from the Raspberry Pi Camera
e. and it gave us highly accurate results.
F. ??????????????Fire Detection Algorithm
Using Transfer Learning with GoogleNet as our base network, we also designed a neural network to detect and identify a fire in the image data stream from the PiCam. The new network was named ‘firenet’.
The Observations from the Fire Detection Algorithm Implementation were:
a. Using Transfer Learning with GoogleNet as our base network, we also designed a neural network to detect and identify a fire in the image data stream from the PiCam.
b. The validation accuracy for fire detection is 98%.
C. We have also deployed the network to take input from the Raspberry Pi Camera
and it gave us highly accurate results.
???????G. Follow Me Feature Implementation
An image detection-based Follow-Me feature for a rover can be a very important feature for purposes such as forest expeditions, videography and for tasks such as carrying heavy loads in warehouse robots.
???????H. MATLAB App GUI
1) Thus, we have created a small-scale project wherein we have made a deployable system for face detection, fire detection, animal and object identification. 2) This system illustrates a low-cost, intelligent and portable solution which can be deployed on any rover, drone or aquatic robot easily. 3) We have created a solution involving mobile robots with surveillance cameras that can perform various kinds of vision-related tasks, enabling us to use it in various systems and backgrounds. 4) We have thus created a highly sophisticated system in a simple, usable form with an impressively accurate output. 5) The validation accuracy for ‘animalnet’ is 92.61%, while that of ‘firenet’ is 98%. 6) The implementation of Follow-Me Function for a mobile rover was successful and we are able to track and follow a red object. 7) We have created a GitHub Repository with our entire source code, MATLAB code as well as datasets. https://github.com/flux04/forest_surveillance_robot
 B. K. Sahu, B. Kumar Sahu, J. Choudhury and A. Nag,” Development of Hardware Setup of an Autonomous Robotic Vehicle Based on Computer Vision Using Raspberry Pi,” 2019 Innovations in Power and Advanced Computing Technologies (i-PACT), 2019, pp. 1-5, doi: 10.1109/i-PACT44901.2019.8960011.  Wu and L. Zhang,” Using Popular Object Detection Methods for Real Time Forest Fire Detection,” 2018 11th International Symposium on Computational Intelligence and Design (ISCID), 2018, pp. 280-284, doi: 10.1109/ISCID.2018.00070.  R. Kumari and P. S. Sanjay,” Smart Surveillance Robot using Object Detection,” 2020 International Conference on Communication and Signal Processing (ICCSP), 2020, pp. 0962-0965, doi: 10.1109/ICCSP48568.2020.9182125.  S. Liawatimena et al.,” A Fish Classification on Images using Transfer Learning and Matlab,” 2018 Indonesian Association for Pattern Recognition International Conference (INAPR), 2018, pp. 108-112, doi: 10.1109/INAPR.2018.8627007.  F. Yuesheng et al.,” Circular Fruit and Vegetable Classification Based on Optimized GoogleNet,” in IEEE Access, vol. 9, pp. 113599-113611, 2021, doi:10.1109/ACCESS.2021.3105112.  Budiharto, Widodo.” Intelligent surveillance robot with obstacle avoidance capabilities using neural network.” Computational intelligence and neuroscience 2015(2015).  A. A. Shah, Z. A. Zaidi, B. S. Chowdhry and J. Daudpoto,” Real time face detection/monitor using raspberry pi and MATLAB,” 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT), 2016, pp. 1-4, doi: 10.1109/ICAICT.2016.7991743.  E. Elbasi,” Reliable abnormal event detection from IoT surveillance systems,” 2020 7th International Conference on Internet of Things: Systems, Management and Security (IOTSMS), 2020, pp. 1-5, doi: 10.1109/IOTSMS52051.2020.9340162.  R. Huang, J. Pedoeem and C. Chen,” YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers,” 2018 IEEE International Conference on Big Data (Big Data), 2018, pp. 2503-2510, doi: 10.1109/Big-Data.2018.8621865.   Sakali, Nagaraju, and G. Nagendra.” Design and implementation of Web Surveillance Robot for Video Monitoring and Motion Detection.” International Journal of Engineering Science and Computing (2017): 4298-4301. 
Copyright © 2022 Lakshya Jain, Upendra Bhanushali, Rajashree Daryapurkar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET47142
Publish Date : 2022-10-20
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here