Approximately 58% of Indian population is involved in agriculture directly or indirectly, which contributed about 19.9% to the GDP of India in 2020-2021 F.Y. According to a report published by ICAR (Indian Council of Agricultural Research) about 30-35% of annual crop yield are wasted because of pests and diseases which affects the income and livelihood of the farmers. With the advancement in deep learning and computer vision it is now possible to detect the plant disease effectively by observing the disease pattern of leaves of plants. Which will help farmers to classify the disease in their plant. In this study about 12500 images of healthy and infected plant leaves which are available in public domain were used to train deep learning model, which can classify the respected disease.
Manual identification of disease is a common practice in agriculture where a farmer try to relate the disease pattern of leaves according to his/her experience or consult scientist which is in some rare cases, Also the early disease identification and prevention plays a significant role in overall yield of crop. Also it is not always true that farmers get prediction of the disease of crops correctly as in many cases there is hit and trial.
With the advancement in deep learning and computer vision technologies, automatics disease detection using images of plant leaves is now possible.
Machine learning includes a set of techniques known as Deep Learning that employs Artificial Neural Networks to simulate the human brain's neural network. CNN is a deep learning system that can automatically identify in image data by figuring out some rules and storing them as weights and biases, which are subsequently utilized in prediction. In deep learning a convolutional neural networks (CNN) is a type of deep neural networks most commonly used to analyze images.
In this project we have used 12500 images from various sources that are in public domain to train Convolutional Neural Networks, which performs exceptionally well in identifying patterns in the images.
II. CONVOLUTIONAL NEURAL NETWORKS
Deep learning is considered to be subset of machine learning which uses Artificial neural networks to learn the features from data and infer rules. Today there is huge demand of deep learning in almost every field from banking for fraud detection to mobile phones, AI proctored examination, speech and video analysis, etc.
Convolutional neural networks are another type of neural network which differs from ANN in the fact that it makes use of convolution operation to be applied on dataset. CNN are highly effective in learning the features from data like- images and infer weights and biases which in future are used for prediction.
Input layer, CONV layer, pooling layer, and fully connected layer are the four layers that make up a CNN.
A. Convolution Layer
The input image's features are extracted using this layer. Convolution is conducted between the input image and a filter of a specific size in this layer (say N x N). It's done by sliding the filter across the input image and calculating the convolution product for each segment. This layer's output is known as a feature map, and it contains information about various image features including edges and corners. This feature map is then supplied to subsequent layers.
B. Pooling Layer
After the CONV layer, a pooling layer is usually added. This layer's primary goal is to reduce the size of the convolved feature map. This layer also minimizes the number of parameters that may be learned and the amount of time it takes to learn them, resulting in lower computation costs. Overfitting, which is described as high training accuracy but low test accuracy, is likewise reduced by this layer.
C. ReLu correction layer/ Activation layer
“This layer make use of a non-linear ReLu (rectified linear unit) layer in every CONV layer” . Relu is defined as:
ReLU(x) = max(0,x)
It replaces all negative values by zeroes.
C. Fully Connected Layer
The fully connected layer is a feed forward neural network that is always the last layer of a neural network. The previous layer's input is flattened before being fed to these FC layers. This layer is used to examine the probability of each class (i.e. for classification purposes).
This layer takes an input vector and uses linear combination and an activation function to create a new output vector. It returns an N-dimensional vector, with N denoting the number of classes in our image classification task
A. Dataset Acquisition
“The image dataset for training the model was obtained from a variety of open source sites, including the plant village dataset” . Images were manually downloaded and organized into folders based on their appropriate classifications.
The collected dataset consists of 12500 photos of different plant leaves of various varieties and diseases.
B. Image Preprocessing
Photos were preprocessed to minimize their size in order to match the input layer's input criteria. Colored 128x128 resolution images are used in this study.
C. Model Building
A typical CNN model was built to train and test the data. CNN model design also plays a vital role in the final accuracy and other results.
The proposed model was trained using the preprocessed dataset for about 50 epochs.
First features are extracted from input images using CONV and pooling layers and then classification is done using fully connected layers.
IV. EXPERIMENTAL SETTINGS
There are roughly 12,500 photos in the collection, “which include four different types of potato leaf diseases, four different types of tomato leaf diseases, four different types of guava leaf diseases, four different types of wheat leaf diseases, and four different types of rice leaf diseases” . Python programming language was used to create neural network code.
The image dataset that was used for training & testing, comprised of 12,500 photos gathered from various web sources. By rotating the photos 20 degrees, flipping and shifting the images horizontally and vertically, data augmentation techniques were also used to improve the image dataset. “A categorical cross-entropy is used by the Adam optimizer. With a batch size of 32, the model trained on 50 epochs.” 
All these experimentations were performed on HP 15-ec1105AX Ryzen 5 Hexa core with 8 GB Ram and 512 GB SSD capacity.
V. RESULTS AND EVALUATION
During the training of deep learning model, 50 epochs were used to train the model which attained a accuracy rate of 98%. While testing on random image samples, the model successfully reached a maximum accuracy rate of 99%. The epoch vs accuracy plot for training and validation data is shown in figure 7(i), while the epoch vs loss curve is shown in figure 7(ii).
Using convolution neural networks, this project was able to detect and recognize 16 different plant kinds and plant diseases. The trained model has been deployed on Amazon Web Services and can be used to classify disease in plant leaves.
For future work additional crops and diseases may be added. A feature to also show the prevention technique or pesticides according to the predicted disease may also be added.
 D.P. Hughes, and M. Salathé, An open access repository of images on plant health to enable the development of mobile disease diagnostics, arXiv:1511.08060, 2015.
 Sammy V. Militante , Bobby D. Gerardo , Nanette V. Dionisio, Plant Leaf Detection and Disease Recognition using Deep Learning, 2019 IEEE Eurasia Conference on IOT, Communication and Engineering.
 Sammy V. Militante , Bobby D. Gerardo , Nanette V. Dionisio, Sugarcane Disease Recognition using Deep Learning, 2019 IEEE Eurasia Conference on IOT, Communication and Engineering.
 Omkar Kulkarni, Crop Disease Detection Using Deep Learning, 2018 IEEE.
 Md. Mosaddikul Anwar, Zinat Tasneem, Md. Alamin Masum. \"An Approach to Develop a Robotic Arm for Identifying Tomato Leaf Diseases using Convolutional Neural Network\", 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI), 2021.
 L K Hema, D. Vijendra Babu, A. Navaneetharajan, K. Vijayakumar, S. Dhayanithi. \"Agriculture Resources for Plant-Leaf Disease Identification using Deep Learning Techniques\", Journal of Physics: Conference Series, 2021.