Skin Cancer Detection Using Deep Learning techniques

Authors: Hrithik Singh, Shambhavi Kaushik, Shruti Talyan, Kartikeya Dwivedi

DOI Link: https://doi.org/10.22214/ijraset.2022.43090

Abstract

Skin cancer detection is one of the major prob-lems across the world. Early detection of the skin cancer and its diagnosis is very important for the further treatment of it. Artificial Intelligence has progressed a lot in the field of healthcare and diagnosis and hence skin cancer can also be detected using Machine Leaning and AI. In this research, we have used convolutional neural network for image processing and recognition. The models implemented are Vgg-16, mobilenet, inception-V3. The paper also reviewed different AI based skin cancer detection models. Here we have used transfer learning method to reuse a pre-trained model also a model from the scratch is also built using CNN blocks. A web app is also featured using HTML, Flask and CSS in which we just have to put the diagnosis image and it will predict the result. Hence, these pre-trained models and a new model from scratch are applied to procure the most optimal model to detect skin cancer using images and web app helps on getting the result at the user end. Thus, the methodology used in this paper if implemented will give improved results of early skin cancer detection using deep learning methods.

Introduction

I. INTRODUCTION

Skin Cancer is one of the most dangerous and common disease that affect mankind very often. The two main types of skin cancer that affect humans are Melanoma and Non-melanoma. However, Melanoma is the most affecting and dan-gerous type of skin cancer and has a high mortality rate with overall affecting cases of less than 5% [1]. The World Health Organization also estimated high cases of Melanoma across the world. Skin self-examination and skin clinical examination are the conventional methods of skin cancer detection [2].

However, these methods of examination are very hectic and troublesome. This method requires a lot of physical visits of patient and are also very expensive. These examinations also require specialized tools such as microspectroscopy and laser-based tools [3]. It requires effort to operate and a lot of training.

The revolution brought by the smartphones allowed patients to share the image through phone for the skin cancer diagnosis. However, these images may be of not standard quality which may lead to inaccurate diagnosis also the privacy may be compromised by using internet for sharing. But with the AI evolution, the human and AI interaction have increased on a daily basis which may assist the decision making of the doctors.

Also, the use of AI will reduce the human error involved in the diagnosis. Despite the presence of such AI technology, the requirement of expert physician is mandatory. The aim of this research is on the use of deep learning and CNN for the early detection of skin cancer. Here, a model is trained based on the data available for the skin cancer and the accuracy of the model is improved with every training of data sets.

This does not use multilayer neural network [4] instead it uses deep learning and deep multi-layered network which involve training of very large data sets for improving the accuracy of the system. These AI methods for detecting the skin cancer are very cheap, easy to use and accessible. Also, AI based technology has been superficial and offers a lot of liabilities and features than the conventional methods of skin cancer detection. The process of skin cancer detection using AI involves feeding the image to the model, segmenting it and processing it to classify the type of skin cancer. Deep learning has somewhere brought a revolution in the field of AI. The algorithms used in the deep learning are inspired by human brain. Deep learning [5] has improvised the results of machine learning and AI. This research performs the literature review of different classical deep learning methods used for the diagnosis of skin cancer and its methodology. In this paper a new model from scratch is made to get the more improved results using CNN. The superficial models Vgg-16 [6], Mobilenet [7] and inception V3 [8] are used. The transfer learning is used to reuse a pre trained model for better accuracy. Thus, this model for skin cancer detection has made the diagnosis and detection easier than the conventional methods of detection.

II. LITERATURE REVIEW

With the advancement in the field of Artificial Intelligence and machine learning the detection and diagnosis of skin cancer have made a great progress throughout the years. In the effort for improving the accuracy of the detection and diagnosis several researchers have employed different algorithms, models and techniques.

The research work of Catarina Barata and Jorge S. Marques [Barata2019] [9] employed hierarchal architectures using deep learning for the detection of skin cancer. They found that when skin lesions are diagnosed by dermatologists, it is arranged in a hierarchical way. But the automated systems do not consider this hierarchical way. They take into account all the types of skin lesions. In the research they took into consideration a deep learning method to perform the hierarchical diagnosis of the skin cancer. They reported the convenience of analysing and sorting the images of skin lesions in structural way. They provided a substantial evaluation criterion for the diagnosis using deep learning technique.

Yoonsik Kim, Insung Hwang and Nam Ik Cho [Kim2017]

suggested use of two CNN and training methods for the detection and diagnosis of skin cancer. The first was a kind of VGG network that used 20 layers and 3x3 filters. The second was modification of the inception model that used 20 NiN layers. For the detection of cancer, patch based and image-based training are considered. The first method classified the skin on the basis of colour and texture while the second one considered on the basis of human shape features and also on the colour and texture. Study suggested that the method using CNN resulted better accuracy and response than the traditional methods of detection using deep learning. The results suggested that whole image-based training gave better accuracy than the patch-based training. Also, the use of NiN structure gave improved results over the VGG structure.

Le Thu Thao and Nguyen Hong Quang [Thao2017] [11] also conducted research on the detection of skin cancer using deep learning. The study was conducted on the skin lesions over melanoma detection. The research gave results in solving two problems in the skin cancer detection using deep learning method over the images of skin tumour. The datasets were trained by using standard datasets provided by the International Skin Imaging Collaboration (ISIC) 2017. The conclusion showed better accuracy and performance than the previously suggested deep learning techniques.

In 2018, Jainesh Rathod et.al [12] studied the diagnosis of skin cancer using CNN. The study suggested that diagnosis of medical dermatology using AI techniques is very difficult and unpredictable due to its complexity. There is a need of different tests to be performed on the patient by the dermatologists based on skin condition and symptoms. It may also vary from expert to expert. So, a system is needed that will not be affected due to such constraints. They used machine learning classification for the recognition of cancer images using an automated system. The system will use computational process to analyse the various features of the image. The images are later processed for enhancement of its quality. The extraction of feature using CNN and classifying the image on the basis of SoftMax classifier resulted in the output as diagnosis result. This technique resulted in better overall performance and accuracy of the system

III. METHODOLOGY

The purpose of this project is to identify and diagnose the skin cancer disease occurred in human beings by using the images of lesions or infected skin parts submitted by the users by using devices such as cameras, mobiles, etc. It’s a binary-class classification problem here. The input is a camera that acquires images of skin lesions having disease, and the task is to classify whether the skin lesion is cancer or not. Picture of the infected skin part, with binary-label output y which can be denoted as [0,1] corresponds to the recognized human healthy or unhealthy skin. In a world trending with state of art, technologies require recognition of such diseases without a doctor’s instead deep learning classifiers can be deployed which can detect healthy and unhealthy humans. There are two ways used in this work i.e., which are deep learning-based techniques i.e., advanced and traditional methods to recognize the diseases in the human. The extraction of characteristics or features is the most common procedure used in traditional methods. After preprocessing and categorization of images by using the machine learning methods, in corresponding classes, classifiers. Shape, texture, and color are all characteristics of handmade features. Haralick texture features and Local Binary Pattern (LBP) [13] are used to extract texture information, and Hu moments are used to recover color information. Colour channel (RGB) (three-color) information, and so forth the standard deviation and mean for each channel are calculated separately with various techniques such as Gaussian filters. Machine Learning algorithms for different classification tasks such as Naive Bayes, Logistic Regression, and other learning algorithms such as K-Nearest Neighbor (KNN), Support vector classifier, and Random Forest may be used to group images into different categories, however, this is a time-consuming and computational resource-demanding task. In order to achieve better results, deep learning-based methods are employed [14]. Convolutional Neural Networks (CNNs) [15] is a type of neural network that may be utilized for a variety of such applications.

Images of different skin lesions are used to recognize the type of disease instead of other traditional approaches such as various time-consuming visits to medical just for checking which results in an overload on doctors too. Extracting features such as shape, color, length, and width is not required to be explicitly done when using such state of art methods. Traditional technologies [16] use color and shape extraction processes which is a difficult and hectic process. This isn’t something that should be done on purpose. Learning strategies such as transfer learning can be useful and used to improve identification and get better accuracy. Pre-trained models such as VGG 16/19, MobileNet, and Inception-v3/5 can be used for feature extraction and classification of skin cancer. On a wide scale, Convolutional Neural Networks (CNNs) [17] are used in image categorization and recognition. These networks propagate both forward and backward. While neurons in such networks are multilayer networks that are linked in a way that resembles a vertebrate visual system cortex. These are referred to as convolutional because they make use of a mathematical procedure known as ’convolutional (*)’. Several layers are employed in the construction of a CNN, which is made from padding, pooling, batch-normalization, and convolutional layers that are entirely linked to each other in a sequential manner. Image features are extracted using a convolutional layer. To keep all inputs in the same form, padding layers are employed. Pooling layers are used to minimize the size of extracted feature vectors while preserving their characteristics. To achieve greater convergence, normalization layers are employed to normalize the output of the previous layers. Flatten layers to transfer features to a usable format, whereas fully linked layers are needed for categorization. Dropout layers and hyper-parameter tweaking are employed to prevent such networks from overfitting. An image is initially transformed into a shape tensor (image width, image height, and a number of channels), which is then fed into the CNN and convoluted with a filter (kernel), which may be regarded as a sliding window for feature extractio [18]. It’s used with a number of strides to regulate how the depth of the output volume is distributed over the width and height.

Where, W= input size of image,

K= kernel size applied ,

P= stands for padding applied,

S= stands for Stride applied.

The purpose of this study is to employ CNNs to recognize and classify different types of diseases. Transfer Learning [16] is more successful and consistent in such real settings since the quantity of data generated for this study comprises 3297 total photos, which is a large dataset that may lead to overfitting when training a deep CNN. A pre-trained CNN can be used with this strategy. The Imagenet dataset was used to train this CNN model, which contains millions of pictures from thousands of distinct classifications. Depending on the dataset, many kinds of transfer learning [19] can be used. Due to the size of the dataset used, transfer learning approaches are employed. This method first freezes all layers of pre-trained models and replaces them with fresh fully connected layers that match the number of classes in the dataset. Following that, a training dataset is used to train the whole model. In this case, several types of pre-trained CNNs are compared to get superior outcomes. VGG-16, MobileNet, and InceptionV3 are the pre-trained basic models [20]. A new CNN model built from scratch with TensorFlow and Keras is also included. Table I shows top 1% and top 5% accuracy of pre-trained models on the image net dataset.

IV. DATASET AND EXPERIMENTS

In this research work, the dataset of skin cancer images has been collected from the Kaggle website. This dataset contains a total of 3297 images that are categorized into two classes which are benign and malignant. The benign class indicates no critical issue and, therefore, can be considered 0 while the malignant class means the potential threat and is hence considered 1. This problem is a binary classification task because only two classes are present. Dataset used is balanced, consisting of almost the same number of images for each class as shown in fig 1. Further, this dataset is split into training and test datasets in a ratio of 80:20. After splitting, the training dataset consists of 2637 images while the test dataset consists of 660 images. Further, both datasets are preprocessed and each image is associated with size as image width = 120, image height= 120, and channels= 3 because RGB images are used. Data augmentation is applied to the training dataset in which vertical and horizontal flipping of images is executed. Each model is trained on the same training dataset for thirty epochs.

A. Experiment 1: Model VGG -16

Vgg-16 is a pre-trained convolutional neural network that has 16 layers. Conv2D and max-pooling layers are used in this model. This model is trained on the well-known Imagenet Dataset which has been used on a large scale for image classification and recognition tasks. In this work, all fully connected layers of the model are removed and all remaining layers are frozen. In this model, three dense layers are added which contain 256, 64, and 1 neuron respectively.

The last layer which is the output layer contains one neuron only because it is a binary classification task. Each layer is activated by the relu activation function while the last layer is activated by the sigmoid activation function which has 1 neuron in order to predict two different classes. To prevent overfitting of the model dropout layers are introduced between each dense layer having a dropout of 0.2. Early Stopping and Reduce LR on the plateau is also used for better training and to avoid overfitting. The model is optimized via Adam Optimizer and the loss function used is binary cross-entropy. This model is first trained for 30 epochs with a learning rate of 1e-4 with 21,153,985 total parameters.

Relu Activation: Relu is a mostly used activation function because of its non saturating and non linear nature. It is computationally efficient due to its easy derivative which makes training and convergence quick.

B. Experiment 2: MobileNet Model

Mobilenet is an efficient deep CNN model which is based on an elegant architecture having 92 layers. This model is lightweight in nature in comparison to others because it is based on the concept of depth-wise separable convolutional layers. These types of networks are implemented in a wide range of mobile and embedded system applications due to their lightweight. In this work first, all layers of the pre-trained mobile net model are frozen and fully connected layers are replaced by new fully connected layers having the same parameters as in the previous vgg-16 model. All other hyper parameters, optimization, and loss functions remained the same as in the previous model. After compiling the model is trained on the same training dataset with 16,090,689 total parameters and tested on the test dataset. Fig 3 shown below indicates the accuracy curve against the epochs for both training and validation data.

C. Experiment 3: Inception V3 Model

It is 317 layers deep CNN which is generally known as faster R-CNN. These types of models are generally used for face detection and recognition purposes. This model is based on feature pooling concepts. Parallel dimensional reduction of features with 1* 1 filters and concatenation with filters of 3*3 and 5*5 results in maximum feature extraction and effectiveness against overfitting. In this first, all layers are frozen and fully connected layers are removed. Three new fully connected layers are added on the top of the model with the same parameters as in the Vgg-16 model. This model is trained on the same training dataset with a learning rate of 1e-4 for 30 epochs with 26,480,133 trainable parameters. Same Early stopping and reduction LR on the plateau is used for avoiding the overfitting of the model. The fig 4 shown below indicates the accuracy curves against the epochs of both training and validation data during the training.

D. Experiment 4: Custom Model from scratch

In this paper, a custom model from scratch is created in TensorFlow and Keras. This custom model has two blocks which are vgg and inception block. In the vgg block, a Conv 2D layer with (3,3) filters activated by the relu activation function is used followed by a max-pooling 2D layer with a stride of 2 is implemented. Here the same padding is used

in this block. In the inception block, four Conv 2D layers with (1,1), (3,3), (5,5), and (3,3) filters respectively are executed. These all are activated by the relu activation function. A max-pooling layer with a stride of 1 and the same padding is also added. In this custom model, one inception and three vgg blocks are implemented. In the first vgg block, 64 filters are used which is concatenated with the inception block having 64, 96, 128, 32, 16, and 32 filters for each Conv layer are used respectively. This layer is further concatenated with two more vgg blocks having 64 and 16 filters respectively. Flatten layer is added to convert the obtained feature vector into a 1-D dimension which is connected with three fully connected dense layers. The first two dense layers contain 512 and 100 neurons while the last Dense layer is fixed with 1 output neuron and activated by a sigmoid function in order to carry binary class classification of skin cancer disease. The architecture of the custom model is shown in fig 5. All hyper-parameters are tuned by the keras tuner. A number of neurons and activation functions for each layer are tuned by the keras tuner to get optimal results. The model is optimized via Adam Optimizer and the loss function used is binary cross-entropy. This model is trained on the same training dataset for 20 epochs with a learning rate of 1e-6 and 6,846,233 as trainable parameters. and then tested on the same test dataset. In order to prevent overfitting of the model two dropout layers with 0.3 drops are used. Moreover, Early stopping and reduction LR on the plateau is implemented for better convergence and to avoid overfitting of the model. Fig 6 shown below indicates the accuracy curves against the epochs of both training and test data during the training.

V. RESULTS

In this research task, various different types of pre-trained CNN models are implemented and compared in order to detect the best optimal model. Transfer learning concepts are used for the training of such large CNN models. A custom model from the first layer is built and compared which provides amazing output. Each model is trained on the same dataset before being tested on the same dataset. To assess the model’s performance, many measures are employed. Because this is a classification job, the assessment metrics accuracy, precision, recall, and F1-score are employed.

Accuracy: The accuracy of any model may be determined by taking out the ratio of the number of the observations which are accurately predicted to the total number of observations. The accuracy of the model provides information about its overall performance. Table II displays the accuracy attained on test data before and after fine tuning in each implemented model.
Precision: The precision score is used in plant disease identification to evaluate model performance; it primarily informs about false positive results in the dataset. The lower the amount of false positive values, the higher the accuracy score. Table III displays the accuracy values achieved on test data before and after fine tuning in each model employed.

3. Recall: The recall score is used in plant disease detection to evaluate model performance, and it primarily informs about false negative results in the dataset. The lower the number of false negative values, the higher the recall score. Table IV displays the recall values acquired from test data before and after fine tuning in each model employed.

4. F1 score: The F1-Score is the combined score gained from Precision and Recall, and it may be calculated using the method below. It is known as the harmonic mean of precision and recall.

Table III displays the F1-score values acquired on test data before and after fine tuning in each implemented model.

In this task, each model is trained on the same dataset and then tested on the same test dataset. Different metrics are used to measure the performance of the model. Since it is a classification task therefore accuracy, precision, recall, and F1-score are used as evaluation metrics. Table I shows the accuracy obtained on test data for each executed model. Table II shows the precision, recall, and f1-score values obtained on test data for each implemented model. In this paper, all the implemented models are compared and results are obtained. Results show an interesting fact about the use of transfer learning. Accuracy Comparison the pre-trained model as shown in fig 7.All three pre-trained base models have shown an exceptionally high accuracy on such type of real classification task with an approximate average of 88% accuracy on the test set and the Recall and precision parameters trade-off is balanced as shown in the table above. It has been observed from obtained results that the pre-trained model mobile net with train and test accuracy of 88.16% and 86.96% respectively outperformed all other implemented models. PRF score of this model on test data is found to be 0.87, 0.87, and 0.87 respectively. The comparison of PRF score of all implemented models are shown in fig 8. The custom model has also performed well. It has been observed that at the end of the 20th epoch training accuracy was found to be 85.22% and test data accuracy was found to be 83.48% which can be considered satisfactory results.

Conclusion

Diagnosing diseases like skin cancer with the help of deep learning techniques is inclining nowadays. Since doctors are overloaded with intense diagnosing of diseases, such deep learning-based classifiers can act as assistants which would make the diagnosing process quick and easy. Recognizing such diverse kinds of skin diseases from images of skin lesions is a difficult undertaking and an unexplored field of study. Three distinct deep CNN models are used and evaluated in this study to identify and properly detect skin cancer. The dataset includes about 2637 images of skin lesions both benign and malignant skin cancer that were used to train the models and 660 images that were used to evaluate the model. Furthermore, a new model is created from scratch and trained on the same training dataset. Experiment results show that each pre-trained model performed well in classification tasks, and that fine adjusting each model boosts model accuracy. The mobile net model outperformed the other two pre-trained models. On the test dataset, it is discovered that the accuracy of this model, when compared to others, is high and fairly satisfying. Other criteria, including accuracy, recall, and f1 score, also indicate extremely good results. Custom models have also shown good results despite the tiny amount of the training dataset and the limited number of trainable parameters owing to constrained computational resources. In future research, the outcomes of both bespoke and pre-trained models can be enhanced by training them on huge datasets. The availability of good computational resources also helps to improve the outcomes, making these procedures more consistent for real-world applications.

References

[1] R. Katta and D. Brown, ”Diet and Skin Cancer: The Potential Role of Dietary Antioxidants in Nonmelanoma Skin Cancer Prevention”, Journal of Skin Cancer, vol. 2015, pp. 1-10, 2015. [2] ”Cancer”, Who.int, 2022. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/cancer. [Accessed: 15- May- 2022] [3] I. Baianu, S. Korban and D. Costescu, ”Single Cancer Cell Detection by Near Infrared Microspectroscopy, Infrared Chemical Imaging and Fluorescence Microspectroscopy”, Nature Precedings, 2011. [4] ] S. Panghal and M. Kumar, ”A multilayer perceptron neural network approach for the solution of hyperbolic telegraph equations”, Network: Computation in Neural Systems, vol. 32, no. 2-4, pp. 65-82, 2021. [5] P. Abhishek and V. Ramesh, ”A Survey on Deep Leaning Architectures and Its Applications”, International Journal of Education and Science, vol. 3, no. 4, 2020. [6] K. Simonyan and A. Zisserman, ”Very Deep Convolutional Networks for Large-Scale Image Recognition”, arXiv.org, 2022. [Online]. Available: https://arxiv.org/abs/1409.1556. [Accessed: 15- May- 2022] [7] A. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, ”MobileNets: Efficient Convolutional Neu-ral Networks for Mobile Vision Applications”, arXiv.org, 2022. [Online]. Available: https://arxiv.org/abs/1704.04861. [Accessed: 15- May- 2022] [8] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens and Z. Wojna, ”Rethinking the Inception Architecture for Computer Vision”, arXiv.org, 2022. [On-line]. Available: https://arxiv.org/abs/1512.00567. [Accessed: 15- May-2022] [9] C.Barata, C. and J. S. Marques, Deep Learning for Skin Cancer Diagnosis with Hierarchical Architectures. In IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pages 841-845, 2019. [10] Y. Kim, I. Hwang and N. I. Cho, Convolutional neural networks and training strategies for skin detection. In IEEE International Conference on Image Processing (ICIP), pages 3919-3923, 2017. [11] S. Yun, W. Xianfeng, Z. Shanwen, and Z. Chuanlei, “Pnn based crop disease recognition with leaf image features and meteorological data,” International Journal of Agricultural and Biological Engineering, vol. 8, no. 4, p. 60, 2015 [12] L. T. Thao and N.H. Quang, Automatic skin lesion analysis towards melanoma detection, In 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES), pages 106-111, 2017 [13] K. Mujib, A. Hidayatno and T. Prakoso, ”PENGENALAN WAJAH MENGGUNAKAN LOCAL BINARY PATTERN (LBP) DAN SUP-PORT VECTOR MACHINE (SVM)”, TRANSIENT, vol. 7, no. 1, p. 123, 2018. [14] M.K. Hu, Visual pattern recognition by moment invariants, IRE Trans-actions on information Theory 8(2) (1962), 179–187 [15] D. Zhou, ”Theory of deep convolutional neural networks: Downsam-pling”, Neural Networks, vol. 124, pp. 319-327, 2020. [16] H. Wu and X. Gu, ”Towards dropout training for convolutional neural networks”, Neural Networks, vol. 71, pp. 1-10, 2015. [17] M. Sar?gul,¨ B. Ozyildirim and M. Avci, ”Differential convolutional neural network”, Neural Networks, vol. 116, pp. 279-287, 2019. [18] M. A. Elaziz, D. Oliva, A. A. Ewees, and S. Xiong, “Multi-level thresholding-based grey scale image segmentation using multi-objective multi-verse optimizer,” Expert Systems with Applications, vol. 125, pp. 112–129, 2019. [19] V. Demirer and I. Sahin, ”Effect of blended learning environment on transfer of learning: an experimental study”, Journal of Computer Assisted Learning, vol. 29, no. 6, pp. 518-529, 2013. [20] M. Rezapour and K. Ksaibati, ”Convolutional Neural Network for Roadside Barriers Detection: Transfer Learning versus Non-Transfer Learning”, Signals, vol. 2, no. 1, pp. 72-86, 2021.

Copyright

Copyright © 2022 Hrithik Singh, Shambhavi Kaushik, Shruti Talyan, Kartikeya Dwivedi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET43090

Publish Date : 2022-05-22

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here