The early detection and treatment of skin cancer depend on the accurate classification of skin lesions. Using the HAM10000 dataset, this paper proposes a deep learning-based system for dividing dermoscopic pictures into malignant and non-malignant groups. To extract robust and discriminative features from skin lesion images, we use the DINOv2-B vision transformer model. These features are subsequently refined for binary classification. The efficacy of the suggested method in distinguishing between benign and malignant lesions is demonstrated by its excellent accuracy, precision, recall, and F1-score. Along with the classification model, an easily accessible tool for tracking the condition of skin lesions has been created: a web-based application that enables users to input dermoscopic photos and receive real-time forecasts. A cutting-edge vision transformer combined with an interactive platform provides a useful solution for patients and clinicians, encouraging early diagnosis and well-informed decision-making while lowering the need for manual evaluation.
Introduction
Skin cancer is a growing global health concern, and early detection significantly improves treatment outcomes. Dermoscopy enables detailed inspection of skin lesions, but manual analysis is time-consuming and prone to error. Deep learning-based computer-aided diagnosis (CAD) systems, particularly using convolutional neural networks (CNNs) and vision transformers (ViTs), can enhance diagnostic accuracy and efficiency. CNNs are effective at hierarchical feature extraction but struggle with capturing global context, whereas vision transformers like DINOv2-B leverage self-attention and self-supervised learning to model long-range relationships and generate discriminative, generalizable image features.
This research fine-tunes DINOv2-B on the HAM10000 dataset of 10,015 dermoscopic images to classify skin lesions as malignant or non-malignant. Preprocessing steps include scaling, normalization, and data augmentation (flipping, rotation, color jitter) to enhance generalization. Images are transformed into tensors for GPU-based training using PyTorch, optimized with AdamW, and evaluated with accuracy, precision, recall, and F1-score. Mix-up regularization and cosine annealing learning rate scheduling further improve performance and convergence.
The trained model is deployed via a web application, allowing real-time image upload, malignancy prediction, and visualization of confidence scores. Explainable AI techniques like Grad-CAM provide transparency for clinicians. The modular design supports scalability, continuous model fine-tuning, and integration with additional datasets.
Results: The DINOv2-B model achieved 93% classification accuracy, outperforming CNNs and other transformer-based architectures on HAM10000 and external ISIC 2019 datasets. Attention visualizations confirm the model focuses on lesion regions rather than background artifacts. Its high generalization, interpretability, and robustness demonstrate its potential as a clinically useful tool for early detection, monitoring, and decision support in dermatology.
Conclusion
The suggested DINOv2-B-based skin lesion classification framework, in summary, shows a very reliable and efficient method for automated dermatological diagnosis. The model effectively extracts high-level semantic information from dermoscopic pictures by utilising DINOv2-B\'s self-supervised learning capabilities, allowing for precise classification between benign and malignant tumors. Combining sophisticated training methods, such as cosine annealing learning rate scheduling and Mix-up regularization, improves model generalization, stabilizes convergence, and lowers the chance of overfitting. The model outperforms traditional CNN and transformer-based architectures with an overall accuracy of 93%, according to experimental results on the HAM10000 and ISIC datasets. Additionally, the model demonstrates outstanding generalization to external datasets, demonstrating its usefulness in actual clinical settings. This system is a promising tool for helping dermatologists discover and diagnose skin cancer early because of its scalability, interpretability, and strong prediction performance. To further its usefulness in healthcare applications, future research might investigate deployment in real-time clinical decision support systems, attention-based interpretability mechanisms, and integration with bigger multi-modal datasets.
References
[1] Atta, M. A. Khan, M. Asif, G. F. Issa, R. A. Said, and T. Faiz, “Classification of skin cancer empowered with convolutional neural network,” in Proc. Int. Conf. Cyber Resilience (ICCR), Oct. 2022, pp. 1–6.
[2] M. Xia et al., “Lesion identification and malignancy prediction from clinical dermatological images,” Sci. Rep., vol. 12, no. 1, p. 15836, Sep. 2022.
[3] A. K. Sharma et al., “Dermatologist-level classification of skin cancer using cascaded ensembling of convolutional neural network and handcrafted features based deep neural network,” IEEE Access, vol. 10, pp. 17920–17932, 2022.
[4] W. Gouda, N. U. Sama, G. Al-Waakid, M. Humayun, and N. Z. Jhanjhi, “Detection of skin cancer based on skin lesion images using deep learning,” Healthcare, vol. 10, no. 7, p. 1183, Jun. 2022.
[5] Gouda, Walaa, Najm Us Sama, Ghada Al-Waakid, Mamoona Humayun, and Noor Zaman Jhanjhi. \"Detection of skin cancer based on skin lesion images using deep learning.\" In Healthcare, vol. 10, no. 7, p. 1183. MDPI, 2022.
[6] Wei, Lisheng, Kun Ding, and Huosheng Hu. \"Automatic skin cancer detection in dermoscopy images based on ensemble lightweight deep learning network.\" IEEE Access 8 (2020): 99633-99647.
[7] Nairi, Chaimaa, and Gokhan Bilgin. \"Deep Learning Approach to Improve Skin Lesion Classification for Early Skin Cancer Detection.\" In 2025 33rd Signal Processing and Communications Applications Conference (SIU), pp. 1-4. IEEE, 2025.
[8] Chaurasia, Abadh K., Patrick W. Toohey, Helen C. Harris, and Alex W. Hewitt. \"Multi-resolution vision transformer model for skin cancer subtype classification using histopathology slides.\" medRxiv (2025): 2025-01.
[9] Huang, Yuning, Jingchen Zou, Lanxi Meng, Xin Yue, Qing Zhao, Jianqiang Li, Changwei Song, Gabriel Jimenez, Shaowu Li, and Guanghui Fu. \"Comparative analysis of imagenet pre-trained deep learning models and dinov2 in medical imaging classification.\" In 2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 297-305. IEEE, 2024.
[10] Ogudo, Kingsley A., R. Surendran, and Osamah Ibrahim Khalaf. \"Optimal Artificial Intelligence Based Automated Skin Lesion Detection and Classification Model.\" Computer Systems Science & Engineering 44, no. 1 (2023).
[11] Gouda, Walaa, Najm Us Sama, Ghada Al-Waakid, Mamoona Humayun, and Noor Zaman Jhanjhi. \"Detection of skin cancer based on skin lesion images using deep learning.\" In Healthcare, vol. 10, no. 7, p. 1183. MDPI, 2022.
[12] Hatem, Mustafa Qays. \"Skin lesion classification system using a K-nearest neighbor algorithm.\" Visual Computing for Industry, Biomedicine, and Art 5, no. 1 (2022): 7.
[13] Nayak, Tushar, Krishnaraj Chadaga, Niranjana Sampathila, Hilda Mayrose, Nitila Gokulkrishnan, Srikanth Prabhu, and Shashikiran Umakanth. \"Deep learning based detection of monkeypox virus using skin lesion images.\" Medicine in Novel Technology and Devices 18 (2023): 100243.
[14] Singh, Sumit Kumar, Vahid Abolghasemi, and Mohammad Hossein Anisi. \"Fuzzy logic with deep learning for detection of skin cancer.\" Applied Sciences 13, no. 15 (2023): 8927.
[15] Khan, Muhammad Attique, Khan Muhammad, Muhammad Sharif, Tallha Akram, and Seifedine Kadry. \"Intelligent fusion-assisted skin lesion localization and classification for smart healthcare.\" Neural Computing and Applications 36, no. 1 (2024): 37-52.