Vision Transformer and Explainable AI for Breast Cancer Detection and Classification: A Review

Authors: Laxmi Yadav, Girish Chandra, Divakar Yadav

DOI Link: https://doi.org/10.22214/ijraset.2025.71641

Abstract

Vision Transformers (ViTs) and Explainable AI (XAI) are revolutionizing breast cancer detection and classification. ViTs have proven to perform better than conventional Convolutional Neural Networks. XAI techniques are important for enhancing the transparency and trustworthiness of the models, making them more acceptable to healthcare professionals and patients.Current research is exploring self-attention mechanisms within ViTs to generate inherent explanations and using XAI to identify and mitigate biases in models. These advancements are being applied to tasks such as identifying breast cancer subtypes and predicting treatment response. To increase efficiency and speed up training, transfer learning—in which models are pre-trained on huge datasets before being modified for particular breast cancer tasks—is also becoming more common. The lack of huge, high-quality datasets and the high computational expenses of training ViTs are two issues that persist despite advancements. Future studies will concentrate on building more dependable XAI methods, larger and more varied datasets, and more effective ViT designs. User-friendly XAI tools are needed for clinical workflows. Addressing concerns about bias, fairness, and transparency is essential for responsible AI use in healthcare. Data standardization is also needed to ensure consistent results across different locations.

Introduction

Breast cancer (BC) diagnosis benefits from advanced, accurate imaging techniques, but traditional methods have limitations in sensitivity and specificity. Deep learning, especially convolutional neural networks (CNNs), has improved BC image analysis, yet CNNs struggle to capture global image context. Vision Transformers (ViTs), inspired by Transformer models in natural language processing, use self-attention to capture long-range dependencies in images, offering potential for more accurate and reliable BC diagnosis.

Recent research explores hybrid CNN architectures and deep transfer learning to enhance classification accuracy on histopathological datasets like BreakHis. Evolutionary algorithms and data augmentation also contribute to optimization and performance improvements.

However, deep learning models are often criticized for their "black box" nature, which limits clinical trust and adoption. Explainable AI (XAI) methods, such as saliency maps, Grad-CAM, LIME, and SHAP, aim to make model decisions transparent by highlighting important image regions, thus increasing interpretability and clinician trust. Despite advances, XAI application specifically for mammography is underexplored, and there is a lack of standardized metrics to evaluate clinical relevance of explainability methods.

Integrating Vision Transformers with Explainable AI shows great promise. ViTs provide powerful feature extraction, while XAI reveals model reasoning, enabling clinicians to verify and understand diagnoses. Attention-based visualizations from ViTs, combined with other XAI techniques, can highlight diagnostic regions effectively, though challenges remain in computational demands and resolution of explanations.

Ultimately, this integration can lead to more transparent, trustworthy, and clinically useful AI systems for breast cancer detection, potentially improving patient outcomes.

Conclusion

the integration of Vision Transformers (ViTs) and Explainable Artificial Intelligence (XAI) into breast cancer histopathology analysis is an important step forward in the creation of intelligent, reliable, and clinically meaningful computer-aided diagnosis (CAD) systems. Vision Transformers have proven to be more effective than conventional convolutional neural networks (CNNs) at capturing global contextual features and long-range relationships. Their application in breast cancer classification has the potential to significantly enhance diagnostic accuracy across varying image magnifications and data complexities.At the same time, the growing emphasis on XAI reflects a critical shift toward interpretability and trustworthiness in deep learning models. In sensitive domains such as medical imaging, it is essential not only for models to perform well but further to give clear and logical justification for their forecasts. XAI methodsranging from saliency maps and Grad-CAM to more advanced tools like SHAP and LIMEare essential in helping to close the gap between clinical applicability and black-box algorithms by emphasizing pertinent information and offering numeric or visual explanations.Despite these advancements, several challenges remain. Current XAI research still lacks standardized, domain-specific evaluation metrics and often fails to address the unique requirements of breast imaging. Moreover, many existing studies focus primarily on model performance while giving limited attention to clinical integration and validation.Nevertheless, ongoing research continues to push the boundaries of what is possible, promising the future development of AI-driven systems that are not only accurate but also transparent, interpretable, and ultimately beneficial to patient care and clinical decision-making.

References

[1] Apon, T. S., Hasan, Md. M., Islam, A., & Alam, Md. G. R. (2021). Demystifying Deep Learning Models for Retinal OCT Disease Classification using Explainable AI. 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE). https://doi.org/10.1109/csde53843.2021.9718400 [2] Ashraf, F. B., Alam, S. M. M., & Sakib, S. M. (2024). Enhancing breast cancer classification via histopathological image analysis: Leveraging self-supervised contrastive learning and transfer learning. Heliyon, 10(2). https://doi.org/10.1016/j.heliyon.2024.e24094 [3] Davenport, T. H., & Kalakota, R. (2019). The potential for artificial intelligence in healthcare. Future Healthcare Journal, 6(2), 94. https://doi.org/10.7861/futurehosp.6-2-94 [4] Dhopte, A., & Bagde, H. (2023). Smart Smile: Revolutionizing Dentistry with Artificial Intelligence [Review of Smart Smile: Revolutionizing Dentistry with Artificial Intelligence]. Cureus. Cureus, Inc. https://doi.org/10.7759/cureus.41227 [5] Flosdorf, C., Engelker, J., Keller, I. C., & Mohr, N. (2024). Skin Cancer Detection utilizing Deep Learning: Classification of Skin Lesion Images using a Vision Transformer. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2407.18554 [6] Ghassemi, M., Oakden?Rayner, L., & Beam, A. L. (2021). The false hope of current approaches to explainable artificial intelligence in health care. The Lancet Digital Health, 3(11). Elsevier BV. https://doi.org/10.1016/s2589-7500(21)00208-9 [7] Grisi, C., Litjens, G., & Laak, J. van der. (2024). Masked Attention as a Mechanism for Improving Interpretability of Vision Transformers. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2404.18152 [8] He, K., Chen, G., Li, Z., Rekik, I., Yin, Z., Wen, J., Gao, Y., Wang, Q., Zhang, J., & Shen, D. (2022). Transformers in Medical Image Analysis: A Review [Review of Transformers in Medical Image Analysis: A Review]. arXiv (Cornell University). Cornell University. https://doi.org/10.48550/arxiv.2202.12165 [9] Henry, E. U., Emebob, O., & Omonhinmin, C. A. (2022). Vision Transformers in Medical Imaging: A Review [Review of Vision Transformers in Medical Imaging: A Review]. arXiv (Cornell University). Cornell University. https://doi.org/10.48550/arxiv.2211.10043 [10] Hou, J., Liu, S., Bie, Y., Wang, H., Tan, A., Luo, L., & Chen, H. (2024). Self-eXplainable AI for Medical Image Analysis: A Survey and New Outlooks. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2410.02331 [11] Jeyaraman, M., Balaji, S., Jeyaraman, N., & Yadav, S. (2023). Unraveling the Ethical Enigma: Artificial Intelligence in Healthcare [Review of Unraveling the Ethical Enigma: Artificial Intelligence in Healthcare]. Cureus. Cureus, Inc. https://doi.org/10.7759/cureus.43262 [12] Kabir, S., Vrani?, S., Saady, R. M. A., Khan, M. S., Sarmun, R., Alqahtani, A., Abbas, T. O., & Chowdhury, M. E. H. (2023). The utility of a deep learning-based approach in Her-2/neu assessment in breast cancer. Expert Systems with Applications, 238, 122051. https://doi.org/10.1016/j.eswa.2023.122051 [13] Kale, A., Nguyen, T., Harris, J. S., Li, C., Zhang, J., & Ma, X. (2022). Provenance documentation to enable explainable and trustworthy AI: A literature review [Review of Provenance documentation to enable explainable and trustworthy AI: A literature review]. Data Intelligence, 5(1), 139. The MIT Press. https://doi.org/10.1162/dint_a_00119 [14] Kaufman, R., & Kirsh, D. (2023). Explainable AI And Visual Reasoning: Insights from Radiology. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2304.03318 [15] Kelly, C., Karthikesalingam, A., Suleyman, M., Corrado, G. S., & King, D. (2019). Key challenges for delivering clinical impact with artificial intelligence. BMC Medicine, 17(1). https://doi.org/10.1186/s12916-019-1426-2 [16] Kumari, V., & Ghosh, R. (2023). A magnification-independent method for breast cancer classification using transfer learning. Healthcare Analytics, 3, 100207. https://doi.org/10.1016/j.health.2023.100207 [17] Loh, H. W., Ooi, C. P., Seoni, S., Barua, P. D., Molinari, F., & Acharya, U. R. (2022). Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011–2022) [Review of Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011–2022)]. Computer Methods and Programs in Biomedicine, 226, 107161. Elsevier BV. https://doi.org/10.1016/j.cmpb.2022.107161 [18] S. Singh, R. Kumar, Breast cancer detection from histopathology images with deep inception and residual blocks, Multimed. Tools Appl. 81 (4) (2022) 5849–5865. [19] M.R. Abbasniya, et al., Classification of breast tumors based on histopathology images using deep features and ensemble of gradient boosting methods, Comput. Electr. Eng. 103 (2022) 108382. [20] Davoudi, K.; Thulasiraman, P. Evolving convolutional neural network parameters through the genetic algorithm for the breast cancer classification problem. Simulation 2021, 97, 511–527. [21] Shallu and Mehra, Rajesh, “Automatic magnification independent classification of breast cancer tissue in histological images using deep convolutional neural network” Advanced Informatics for Computing Research: Second International Conference, ICAICR 2018, Shimla, India, July 14–15, 2018, Revised Selected Papers, Part I 2 (2019) 772–781. [22] G. Yang, Q.H. Ye, J. Xia, Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: a mini-review, two showcases and beyond, Inform Fusion 77 (2022) 29–52, https://doi.org/10.1016/j. inffus.2021.07.016 18. [23] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, ‘‘Learning deep features for discriminative localization,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2016, pp. 2921–2929. [24] D. Kaplun, et al., Cancer cell profiling using image moments and neural networks with model agnostic explainability: A case study of breast cancer histopathological (BreakHis) database, Mathematics 9 (20) (2021) 2616. [25] Madani, M., Behzadi, M. M., & Nabavi, S. (2022). The Role of Deep Learning in Advancing Breast Cancer Detection Using Different Imaging Modalities: A Systematic Review. Cancers, 14(21), 5334. Multidisciplinary Digital Publishing Institute. https://doi.org/10.3390/cancers14215334 [26] Marey, A., Arjmand, P., Alerab, A. D. S., Eslami, M., Saad, A. M., Sanchez, N., & Umair, M. (2024). Explainability, transparency and black box challenges of AI in radiology: impact on patient care in cardiovascular radiology. The Egyptian Journal of Radiology and Nuclear Medicine, 55(1). https://doi.org/10.1186/s43055-024-01356-2 [27] Markus, A. F., Kors, J. A., & Rijnbeek, P. R. (2020). The role of explainability in creating trustworthy artificial intelligence for health care: A comprehensive survey of the terminology, design choices, and evaluation strategies. Journal of Biomedical Informatics, 113, 103655. Elsevier BV. https://doi.org/10.1016/j.jbi.2020.103655 [28] Matsoukas, C., Haslum, J. F., Söderberg, M., & Smith, K. A. (2023). Pretrained ViTs Yield Versatile Representations for Medical Images. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2303.07034 [29] Mudgal, S. K., Agarwal, R., Chaturvedi, J., Gaur, R., & Ranjan, N. (2022). Real-world application, challenges and implication of artificial intelligence in healthcare: an essay. PubMed, 43, 3. National Institutes of Health. https://doi.org/10.11604/pamj.2022.43.3.33384 [30] Muhammad, D., & Bendechache, M. (2024). Unveiling the black box: A systematic review of Explainable Artificial Intelligence in medical image analysis. Computational and Structural Biotechnology Journal, 24, 542. Elsevier BV. https://doi.org/10.1016/j.csbj.2024.08.005 [31] Pereira, G. A., & Hussain, M. A. (2024). A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships. arXiv (Cornell University). Cornell University. https://doi.org/10.48550/arxiv.2408.15178 [32] Prusty, S., Patnaik, S., Dash, S. K., & Prusty, S. G. P. (2023). SEMeL-LR: An improvised modeling approach using a meta-learning algorithm to classify breast cancer. Engineering Applications of Artificial Intelligence, 129, 107630. https://doi.org/10.1016/j.engappai.2023.107630 [33] Rahman, M. M., Khan, Md. S. I., & Babu, H. Md. H. (2022). BreastMultiNet: A multi-scale feature fusion method using deep neural network to detect breast cancer. Array, 16, 100256. https://doi.org/10.1016/j.array.2022.100256 [34] Rane, N., Choudhary, S., & Rane, J. (2023). Explainable Artificial Intelligence (XAI) in healthcare: Interpretable Models for Clinical Decision Support. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4637897 [35] Rogha, M. (2023). Explain To Decide: A Human-Centric Review on the Role of Explainable Artificial Intelligence in AI-assisted Decision Making. arXiv (Cornell University). Cornell University. https://doi.org/10.48550/arxiv.2312.11507 [36] Rozario, S., & ?evora, G. (2023). Explainable AI does not provide the explanations end-users are asking for. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2302.11577 [37] Sadeghi, Z., Alizadehsani, R., Çifçi, M. A., Kausar, S., Rehman, R., Mahanta, P., Bora, P. K., Almasri, A., Alkhawaldeh, R. S., Hussain, S., Alata?, B., Shoeibi, A., Moosaei, H., Hladík, M., Nahavandi, S., & Pardalo, P. M. (2023). A Brief Review of Explainable Artificial Intelligence in Healthcare.https://doi.org/10.2139/ssrn.4600029 [38] Samek, W., & Müller, K. (2019). Towards Explainable Artificial Intelligence. In Lecture notes in computer science (p. 5). Springer Science+Business Media. https://doi.org/10.1007/978-3-030-28954-6_1 [39] Sendak, M., Elish, M. C., Gao, M., Futoma, J., Ratliff, W., Nichols, M., Bedoya, A., Balu, S., & O’Brien, C. (2019). “The Human Body is a Black Box”: Supporting Clinical Decision-Making with Deep Learning. arXiv (Cornell University). https://doi.org/10.48550/arxiv.1911.08089 [40] Sharma, A., & Verma, N. K. (2023). A Novel Vision Transformer with Residual in Self-attention for Biomedical Image Classification. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2306.01594 [41] Shen, Y., Guo, P., Wu, J., Huang, Q., Le, N., Zhou, J., Jiang, S., & Unberath, M. (2023). MoViT: Memorizing Vision Transformers for Medical Image Analysis. Lecture Notes in Computer Science, 205. https://doi.org/10.1007/978-3-031-45676-3_21 [42] Shortliffe, E. H., Davis, R., Axline, S. G., Buchanan, B. G., Green, C. C., & Cohen, S. N. (1975). Computer-based consultations in clinical therapeutics: Explanation and rule acquisition capabilities of the MYCIN system. Computers and Biomedical Research, 8(4), 303. https://doi.org/10.1016/0010-4809(75)90009-9 [43] Singh, A., Sengupta, S., & Lakshminarayanan, V. (2020). Explainable deep learning models in medical image analysis. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2005.13799 [44] Takahashi, S., Sakaguchi, Y., Kouno, N., Takasawa, K., Ishizu, K., Akagi, Y., Aoyama, R., Teraya, N., Bolatkan, A., Shinkai, N., Machino, H., Kobayashi, K., Asada, K., Komatsu, M., Kaneko, S., Sugiyama, M., & Hamamoto, R. (2024). Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review [Review of Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review]. Journal of Medical Systems, 48(1). Springer Science+Business Media. https://doi.org/10.1007/s10916-024-02105-8 [45] Xu, H., & Shuttleworth, K. M. J. (2023). Medical artificial intelligence and the black box problem: a view based on the ethical principle of “do no harm.” Intelligent Medicine, 4(1), 52. https://doi.org/10.1016/j.imed.2023.08.001 [46] Yang, G., Ye, Q., & Xia, J. (2021). Unbox the Black-box for the Medical Explainable AI via Multi-modal and Multi-centre Data Fusion: A Mini-Review, Two Showcases and Beyond. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2102.01998 [47] Zhang, Y., Weng, Y., & Lund, J. N. (2022). Applications of Explainable Artificial Intelligence in Diagnosis and Surgery, Diagnostics, 12(2), 237. Multidisciplinary Digital Publishing Institute. https://doi.org/10.3390/diagnostics12020237 [48] Zhou, X., Guo, Y., Shen, M., & Yang, G. (2020). Artificial Intelligence in Surgery. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2001.00627

Copyright

Copyright © 2025 Laxmi Yadav, Girish Chandra, Divakar Yadav. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET71641

Publish Date : 2025-05-26

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here

A PHP Error was encountered