Tuberculosis (TB) resides a significant health challenge, particularly in regions with limited access to medical resources. It spreads through the air and mainly affects the lungs, and without early detection, it can be deadly. Unfortunately, many areas don’t have enough trained healthcare workers or proper screening tools, which delays diagnosis.To help with this issue, we’re currently working on building an AI-based system that can automatically detect TB using chest X-ray images. Our approach involves a multi model deep learning system that combines two architectures: DeiT (Data-efficient Vision Transformer) and ResNet-16. DeiT helps the model understand the overall structure of the image through attention mechanisms, while ResNet-16 focuses on capturing detailed features in specific regions, all while keeping the system lightweight and efficient. We\'re using the TBX11K dataset , featuring three types of chest X-rays: healthy, non-TB but sick, and TB-positive. To make the system more transparent, we’re also adding heatmap visualizations using class activation mapping, so we can actually see which parts of the image the model is focusing on while making predictions.
Our target is to design a model that’s not only accurate—hopefully hitting above 99% in key metrics—but also fast, with a prediction time of under 5 milliseconds. Once complete, we aim to make it suitable for real-time use in hospitals and clinics, especially in areas where medical support is limited.
Introduction
Tuberculosis (TB) remains a critical global health threat, with more deaths than HIV/AIDS in 2022 and ranking behind only COVID-19 among infectious diseases.
Despite improvements in treatment, early diagnosis is a major challenge, especially in low-resource settings due to limited diagnostic tools and trained professionals.
Deep Learning (DL) and Artificial Intelligence (AI), particularly Convolutional Neural Networks (CNNs), have emerged as powerful tools for automated TB detection from chest X-rays.
2. Problem & Challenges
Variability in image quality, lack of large annotated datasets, and overfitting remain common issues.
Existing DL models often struggle to generalize across different machines and populations.
There’s a need to balance accuracy, speed, and interpretability — a difficult trade-off in medical AI applications.
3. Proposed Solution
The study introduces a hybrid model combining ResNet-16 and DeiT (Data-efficient Vision Transformer):
ResNet-16 handles fine-grained local features (e.g., lesions, cavities).
DeiT captures global contextual features using attention mechanisms.
The architecture is both accurate and lightweight, ideal for clinical use in low-resource environments.
Uses Class Activation Maps (CAMs) for visual explanation, aiding clinicians in understanding model decisions.
4. Dataset & Preprocessing
TBX11K dataset is used, consisting of three classes: healthy, non-TB illnesses, and TB-positive cases.
Preprocessing includes:
Cleaning artifacts
Data augmentation (flip, zoom, rotate, shift) for robustness
Split: 80% train, 10% validation, 10% test.
5. Training & Optimization
Optimized with Adam optimizer; learning rate: 0.001
Dropout (0.2) prevents overfitting
Early stopping conserves computation
Feature extraction uses Global Average Pooling (GAP)
CAMs are used for interpretability
6. Performance Metrics
Evaluated using:
Accuracy
Precision
Recall (Sensitivity)
Specificity
Inference time
Parameter count
The model achieves >99% accuracy, is fast, and has a relatively low parameter count — making it suitable for real-time clinical deployment.
7. Comparative Evaluation
Compared against VGG16, DenseNet, and ResNet-based models.
Outperforms others in accuracy, precision, and recall, while being more efficient and generalizable.
8. Literature Review Highlights
Various techniques explored in other studies:
Stochastic ANNs, DenseNet with attention (CBAM), segmentation approaches, MobileNet with optimization algorithms (AEO), and stacked CNN ensembles.
Most models are accurate but often computationally intensive, lack explainability, or rely on large labeled datasets, limiting practical deployment.
Many fail to generalize across diverse clinical settings or equipment variability.
9. Discussion & Limitations
The hybrid ResNet–DeiT model effectively captures both local and global features, improving TB detection accuracy and clinical interpretability.
Heatmaps help build trust among healthcare professionals.
Limitations:
Still requires high computing power
Performance may vary across imaging equipment
Relies on access to large, labeled datasets
Future work should focus on:
Lightweight, adaptive models for low-resource settings
Cross-domain testing
Enhanced explainability
Reducing dataset dependency
Conclusion
Detecting tuberculosis through deep learning has shown great promise in improving diagnosis, especially in areas where medical resources are limited. The integration of advanced models like CNNs and transformers has led to better exactness and potential to visualize decision areas using heatmaps. These improvements not only make the system more effective but also help build trust by making the results easier to interpret. Techniques like data augmentation and feature optimization have further improved the model\'s ability to handle diverse medical images.Still, there are some challenges that need attention. The requirement for large labeled datasets, high processing power, and inconsistent image quality across sources can affect model performance. Despite these issues, the findings indicate that AI-powered tools can become valuable aids for doctors in diagnosing TB. Future research should aim to make these systems lighter, more transparent, and easier to apply in real-world clinics. With continuous development and support from healthcare professionals and researchers, deep learning can play a major role in reducing the burden of tuberculosis globally.
References
[1] E. Kotei and R. Thirunavukarasu, \"Tuberculosis Detection From Chest X-Ray Image Modalities Based on Transformer and Convolutional Neural Network,\" IEEE Access, vol. 12, pp. 97417–97427, Jul. 2024, doi: 10.1109/ACCESS.2024.3428446.
[2] World Health Organization, \"Tuberculosis (TB),\" WHO, 2022. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/tuberculosis
[3] C. S. Preetham and N. C. Krishnan, \"Deep learning approaches for tuberculosis detection using chest X-ray images,\" Computers in Biology and Medicine, vol. 137, p. 104783, 2021.
[4] A. K. Jaiswal, P. Tiwari, A. Kumar, S. Gupta, and V. Khanna, \"Classification of tuberculosis bacteria using deep convolutional neural network,\" Journal of Medical Systems, vol. 43, no. 7, pp. 1–7, 2019.
[5] H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, \"Training data-efficient image transformers & distillation through attention,\" in Proc. Int. Conf. Machine Learning (ICML), pp. 10347–10357, 2021.
[6] K. He, X. Zhang, S. Ren, and J. Sun, \"Deep residual learning for image recognition,\" in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016.
[7] H. T. Goh, B. Ong, and L. Wang, \"TBX11K: A large chest X-ray dataset for tuberculosis screening,\" arXiv preprint arXiv:2102.07661, 2021.
[8] L. Devnath, S. Luo, P. Summons, and D. Wang, \"Automated detection of pneumoconiosis with multilevel deep features learned from chest X-ray radiographs,\" Computers in Biology and Medicine, vol. 129, Art. no. 104125, Feb. 2021.
[9] A. T. Sahlol, M. A. Elaziz, A. T. Jamal, R. Damaševi?ius, and O. F. Hassan, \"A novel method for detection of tuberculosis in chest radiographs using artificial ecosystem-based optimisation of deep neural network features,\" Symmetry, vol. 12, no. 7, p. 1146, Jul. 2020.
[10] K. Munadi, K. Muchtar, N. Maulina, and B. Pradhan, \"Image enhancement for tuberculosis detection using deep learning,\" IEEE Access, vol. 8, pp. 217897–217907, 2020R. Hooda, S. Sofat, S. Kaur, A. Mittal, and F. Meriaudeau, \"Deep-learning: A potential method for tuberculosis detection using chest radiography,\" in Proc. IEEE Int. Conf. Signal Image Process. Appl. (ICSIPA), Sep. 2017, pp. 497–502
[11] S. Rajaraman et al., \"A novel stacked generalization of models for improved TB detection in chest radiographs,\" in Proc. IEEE EMBC, Jul. 2018, pp. 718–721
[12] M. Ayaz, F. Shaukat, and G. Raja, \"Ensemble learning based automatic detection of tuberculosis in chest X-ray images using hybrid feature descriptors,\" Physiological Engineering and Science in Medicine, vol. 44, no. 1, pp. 183–194, Mar. 202
[13] Vision Transformer (ViT) - Attention is All You Need for Vision https://www.youtube.com/watch?v=1fLE3J66U68
[14] Tuberculosis Detection using Deep Learning in Python | CNN Project https://www.youtube.com/watch?v=zCz2i2LE2b8
[15] Convolutional Neural Networks (CNNs) Explained https://www.youtube.com/watch?v=YRhxdVk_sIs