Respiratory illnesses including COVID-19, tuberculo- sis, and pneumonia account for significant morbidity globally, yet timely detection remains challenging in resource-limited settings due to expensive diagnostic tests and delayed results. This paper presents SpectroCough, an AI-driven pre-screening system that analyzes cough audio to enable rapid, contactless, and affordable triage of respiratory diseases. The system com- bines Mel-Spectrograms and Mel-Frequency Cepstral Coefficients (MFCCs) with statistical acoustic features, employing a hybrid Convolutional Neural Network-Dense (CNN-Dense) ar- chitecture optimized for real-time inference. Trained on 300–500 samples per disease class, SpectroCough distinguishes between six respiratory conditions: COVID-19, tuberculosis, pneumonia, bronchitis, asthma, and normal cough, while simultaneously detecting fake coughs.
The system delivers predictions in under 2 seconds with confidence scores, enabling 60% faster triage compared to traditional diagnostic methods. Achieved accuracy of 89–95% across disease classes demonstrates the feasibility of acoustic-based disease screening for clinical support and healthcare accessibility in developing regions.
Introduction
Respiratory diseases pose a major global health burden, especially in countries like India where conditions such as COVID-19, tuberculosis, and pneumonia contribute to a high number of outpatient visits. Rural areas face significant diagnostic delays due to limited access to tools like X-rays and RT-PCR tests. Since cough is an early and common symptom of many respiratory conditions, recent research has explored using cough acoustics for automated disease detection. Variations in cough sound properties—caused by physiological changes in the respiratory tract—enable machine learning models to classify diseases accurately.
SpectroCough is proposed as a rapid, contactless, and cost-effective pre-screening tool using audio signal processing and deep learning. It provides real-time predictions in under two seconds, supports multi-disease classification across six respiratory illnesses, includes fake cough detection, offers confidence scores for clinical decision-making, and can be deployed on smartphones, making it ideal for resource-limited settings.
The literature review highlights strong prior advancements in audio-based disease detection, including CNN, hybrid CNN-LSTM models, transfer learning, and handcrafted feature approaches such as MFCCs. While many studies achieve high accuracy, challenges remain in robustness across diverse environments. Feature extraction methods such as MFCCs, Mel-spectrograms, and spectral augmentation are shown to significantly improve model generalization.
The SpectroCough methodology consists of an end-to-end hybrid architecture. Cough audio is collected via mobile or web interfaces, then preprocessed with noise filtering, silence removal, normalization, and segmentation. Dual feature extraction is performed: (1) deep spectral features through Mel-spectrograms and MFCCs using CNNs, and (2) handcrafted acoustic features such as Chroma, Spectral Centroid, Bandwidth, ZCR, and RMS energy. These features are fused in a hybrid model combining CNN and dense layers, followed by a softmax classifier for multi-disease prediction. A curated dataset is standardized through resampling, filtering, and segmentation to ensure consistency.
The deep learning model is trained using TensorFlow/Keras with Adam optimization, categorical cross-entropy loss, and regularization techniques such as dropout and L2 penalty. The dataset is split into 70:15:15 for training, validation, and testing. Model performance is evaluated using accuracy, precision, recall, F1-score, and confusion matrices, with training monitored through TensorBoard.
Conclusion
SpectroCough demonstrates the feasibility of rapid, accessible respiratory disease screening through acoustic analysis and deep learning. By achieving 91.6% overall accuracy, 89.6% sensitivity, and 92.6% specificity while maintaining sub-2- second inference time, the system meets critical requirements for clinical triage in resource-limited settings. The integration of Mel-Spectrograms, MFCCs, and statistical features with a hybrid CNN–Dense architecture provides robust multi- disease classification while simultaneously detecting fake coughs.
The system’s 60% faster triage compared to traditional diagnostic methods, combined with its non-invasive, con- tactless nature, positions SpectroCough as a powerful tool for supporting healthcare providers in making rapid clinical decisions. Particularly in developing regions with limited diagnostic infrastructure, SpectroCough offers a scalable, smartphone-deployable solution that can reduce diagnostic delays, lower testing costs, and improve patient outcomes through early detection and timely intervention.
Future work will focus on clinical validation across di- verse populations, expansion to additional respiratory condi- tions, and integration with electronic health records systems to enable seamless workflows in existing healthcare infras- tructure.
References
[1] Y.-P. Huang and R. Mushi, “Classification of cough sounds using spectrogram methods and a parallel-stream one-dimensional deep con-volutional neural network,” IEEE Access, vol. 10, pp. 97089–97100, Sept. 2022.
[2] S. Hamdi, M. Oussalah, A. Moussaoui, and M. Saidi, “Attention- based hybrid CNN-LSTM and spectral data augmentation for COVID- 19 diagnosis from cough sound,” Journal of Intelligent Information Systems, vol. 59, pp. 367–389, 2022.
[3] H. Benaliouche, H. Hafi, H. Bendjenna, and Z. Alshaikh, “Toward AI- driven cough sound analysis for respiratory disease diagnosis,” IEEE Access, vol. 13, pp. 92554–92568, May 2025.
[4] M. Pahar, M. Klopper, B. Reeve, R. Warren, G. Theron, and T. Niesler, “Automatic cough classification for tuberculosis screening in a real- world environment,” Physiological Measurement, vol. 42, p. 105014, Nov. 2021.