Diabetic Retinopathy (DR), Glaucoma, and Cataracts are major causes of preventable blindness,highlighting the need for early and accessible screeningsolutions. This work proposes a Smart Ocular Health Ecosystem that integrates a hierarchical deep learning framework formulti-disease retinal diagnosis with a Retrieval-Augmented Generation (RAG)–based AI medical chatbot. The system employs a two-stage cascade architecture, where a lightweight Generalist CNN classifies fundus images into Normal, Cataract, Glaucoma, and DR, followed by a Specialist Attention-Guided CNN fusion model for fine-grained DR severity grading. To enhance patient understanding, a RAG-based chatbot provides context-aware explanations grounded in verified clinical guidelines. Experimental results demonstrate high diagnostic performance, achieving 95.3% accuracy for DR grading, while enabling efficient multi-disease screening and improved patient interaction.
Introduction
Eye diseases such as Diabetic Retinopathy (DR), Glaucoma, and Cataract are major causes of vision loss worldwide. Traditional diagnosis through manual examination of retinal fundus images is time-consuming and depends heavily on expert knowledge. Recent advances in deep learning, especially Convolutional Neural Networks (CNNs), have enabled automated retinal image analysis, but single models often struggle to detect both small lesions and large structural abnormalities while also lacking explainability.
This research proposes an attention-based deep CNN fusion model for automated detection of multiple ocular diseases. The system follows a two-stage diagnostic approach: a lightweight CNN first performs general screening by classifying images into Normal, Cataract, or Glaucoma categories. If Diabetic Retinopathy is suspected, a specialized model using a fusion of VGG16, ResNet50, and DenseNet121 performs detailed DR severity grading. Attention mechanisms help the model focus on important retinal regions, improving feature extraction and classification accuracy.
To increase clinical trust, the system uses Grad-CAM visualization to highlight the retinal areas influencing predictions, allowing doctors to understand the model’s decisions. Additionally, a RAG-based medical chatbot is integrated to explain diagnoses and provide patient-friendly information using reliable medical knowledge sources.
The methodology includes retinal image preprocessing, noise reduction, resizing, normalization, dual-model classification, feature fusion, attention-based learning, and explainability generation. The DR specialist model combines features from multiple CNN architectures using trainable fusion weights, while MobileNet is used for efficient Glaucoma and Cataract detection with lower computational requirements.
The study evaluates the system using accuracy, precision, recall, F1-score, and AUC metrics. Results show that the fusion model performs better than individual CNN models by improving feature representation and reducing errors, especially in different DR severity stages. The complete proposed model achieved 95.3% accuracy and 0.96 F1-score, demonstrating strong capability for multi-disease retinal screening.
Key findings indicate that:
CNN fusion with attention mechanisms improves ocular disease detection performance.
Explainable AI methods like Grad-CAM increase medical reliability and transparency.
MobileNet enables faster and lightweight deployment for real-time screening.
The integrated chatbot improves communication between AI systems, doctors, and patients.
The system supports scalable applications such as tele-ophthalmology and automated eye screening programs.
Conclusion
This study introduces a complete Smart Eye Health System that goes beyond detecting just one eye disease. The system uses a step-by-step deep learning approach to screen for Cataract and Glaucoma quickly, while still providing detailed and accurate grading for Diabetic Retinopathy. To make the system easier to use, a RAG-based AI medical assistant is included. This assistant explains complex diagnostic results in simple and medically accurate language so patients can better understand their condition. Because of its design, the system can be used in large screening programs and remote eye care services such as tele-ophthalmology.
References
[1] V.Gulshan,L.Peng,andM.Coram,“Developmentandvalidationofa deep learning algorithm for detection of diabetic retinopathy,” JAMA,vol. 316, no. 22, pp. 2402–2410, 2016.
[2] D.S.W.TingandC.Y.Cheung,“Deeplearninginscreeningfordiabeticretinopathyandrelatedeyediseases,”JAMA,vol.318,no.22,pp.2211–2223, 2017.
[3] Z.LiandY.He,“Efficacyofdeeplearningfordetectingglaucomafromfundus images,” Ophthalmology, vol. 125, no. 8, pp. 1199–1206, 2018.
[4] X. Liu and H. Jiang, “Deep learning-based cataract detection usingfundus images,” IEEE Access, vol. 7, pp. 108837–108847, 2019.
[5] C. Lam and C. Yu, “Multi-disease classification of retinal images usingdeepcnns,”IEEEJournalofBiomedicalandHealthInformatics,vol.22,no. 5, pp. 1515–1525, 2018.
[6] J. I. Orlando and E. Prokofyeva, “An ensemble deep learning approachfor retinal disease classification,” Medical Image Analysis, vol. 39, pp.1–13, 2017.
[7] K. He and X. Zhang, “Deep residual learning for image recognition,”Proceedings of the IEEE Conference on Computer Vision and PatternRecognition, pp. 770–778, 2016.
[8] K. Simonyan and A. Zisserman, “Very deep convolutional networks forlarge-scale image recognition,” International Conference on LearningRepresentations, 2015.
[9] G. Huang and Z. Liu, “Densely connected convolutional networks,”Proceedings of the IEEE Conference on Computer Vision and PatternRecognition, pp. 4700–4708, 2017.
[10] X. Li and X. Hu, “Attention-guided deep neural networks for retinaldisease detection,” IEEE Access, vol. 7, pp. 109947–109956, 2019.
[11] O. Oktay and J. Schlemper, “Attention u-net: Learning where to lookfor the pancreas,” Medical Image Analysis, vol. 53, pp. 1–13, 2018.
[12] Z. Wang and J. Yang, “Zoom-in-net: Deep mining lesions for diabeticretinopathydetection,”IEEETransactionsonMedicalImaging,vol.36,no. 11, pp. 2337–2348, 2017.
[13] H. Pham and M. Luong, “Multi-cnn feature fusion for retinal diseaseclassification,”PatternRecognitionLetters,vol.131,pp.203–209,2020.
[14] A. Kori and S. Chennamsetty, “Ensemble of convolutional neuralnetworks for diabetic retinopathy grading,” Computer Methods andPrograms in Biomedicine, vol. 165, pp. 115–126, 2018.
[15] R. R. Selvaraju and M. Cogswell, “Grad-cam: Visual explanations fromdeepnetworksviagradient-basedlocalization,”ProceedingsoftheIEEEInternational Conference on Computer Vision, pp. 618–626, 2017.
[16] B. Zhou and A. Khosla, “Learning deep features for discriminativelocalization,” Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition, pp. 2921–2929, 2016.
[17] G. Litjens and T. Kooi, “A survey on deep learning in medical imageanalysis,” Medical Image Analysis, vol. 42, pp. 60–88, 2017.
[18] M. Raghu and C. Zhang, “Transfusion: Understanding transfer learningfor medical imaging,” Advances in Neural Information ProcessingSystems, 2019.
[19] F. Grassmann and J. Mengelkamp, “A deep learning algorithm fordetection of diabetic retinopathy,” Investigative OphthalmologyVisualScience, vol. 59, no. 7, pp. 3180–3187, 2018.
[20] W.H.Organization,“Worldreportonvision,”WHOPublications,2023.