Diabetes mellitus is a common chronic condition that creates significant challenges for public health. Early risk prediction can help with timely prevention and better self-management. Many studies focus only on clinical data, but fewer consider lifestyle and behavioural signals that greatly affect disease onset. This work introduces a machine-learning web application that estimates a person’s risk of diabetes by looking at clinical factors (age, BMI, HbA1c, blood pressure, family history) and lifestyle habits (diet, physical activity, stress, sleep quality, and screen time). The system is built on the MERN stack to ensure secure data storage, user authentication, and interactive visualizations. Our model classifies users into Low, Moderate, or High-risk categories and offers practical, personalized advice for diet, exercise, and stress management. Experiments showed that combining lifestyle factors with medical features improved prediction accuracy compared to using clinical data alone, with analysis showing added value from lifestyle variables. These findings highlight the potential of digital health tools that mix medical and everyday data to help people make informed choices. We include basic protections like consent, role-based access, and anonymized storage to safeguard user privacy.
Introduction
Overview:
Diabetes mellitus is a growing global health concern, with increasing incidence putting pressure on patients and healthcare systems. Traditional risk assessments focus mainly on clinical factors like blood glucose, BMI, and blood pressure. However, lifestyle elements such as diet, physical activity, stress, and sleep also play crucial roles in diabetes development and progression.
Current Research & Gaps:
A literature survey identified numerous machine learning (ML) approaches used in diabetes prediction. These include:
Lifestyle-only models (e.g., Qin et al., 2022) showed good accuracy but lacked diversity in datasets.
Clinical-only models (e.g., Alzboon et al., 2025) were accurate but limited in scope.
Large-scale datasets (e.g., Li et al., 2023) improved prediction but lacked real-time or behavioral integration.
Explainable models (e.g., Allani, 2023) prioritized transparency but were restricted to binary classifications.
Sensor-based and emotion-aware models introduced innovative features but raised practicality and privacy concerns.
Key Research Gaps Identified:
Focus mainly on binary classification (diabetic vs. non-diabetic), ignoring prediabetes.
Lack of real-time tracking or integration with wearable/smartphone data.
Limited dataset diversity and external validation.
Underutilization of deep learning for lifestyle-based prediction.
Trade-offs between accuracy and model interpretability.
Minimal focus on younger populations or early intervention.
Proposed System: A Holistic Web-Based Diabetes Risk Predictor
The proposed solution is a MERN-stack (MongoDB, Express.js, React.js, Node.js) web application integrated with machine learning for multiclass diabetes risk prediction (Low, Moderate, High).
Key Features:
Comprehensive Data Collection:
Users enter clinical (e.g., HbA1c, BMI) and lifestyle (e.g., stress, sleep, diet) data through a user-friendly React interface.
Machine Learning-Based Prediction:
An ensemble ML model (e.g., Random Forest, Boosting) predicts diabetes risk.
Includes explainability tools like SHAP for better user understanding.
Multiclass Output:
Predicts Low, Moderate, or High risk levels (instead of just diabetic/non-diabetic).
Personalized Recommendations:
Suggests tailored diet, exercise, and stress management plans based on the user's risk level.
Health Record Tracking:
MongoDB stores user data securely, enabling longitudinal analysis of health trends over time.
Interactive Visualization:
Charts and graphs help users understand risk factors and lifestyle impacts clearly.
System Design & Workflow:
Frontend: React.js + Tailwind CSS for data input and visual feedback.
Backend: Node.js + Express.js with JWT for secure API handling.
Database: MongoDB for storing health history.
ML Model: Python-based (scikit-learn/TensorFlow) for risk prediction.
Recommendation Engine: Provides preventive care tips based on risk score.
Workflow Steps:
User logs in and submits data.
Backend authenticates and stores info.
Data is preprocessed and sent to ML model.
ML predicts risk level.
Recommendations and visualizations are returned to the user.
Implementation Summary:
The system has been implemented with full-stack web technologies and ML integration. It focuses not just on predicting diabetes but on empowering users to manage their lifestyle proactively, aiming to reduce risk through early intervention and behavioral change.
Conclusion
The developed diabetic risk prediction system shows how combining medical factors like BMI, blood pressure, glucose levels, and HbA1c with lifestyle patterns such as diet, sleep, stress, physical activity, and screen time can lead to more accurate and meaningful predictions.
Unlike traditional methods that focus only on clinical data, this approach highlights the everyday behaviors that contribute to diabetes risk, offering a more complete picture of an individual’s health. The system not only predicts risk levels but also provides personalized advice—including diet plans, exercise suggestions, and stress management tips—shifting the focus from treatment after diagnosis to preventive care. With features like long-term tracking and clear visualizations, users can monitor their progress over time and gain better awareness of how their habits influence their health.
What makes this system unique is its preventive outlook, explainable machine learning outputs, and user-friendly design, which together transform it into more than just a diagnostic tool—it becomes a supportive health companion. Looking ahead, the system can be enhanced with real-time data from wearables, larger and more diverse datasets for better accuracy, and a mobile app to provide continuous monitoring and instant feedback. By blending medical insights with lifestyle awareness, this work contributes to the growing field of AI-driven healthcare, promoting early detection, healthier choices, and an overall improvement in quality of life.
References
[1] Qin, Y., et al. (2022). Machine Learning Models for Data-Driven Prediction of Diabetes by Lifestyle Type. International Journal of Environmental Research and Public Health, 19(22), 15027.
[2] Adler, A. (2021). Using Machine Learning Techniques to Identify Key Risk Factors for Diabetes and Undiagnosed Diabetes. BMC Medical Informatics and Decision Making, 21, 123.
[3] Li, X., et al. (2023). Machine Learning for Predicting Diabetes Risk in Western China Adults. Diabetology & Metabolic Syndrome, 15(1), 112.
[4] Mohsen, F., et al. (2023). Artificial Intelligence-Based Methods for Precision Medicine: Diabetes Risk Prediction. arXiv preprint arXiv:2305.16346.
[5] Nguyen, B., & Zhang, Y. (2022). A Comparative Study of Diabetes Prediction Based on Lifestyle Factors Using Machine Learning. University of Washington Research.
[6] Kumar, R., et al. (2022). Deep Learning Model for Predicting Diabetes Using Mobile Sensor Data. Journal of Mobile Health Informatics, 9(3), 215–224.
[7] Sharma, M., & Singh, V. (2023). Lifestyle-Aware Diabetes Prediction with Feature Fusion. International Conference on Data Science and Health Analytics, 102–108.
[8] Das, P., et al. (2024). Emotion-Aware Health Monitoring for Diabetes Prediction Using Facial and Typing Patterns. IEEE Journal of Biomedical and Health Informatics, 28(2), 341–349.
[9] Rahman, M. M., et al. (2021). Risk Prediction of Diabetes Using Machine Learning Models. Scientific Reports (Nature), 11, 24154.
[10] Zhou, Z., et al. (2023). Explainable Artificial Intelligence for Type 2 Diabetes Risk Prediction: A Systematic Review. IEEE Access, 11, 47210–47225.