This paper explores machine learning techniques to assess the anthropometric measures most commonly linked to type 2 diabetic mellitus (T2DM). According to recent data, T2DM, which is mostly associated with visceral or abdominal obesity and metabolic abnormalities, is more common in patients with metabolic syndrome. . Identifying those who are at high risk for type 2 diabetes is vital given the disease\'s prevalence and serious complications. Anthropometric assessment techniques are one of the most straightforward and non-invasive ways to detect individuals at risk for diabetes, even though one-third of patients with the disease have not been identified, and this number is rising. Diabetes is predicted by the Waist-to-Height Ratio (WHtR), Body Adiposity Index (BAI), A Body Shape Index (ABSI), Body Mass Index (BMI), and Waist Circumference (WC). According to one study, the best indicators are WC and BMI. Few researches have compared the values of these anthropometric indices and their relationship with the prevalence of diabetes, despite the fact that there have been several studies on the topic. Additionally, visceral and subcutaneous fat cannot be detected using conventional anthropometric techniques.We chose to design a cohort study to investigate additional and novel anthropometrical measures for assessing their association with diabetes, given the rising prevalence of the disease, the dearth of accurate measurements for its diagnosis, and the controversy surrounding the findings of earlier studies. In order to identify patients at risk of acquiring diseases, we also employed machine learning techniques, which are revolutionary and efficient ways to organize a huge number of indicators while creating powerful predictive models
Introduction
Overview of Diabetes Mellitus
Diabetes mellitus is a chronic metabolic disorder characterized by high blood sugar due to insufficient insulin production or ineffective insulin use.
Left untreated, it can lead to serious complications like heart disease, kidney failure, and nerve damage.
Traditional diagnosis involves invasive blood tests (fasting glucose, HbA1c), which are often inaccessible in low-resource regions.
The Need for Non-Invasive Alternatives
With 1 in 10 adults globally affected, diabetes is both a medical and societal crisis.
Anthropometric parameters—such as BMI, waist circumference, waist-to-hip ratio (WHR), and body adiposity—are non-invasive, low-cost, and effective indicators of diabetes risk.
Manual interpretation of these indicators is difficult due to complex metabolic interactions, but machine learning (ML) can help uncover patterns and predict diabetes efficiently.
Machine Learning for Diabetes Prediction
ML models can predict diabetes without lab tests by learning from anthropometric and demographic data.
Popular ML models used: Logistic Regression, Decision Tree (DT), Random Forest (RF), and K-Nearest Neighbors (KNN).
ML handles the non-linear relationships between features and diabetes risk effectively.
Key Studies and Findings
Saberi-Karimian et al.: Waist circumference (WC), body adiposity index (BAI), and body shape index (ABSI) are strong T2DM predictors. Decision trees effectively categorized these.
Wee et al.: Deep learning models (CNN, DNN, MLP) showed high accuracy (~98.1%) in diabetes classification, though dataset standardization remains a challenge.
Wei et al.: In a Chinese cohort, Chinese Visceral Adiposity Index (CVAI) had the highest predictive power, outperforming BMI and WC in ROC analysis.
Hosseini et al.: KNN outperformed DT and logistic regression. Best predictors differed by gender (e.g., BMI, BAI, MAC for females; BRI, MAC for males).
Lugner et al.: Using UK Biobank data and XGBoost, HbA1c and BMI were top predictors, showing biological factors are more reliable than lifestyle data.
Proposed Methodology
A five-step process using both qualitative and quantitative data:
Data input and preprocessing (normalize numeric values, encode categorical variables).
Feature selection to reduce complexity.
Classifier training and testing (LR, DT, RF).
Predict class labels (diabetic/non-diabetic).
Validate using performance metrics (accuracy, precision, recall, F1 score, ROC AUC).
Data used: Open-source dataset with 10,000 subjects (4835 normal, 5165 diabetic). Parameters include BMI, waist circumference, blood pressure, and lifestyle factors like smoking and activity level.
Model Performance
Logistic Regression:
F1 Score: 0.96, ROC AUC: 0.99
Good performance, highly interpretable.
Decision Tree & Random Forest:
Perfect performance: Accuracy, Precision, Recall, F1 Score = 1.0
Random Forest is more robust but computationally expensive.
Confusion Matrices:
LR misclassified some cases.
DT and RF classified all correctly (100% accuracy).
Conclusion
The future of diabetes detection using anthropometrics parameters and Machine Learning lies in personalized, accessible, and explainable AI solutions that can seamlessly integrate into healthcare systems. By leveraging emerging technologies and expanding datasets, these models can revolutionize early diagnosis and preventive care globally.
It presents a cost-effective, non-invasive, and scalable approach to early diagnosis and risk assessment. By leveraging easily measurable body metrics such as BMI, waist circumference, waist-to-hip ratio, and skinfold thickness, combined with advanced ML algorithms, this method offers a promising alternative to traditional blood-based tests, particularly in low-resource settings.
1) High Accuracy: Machine learning models (e.g., Random Forest, SVM, Neural Networks) can effectively predict diabetes risk using anthropometric and demographic data.
2) Early Detection: Enables identification of high-risk individuals before clinical symptoms appear, allowing for timely lifestyle interventions.
3) Accessibility: Reduces dependency on invasive tests, making screening feasible in remote areas with limited healthcare infrastructure.
4) Integration Potential: Can be combined with wearable devices, mobile health apps, and electronic health records for continuous monitoring.
However, challenges such as dataset bias, model interpretability, and the need for clinical validation must be addressed to ensure reliability and widespread adoption. Future advancements in explainable AI (XAI), federated learning, and multi-modal data integration (combining anthropometrics with genetic and biochemical markers) will further enhance predictive performance.
In conclusion, Machine Learning driven diabetes detection using anthropometric parameters holds significant potential to revolutionize preventive healthcare, reduce global diabetes burden, and improve patient outcomes through early, affordable, and non-invasive screening. Continued research, real-world validation, and collaboration between AI experts and medical professionals will be crucial for its successful implementation.
References
[1] Maryam Saberi-Karimian, Amin Mansoori and Maryam MohammadiBajgiran et al., Data mining approaches for type 2 diabetes mellitus prediction using anthropometric measurements, Journal of clinical laboratory analysis, 12 December 2022 https://doi.org/10.1002/jcla.24798.
[2] Wei J, Liu X, Xue H, Wang Y, Shi Z. Comparisons of Visceral Adiposity Index, Body Shape Index, Body Mass Index and Waist Circumference and Their Associations with Diabetes Mellitus in Adults. Nutrients. 2019; 11(7):1580. https://doi.org/10.3390/nu11071580
[3] Salpea P, Malanda B, Karuranga S, Unwin N, Colagiuri S, Guariguata L, Motala AA, Ogurtsova K, Shaw JE, Bright D, Williams R (2019) Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the international diabetes federation diabetes atlas, 9th edition. Diabetes Res ClinPract 157:107843. https://doi.org/10.1016/j.diabres.2019.107843.
[4] Chawla R, Madhu S, Makkar B, Ghosh S, Saboo B, Kalra S et al (2020) Rssdi-esi clinical practice recommendations for the management of type 2 diabetes mellitus 2020. Indian J. Endocrinol. Metab 24(1):1
[5] Sacks DB, Arnold M, Bakris GL, Bruns DE, Horvath AR, Kirkman MS, Lernmark A, Metzger BE, Nathan DM (2011) Guidelines and Recommendations for Laboratory Analysis in the Diagnosis and Management of Diabetes Mellitus. Diabetes Care 34(6):61–99. https://doi.org/10.2337/dc11-9998https://diabetesjournals.org/care/article-pdf/34/6/e61 /609322/e61.pdf.
[6] Kazmi NHS, Gillani S, Afzal S, Hussain S (2013) Correlation between glycatedhaemoglobin levels and random blood glucose. J Ayub Med Coll 25(1–2):86–88Return to ref 25 in article
[7] Zaccardi F, Dhalwani NN, Papamargaritis D, Webb DR, Murphy GJ, Davies MJ, Khunti K (2017) Nonlinear association of bmi with all-cause and cardiovascular mortality in type 2 diabetes mellitus: a systematic review and meta-analysis of 414,587 participants in prospective studies. Diabetologia 60(2):240–248.
[8] Wee, B.F., Sivakumar, S., Lim, K.H. et al. Diabetes detection based on machine learning and deep learning approaches. Multimed Tools Appl 83, 24153–24185 (2024). https://doi.org/10.1007/s11042-023-16407-5.
[9] Lugner, M., Rawshani, A., Helleryd, E. et al. Identifying top ten predictors of type 2 diabetes through machine learning analysis of UK Biobank data. Sci Rep 14, 2102 (2024). https://doi.org/10.1038/s41598-024-52023-5
[10] Hosseini N, Tanzadehpanah H, Mansoori A, Sabzekar M, Ferns GA, Esmaily H, Ghayour-Mobarhan M. Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches. BMC Med Inform DecisMak. 2025 Jan 31;25(1):49. doi: 10.1186/s12911-025-02887-y. PMID: 39891090; PMCID: PMC11786328.
[11] Liu G, Li Y, Hu Y, Zong G, Li S, Rimm EB, Hu FB, Manson JE, Rexrode KM, Shin HJ et al (2018) Influence of lifestyle on incident cardiovascular disease and mortality in patients with diabetes mellitus. J Am CollCardiol 71(25):2867–2876.