Pregnancy complications still constitute an enormous threat to both mothers and babies around the world. According to WHO estimates, about 712 women die daily from pregnancy and delivery-related complications with more than 90% of cases occurring in low- and lower-middle-income countries [1]. Conventional methods of assessing maternal and fetal health risk depend on human interpretation of clinical measurements and Cardiotocography (CTG) signals, which can be tedious and time-consuming while being prone to significant inter-rater variability [2]. In this work, we propose a combined clinical decision support system called MaternaInsight which involves machine-learning models for both maternal and fetal health risk prediction. Maternal health risk prediction model identifies three risk categories – Low, Mid, and High based on LightGBM classifier optimized with Optuna and rebalanced with SMOTETomek, resulting in 90.64% of test accuracy (macro-F1=90.62%). Fetal classification is accomplished using a Stacking Ensemble approach with a logistic regression model as the meta-classifier, and the model is trained using 39 CTG-derived features, yielding an accuracy of 94.13% and macro-F1 of 88.66% on test data. Both classifiers are trained on data following the strict sequence of Split-Scale-Balance to avoid leakage. They also include interpretability of individual predictions using SHAP, making them clinically explainable. The entire pipeline is encapsulated in a four-page Streamlit web app containing a Clinical Reference and Performance Dashboard.
Introduction
This study presents MaternaInsight, a web-based healthcare system that uses machine learning (ML) to predict both maternal health risks and fetal health status during pregnancy. Maternal and fetal mortality remain significant global health challenges, especially in developing countries. Traditional methods such as Cardiotocography (CTG) for fetal monitoring require expert interpretation and often produce inconsistent results, while manual assessment of maternal risk factors can lead to delayed diagnosis of conditions such as hypertension, preeclampsia, and gestational diabetes.
To address these issues, MaternaInsight integrates two optimized ML models into a single platform and provides explainable predictions using SHAP values, helping healthcare professionals understand the factors influencing each prediction.
Related Work
Previous studies have applied machine learning to either maternal risk prediction or fetal health classification separately. Research on maternal health has achieved high prediction accuracy using physiological parameters such as blood pressure, blood glucose, heart rate, and temperature. Similarly, CTG-based fetal health classification has shown strong performance using models like Random Forest, Gradient Boosting, and ensemble methods. However, most existing studies:
Focus on either maternal or fetal health alone.
Lack explainable AI mechanisms.
Often suffer from data leakage due to improper preprocessing procedures.
MaternaInsight addresses these limitations by combining maternal and fetal assessment in one system, incorporating SHAP-based explanations, and following a leakage-free data preparation process.
Methodology
Maternal Health Risk Prediction
Uses the UCI Maternal Health Risk dataset containing 1,014 records.
Inputs include age, blood pressure, blood glucose, body temperature, and heart rate.
Twenty additional clinically relevant engineered features were created, such as pulse pressure, mean arterial pressure, shock index, hypertension flags, and blood sugar interactions.
Data preprocessing followed a strict Split → Scale → Balance sequence to prevent data leakage.
Several ML models were evaluated, including Logistic Regression, Random Forest, XGBoost, Voting Ensemble, Stacking Ensemble, and LightGBM.
Hyperparameter optimization was performed using Optuna.
Results
The LightGBM model with Optuna optimization achieved the best performance:
Accuracy: 90.64%
Macro F1-score: 90.62%
Stable cross-validation performance, indicating strong generalization and no data leakage.
The confusion matrix showed accurate identification of low-, medium-, and high-risk pregnancies, with no low-risk cases incorrectly classified as high-risk, improving clinical safety.
Key Contributions
Unified platform for both maternal and fetal health assessment.
Explainable AI through SHAP value interpretation.
Leakage-free machine learning pipeline.
High prediction accuracy with optimized models.
Potential to support early diagnosis and better pregnancy care management.
Conclusion
The research work proposed MaternaInsight, which is a Machine Learning-based decision support system for integrating maternal health risk stratification and CTG based Fetal health assessment into one Web Application. The maternal health prediction module uses the LightGBM algorithm and attained 90.64% (macro-F1 = 90.62%) test accuracy using 12-step ablation study with 26 feature engineered, Optuna hyperparameter tuning and SMOTETomek balancing techniques. The fetal health assessment module used a Stacking Ensemble technique with 39 clinically extracted CTG features and obtained 94.13% (macro-F1 = 88.66%) test accuracy. Both modules offer post-hoc explanations using SHAP technique. Additionally, a Clinical Reference guide was developed to assist clinical users.
The following are some of the improvements that have been proposed for further development. The first improvement would entail developing SHAP Natural Language Summaries, which will involve transforming SHAP feature contributions into automatic natural language descriptions. Secondly, a feature for exporting PDF prediction reports will be created, making it possible to incorporate predictions and SHAP explanations in the medical records of patients. The third improvement entails carrying out research on an exploratory CTG Image Digitisation Pipeline, which will be used to automatically generate CTG numerical parameters from CTG waveform images.
References
[1] World Health Organization, \"Mortality Trends and Maternal Health,\" WHO, Geneva, Switzerland, 2023. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/maternal-mortality
[2] D. Ayres-de-Campos, C. Y. Spong, and E. Chandraharan, \"FIGO consensus guidelines on intrapartum fetal monitoring: Cardiotocography,\" Int. J. Gynaecol. Obstet., vol. 131, no. 1, pp. 13–24, Oct. 2015.
[3] M. Ahmed, M. A. Kashem, M. Rahman, and S. Khatun, \"Review and Analysis of Risk Factor of Maternal Health in Remote Area Using the Internet of Things (IoT),\" in Lecture Notes in Electrical Engineering, vol. 632, Springer, Singapore, 2020.
[4] Y. Salini, S. N. Mohanty, J. V. N. Ramesh, M. Yang and M. M. V. Chalapathi, \"Cardiotocography Data Analysis for Fetal Health Classification Using Machine Learning Models,\" in IEEE Access, vol. 12, pp. 26005–26022, 2024.
[5] P. Fergus, M. Selvaraj, and C. Chalmers, \"Machine learning ensemble modelling to classify caesarean section and vaginal delivery types using cardiotocography traces,\" Comput. Biol. Med., vol. 93, pp. 7–16, Feb. 2018.
[6] S. Bertini et al., \"Using machine learning to predict complications in pregnancy: A systematic review,\" Front. Bioeng. Biotechnol., vol. 9, art. no. 780389, Jan. 2022.
[7] A. Mehbodniya, A. J. P. Lazar, J. Webber, and D. K. Sharma, \"Fetal health classification from cardiotocographic data using machine learning,\" Expert Syst., vol. 39, no. 6, p. e12899, Jul. 2022.
[8] A. Khadidos, F. Saleem, S. Selvarajan, and Z. Ullah, \"Ensemble machine learning framework for predicting maternal health risk during pregnancy,\" Sci. Rep., vol. 14, art. no. 21483, Sep. 2024.
[9] S. Venkatesh, H. Jha, F. Kazmi, and S. Zaidi, \"Classification of Maternal Health Risks Using Machine Learning Methods,\" in Advances in Digital Health and Medical Bioengineering, IFMBE Proceedings, vol. 109, Springer, Cham, pp. 1–8, 2024.
[10] I. Rafique, M. Dilawar, A. Umer, and M. A. Hassan, \"Classification of cardiotocography data for fetal health using feature selection techniques,\" in Artificial Intelligence in Intelligent Systems, Lecture Notes in Networks and Systems, vol. 229, Springer, Cham, pp. 34–44, 2021.
[11] A. Kuzu and Y. Santur, \"Early diagnosis and classification of fetal health status from a fetal cardiotocography dataset using ensemble learning,\" Diagnostics, vol. 13, no. 15, p. 2471, Jul. 2023.
[12] A. Singha and V. Venkateswaran, \"Cardiotocography fetal health data analysis using machine learning,\" in Proc. Int. Conf. Frontiers in Computing and Systems (COMSYS 2022), Lecture Notes in Networks and Systems, vol. 690, Springer, Singapore, pp. 449–462, 2023.
[13] S. Das, H. Mukherjee, K. Roy, and C. K. Saha, \"Fetal Health Classification from Cardiotocograph for Both Stages of Labor — A Soft-Computing-Based Approach,\" Diagnostics, vol. 13, no. 5, p. 858, Feb. 2023.
[14] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, \"SMOTE: Synthetic minority over-sampling technique,\" J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.
[15] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, \"Optuna: A next-generation hyperparameter optimization framework,\" in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Anchorage, AK, pp. 2623–2631, Jul. 2019.
[16] S. M. Lundberg and S.-I. Lee, \"A unified approach to interpreting model predictions,\" in Advances in Neural Information Processing Systems (NeurIPS), vol. 30, pp. 4765–4774, Dec. 2017.
[17] T. O. Togunwa, A. O. Babatunde, and K.-R. Abdullah, \"Deep Hybrid Model for Maternal Health Risk Classification in Pregnancy: Synergy of ANN and Random Forest,\" Frontiers in Artificial Intelligence, vol. 6, p. 1213436, 2023.
[18] L. Breiman, \"Random forests,\" Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001.
[19] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, \"A study of the behavior of several methods for balancing machine learning training data,\" ACM SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 20–29, 2004.