This study leverages real-time data and predictive analytics to mitigate stillbirth risks using machine learning techniques. We utilized the Cardiotocography (CTG) dataset to classify fetal health states as Normal, Suspect, or Pathologic. The preprocessing included data cleaning, correlation-based feature selection, and scaling. Eight machine learning models—Logistic Regression, Support Vector Machine (SVM), Decision Tree, Random Forest, Gradient Boosting, AdaBoost, XGBoost, and LightGBM—were trained and evaluated using standard metrics. Bayesian hyperparameter tuning was employed for performance enhancement. SHAP was used to explain feature importance. Ensemble models like LightGBM and XGBoost achieved the highest performance (F1-score ~0.992).
Introduction
Stillbirth, defined as fetal death at or after 28 weeks of gestation, remains a major global health issue, especially in Sub-Saharan Africa and South Asia. Key causes include maternal health problems, fetal complications, and limited healthcare access. Monitoring fetal well-being is crucial, with cardiotocography (CTG) being a common non-invasive method that tracks fetal heart rate and uterine contractions. However, manual CTG interpretation can be subjective.
This study applies multiple machine learning (ML) models to automate CTG analysis for early and reliable fetal distress detection, aiming to reduce stillbirth risk. Using a public CTG dataset (2,126 records, 36 features after cleaning), eight classifiers—Logistic Regression, SVM, Decision Tree, Random Forest, Gradient Boosting, AdaBoost, XGBoost, and LightGBM—were trained and optimized via Bayesian hyperparameter tuning.
Boosting models (XGBoost and LightGBM) achieved the highest F1-scores (~0.992), indicating strong predictive performance. SHAP (SHapley Additive exPlanations) was employed to improve model transparency, revealing key features like suspicious patterns (SUSP), light decelerations (LD), and variability measures (ASTV, MSTV) as influential predictors, consistent with clinical understanding.
Conclusion
This project effectively combined real-time cardiotocography (CTG) signals with machine learning methods to predict fetal health conditions associated with stillbirth. Among the models tested, ensemble boosting algorithms demonstrated superior performance. Incorporating SHAP values provided valuable insights into model decisions, which helps build confidence among healthcare professionals. Future work will focus on implementing these models in clinical practice and enhancing them by incorporating additional features to improve accuracy, fairness, and reliability.
References
[1] Eva Malacova et al. (2020). Stillbirth Risk Prediction Using Machine Learning for a Large Cohort in Western Australia.
[2] Aki Koivu and Mikko Sairanen (2020). Predicting Risk of Stillbirth and Preterm Pregnancies with Machine Learning.
[3] Zhengfan Wang et al. (2020). Estimating the Stillbirth Rate for 195 Countries Using a Bayesian Sparse Regression Model with Temporal Smoothing.
[4] Khulood K. Shattnawi et al. (2020). Rate, Determinants, and Causes of Stillbirth in Jordan (JSANDS).
[5] Toktam Khatibi et al. (2021). Machine Learning Algorithms to Predict Stillbirth Using Large Population-Based Data.
[6] Authors from Study 9 (2024). Early Prediction of Stillbirth Using Machine Learning Models.
[7] Authors from Study 10 (2024). Risk Assessment Model for Stillbirth Based on Machine Learning.
[8] Tess E. K. Cersonsky et al. (2024). Using Machine Learning to Identify Stillbirth Risk Utilizing Data from the SCRN.
[9] Adly Nanda Al-Fattah et al. (2024). A Prediction Model for Stillbirth Based on FirstTrimester Pre-Eclampsia Combined Screening.
[10] Oguzhan Gunenc et al. (2025). The Application of Machine Learning Models to Predict Stillbirths.