As one of the global outbreaks of modern innovation, the healthcare system poses a significant challenge to public health. For it kills people with all too great frequency. Without doubt, early and accurate prediction of heart disease is especially important. Focusing on patient records, machine learning methods have also been employed to predict whether a patient has the above-mentioned diseases or not. A comperhensive comparsion has been conducted in this paper to evaluate the prediction of cardiac related conditions using various learning-based models, such as decision tree, random forest, XGBoost, lightGBM and multilayer Percepton. First, finding related to the accuracy of various models are briefly discussed from existing literature. Then we experimentally evaluate our proposed Extra Trees Classifier using conventional classification metrics. Applications of Different Models. In this paper, we mainly use the synthetic_heart_disease_dataset, which contains clinical and demographic indicators widely used in cardiovascular risk assessment. Besides presenting the existing approaches, an ensemble-based Extra tress classifier is suggested to increase predicition accuracy by incorporating feature randomness and training strategies. Moreover, compared with conventional models, the proposed model has much lower variance and better generalization ability. Experiments show that our Extra Trees Classifier is significantly better than earlier methods such as Decision Tree, Random Forest, XGBoost, and Multilayer Perceptron. From the comparative analysis, it can be seen that ensemble learning methods achieve higher performance in predicting heart diseases. The route presented here could be an efficient tool for a clinical decision support system available for the earlier detection of heart disease.
Introduction
The text focuses on the importance of early and accurate prediction of heart disease, a leading cause of global mortality, using machine learning techniques. Traditional diagnostic methods rely on clinical tests and physician expertise, which can be time-consuming, costly, and prone to human error. With the rapid growth of healthcare data, machine learning and data mining have emerged as effective tools for extracting meaningful patterns from clinical datasets to support timely cardiac risk assessment.
The study reviews existing machine learning approaches such as Decision Trees, Random Forests, XGBoost, LightGBM, and Neural Networks, highlighting their strengths and limitations in terms of accuracy, computational cost, and interpretability. While ensemble and neural network models generally outperform basic classifiers, challenges remain regarding robustness, complexity, and explainability.
To address these issues, the paper proposes an Extra Trees–based ensemble model for early-stage heart disease prediction. Using a synthetic Kaggle heart disease dataset containing demographic, lifestyle, clinical, and medical history features, the study applies systematic preprocessing, feature analysis, and model evaluation. The proposed model introduces increased randomness to reduce overfitting and improve generalization.
Experimental results show that the Extra Trees classifier achieves superior performance, with an accuracy of 96.17%, outperforming previously reported models. The findings suggest that randomized ensemble methods provide an effective balance between accuracy, stability, and computational efficiency, making them suitable for real-time clinical decision support and early cardiac risk detection.
Conclusion
This paper presents a detailed comparison of how different ML-based approaches perform in assessing cardiac-risk. [4] With the help of a synthesized Kaggle heart disease dataset, this paper explores an ensemble learning-powered Extra Trees Classifier. [12]
The developed model achieved an accuracy of 96.17%, outperforming many perviously reported methods in the literature. This improvement can be largely attributed to the ensemble structure adopted in the Extra Trees based approaches, which introduces additional randomness during training and helps reduce overfitting.
By combining multiple randomized decision trees, the extra trees techniques is able to capture complex patterns in the data while maintaining good generalization performance. Overall, the results demonstrate that ensemble learning methods, particularly those implemented through the extra trees framework, are effective and reliable for supporting cardiac risk assessment in clinical decision-making. [11] [12]
References
[1] Nicholas, G.Hoendarto, and J.Tjen, \"cardiac risk Prediction with Decision Tree,\" Social Science and Humanities Journal, vol. 9, no. 1, pp. 6451-6457, Jan. 2025, doi: 10.18535/sshj. v9i01.1444.
[2] H. Al Amin, S. Wibisono, E. Lestariningsih, and M. L. M.A, \"Optimizing cardiac risk Prediction with Random Forest and Ensemble Methods,\" COGITO Smart Journal, vol. 11, no. 1, pp. 180-[Page Numbers], June 2025.
[3] Sakyi-Yeboah et al., \"cardiac risk Prediction Using Ensemble Tree Algorithms: A Supervised Learning Perspective,” Applied Computational Intelligence and Soft Computing, vol. 2025, Art. ID 1989813, 18 pages, 2025, doi: 10.1155/acis/1989813.
[4] Xia, \"Influencing Factors and Prediction of Heart Disease,” Highlights in Science, Engineering and Technology, vol. 123 (BFSPH 2024), pp. 586-592, 2024.
[5] Jiang, \"cardiac risk Prediction Using Machine Learning Algorithms,\" Master\'s Thesis, University of California, Los Angeles, 2020.
[6] M. Meti and Dr. Lingraj, \"Heart Boost: Clinical Data-Driven cardiac risk Prediction Using XGBoost,” International Research Journal on Advanced Engineering Hub (IRJAEH), vol. 3, no. 9, pp. 3517-3525, Sep. 2025, doi: 10.47392/IRJAEH.2025.0517.
[7] A.T L, A. BK, and D. D, \"cardiac risk Prediction Using Logistic Regression,” Indian Journal of Computer Science and Technology, vol. 4, no. 2, pp. 356-359, May-Aug. 2025, doi: 10.59256/indjcst 20250402048.
[8] F. Y. Ayankoya et al., \"cardiac risk prediction using machine learning model,” Global Journal of Engineering and Technology Advances, vol. 24, no. 2, pp. 036-049, 2025, doi: 10.30574/gjeta 2025.24.2.0223.
[9] B. Shehzadi et al., \"cardiac risk Prediction Statistical Analysis and Classification of cardiac riskUsing Clinical Parameters,” Social Sciences & Humanity Research Review, Jan.-Mar. 2025, pp. [Page Numbers], ISSN: 3007-3162.
[10] Y. Chen, \"Predicting cardiac risk Using Machine Learning: Analysis and New Insights,” Dean&Francis [Journal Title Implicit], pp. [Page Numbers], ISSN: 2959-6157.
[11] S. Chaudhari, C. S. Gautam, and A. A. Waoo, \"Optimizing cardiac risk Prediction Accuracy using Machine Learning Models,” International Journal of All Research Education and Scientific Methods (IJARESM), vol. 12, no. 6, June 2024, pp. [Page Numbers], ISSN: 2455-6211.
[12] V. V. R. Karna et al., \"A Comprehensive Review on cardiac risk Prediction using Machine Learning and Deep Learning Algorithms,” Archives of Computational Methods in Engineering, pp. [Page Numbers], 2024, doi: 10.1007/s11831-024-10194-4.
[13] Anjali Regala, SD Ravikanti, and RG Franklin, “Design and implementation of cardiac risk prediction using naive Bayesian\", International conference on trends in electronics and informatics (ICOEI), pp. 292-297.
[14] VV Ramalingam, A Dasapopath and MK. Raja, \"cardiac risk prediction using machine learning techniques-a survey\", International journal of Engineering & Technologies, Vol. 7, no. 5.8, pp. 684-7.
[15] E. I. Elsedimy, S. M. M. Abo Hashish, and E. Alzgara, \"New cardiovascular disease prediction approach using support vector machine and quantum-behaved particle swarm optimization\", Multimedia Tools and Applications, 2023.