Machine Learning-Based Early Detection of Parkinson’s Disease

Authors: Mr. S. Somashakar, Mr. U. Kumar Sai, Mr. S. Vinod, Mr. S. Sai Vamsi, Mrs. P. Leelavathi

DOI Link: https://doi.org/10.22214/ijraset.2025.68833

Abstract

Parkinson\'s disease (PD) is a neurological condition that worsens over time and has a major effect on quality of life and motor function. For better results and efficient care, early diagnosis is essential. Using clinical and biological speech data, this study suggests a machine learning-based method for the early identification of Parkinson\'s disease. The ability of many classification methods, such as Support Vector Machines (SVM), Random Forest, and k-Nearest Neighbors (k-NN), to differentiate between healthy people and PD patients was assessed. The model\'s promising sensitivity, specificity, and accuracy show promise as a non-invasive, affordable diagnostic tool. The findings demonstrate that incorporating machine learning methods into clinical procedures for the early diagnosis of Parkinson\'s disease is feasible

Introduction

Parkinson’s disease (PD) is a progressive neurodegenerative disorder affecting movement due to dopamine neuron loss. Early diagnosis is crucial for better patient care but traditional methods often rely on subjective clinical evaluations, which may miss early-stage PD. Recent advances in machine learning (ML) offer promising, objective tools for early detection by analyzing complex biological data patterns, especially through non-invasive voice analysis, as vocal impairments are early PD indicators.

The study uses various ML algorithms (SVM, Random Forest, XGBoost, AdaBoost) on a dataset of speech recordings from PD patients to classify and diagnose the disease. The data undergo preprocessing, feature extraction, and selection (using methods like Chi-square tests) to improve model accuracy and reduce irrelevant information. Models are trained and tested, with performance evaluated through metrics such as accuracy, precision, recall, F1-score, confusion matrices, and ROC curves.

A hybrid ensemble approach combining SVM, Random Forest, and XGBoost via a voting classifier achieved the highest accuracy (~95%), outperforming individual models. The system demonstrated robust ability to differentiate PD patients from healthy controls, with good generalizability across datasets. The research confirms that ML-based voice analysis is an effective, scalable, and non-invasive method for early PD diagnosis, potentially improving clinical outcomes and supporting real-time monitoring.

In essence, this study highlights how combining multiple ML classifiers and using voice data can significantly enhance early detection of Parkinson’s disease compared to traditional clinical approaches.

Conclusion

In conclusion, the use of machine learning models—Random Forest, XGBoost, and SVM in particular—for the identification of Parkinson\'s disease shows how hybridization with a voting classifier may lead to improved accuracy and robustness. Every classifier has advantages of its own. For example, SVM is excellent at class separation, Random Forest is good at group learning, and XGBoost is good at gradient boosting. By using a Voting Classifier to enable a collaborative decision-making process, the hybridization technique capitalizes on the advantages of each model. This ensemble method reduces the drawbacks of a single classifier while demonstrating increased prediction accuracy.The four most accurate machine learning models, according to our study, are AdaBoost, Random Forest, Support Vector Machine, and XGBoost. whereby AdaBoost is 84.6, Random Forest is 94.87, Support Vector Machine is 92.3, and XGBoost is 92.3. Following the use of these four machine learning models, we do hybridization by integrating the three most accurate models—Random Forest, SVM, and XGBoost. We employ a Voting Classifier to aggregate the advantages of several models, resulting in the best parameters

References

[1] https://www.kaggle.com/datasets/vikasukani/parkinsons-disease-data- [2] https://www.kaggle.com/datasets/s3programmer/parkison-diseaseseeg- [3] Anudeep, P., Mourya, P., Anandhi, T. (2021). Parkinson’s DiseaseDetection Using Machine Learning Techniques. In: Mallick, P.K.,Bhoi, A.K., Chae, GS., Kalita, K. (eds) Advances in Electronics,Communication and Computing. ETAEERE 2020. Lecture Notes inElectrical Engineering, vol 709. Springer, Singapore.https://doi.org/10.1007/978-981-15-8752-8_49 [4] Oh, S.L., Hagiwara, Y., Raghavendra, U. et al. A deep learningapproach for Parkinson’s disease diagnosis from EEG signals. NeuralComput & Applic 32, 10927–10933 (2020). https://doi.org/10.1007/s00521-018-3689-5 [5] Zehra Karapinar Senturk, Early diagnosis of Parkinson’s disease usingmachine learning algorithms, Medical Hypotheses, Volume 138 , 2020, 109603 , ISSN 0306 - 9877 , https:// doi. org/ 10 . 1016 /j.mehy.2020.109603. [6] Johri, Anubhav, and Ashish Tripathi. \"Parkinson disease detectionusing deep neural networks.\" In 2019 Twelfth international conferenceon contemporary computing (IC3), pp. 1-4. IEEE, 2019. [7] T. Chen and C. Guestrin, \"XGBoost: A Scalable Tree BoostingSystem,\" in Proceedings of the 22nd ACM SIGKDD InternationalConference on K [8] nowledge Discovery and Data Mining, 2016, pp. 785-794. [9] C. Cortes and V. Vapnik, \"Support-vector networks,\" MachineLearning, vol. 20, no. 3, pp. 273-297, Sep. 1995. [10] L. Breiman, \"Random forests,\" Machine Learning, vol. 45, no. 1, pp.5-32, Oct. 2001. [11] Y. Freund and R. E. Schapire, \"A Decision-Theoretic Generalizationof on-Line Learning and an Application to Boosting,\" Journal ofComputer and System Sciences, vol. 55, no. 1, pp. 119-139, Aug. 1997. [12] Chen, Xu, Xiaohui Yao, Chen Tang, Yining Sun, Xun Wang, and XiWu. \"Detecting Parkinson’s disease using gait analysis with particleswarm optimization.\" In Human Aspects of IT for the Aged Population.Applications in Health, Assistance, and Entertainment: 4thInternational Conference, ITAP 2018, Held as Part of HCIInternational 2018, Las Vegas, NV, USA, July 15–20, 2018,Proceedings, Part II 4, pp. 263-275. Springer International Publishing,2018. [13] Arora, S., Bhatia, M.P.S., & Singh, P. (2021). \"Analysis of voicedisorders in Parkinson’s disease using deep learning techniques.\"Biomedical Signal Processing and Control, 69, 102949. [14] Das, R. (2020). \"A comparison of multiple classification methods fordiagnosis of Parkinson disease.\" Expert Systems with Applications,37(2), 1568-1572.Proceedings of the 6th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI-2025)IEEE Xplore Part Number: CFP25US4-ART; ISBN: 979-8-3315-2266-7979-8-3315-2266-7/25/$31.00 ©2025 IEEE 1265Authorized licensed

Copyright

Copyright © 2025 Mr. S. Somashakar, Mr. U. Kumar Sai, Mr. S. Vinod, Mr. S. Sai Vamsi, Mrs. P. Leelavathi . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET68833

Publish Date : 2025-04-13

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here