Speech Based Parkinson\'s Disease Detection Using Machine Learning

Authors: J. Spandana, T. Rakesh, K. Pranay, Ch.Vijaya Bhaskar, Sunil Bhutada

DOI Link: https://doi.org/10.22214/ijraset.2023.50056

Abstract

The diagnosis of Parkinson\'s disease (PD) is often made after careful observation and evaluation of clinical indicators, such as the description of various motor symptoms. Traditional methods of diagnosis, on the other hand, may be prone to error since they depend on the subjective judgment of motions that might be difficult for human eyes to categorise. However, non-motor symptoms of PD in its early stages may be minor and might be due to a wide variety of diseases. Thus, it is difficult to make an early diagnosis of PD since these symptoms are often disregarded. These challenges have prompted the use of machine learning approaches for the categorization of PD and healthy controls or patients with comparable clinical presentations as a means of improving diagnostic and evaluation processes for PD (e.g., movement disorders or other Parkinson an syndromes). PD has been diagnosed using a wide variety of data types and machine learning techniques; the goal of this article is to present a synopsis of these approaches. In this study, we investigate PD recognition from a spoken language using CNN, ANN, and XGB. The CNN was fed stacked 2D input maps consisting of spectrograms and other short-term characteristics. The effectiveness of PD detection was analyzed by breaking down a voice recording into its component parts and comparing the results to those obtained by fusing all of the segments at the decision level.

Introduction

I. INTRODUCTION

Millions of individuals across the world suffer from Parkinson's disease (PD), a neurological ailment characterized by a broad range of symptoms including tremor, cognitive impairment, hallucinations, dementia, and sleep disturbances. In Bangladesh, 1600 individuals lose their lives annually to PD, and there is currently no treatment. Loss of smell, problems with REM sleep, cramped handwriting, an inability to walk, and other mobility issues are all symptoms of Parkinson's disease [1]. The symptoms often manifest on one side of the body and progress to the other; however, this is not always the case. Early symptoms are often subtle and easily missed, and they also vary from person to person. A lack of dopamine-producing neurons in the brain lies at the root of Parkinson's disease. A lack of the amino acid dopamine causes aberrant brain activity, which in turn causes Parkinson's disease. PD is thought to affect between 7 and 10 million people around the world. Compared to individuals over the age of 50, just 4% of those under the age of 50 really get a diagnosis. While PD cannot be prevented or cured, its symptoms may be managed [2]

Talking is a difficult undertaking since it requires the coordinated and precise regulation of a wide variety of processes and systems. The lungs are the major organ responsible for speech creation in humans, since they force enough air through the glottis to cause the vocal folds to vibrate and allow for sound to be produced. Vocal fold vibration results in an excitation signal with the same characteristics as a lung-expelled pressure wave. After entering the vocal tract, the source signal is filtered by the spectral envelope to produce the speech signal. [3, 4].

A. Motivation

As the Parkinson’s disease is affected to a large extent of audience and this disease is also hard to diagnose many people are suffering with this when it reached to a chronic stage. We chose this topic because of my friend's grandfather, who was affected by this disease. At starting they ignored it as it may be due to an age factor but, it affected them very severely, and he lost his voice. If this had been detected early, there would have been some solution to control this disease. Inspired by that incident, we opted for this project to implement machine learning algorithms to detect it.

B. Objectives

Although the clinical picture of PD will almost certainly change throughout the course of extended dopaminergic medication, close monitoring of the patient is essential to the success of correcting the primary clinical signs of PD.
The accelerometers included in most new cell phones should make it possible to track a patient's every move and provide a quantitative measure of their daily activity (e.g., walk or sit). Smart phones are of sufficient grade to provide enhanced medical diagnostics and status monitoring.

A considerable influence on healthcare costs, patient longevity, and quality of life might be realized by identifying speech alterations in Parkinson's patients before the development of debilitating physical symptoms. Parkinson's disease is often diagnosed by a combination of medical history, physical examination, and the detection of specific motor symptoms (PD). Yet, traditional diagnostic approaches may be prone to subjectivity since they depend on the evaluation of movements that might be difficult to characterize due to their subtlety to the human eye. Early nonmotor signs of Parkinson's disease, meantime, might be mild and can be brought on by a wide range of diseases. Because of this, early PD diagnosis is challenging [5], and these symptoms are often ignored. ML techniques have been under consideration as a possible game-changer in the diagnosis of this illness by scientists for some time. Methods of gait analysis that don't involve touching the patient might be widely used at home [6]. Very little effort has focused on incorporating ML approaches into the process to make it fully self-sufficient and useable even without an active network connection. Early-stage patients may also have speech issues [7] such dysphonia, echolalia, and hypophonia. The utilization of human voice by computers for information retrieval and analysis is a potential future development [8].

In Section 2, we detail the research survey that formed the basis of the study. In Section 3, we'll talk about the framework and approach that will be employed to reach the desired goal. Materials and techniques are outlined in Section 4, and the experiment and its outcomes are discussed in Section 5. The overall usefulness of the planned effort is discussed in Section 5. Section 6 wraps up the planned work and discusses potential upgrades.

II. RELATED WORK

For efficient learning and categorization, ML relies on the tried-and-true naive Bayes classifier algorithm. In accordance with Bayes' theorem, it calculates the probability that a certain event will take place given a given set of conditions. Many variations in voice signals are included in the data used to calculate the probability of a health problem. Naive Bayes classification is carried out with the help of the simple Gaussian naive Bayes algorithm, which supplies the classifier module [9].

Particle swarm optimization is used with a new method called RMDL for classification to get the best possible outcomes in both areas. In addition to being applicable to a wide variety of data types and file formats, the acquired findings demonstrated an increase in the reliability and efficiency of the models. This method shows promise for aiding PD diagnosis since no filter configuration is required to produce the structured co-incidence matrix [10].

Using deep learning methods to diagnose PD. PD was detected using a variety of data mining methods, including the Naive Bayes algorithm, support vector machines, multilayer perceptron neural networks, and decision trees. Multi-layer perceptron and logistic regression (MLPR) models were applied to speech input from acoustic devices in order to predict PD. Patients' geographical origins and linguistic characteristics were studied for their ability to foretell the development of PD [11].

Digital Parkinson's disease Analysis used machine learning techniques to categorise the many aspects of deep brain surgery images, and [12], it's been cleaned up and shown. A novel ensemble deep learning technique for categorization, Random Multimodal Deep Learning (RMDL) accepts data in many forms, including text, video, pictures, and symbolic representations. Across a wide variety of data types and classification issues, the acquired solutions yield consistently better performance than model techniques. The purpose of this research is to improve the reliability of machine learning approaches for the categorization of deep brain surgery pictures.

Effective diagnostic software based on the fuzzy K-nearest neighbor algorithm 2. The FKNN model is developed on top of a principal component analysis-derived optimal feature set. The model is then contrasted with SVM-based methods. FKNN-based systems were shown to be superior to SVM-based ones [13]. Massive amounts of information are gathered from both healthy people and those who have had Parkinson's disease in the past [14]. In order to train these algorithms, we need this data. XGBoost, Naive Bayes, and Decision Tree were utilized for the categorization, with Decision Tree achieving an accuracy of 87%. Sixty percent of the data is used for training, and forty percent is used for testing.

Finding persons who have Parkinson's disease has been investigated using a wide variety of deep learning and machine learning methods (PD). The primary focus of this investigation is on the feasibility of using speech signal analysis as diagnostic evidence for PD. As speech processing has been around for a while and can be put to use in a wide range of situations, it holds a lot of potential for the classification and diagnosis of PD. The purpose of this research is to examine the similarities and differences between many popular classification schemes [15].

III. FRAME WORK

The gradient boosted trees approach is widely used, and XG Boost is a popular and effective open-source implementation of the algorithm. To improve the accuracy of its predictions, the supervised learning method known as "gradient boosting" combines the predictions of many less sophisticated models.

For regression with gradient boosting, the weak learners take the form of regression trees, with each input data point being mapped into a leaf of the tree that stores a continuous score. The regularized (L1 and L2) objective function that XG Boost minimizes consists of a convex loss function (based on the difference between the predicted and target outputs) plus a penalty term for model complexity (in other words, the regression tree functions). Iteratively, as the training progresses, new trees are added to forecast the residuals or mistakes of earlier trees, which are then integrated with earlier trees to form the final prediction. To reduce the loss while including more models, gradient boosting employs a gradient descent approach.

IV. IMPLEMENTATION ANALYSIS

The Parkinson's illness speech dataset from the UCI Machine Learning library serves as training data. In addition, by combining the inputs of healthy and Parkinson's afflicted patients' spiral drawings, our suggested approach produces reliable outcomes. We suggest a hybrid approach, which is both efficient and accurate, by evaluating the patient's speech and their spiral drawing data. By comparing the two sets of data, the doctor can determine if the patient is healthy or not and what medication to give them depending on the severity of their condition.

Pre-processing: speech signals have to be broken down into their component parts, with the quiet portions having less energy than the spoken portions since their amplitude is lower. So, this method may be used to separate the sound of speaking from that of quiet. In this study, we isolate the consistent parts of each speech signal before carrying out the segmentation process. As these signals tend to be most consistent midway through their whole duration, cutting off the beginning and finish portions allows for continuous data transmission throughout. To eliminate issues at the beginning and end of the phonations, 2s segments were selected from the intermediate, steady section of the speech signals for the following acoustic analysis.

Distinct auditory units from healthy and Parkinson's patients were analyzed and contrasted. Waveform, spectrogram, intensity, and formant frequency representations are used in the acoustic analysis. Patients with Parkinson's disease (PD) and healthy controls had their voices removed from a longer audio recording, and the participants were asked to interpret the resulting 20-second piece. Pitch (shown in blue) and volume (shown in red) are two examples of acoustic parameters that may be extracted from audio recordings of both Parkinson's disease and healthy patients, as shown in the figure below (in yellow). As can be seen in Figure 'a,' when comparing the acoustic waveform of the sound unit "The North Wind and the Sun were arguing who was the stronger," to that of a typical patient, some of the peaks are flattened (refer figure b). This mimics the muffled condition of a throt microphone in much the same way. In PD patients, there is a noticeable break after every few words said over that 20-second period, but in healthy individuals, the sentences flow smoothly together as seen by the spectrogram. The intensity of a spot on a spectrogram represents the magnitude of the frequency variable. Very dark points have large amplitudes while extremely bright points have tiny amplitudes.

Table 1: The data set of voice based Parkinson’s disease

The intensity of sound units from people with Parkinson's disease and from those in good health varies as seen in the figure below. In this case, we observe that PD patients speak words with reduced intensity, i.e., below 58.1 dB, therefore most of the syllables were not heard clearly, but healthy persons say them with considerably greater intensity. PD patients and healthy persons' formant frequencies are shown in Figure 4. Here, we can hear how the clear formant of a healthy person's speech represents a certain frequency range, whereas the muffled formant of a person with PD's voice implies a lower frequency range. Motives for using characteristics like intensity and formant features are provided by this study. The following section elaborates on the suggested architecture for the system.

V. FUTURE ENHANCEMENT

Future research might explore other methods for making Parkinson's disease forecasts from a variety of sources. In this study, we categorise patients into two groups based on a single binary attribute: those with illness and those without. Patients with Parkinson's disease will be categorized into their stages using a variety of features in the future.

Conclusion

In this work, stress the significance of early diagnosis and prognosis of Parkinson\'s disease so that sufferers may get therapy and care as soon as feasible. This study investigates PD recognition from a speech signal using CNN, ANN, and XGB. The CNN was fed stacked 2D input maps consisting of spectrograms and other short-term characteristics. Decision-level fusion of all segments in a voice recording was compared to the influence of each individual segment on PD detection efficiency. Although deep learning does beat machine learning models, distinguishing it as the better method remains challenging. This is because our audio dataset was too small to properly categorise the deep learning approaches. ANN-based recognition outperforms HMM-based recognition when compared to CNN. The results of this research might therefore be seen as a first attempt to use cutting-edge science for the purpose of diagnosing diseases at an early stage. In addition to the forthcoming work on the speech dataset, additional symptoms of Parkinson\'s patients may be gathered to enable early detection of the disease. In PD diagnosis, it would be possible to gather and evaluate a dataset of motor and nonmotor symptoms. As compared to the other available algorithms for making diagnoses, the XGBoost classifier achieves the best results in this scenario. As our findings show, the Extreme Gradient Booster Algorithm is the best option for the Prediction of Parkinson\'s Disease when the data is boosted and trained using this method, with an effective accuracy.

References

[1] K. Mamun, \"Deep brain stimulation: a light of hope for Parkinson’s patients\", Deep brain stimulation: a light of hope for Parkinson’s patients | theindependentbd.com, 2018. [2] Anitha. R et al., “ EARLY DETECTION OF PARKINSON’S DISEASE USING MACHINE LEARNING”, IJARIIE-ISSN(O)-2395-4396, Vol-6 Issue-2 2020. [3] Hariharan M, et al., 2006 SPEECH EMOTION RECOGNITION USING STATIONARY WAVELET TRANSFORM AND TIMBRAL TEXTURE FEATURES ARPN Journal of Engineering and Applied Sciences 9 1316-22. [4] Khan T, et al., 2014 Classification of speech intelligibility in Parkinson\'s disease Biocybernetics and Biomedical Engineering 34 35-45 [5] Shikha Singh, Nikita Shingade, Priti Sarote, Deepti Yelale and Nihar Ranjan. “Parkinson’s disease detection using machine learning”, International Journal of Development Research, 12, (04), 55117-55119. [6] Lakany, H. Extracting a diagnostic gait signature. Pattern Recognition. 2008, 41, 1627–1637. [7] Hazan, H.; et al., Early diagnosis of Parkinson’s disease via machine learning on speech data. In Proceedings of the 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, Eilat, Israel, 14–17 November 2012; pp. 1–4. [8] rid, A.; et al., Computational diagnosis of Parkinson’s Disease directly from natural speech using machine learning techniques. In Proceedings of the 2014 IEEE International Conference on Software Science, Technology and Engineering, Washington, DC, USA, 11–12 June 2014; pp. 50–53. [9] Bhatia, A.; Sulekh, R. Predictive Model for Parkinson’s disease through Naïve Bayes Classification. Int. J. Comput. Sci. Commun. 2017, 9, 194–202 [10] V.Kakulapati et al., \"RMDL: Classification of Parkinson\'s disease by nature- Inspired algorithm\", International Journal of Pharmaceutical Research Volume 12, issue 3, July - Sept, 2020, 10.31838/ijpr/2020.12.03.001. [11] V.Kakulapati et al., \"RMDL: Classification of Parkinson\'s disease by nature- Inspired algorithm\", International Journal of Pharmaceutical Research Volume 12, issue 3, July - Sept, 2020, 10.31838/ijpr/2020.12.03.001. [12] Kakulapati, V., et al., (2020). Metaheuristic Approach of RMDL Classification of Parkinson’s Disease. In: Oliva, D., Hinojosa, S. (eds) Applications of Hybrid Metaheuristic Algorithms for Image Processing. Studies in Computational Intelligence, vol 890. Springer, Cham. https://doi.org/10.1007/978-3-030-40977-7_17. [13] Chen, H. L., Huang, C. C., Yu, X. G., Xu, X., Sun, X., Wang, G., and Wang, S. J. (2013). An efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest neighbor approach. Expert systems with applications, 40(1), 263-271. [14] C K GOMATHY, B. DHEERAJ KUMAR REDDY, Ms. B. VARSHA and B. VARSHINI, “The parkinson’s disease detection using machine learning techniques”, International Research Journal of Engineering and Technology (IRJET), 8(10),pp 440- 444,2021. [15] Senjuti Rahman et al., Classification of Parkinson’s Disease using Speech Signal with Machine Learning and Deep Learning Approaches” EJECE, European Journal of Electrical Engineering and Computer Science ISSN: 2736-5751, Vol 7| Issue 2| March2023, DOI: http://dx.doi.org/10.24018/ejece.2023.7.2.488

Copyright

Copyright © 2023 J. Spandana, T. Rakesh, K. Pranay, Ch.Vijaya Bhaskar, Sunil Bhutada. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET50056

Publish Date : 2023-04-03

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here