A Comparative Review of Lightweight Machine Learning Approaches for Handwriting-Based Dyslexia Detection

Authors: Krispy Tyagi, Katyayani Sharma

DOI Link: https://doi.org/10.22214/ijraset.2025.71250

Abstract

Dyslexia, a neurodevelopmental disorder affecting reading and writing skills, requires early detection to mitigate long-term educational impacts. Traditional diagnostic methods are time-consuming and subjective, prompting the adoption of automated handwriting analysis. This paper reviews lightweight machine learning (ML) approaches for dyslexia detection through handwriting, emphasizing computational efficiency, real-time applicability, and cross-linguistic adaptability. By evaluating techniques such as MobileNetV2, SSD Lite, Support Vector Machines (SVM), and Random Forests across languages like English, Hindi, Arabic, and Chinese, we highlight trade-offs between accuracy, efficiency, and script-specific challenges. Our analysis reveals that lightweight models achieve competitive performance while addressing issues like accessibility, making them valuable for use in resource-constrained environments like classrooms.

Introduction

1. Background

Dyslexia is a neurodevelopmental disorder affecting 10–15% of the global population, impairing reading, writing, and phonological processing.
Early detection is critical but current diagnostic methods are time-consuming, subjective, and inaccessible in low-resource settings.
Handwriting analysis offers a non-invasive, scalable approach to detect dyslexia markers (e.g., letter reversals, inconsistent spacing, stroke irregularities).

2. Role of AI and ML

AI enables automated detection using handwriting features.
Deep learning models (CNNs) are accurate but computationally intensive and less interpretable.
Lightweight ML models (MobileNetV2, SSD Lite, SVM, Random Forest) provide a better balance of speed, efficiency, and accuracy, especially on edge devices.

3. Language-Specific Challenges

Dyslexia manifests differently across writing systems, requiring script-specific adaptations:

English: Best results with CNNs (up to 99.2% accuracy).
Hindi: Diacritic misplacements; hybrid models (SVM-CNN) achieved 89% accuracy.
Arabic: Cursive script needs ligature-aware models (82% accuracy).
Chinese: Stroke-based CNNs achieved around 78% accuracy.

4. Model Comparison & Evaluation

Model	Accuracy	F1 Score	Interpretability	Inference Speed	Best Use Case
MobileNetV2	89–94%	99.1%	Low	High	Real-time on mobile; good cross-language fit
SSD Lite	98.7%	98.2%	Low	Very High	Fast screening, but lower deep feature quality
SVM	99.3%	99.0%	High	Moderate	Small datasets; interpretable results
Random Forest	98.9%	98.5%	High	Slower	Robust and explainable; slower deployment

MobileNetV2: Uses depthwise separable convolutions to reduce computation by ~70%.
SSD Lite: Enables real-time letter-level detection but struggles with deeper semantic features.
SVM: Good for structured datasets; margin-based classifier.
Random Forest: Easy to interpret; robust with noisy data but slower in inference.

5. Key Challenges

Multilingual Datasets: Scarcity limits generalizability across languages.
Script Complexity: Arabic and Chinese scripts are harder to classify due to ligatures and visual intricacy.
Explainability: Deep learning models are "black-box" systems. Explainable AI (XAI) tools like Grad-CAM help visualize model decisions.

6. Future Directions

Develop script-aware and transfer learning models to adapt across diverse languages.
Expand multilingual datasets.
Enhance model interpretability for educational and clinical acceptance.
Optimize models for real-time, low-power, and resource-constrained environments.

Conclusion

This comparative study offers a critical evaluation of four key machine learning models—MobileNetV2, SSD Lite, Support Vector Machines (SVM), and Random Forests—based on existing research data aimed at dyslexia detection. The findings illuminate the relative advantages and constraints of each approach, underscoring that the optimal model choice is highly contingent on the specific application context [16]. MobileNetV2 and SSD Lite emerged as high-performing options for real-time dyslexia detection, particularly in mobile settings. Both models demonstrated strong accuracy, ranging between 89% and 94%, while maintaining efficient speed and low memory usage [17]. Nonetheless, despite these computational strengths, their limited interpretability may pose challenges for deployment in clinical contexts where transparency in model decisions is essential [18], [19]. Conversely, SVM and Random Forest models displayed commendable accuracy—SVM at 92% and Random Forest between 94% and 97%—and excel in scenarios requiring high interpretability [16]. These attributes make them especially suitable for educational and medical applications where clear, explainable outputs are critical. However, both models exhibit slower inference times, particularly when operating on large-scale or complex datasets [20]. The analysis identifies a distinct trade-off between computational speed and interpretability. While MobileNetV2 and SSD Lite are optimized for rapid processing, their opaqueness in decision-making processes can hinder adoption in settings that demand transparency [18]. On the other hand, SVM and Random Forests prioritize explainability but at the cost of slower performance. As such, model selection should account for both real-time performance demands and the necessity of interpretable outcomes. Consideration of computational resources is also crucial; while MobileNetV2 and SSD Lite are resource-efficient, SVM and Random Forest require greater computational power when handling extensive data [20]. Looking ahead, advancing the field will necessitate the creation of more diverse handwriting datasets, improvements in AI model transparency, and adaptations tailored to the distinct characteristics of various writing systems [21]. Developing hybrid approaches that integrate the advantages of deep learning models like MobileNetV2 with the interpretability of classical methods such as Random Forests may offer a practical compromise [22]. Furthermore, future studies should assess model performance across multiple languages and scripts, particularly those with complex orthographies like Arabic and Chinese [21]. Understanding cross-linguistic variances may enhance both the robustness and generalizability of these models. Ultimately, progress in these areas could pave the way for lightweight, interpretable AI systems that support early dyslexia detection, thereby facilitating timely interventions and significantly enhancing educational outcomes on a global scale.

References

[1] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang,T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” arXiv preprint arXiv:1704.04861, 2017. [2] W. Liu, D. Anguelov, D. Erhan, X. Han, and S. Jain, “SSD: Single Shot MultiBox Detector,” in Proc. European Conf. Computer Vision (ECCV), 2016, pp. 21–37. [3] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. [4] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. [5] Dyslexia Dataset for Children [6] A. Al-Salman, M. Al-Khassaweneh, and M. Al-Momani, ”The KHATT dataset: Arabic Handwriting Recognition,” Arabian Journal of Computer Science, vol. 5, no. 2, pp. 90-101, 2019. [7] CASIA Handwriting Database [8] V. Venkatesh, R. Sharma, “Machine Learning (SVM, CNN) for Dyslexia Detection in Handwriting Samples,” Journal of AI Research and Applications, vol. 8, pp. 120–130, 2020. [9] R. Venkatesh and K. Sharma, “SVM-CNN Hybrid for Hindi Dyslexia Detection,” in Proc. Int. Conf. AI for Multilingual NLP, 2021. [10] H. Rodriguez and J. Kim, “Ligature-aware CNNs for Arabic Dyslexia Detection,” Journal of Artificial Intelligence and Education, vol. 12, no. 1, pp. 45–58, 2024. [11] Y. Tan, M. Zhou, L. Li, and H. Zhang, “Stroke-level CNN for Chinese Dyslexia Detection,” IEEE Transactions on Affective Computing, 2022. [12] Z. Ahmed, R. Gupta, and T. Alvi, “Explainable AI in Handwriting-based Dyslexia Detection using Grad-CAM,” Expert Systems with Applications, vol. 232, 2024. [13] J. Smith and E. Johnson, “Convolutional Neural Networks for Handwriting-Based Dyslexia Detection,” International Journal of Cognitive Computing, vol. 15, no. 2, pp. 102–110, 2020. [14] R. Patel and N. Gupta, “Hybrid CNN-RNN Models for Detecting Temporal Patterns in Dyslexic Handwriting,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 31, pp. 321–330, 2023. [15] S. Mishra, A. Roy, and K. Verma, “Transfer Learning for Hindi Handwriting Dyslexia Detection using Pre-trained CNNs,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 23, no. 1, 2024. [16] M. H. Al-Khafaji, A. Alsaeedi, and S. Al-Zubaidi, ”Deep learning techniques for dyslexia detection: A survey and performance evaluation,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 7, pp. 332–338, 2020. [17] D. J. DeRose and L. A. Cavanaugh, ”Handwriting pattern recognition for dyslexia screening using deep learning models,” ProcediaComput. Sci., vol. 192, pp. 2934–2941, 2021. [18] Z. C. Lipton, ”The mythos of model interpretability,” Commun. ACM, vol. 61, no. 10, pp. 36–43, 2018. [19] F. Doshi-Velez and B. Kim, ”Towards a rigorous science of interpretable machine learning,” arXiv preprint arXiv:1702.08608, 2017. [20] A. Jaiswal, H. Gianey, and R. Sachdeva, ”Resource-aware machine learning models for mobile and embedded systems,” Mobile Inf. Syst.,vol. 2020, pp. 1–12, 2020. [21] H. Wang, X. Chen, and D. Lee, ”Cross-lingual handwriting analysis for dyslexia detection in complex scripts,” IEEE Trans. Cogn. Dev. Syst., vol. 13, no. 4, pp. 867–876, Dec. 2021. [22] Y. Zhang, Y. Zheng, and H. Qi, ”Explainable deep hybrid models: Balancing performance and transparency in sensitive domains,” J. Artif.Intell. Res., vol. 73, pp. 221–245, 2022.

Copyright

Copyright © 2025 Krispy Tyagi, Katyayani Sharma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET71250

Publish Date : 2025-05-19

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here