This study aims to improve the reliability of academic performance prediction by addressing missing and irregular mobile phone usage data through a context-aware imputation strategy. Methods: A case study was conducted using anonymized mobile phone usage logs collected from undergraduate students over one academic semester. A novel data imputation method, designed to preserve temporal continuity and individual behavioral patterns, was integrated with deep learning models to predict academic performance. The proposed approach was evaluated against commonly used imputation techniques using standard predictive performance metrics. Findings: The results indicate that the proposed imputation method consistently enhances prediction accuracy and model stability compared to traditional imputation approaches, particularly for sequence-based deep learning models. Improved handling of missing behavioral data enabled more meaningful representation of student learning patterns. Novelty: Unlike existing studies that treat imputation as a generic preprocessing step, this research introduces a behavior-aware imputation method tailored specifically to mobile phone usage data in educational contexts. The study demonstrates that imputation quality plays a critical role in deep learning–based academic performance prediction and provides empirical evidence from a real-world educational case study.
Introduction
Academic performance prediction is increasingly important for identifying students who need early academic support. Traditional prediction models rely on static indicators such as demographics and past grades, which fail to capture dynamic learning behaviors.
With widespread smartphone use among students, mobile phone usage data—including screen time, app categories, and activity patterns—provide a continuous and behavior-sensitive perspective on learning habits. However, real-world mobile data often contain missing or irregular entries due to logging gaps, privacy constraints, or device inactivity. Conventional imputation methods (e.g., mean substitution, forward filling) ignore behavioral context and may distort temporal patterns, negatively affecting deep learning models.
Research Gap
Although prior research has successfully applied LSTM and Bi-LSTM models to sequential behavioral data, most studies treat missing data handling as a routine preprocessing step. Limited attention has been given to how imputation strategies directly influence predictive performance.
This case study addresses that gap by proposing a novel behavior-aware imputation method specifically designed for mobile phone usage data in educational contexts.
Case Study Context
The study was conducted at a higher education institution using anonymized mobile usage data collected over one academic semester. Features included:
Academic performance was categorized into grade groups (High, Medium, Low).
The dataset contained realistic missing values, making it suitable for testing imputation methods.
Proposed Novel Imputation Method
The new imputation strategy preserves both temporal continuity and individual behavioral consistency through three key principles:
1. Temporal Neighborhood Awareness
Missing values are reconstructed using surrounding time intervals, maintaining natural daily rhythms and avoiding abrupt artificial changes.
2. Behavioral Similarity
Imputation is personalized using each student’s historical usage profile, preserving individual habits (e.g., consistent low night usage or high educational app engagement).
3. Adaptive Weighting
Weights assigned to neighboring observations are dynamically adjusted based on data stability and density, improving robustness in sparse or variable regions.
This behavior-aware approach produces more realistic and sequentially coherent data, which is crucial for deep learning models.
Deep Learning Implementation
To evaluate effectiveness, LSTM and Bidirectional LSTM models were trained on datasets processed using:
Mean substitution
Forward filling
K-nearest neighbor imputation
The proposed behavior-aware method
Models were evaluated using:
Accuracy
F1-score
AUC (Area Under the Curve)
Results and Findings
The behavior-aware imputation method consistently outperformed traditional imputation techniques across all evaluation metrics.
Key observations:
Sequence-based models (LSTM, Bi-LSTM) showed the largest performance improvements.
Preserving temporal and behavioral patterns significantly enhanced predictive accuracy.
Data preprocessing decisions had an impact comparable to, or greater than, model architecture selection.
Conclusion
This study presented a real-world case study examining academic performance prediction using mobile phone usage data, with a particular focus on the development and evaluation of a novel data imputation method. By addressing missing and irregular data in a behavior-aware manner, the proposed approach was able to preserve temporal and individual usage patterns, resulting in significant improvements in the predictive performance of deep learning models. These findings underscore the critical role of thoughtful data preprocessing in enhancing model reliability, especially when dealing with sequential behavioral data that reflect students’ daily learning activities.
Beyond its methodological contribution, the study offers practical implications for educational institutions seeking to implement learning analytics systems. The behavior-sensitive imputation strategy can be integrated into existing analytics workflows to generate more accurate predictions while maintaining interpretability and respecting student privacy. Looking ahead, future research could extend this approach by validating its effectiveness across multiple institutions, exploring its adaptability to diverse student populations, and investigating its integration with real-time monitoring systems to provide timely interventions and personalized support for at-risk students.
References
[1] Vimala, S., & Sheela, G. A. S. (2025). Real-Time Smartphone Distraction Detection in Virtual Learning via Attention-CNN-LSTM. International Journal of Innovative Research in Technology, 12(6), 5644-5656.
[2] Vimala, S., & Sheela, G. A. S. Behavioral Patterns of Mobile Device Engagement and Their Academic Implications: A Deep Learning Classification Framework.
[3] Choube, A., Bhattacharya, S., Majethia, R., Li, J., Das Swain, V., & Mishra, V. (2025). Imputation Matters: A Deeper Look into an Overlooked Step in Longitudinal Health and Behavior Sensing Research. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 9(4), 1-30.
[4] Zhang, X., Zhang, Y., Chen, A. L., Yu, M., & Zhang, L. (2025). Optimizing multi label student performance prediction with GNN-TINet: A contextual multidimensional deep learning framework. PloS one, 20(1), e0314823.
[5] Vimala, S., & Sheela, D. G. A. S. (2025). A Comparative Study of Artificial Intelligence, Machine Learning, and Deep Learning Approaches in Predicting Academic Performance. International Multidisciplinary Research Journal Reviews (IMRJR).
[6] Vimala, S., & Sheela, G. A. S. (2025). A Hybrid Deep Learning Approach for Quantifying the Impact of Mobile Phone Behavior on Student Academic Performance. Journal of Engineering Research and Reports, 27(10), 185-193.
[7] Gu, J. (2025). Predicting student academic achievement using stacked ensemble learning with deep neural networks and fuzzy-based feature selection. Scientific Reports, 15(1), 37195.
[8] Vimala, S. (2025). Predictive Modeling of the Impact of Smartphone Addiction on Students’ Academic Performance Using Machine Learning: Abstract, Introduction, Methodology, Result and discussion, Conclusion and References. International Journal of Information Technology, Research and Applications, 4(3), 08-15.
[9] Vimala, S., & Sheela, D. G. A. S. (2025). Predictive Analytics for Mobile Phone Impact on Student Academic Achievement: A Deep Learning Framework for Digital Wellness Monitoring. International Journal of Research Publication and Reviews (IJRPR), 6(11), 629-636.
[10] Webb, C. A., Ren, B., Rahimi-Eichi, H., Gillis, B. W., Chung, Y., & Baker, J. T. (2025). Personalized prediction of negative affect in individuals with serious mental illness followed using long-term multimodal mobile phenotyping. Translational Psychiatry, 15(1), 174.
[11] Vimala, S., & Sheela, D. G. A. S. (2025). Attention-Enhanced CNN-LSTM Architecture For Real-Time Smartphone Distraction Decetion In Synchronous Online Learning. International Journal Advanced Research Publication (IJARP), 1(2), 01-11.
[12] Vimala, S., & Sheela, G. A. S. (2025). Impact of Smartphone Usage on Students’ Academic Performance Using Contemporary Deep Learning Models. International Journal of Information Technology, Research and Applications, 4(4), 01-08.
[13] Rezk, N. G., Attia, A. F., El-Rashidy, M. A., El-Sayed, A., & Hemdan, E. E. D. (2025). An efficient IoT-based crop damage prediction framework in smart agricultural systems. Scientific Reports, 15(1), 27742.
[14] Vimala, S., & Sheela, G. A. S. (2025). Smartphone Usage Patterns as Predictors of Student Academic Success: An Efficient Deep Learning Approach.
[15] Li, J., Guo, S., Ma, R., He, J., Zhang, X., Rui, D., ... & Guo, H. (2024). Comparison of the effects of imputation methods for missing data in predictive modelling of cohort study datasets. BMC Medical Research Methodology, 24(1), 41.
[16] Hua, V., Nguyen, T., Dao, M. S., Nguyen, H. D., & Nguyen, B. T. (2024). The impact of data imputation on air quality prediction problem. Plos one, 19(9), e0306303.
[17] Díaz-Bedoya, D., Philippon, A., González-Rodríguez, M., & Clairand, J. M. (2024). Innovative Deep Learning Techniques for Energy Data Imputation Using SAITS and USGAN: A Case Study in University Buildings. IEEE Access.
[18] Villar, A., & de Andrade, C. R. V. (2024). Supervised machine learning algorithms for predicting student dropout and academic success: a comparative study. Discover Artificial Intelligence, 4(1), 2.
[19] Farooq, U., Naseem, S., Mahmood, T., Li, J., Rehman, A., Saba, T., & Mustafa, L. (2024). Transforming educational insights: strategic integration of federated learning for enhanced prediction of student learning outcomes. The Journal of Supercomputing, 80(11), 16334-16367.
[20] Elhabyb, K., Baina, A., Bellafkih, M., & Deifalla, A. F. (2024). Machine learning algorithms for predicting energy consumption in educational buildings. International journal of energy research, 2024(1), 6812425.