Influenza-Like Illness (ILI) remains one of the most persistent global public health concerns due to its rapid trans- mission patterns and seasonal variability. Traditional surveillance systems rely on weekly clinical reporting, which often introduces delays and limits timely decision–making. This research proposes a machine learning–based predictive framework capable of fore- casting weekly ILI cases using historical epidemiological data. Three supervised regression models were evaluated: Random Forest Regressor, and XGBoost Regressor. The system is deployed using Flask, providing an interactive interface for data upload, feature selection, training, evaluation, and visualization. Results demonstrate that Random Forest and XGBoost outperform baseline models, achieving high predictive accuracy with an R2 score of up to 0.96 on test data. The findings validate the potential of ensemble learning techniques in improving the timeliness and reliability of ILI surveillance.
Introduction
Influenza remains a major global health challenge due to its high mutation rate and seasonal outbreaks, causing millions of severe cases each year. Traditional surveillance systems for Influenza-Like Illness (ILI) are often delayed and limited in predictive capability, making early outbreak forecasting difficult. To address this, the study proposes a machine learning–based system for predicting ILI trends using historical surveillance data.
The system uses datasets from CDC sources such as ILINet, NREVSS, and Virus Season reports. After data preprocessing and cleaning, important features like weekly trends, age-group case distributions, and patient volumes are extracted. The target variable is total weekly ILI cases. Machine learning models including Random Forest Regressor and XGBoost Regressor are used to capture complex, non-linear patterns in disease spread.
The proposed system follows an end-to-end pipeline involving data ingestion, preprocessing, model training, prediction, and visualization. A Flask-based web application is used to automate forecasting and provide interactive dashboards for users.
Conclusion
This research demonstrates the effectiveness of machine learning models—particularly ensemble approaches—in fore- casting Influenza-Like Illness (ILI) cases from historical surveillance data. The developed Flask-based application suc- cessfully integrates preprocessing, model selection, training, evaluation, and visualization into an accessible platform suit- able for public health agencies. Future enhancements include integrating real-time data streams, incorporating environmental variables, and experimenting with deep learning models such as LSTM networks to capture long-term temporal dependen- cies.
References
[1] Schmidt, M., et al. “Ensemble-Based Approaches for Influenza Fore- casting.” 2019.
[2] Brownlee, J. “Machine Learning for Time Series Forecasting.” 2020.
[3] Jones, R., Patel, S. “Interactive Dashboards in Public Health Analytics.” 2021.
[4] Yang, L., et al. “Short-Term ILI Forecasting Using ML Models.” 2022.