Pune’s air quality has drastically declined as a result of the city’s rapid urbanization and growing car population, endangering public health. We need trustworthy models to forecast the future pollution levels before they occur in order to find solutions to this issue, not just the monitoring systems we currently have. This study uses the Facebook Prophet algorithm as a machine learning method to predict Pune’s Air Quality Index. We chose this model because it is the model can handle the seasonal and complex nature of environmental data more effectively than conventional statistical techniques.
The study examines a historical dataset from 2017 to 2024 that included common data problems like sensor errors and missing dates. We used a meticulous cleaning procedure to prepare the data, standardizing the timestamps and filling in any missing values with linear interpolation. To prevent the model from being misled by gaps in the data, it was crucial to create a continuous and uninterrupted timeline. The Prophet model divides the data into daily, weekly, and annual patterns in order to identify hidden trends, in contrast to basic linear regression, which assumes a straight line trend.
Our findings revealed clear cycles in Pune’s air quality, particularly how the monsoon rains naturally purify the air while wintertime traps pollutants and raises the Air Quality Index. Standard error metrics, such as Mean Absolute Error and Root Mean Squared Error are used to gauge how accurate our predictions were. The results show that our model effectively captures these regional patterns and provides a useful instrument for environmental monitoring. This system could effectively assist legislators and city planners in making data-driven decisions, such as controlling traffic flow before pollution levels become hazardous or promptly issuing health warnings.
Introduction
The text focuses on forecasting air quality (AQI) in Pune, India, using machine learning—specifically the Facebook Prophet time-series model—to address increasing pollution caused by urbanization, industrial growth, and construction activities.
Pune has experienced worsening air quality due to rising vehicle emissions and infrastructure development. Traditional deterministic and statistical models often fail to capture the complex, non-linear nature of pollution influenced by weather conditions. While machine learning models like LSTM and ARIMA are widely used, they require heavy tuning and struggle with missing or irregular environmental data. To overcome these limitations, this study applies the Facebook Prophet model, which is better suited for handling missing values and decomposing time-series data into trend, seasonality, and event-based components.
The study uses daily AQI data from 2017 to 2024, sourced from Kaggle, along with preprocessing steps such as interpolation for missing values, date formatting, and outlier removal. Prophet is used to model long-term trends, seasonal variations (weekly and yearly), and holiday effects. The model captures key patterns such as lower pollution during monsoon months and higher AQI levels in winter, along with a gradual increase in pollution over time due to urbanization.
Forecast results show that the model successfully tracks seasonal behavior and overall trends but slightly underestimates extreme pollution spikes caused by short-term local events like construction activity. Performance evaluation using RMSE (23.45) and MAE (18.12) indicates reasonable accuracy for highly variable environmental data.
Conclusion
This study effictively shows the efficieny of using the Facebook Prophet model in forecasting air quality trends for Pune. By utilizing AQI data spanning from 2017 to 2024, the analysis highlights a recurring improvement in air quality during the monsoon season, alongside a broader and concerning upward trend in pollution levels over time. The proposed approach offers a computationally efficient baseline forecasting solution that can be extended into a web-based platform for public awareness and policy support. Future enhancements will explore hybrid modeling techniques that combine Prophet’s trend modeling capabilities with LSTM-based residual learning to further reduce prediction error.
References
[1] F. Cai, “The Prediction of the Air Quality based on the Prophet Algorithm,” Highlights in Science, Engineering and Technology, vol. 39, pp. 1056–1060, 2023. [2] S. J. Taylor and B. Letham, “Forecasting at Scale,” The American Statistician, vol. 72, no. 1, pp. 37–45, 2018.
[2] A. Dorage, A. Chopkar, T. Kulkarni, and A. M. Deshpande, “Predictive Analysis of Pune City’s Air Quality Index: A ML and Time Series Approach,” in 2024 15th Conference on Computing Communication and Networking Technologies (ICCCNT), 2024.
[3] A. Samad et al., “Facebook’s Prophet vs LSTM for Air Pollution Forecasting in Data-Constrained Northern Nigeria,” arXiv preprint arXiv:2508.16244, 2025.
[4] Centre for Science and Environment, “Air Quality Tracker: Pune Metropolitan Region,” Aug. 2024. [Online]. Available: https://www.cseindia.org/air-quality-trackerpune-metropolitan-region-12320.
[5] Pune Municipal Corporation, “Environmental Status Report 2024-25,” PMC, Pune, India, Tech. Rep., 2025.
[6] B. Alam, A. Hussain, and M. Fayaz, “An Effective Approach for Air Quality Prediction in Bishkek Based on Machine Learning techniques,” Applied Sciences, vol. 12, no. 15, 2022. [8] P. Raizada, “Pune Air Quality Index Dataset (2017-2024),” Kaggle, 2024. [Online]. Available:
https://www.kaggle.com/datasets/pranavraizada/pune-airquality-index-dataset.