The Air Quality Index (AQI) is a standardized tool that simplifies complex air pollutant data into an accessible format, aiding public awareness and policy-making. It incorporates key pollutants like PM2.5, PM10, NO2, SO2, CO, and Ozone, with values ranging from 0 (Good) to 500+ (Hazardous). In recent decades, industrial growth and rapid increase in vehicular density resulted in an increase in the concentration of pollutants in the atmosphere.This study explores the use of machine learning (ML) in enhancing AQI prediction, particularly in urban areas like Jaipur, which face unique pollution challenges due to rapid urbanization, vehicular emissions, industrial activities, and natural phenomena like dust storms This study identifies gaps in region-specific modeling and handling extreme pollution events, emphasizing the need for tailored solutions for Jaipur’s semi-arid climate.. The study, thus, aims to fulfill this gap by developing ML-based AQI forecasting model, focusing on predictive modeling and real-time alert systems. Daily data of Air pollutants as well as meteorological parameters were accounted for the development of efficient prediction model for future forecasting of AQI. Data was trained with latest algorithm of temporal fusion transformer (tft) for efficient time-series forecasting. Results indicate that performance of tft was found to be on par with best performing neural network based models for AQI prediction. The tft model exhibited better accuracy than earlier machine learning models utilized in Jaipur and its performance is on line with cutting-edge methods. Future Application of this work is that the developed TFT model can be implemented by Air pollution controlling and regulating authorities for effective air pollution control and management.
Introduction
Air pollution is a major environmental and public health challenge in rapidly urbanizing cities, including Jaipur, where vehicular emissions, industrial activity, construction, biomass burning, and semi-arid climatic conditions contribute to elevated pollutant levels, particularly particulate matter. The Air Quality Index (AQI) integrates key pollutants into a single indicator to communicate health risks, but Jaipur currently lacks robust predictive systems to anticipate pollution episodes and support timely mitigation.
This study addresses this gap by developing a machine learning–based AQI forecasting framework tailored to Jaipur’s regional characteristics. Using daily data (2022–2024) from three Continuous Ambient Air Quality Monitoring Stations, the research integrates multiple pollutant concentrations and meteorological parameters to predict next-day AQI. A comprehensive literature review highlights the growing effectiveness of machine learning models—especially deep learning approaches—in air quality prediction and emphasizes the need for location-specific models.
The methodology involves extensive data preprocessing, including station-wise outlier detection, missing-value imputation, data cleaning, normalization, and feature selection to preserve temporal trends and reduce redundancy. Key predictors selected for modeling include PM??, CO, NO?, NO?, SO?, O?, and wind speed, chosen based on correlation analysis, regulatory importance, and physical interpretability. Highly correlated or redundant variables were excluded to avoid data leakage and overfitting.
For prediction, the study employs the Temporal Fusion Transformer (TFT), a deep learning time-series model capable of handling multivariate inputs, capturing complex temporal dependencies, and providing interpretability. The model is trained on a unified, normalized dataset and optimized through systematic hyperparameter tuning to enhance forecasting accuracy.
Conclusion
This study demonstrates the effectiveness of the Temporal Fusion Transformer (TFT) model for next-day Air Quality Index (AQI) forecasting in Jaipur. The optimized model achieved strong predictive performance, with a Mean Squared Error (MSE) of 430.39 and a coefficient of determination (R²) of 0.8965. These results are comparable to advanced hybrid forecasting architectures, such as Prophet–TFT models, and fall well within the performance range reported for state-of-the-art neural network–based AQI prediction approaches. This confirms the suitability of TFT for complex urban air quality forecasting tasks.
Hyperparameter optimization played a critical role in enhancing model performance. Increasing the hidden size from 8 to 32 resulted in an approximate 38% reduction in MSE and a 7.7% improvement in R², significantly improving model convergence and accuracy. Station-wise evaluation revealed spatial variability in performance, with Adarsh Nagar showing the highest accuracy due to relatively stable pollution dynamics, while Shastri Nagar exhibited higher prediction errors attributed to complex industrial emissions and meteorological influences.
The robustness of the proposed framework was strengthened through a station-specific preprocessing strategy that incorporated advanced outlier detection methods (Z-score, IQR, and STL decomposition) and tailored imputation techniques. This approach preserved extreme pollution events and ensured high-quality, consistent input data across monitoring locations. Feature selection based on correlation analysis and domain relevance further enhanced model generalizability. PM??, CO, NO?, NO?, SO?, O?, and wind speed were retained as key predictors, while PM?.? was intentionally excluded to prevent data leakage, maintaining scientific rigor and predictive validity.
The final trained model was stored in PyTorch format (tft_aqi_model.pt), enabling future deployment in real-time AQI forecasting and decision-support systems. Overall, the high accuracy and robustness of the TFT-based model highlight its potential for supporting proactive air quality management, early warning systems, and evidence-based pollution control strategies in Jaipur, aligning closely with contemporary best practices in air quality modeling.
References
[1] Soni, Manish, SwagataPayra, and SunitaVerma. \"Particulate matter estimation over a semi arid region Jaipur, India using satellite AOD and meteorological parameters.\" Atmospheric Pollution Research 9, no. 5 (2018): 949-958.
[2] Singh, UdayPratap, VivekSaxena, Anil Kumar, PurushottamBhari, and DeepikaSaxena. \"Unraveling the Prediction of Fine Particulate Matter over Jaipur, India using Long Short-Term Memory Neural Network.\" In Proceedings of the 4th International Conference on Information Management & Machine Intelligence, pp. 1-5. 2022.
[3] Suri, Raunaq Singh, Ajay Kumar Jain, Nishant Raj Kapoor, Aman Kumar, Harish Chandra Arora, Krishna Kumar, and Hashem Jahangir. \"Air quality prediction-a study using neural network based approach.\" Journal of Soft Computing in Civil Engineering 7, no. 1 (2023): 93-113.
[4] Bhati, Vikram Singh, Abhishek Saxena, and Ravi Khatwal. \"Exploring Air Quality Dynamics and Predictive Modeling by Using Artificial Intelligence During COVID-19 Lock Down Over the Western Part of India.\" Current World Environment 19, no. 2 (2024): 978.
[5] Gupta, N. Srinivasa, YashviMohta, KhyatiHeda, RaahilArmaan, B. Valarmathi, and G. Arulkumaran. \"Prediction of air quality index using machine learning techniques: a comparative analysis.\" Journal of Environmental and Public Health 2023, no. 1 (2023): 4916267.
[6] Sethi, Jasleen Kaur, and Mamta Mittal. \"Prediction of air quality index using hybrid machine learning algorithm.\" In Advances in Information Communication Technology and Computing: Proceedings of AICTC 2019, pp. 439-449. Springer Singapore, 2021.
[7] Gokul, P. R., Aneesh Mathew, AvadhootBhosale, and Abhilash T. Nair. \"Spatio-temporal air quality analysis and PM2. 5 prediction over Hyderabad City, India using artificial intelligence techniques.\" Ecological Informatics 76 (2023): 102067.
[8] Suthar, Gourav, NiveditaKaul, SumitKhandelwal, and Saurabh Singh. \"Predicting land surface temperature and examining its relationship with air pollution and urban parameters in Bengaluru: A machine learning approach.\" Urban Climate 53 (2024): 101830.
[9] Suthar, Gourav, Saurabh Singh, NiveditaKaul, and SumitKhandelwal. \"Prediction of Land Surface Temperature Using Spectral Indices, Air Pollutants, and Urbanization Parameters for Hyderabad City of India Using Six Machine Learning Approaches.\" Remote Sensing Applications: Society and Environment (2024): 101265..
[10] Natarajan, Suresh Kumar, Prakash Shanmurthy, Daniel Arockiam, BalamuruganBalusamy, and ShitharthSelvarajan. \"Optimized machine learning model for air quality index prediction in major cities in India.\" Scientific Reports 14, no. 1 (2024): 6795.
[11] Goyal, S., and R. Sharma. \"Prediction of the concentrations of PM2. 5 and NOx using machine learning-based models.\" Materials Today: Proceedings (2023).
[12] Dey, Sweta, Kalyan Chatterjee, Ramagiri Praveen Kumar, AnjanBandyopadhyay, Sujata Swain, and Neeraj Kumar. \"Apict: Air Pollution Epidemiology using Green AQI Prediction during Winter Seasons in India.\" IEEE Transactions on Sustainable Computing (2024).
[13] Choudhary, Arti, Pradeep Kumar, Chinmay Pradhan, Saroj K. Sahu, Sumit K. Chaudhary, Pawan K. Joshi, Deep N. Pandey, Divya Prakash, and AshutoshMohanty. \"Evaluating air quality and criteria pollutants prediction disparities by data mining along a stretch of urban-rural agglomeration includes coal-mine belts and thermal power plants.\" Frontiers in Environmental Science 11 (2023): 1132159.
[14] Bajpai, Mann, Tarun Jain, Aditya Bhardwaj, Horesh Kumar, and Rakesh Sharma. \"Air Quality Index Prediction Using Various Machine Learning Algorithms.\" In 6G Enabled Fog Computing in IoT: Applications and Opportunities, pp. 91-110. Cham: Springer Nature Switzerland, 2023..
[15] Suthar, Gourav, Rajat Prakash Singhal, SumitKhandelwal, and NiveditaKaul. \"Spatiotemporal variation of air pollutants and their relationship with land surface temperature in Bengaluru, India.\" Remote Sensing Applications: Society and Environment 32 (2023): 101011.
[16] Venkateswaran, R., Suresh Palarimath, and Mr Rogelio Gutierrez. \"Optimized Air Quality Index and Meteorological Predictions with Machine Learning and IoT.\" International Journal of Research and Review in Applied Science, Humanities, and Technology (2024): 110-120.
[17] Halder, S., and S. Bose. \"Ecological quality assessment of five smart cities in India: a remote sensing index-based analysis.\" International Journal of Environmental Science and Technology 21, no. 4 (2024): 4101-4118.
[18] Al-Hamdan, Mohammad Z., William L. Crosson, Ashutosh S. Limaye, Douglas L. Rickman, Dale A. Quattrochi, Maurice G. Estes Jr, Judith R. Qualters et al. \"Methods for characterizing fine particulate matter using ground observations and remotely sensed data: potential use for environmental public health surveillance.\" Journal of the Air & Waste Management Association 59, no. 7 (2009): 865-881.
[19] Blanco, Giacomo, Luca Barco, Lorenzo Innocenti, and Claudio Rossi. \"Urban Air Pollution Forecasting: A Machine Learning Approach Leveraging Satellite Observations and Meteorological Forecasts.\" In Proceedings of the 2024 IEEE International Workshop on Metrology for Living Environment (MetroLivEnv), 421–426. IEEE, 2024
[20] Goyal, P., and Sidhartha. \"Modeling and Prediction of Hourly Ambient Ozone (O?) and Oxides of Nitrogen (NO?) Concentrations Using Artificial Neural Network and Decision Tree Algorithms for an Urban Intersection in India.\" Journal of Hazardous, Toxic, and Radioactive Waste 19, no. 3 (2015): 05014006. https://doi.org/10.1061/(ASCE)HZ.2153-5515.0000270
[21] Lim, Bryan, Sercan O. Arik, Nicolas Loeff, and Tomas Pfister. \"Temporal Fusion Transformers for interpretable multi-horizon time series forecasting.\" International Journal of Forecasting 37, no. 4 (2021): 1748-1764.
[22] Wu, Ning, Xin Wayne Zhao, Jingyuan Wang, and Dayan Pan. \"Learning effective representations from global and local features for cross-view geo-localization.\" IEEE Transactions on Geoscience and Remote Sensing 60 (2022): 1-13.
[23] Sharma, Ekta, Mukesh Khare, and S. M. Shiva Nagendra. \"Air quality assessment and forecasting in an urban agglomeration of India using machine learning techniques.\" Environmental Monitoring and Assessment 192, no. 8 (2020): 1-17.
[24] Gupta, Priya, and Alpana Joshi. \"Temporal analysis of air quality in Jaipur city using statistical and machine learning models.\" Journal of Environmental Management 276 (2020): 111260.
[25] Raschka, Sebastian, and Vahid Mirjalili. \"Python machine learning: Machine learning and deep learning with Python, scikit-learn, and TensorFlow 2.\" Packt Publishing Ltd, 3rd Edition (2019): 456-478.
[26] Zhang, Y., Li, Z., and Wang, J. \"Time-series forecasting of air quality index using a hybrid model combining prophet and temporal fusion transformer.\" Environmental Science and Pollution Research 29, no. 15 (2022): 21834-21846.
[27] Khandelwal, Ankita, and Anil K. Gupta. \"Assessment of air pollution and its impact on human health in Jaipur city, Rajasthan.\" Indian Journal of Environmental Protection 39, no. 7 (2019): 623-630.