Accurate solar radiation prediction is critical for urban energy planning, renewable energy optimization, and climate modeling, particularly in arid urban environments where variability in weather conditions is high. This study proposes an advanced machine learning-based framework for solar radiation forecasting in Lima, Peru, integrating key meteorological variables including temperature, humidity, wind speed, and atmospheric pressure. Principal Component Analysis (PCA) is employed for dimensionality reduction to enhance model efficiency and interpretability. Several machine learning models, including Linear Regression, Random Forest, Gradient Boosting, and a Stacking Regressor, were developed and compared. Evaluation metrics such as MAE (Mean Absolute Error), RMSE (Root Mean Squared Error), and R² (Coefficient of Determination) were used to assess performance. The Stacking Regressor exhibited the highest predictive accuracy, effectively capturing nonlinear relationships among weather parameters. The study demonstrates that ensemble models combined with dimensionality reduction can significantly improve solar radiation prediction accuracy, supporting sustainable energy planning and deployment in arid urban regions.
Introduction
Solar radiation is a critical factor for climate studies, agriculture, and renewable energy planning, especially in arid urban environments like Lima, where variability is influenced by microclimates and urban heat effects. Accurate prediction is essential for optimizing photovoltaic (PV) systems, integrating solar energy into grids, and improving urban planning. Traditional statistical methods often struggle with nonlinear relationships, whereas machine learning (ML) can capture complex patterns from historical data.
The study proposes a robust ML framework for solar radiation prediction using dimensionality reduction (PCA) and ensemble learning. Historical meteorological data from Lima (temperature, humidity, wind, etc.) are preprocessed, cleaned, and normalized. PCA reduces the feature set from 11 to 8 principal components, retaining key information while reducing complexity. Multiple ML models (Linear Regression, Random Forest, Gradient Boosting, Decision Tree, KNN, ANN) are trained and compared, with a Stacking Regressor combining base models to improve predictive accuracy.
Performance evaluation using metrics like R2R^2R2, RMSE, MAE, NSE, and MAPE identifies the Stacking Regressor as the optimal model, achieving R2≈0.92R^2 \approx 0.92R2≈0.92, meaning it explains about 92% of the variance in solar radiation. This framework provides a reliable tool for urban energy planning and solar resource management in arid cities.
Conclusion
An optimized ensemble-based approach for solar radiation prediction in arid regions has been developed and evaluated. The integration of multiple machine learning algorithms through stacking provided superior accuracy compared with individual learners. The application of PCA simplified the model while maintaining robustness. The study confirms that ensemble techniques are highly effective for renewable energy estimation in data-scarce regions. Future work will involve testing the framework across multi-year datasets and incorporating additional atmospheric variables such as cloud cover and sunshine duration to further improve model generalization
References
[1] P. Chaudhary, R. Gattu, S. Ezekiel, and J. Rodger, “Forecasting solar radiation using machine learning algorithms,” J. Cases Inf. Technol., vol. 23, no. 4, pp. 1–21, 2021.
[2] M. K. Nematchoua et al., “Prediction of daily global solar radiation and air temperature using machine learning algorithms,” Ecol. Informat., vol. 69, p. 101643, 2022.
[3] A. Geetha et al., “Prediction of hourly solar radiation in hot climates using ANN models,” Energy Rep., vol. 8, pp. 664–671, 2022.
[4] Ü. A?bulut, A. E. Gürel, and Y. Biçen, “Evaluation of machine learning algorithms for daily solar radiation prediction,” Renew. Sustain. Energy Rev., vol. 135, p. 110114, 2021.
[5] J. Breiman, “Random forests,” Mach. Learn., vol. 45, pp. 5–32, 2001.
[6] Y. Freund and R. E. Schapire, “A short introduction to boosting,” Jpn. Soc. Artif. Intell., vol. 14, no. 5, pp. 1612–1620, 1999.
[7] H. Citakoglu, “Comparison of artificial intelligence techniques for solar radiation prediction,” Comput. Electron. Agric., vol. 118, pp. 28–37, 2015.