Formula One race strategy optimization has traditionally relied on predefined heuristics and Monte Carlo simulations, which are computationally expensive and lack adaptability to live race conditions. While prior works have explored reinforcement learning (RL) in other motorsport categories, its application to Formula One strategy remains underdeveloped. This research introduces a reinforcement learning framework aimed at dynamically predicting tire compound choices after the summer break, addressing the gap in adaptive decision-making for in-race strategic planning. The proposed RL model employs a deep recurrent Q-network (DRQN) trained using a Monte Carlo race simulator. The state space incorporates critical race parameters such as tire degradation, gaps to competitors, and race progress, while the action space consists of tire compound selection and pit stop timing. A reward function, balancing immediate lap performance and long-term finishing position, guides the learning process. The model is further enhanced with explainability techniques, including feature importance analysis and decision tree-based surrogate models, to improve transparency and trust in automated strategy recommendations.
Introduction
Formula One (F1) is a high-cost, high-tech motorsport where race strategy—especially tire choice and pit stop timing—is critical since cars and drivers can’t be changed mid-race. Traditional strategy planning uses Monte Carlo simulations, which are computationally expensive, slow, and inflexible to real-time changes, and often lack interpretability.
Advances in AI, especially reinforcement learning (RL), offer potential for dynamic, adaptive, and explainable race strategy optimization based on live data.
Literature Review
Early F1 strategy models were deterministic and expert-based, evolving to simulations incorporating factors like tire wear, fuel, and probabilistic events (e.g., safety cars) via Monte Carlo methods.
Machine learning methods, including deep neural networks, have been applied for predicting driver rankings, lap times, and race outcomes.
Reinforcement learning has been used to optimize decisions like pit stops and refueling dynamically, with models such as Deep-Racing and explainable RL frameworks showing promise.
Game theory models examine competitive interactions between drivers.
Research also focuses on separating the impact of driver skill vs. car performance and using genetic algorithms for strategy optimization.
Challenges remain in creating robust, interpretable, real-time adaptable models.
Methodology
The project develops a Deep Recurrent Q-Network (DRQN)-based system to optimize pit stop decisions in F1 races.
The system simulates races using historical lap-by-lap data, including tire conditions, track info, and race state.
The agent (a neural network with LSTM) learns from sequences of race states to decide whether to pit or not, balancing lap times and penalties.
Experience replay and target networks stabilize training.
Data preprocessing converts raw race data into structured input.
Training involves running the model over multiple seasons’ data, improving generalization.
The final trained model is saved for future use.
Evaluation and Results
The DRQN model was tested on unseen race conditions from 2021 to mid-2024.
It recommended tire compounds and pit stop laps, considering weather, track, tire degradation, and safety car incidents.
The model’s strategies were compared with actual F1 team strategies, showing good alignment and potential for improved dynamic decision-making.
Monte Carlo simulations validated the model’s robustness and adaptability.
This research demonstrates how reinforcement learning can enhance F1 race strategy by enabling adaptive, data-driven pit stop decisions that rival professional human strategists.
Conclusion
Early deterministic models provided a foundational understanding of race dynamics, yet their inability to adapt to real-world uncertainties necessitated the evolution toward data-driven techniques. Machine learning, particularly deep learning, has enabled the extraction of complex patterns from vast amounts of telemetry and sensor data, thereby enhancing predictive accuracy and performance analysis.
Deep reinforcement learning (DRL) builds on these advancements by introducing an adaptive, interactive element to strategy development. DRL models—such as Deep Q-Networks (DQN) and Deep Recurrent Q-Networks (DRQN)—demonstrate significant promise in managing the inherent uncertainty and dynamic conditions of racing. By leveraging trial-and-error learning and incorporating probabilistic effects, these models can anticipate competitor behavior and adapt strategies in real time, offering a distinct competitive edge. It provides a structured framework to analyze both cooperative and adversarial behaviors, enabling teams to develop strategies that account for the actions and responses of opponents.
References
[1] AD. Thomas, et al., \"Explainable Reinforcement Learning for Formula One Race Strategy,\" arXiv preprint arXiv:2501.04068, 2025.
[2] F. Hojaji, A. J. Toth, J. M. Joyce, and M. J. Campbell, \"AI-enabled prediction of sim racing performance using telemetry data,\" Computers in Human Behavior Reports, vol. 14, p. 100414, 2024.
[3] Z. Zhao, \"Deep Neural Network-based lap time forecasting of Formula 1 Racing,\" Applied and Computational Engineering, vol. 47, pp. 61-66, 2024.
[4] H. Han, Z. Liu, M. Barrios, J. Li, Z. Zeng, N. Sarhan, and E. Awwad, \"Time series forecasting model for non-stationary series pattern extraction using deep learning and GARCH modeling,\" Journal of Cloud Computing, vol. 13, no. 1, pp. 1-22, 2024.
[5] S. S. W. Fatima and J. Johrendt, \"Deep-Racing: An Embedded Deep Neural Network (EDNN) Model to Predict the Winning Strategy in Formula One Racing,\" International Journal of Machine Learning and Computing, Vol. 13, No. 3, 2023.
[6] M. Boettinger and D. Klotz, \"Mastering Nordschleife--A comprehensive race simulation for AI strategy decision-making in motorsports,\" arXiv preprint arXiv:2306.16088, 2023.
[7] P. Malik, A. Dangi, A. Singh, T. Asst, A. Pratap, S. Parihar, U. Sharma, and L. Mishra, \"An Analysis of Time Series Analysis and Forecasting Techniques,\" IJARCCE, vol. 9, 2023.
[8] E.-J. van Kesteren and T. Bergkamp, \"Bayesian analysis of Formula One race results: disentangling driver skill and constructor advantage,\" Journal of quantitative analysis in sports, vol. 19, no. 4, pp. 273-293, 2023.
[9] A. Bonomi, E. Turri, and G. Iacca, \"Evolutionary F1 Race Strategy,\" In Proceedings of the Companion Conference on Genetic and Evolutionary Computation (GECCO \'23 Companion), pp. 1925–1932, 2023.
[10] O. F. C. Heine and C. Thraves, \"On the optimization of pit stop strategies via dynamic programming,\" Central European Journal of Operations Research, vol. 31, no. 1, pp. 239-268, 2023.
[11] L. Paparusso, M. Riani, F. Ruggeri, and F. Braghin, \"Competitors-Aware Stochastic Lap Strategy Optimisation for Race Hybrid Vehicles,\" in IEEE Transactions on Vehicular Technology, vol. 72, no. 3, pp. 3074-3089, March 2023.
[12] A. Heilmeier, M. Graf, J. Betz, and M. Lienkamp, \"Application of Monte Carlo Methods to Consider Probabilistic Effects in a Race Simulation for Circuit Motorsport,\" Applied Sciences, vol. 10, no. 12, p. 4229, 2020.
[13] A. Heilmeier, A. Thomaser, M. Graf, and J. Betz, \"Virtual Strategy Engineer: Using Artificial Neural Networks for Making Race Strategy Decisions in Circuit Motorsport,\" Applied Sciences, vol. 10, no. 21, p. 7805, 2020.
[14] B. Peng, et al., \"Rank position forecasting in car racing,\" In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 1057-1066, 2021.
[15] F. Dama and C. Sinoquet, \"Time Series Analysis and Modeling to Forecast: a Survey,\" arXiv preprint arXiv:2104.00164, 2021.
[16] D. I. E. G. O. Piccinotti, et al., \"Online Planning for F1 Race Strategy Identification,\" In International Conference on Automated Planning and Scheduling (ICAPS), 2021.
[17] A. Heilmeier, M. Graf, and M. Lienkamp, \"A Race Simulation for Strategy Decisions in Circuit Motorsports,\" In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 2986-2993, 2018.
[18] J. Bekker and W. Lotz, \"Planning Formula One race strategies using discrete-event simulation,\" Journal of the Operational Research Society, vol. 60, no. 1, pp. 26-32, 2009.