Developing systems capable of driving autonomously demands not only robust perception of the surrounding environment but also reliable decision-making under real-world uncertainty. Since deploying untested controllers on public roads carries significant risk and cost, simulation platforms such as CARLA have become the standard staging ground for early-stage research. In this work, we trained an end-to-end steering controller inside the CARLA simulator using Deep Q-Network (DQN) reinforcement learning, and subsequently evaluated a Double DQN (DDQN) variant that mitigates the overestimation bias inherent in vanilla DQN.
Both agents process raw camera frames from a forward-facing sensor to produce discrete steering commands, while a separate PID speed controller handles longitudinal velocity. Positive rewards encourage smooth lane-following and target-speed maintenance; collision and lane-departure events trigger penalty signals that guide the agent away from unsafe behaviour. Stabilization mechanisms experience replay, epsilon-greedy exploration decay, and a periodically synchronized target network were applied throughout training. Our experiments confirm that DQN agents can acquire competent steering policies from pixels alone, and that the DDQN extension yields measurably lower collision rates and more consistent trajectories. The results support the broader applicability of deep reinforcement learning to autonomous steering tasks and lay groundwork for future extensions toward real-world deployment.
Introduction
The text discusses the development of an autonomous driving system using Deep Reinforcement Learning (DRL) within the CARLA simulation environment. Autonomous driving requires the integration of perception, prediction, and control systems, but real-world testing is expensive and risky during early research stages. To overcome this challenge, the open-source CARLA platform provides a realistic urban driving simulator with sensors and a Python API, allowing researchers to safely train and test autonomous driving algorithms.
The study focuses on using Deep Q-Networks (DQN), a DRL algorithm introduced by Volodymyr Mnih and colleagues, to learn steering control directly from camera images. The system processes 128×128 grayscale images from a front-facing camera using a convolutional neural network (CNN) and predicts Q-values for thirteen discrete steering actions. The model is trained using reinforcement learning techniques such as epsilon-greedy exploration, replay buffers, and target networks. An improved version called Double Deep Q-Network (DDQN) is also implemented to reduce Q-value overestimation. Performance is evaluated using collision rate, lane deviation, and average reward over 150 training episodes.
The literature survey reviews major contributions in reinforcement learning and autonomous driving research. Foundational works such as Reinforcement Learning: An Introduction provide the theoretical basis for reinforcement learning algorithms. Research on DQN demonstrated that neural networks can learn control policies directly from raw visual input. Other studies explored reinforcement learning for driving assistance systems, policy fine-tuning, imitation learning, and hybrid approaches combining reinforcement learning with human driving demonstrations. Surveys highlighted ongoing challenges such as sample efficiency, transfer learning, safety constraints, and generalization across environments.
The paper identifies limitations in traditional rule-based autonomous driving systems, which rely heavily on manually programmed lane detectors, obstacle classifiers, and decision trees. These systems struggle with unpredictable urban environments and cannot easily adapt to new situations. In contrast, reinforcement learning allows agents to learn adaptive driving behavior directly from experience and reward feedback rather than predefined rules.
The proposed methodology trains a DQN agent in the CARLA environment to map visual observations to steering commands. The agent uses a single front-facing RGB camera whose images are resized and converted to grayscale to reduce computational complexity. The steering space is discretized into thirteen steering values ranging from −0.75 to +0.75. The reward function encourages lane-following and speed maintenance while penalizing collisions and lane departures.
The environment setup includes realistic traffic scenarios with NPC vehicles, pedestrians, and various weather conditions. Multiple sensors such as cameras, lidar, radar, GPS, and IMU are attached to the vehicle. The neural network architecture consists of three convolutional layers with batch normalization followed by fully connected layers that output steering Q-values. The model is trained using the Adam optimizer and Huber loss to improve stability and convergence.
Conclusion
This project demonstrates that Deep Reinforcement Learning can serve as a viable foundation for autonomous vehicle steering control, even when the agent starts with no prior knowledge of driving and must learn entirely from its own experience inside a simulator. Working within the CARLA environment, we trained a DQN-based controller that progressed from random exploration to stable, lane-keeping behavior over 150 episodes, ultimately achieving Grade A performance with average episode rewards above 5,000. Collision rates fell from approximately 50% to under 20%, episode lengths nearly quintupled, and lateral lane deviation dropped to below one meter all consistent indicators of a policy that has internalized the basic geometry of road following.
References
[1] V. Mnih et al., \"Human-level control through deep reinforcement learning,\" Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015, doi: 10.1038/nature14236.
[2] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA: MIT Press, 2018.
[3] A. Dosovitskiy et al., \"CARLA: An open urban driving simulator,\" in Proc. 1st Annu. Conf. Robot Learn. (CoRL), Mountain View, CA, USA, 2017, pp. 1–16.
[4] A. Kendall et al., \"Learning to drive in a day,\" in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), Montreal, QC, Canada, 2019, pp. 8248–8254, doi: 10.1109/ICRA.2019.8793742.
[5] X. Zhu et al., \"Deep reinforcement learning for advanced driver assistance systems,\" IEEE Trans. Intell. Veh., vol. 5, no. 4, pp. 606–617, Dec. 2020, doi: 10.1109/TIV.2020.2991063.
[6] J. Chen, B. Yuan, and M. Tomizuka, \"Interpretable end-to-end urban autonomous driving with latent deep reinforcement learning,\" IEEE Robot. Autom. Lett., vol. 6, no. 4, pp. 7798–7805, Oct. 2021, doi: 10.1109/LRA.2021.3099469.
[7] L. Anzalone, M. Sorci, and U. Castellani, \"Reinforced curriculum learning for autonomous driving in CARLA,\" in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV) Workshops, Montreal, QC, Canada, Oct. 2021, pp. 3018–3023.
[8] B. R. Kiran et al., \"Deep reinforcement learning for autonomous driving: A survey,\" IEEE Trans. Intell. Transp. Syst., vol. 23, no. 6, pp. 4909–4926, Jun. 2022, doi: 10.1109/TITS.2021.3054625.
[9] D. Li and O. Okhrin, \"Modified DDPG car-following model with a real-world human driving experience with CARLA simulator,\" in Proc. IEEE 25th Int. Conf. Intell. Transp. Syst. (ITSC), Macau, China, 2022, pp. 1–7, doi: 10.1109/ITSC55140.2022.9922107.
[10] C. Gómez-Huélamo et al., \"Deep reinforcement learning based control for autonomous vehicles in CARLA,\" Multimed. Tools Appl., vol. 81, no. 3, pp. 3553–3576, Jan. 2022, doi: 10.1007/s11042-021-11437-3.
[11] J. Hossain and M. A. H. Khan, \"Autonomous driving with deep reinforcement learning in CARLA simulation environment,\" in Proc. IEEE Conf. Adv. Netw. Appl. (AINA), Hasselt, Belgium, 2023, pp. 1–8.
[12] A. J. Aghdasi, E. Kim, and L. Shen, \"Autonomous driving using residual sensor fusion and deep reinforcement learning,\" IEEE Embedded Syst. Lett., vol. 15, no. 3, pp. 93–96, Sept. 2023, doi: 10.1109/LES.2023.3308159.
[13] J. Wu, H. Wu, and Z. J. Wang, \"Recent advances in reinforcement learning for autonomous driving,\" IEEE Trans. Intell. Transp. Syst., vol. 25, no. 2, pp. 1472–1491, Feb. 2024, doi: 10.1109/TITS.2023.3331053.
[14] Z. Peng, H. He, J. Wang, and D. Sun, \"Improving agent behaviors with reinforcement learning fine-tuning,\" in Proc. European Conf. Comput. Vis. (ECCV), Milan, Italy, 2024.
[15] B. T. Uppuluri, A. Jain, and K. Paul, \"CuRLA: Curriculum learning based deep reinforcement learning for autonomous driving,\" arXiv preprint arXiv:2407.12729, 2024.
[16] E. Delavari, P. Kumar, and S. Kim, \"A comprehensive review of reinforcement learning for autonomous driving in the CARLA simulator,\" IEEE Access, vol. 12, pp. 112543–112570, 2024, doi: 10.1109/ACCESS.2024.3401234.
[17] M. Bansal, A. Krizhevsky, and A. Ogale, \"ChauffeurNet: Learning to drive by imitating the best and synthesizing the worst,\" in Proc. Robot.: Sci. Syst. (RSS), FreiburgimBreisgau, Germany, 2019.
[18] F. Codevilla, M. Müller, A. López, V. Koltun, and A. Dosovitskiy, \"End-to-end driving via conditional imitation learning,\" in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), Brisbane, QLD, Australia, 2018, pp. 4693–4700, doi: 10.1109/ICRA.2018.8460487.