Reinforcement Learning for Smart Traffic Signal Control: A Deep Q-Network Approach with Multi-Agent Coordination

Authors: Yogita Dhale

DOI Link: https://doi.org/10.22214/ijraset.2026.81945

Abstract

Urban tra c congestion represents a critical challenge in modern smart city development, causing signi cant economic losses, environmental degradation, and reduced quality of life. Traditional xed-time and actuated trafc signal controllers fail to adapt to dynamic tra c conditions, resulting in extended vehicle waiting times and increased emissions. This paper proposes an intelligent tra c signal control system based on Deep Reinforcement Learning (DRL), speci cally employing Double Deep Q-Networks (DDQN) with experience replay for adaptive signal timing optimization. The system is designed and evaluated using the Simulation of Urban Mobility (SUMO) platform, incorporating consistent statereward design principles to ensure stable training convergence. We further extend the framework to multiagent reinforcement learning (MARL) for coordinated control across multiple intersections. Experimental results demonstrate that the proposed RL-based controller achieves a 34.7% reduction in average waiting time, 28.3% decrease in queue length, and 22.1% improvement in intersection throughput compared to conventional xedtime controllers. The multi-agent extension shows additional network-wide bene ts with 18.5% reduced congestion propagation. These ndings establish a foundation for scalable, sustainable, and intelligent tra c management systems aligned with smart city objectives.

Introduction

The text discusses the use of Deep Reinforcement Learning (DRL) for improving urban traffic signal control, addressing inefficiencies in traditional traffic management systems. Rapid urbanization and increasing vehicle usage have led to severe congestion, economic losses, pollution, and longer travel times. Conventional methods such as fixed-time, actuated, and even adaptive systems (like SCOOT and SCATS) struggle to adapt dynamically to real-time traffic conditions or coordinate across multiple intersections.

To solve these issues, the paper proposes an AI-based traffic control framework using Reinforcement Learning, particularly a Double Deep Q-Network (DDQN) approach. The system models traffic signal control as a Markov Decision Process (MDP) where the agent observes traffic states (queue length, waiting time, vehicle density, and signal phase), takes actions by selecting signal phases, and receives rewards designed to minimize delay, congestion, and unnecessary signal changes.

Key contributions include:

Development of a DDQN-based traffic signal controller with stable state-reward design
Use of the SUMO traffic simulation platform for realistic evaluation
Extension toward multi-agent reinforcement learning (MARL) for coordination across multiple intersections
Evaluation using metrics such as waiting time, queue length, throughput, and emissions

The literature review highlights the evolution from fixed-time and actuated systems to adaptive and AI-driven approaches. It emphasizes that reinforcement learning enables adaptive, data-driven traffic control, while deep learning improves handling of complex, high-dimensional traffic states. However, challenges remain in real-world deployment, standardization, scalability, and multi-agent coordination.

The problem is formulated using MDP concepts, where:

The state space includes traffic conditions like queue length, waiting time, density, and signal phase
The action space includes different traffic signal phase choices
The reward function aims to reduce delay and penalize unnecessary phase switching

Conclusion

This paper presented a comprehensive framework for reinforcement learning-based smart tra c signal control. The proposed DDQN approach with consistent statereward design achieves signi cant improvements in trafc e ciency, reducing average waiting times by 34.7% and queue lengths by 28.3% compared to conventional xed-time controllers. The multi-agent extension enables network-wide coordination, preventing congestion propagation and achieving holistic optimization across multiple intersections. Environmental sustainability bene ts include 22.1% reduction in CO2 emissions through decreased vehicle idle time, supporting smart city environmental objectives. The framework establishes a foundation for future extensions including emergency vehicle priority, connected vehicle integration, and real-world pilot deployment. As cities worldwide grapple with increasing trac challenges, intelligent systems that adapt to dynamic conditions represent essential tools for sustainable urban development. The reinforcement learning paradigm o ers a powerful approach to this challenge, with demonstrated bene ts that justify continued research investment and eventual deployment in production tra c management systems.

References

[1] S. Bouktif, A. Cheniki, A. Ouni, and H. El-Sayed, Deep reinforcement learning for tra c signal control with consistent state and reward design approach, Knowledge-Based Systems, vol. 267, p. 110440, 2023. [2] A. Saadi, N. Abghour, Z. Chiba, K. Moussaid, and S. Ali, A survey of reinforcement and deep reinforcement learning for coordination in intelligent tra c light control, Journal of Big Data, vol. 12, no. 84, 2025. [3] P. Michailidis, I. Michailidis, C. R. Lazaridis, and E. Kosmatopoulos, Tra c signal control via reinforcement learning: A review on applications and innovations, Infrastructures, vol. 10, no. 114, 2025. [4] P. Alegre, D. Ziemke, and A. Bazzan, Using reinforcement learning to control tra c signals in a real-world scenario: An approach based on linear function approximation, Journal of Arti cial Intelligence Research, vol. 71, pp. 1051 1087, 2021. [5] B. Wang, Z. He, J. Sheng, and Y. Liu, Multi-agent deep reinforcement learning with actor-attentioncritic for tra c light control, SAGE Journals, 2024. [6] K.-L. A. Yau, J. Qadir, H. L. Khoo, M. H. Ling, and P. Komisarczuk, A survey on reinforcement learning models and algorithms for tra c signal control, ACM Computing Surveys, vol. 50, no. 3, 2017. [7] H. Wei, C. Chen, G. Zheng, K. Wu, V. Gayah, K. Xu, and Z. Li, PressLight: Learning max pressure control to coordinate tra c signals in arterial network, in Proc. KDD, 2019. [8] J. Tan, Q. Yuan, W. Guo, N. Xie, F. Liu, J. Wei, and X. Zhang, Deep reinforcement learning for tra c signal control model and adaptation study, Sensors, vol. 22, no. 22, 2022. [9] Z. Li, S. Lin, T. Shi, C. Tian, Y. Mei, J. Song, X. Zhan, and R. Li, A fully data-driven approach for realistic tra c signal control using o ine reinforcement learning, arXiv:2311.15920, 2023. [10] L. A. Prashanth and S. Bhatnagar, Reinforcement learning with function approximation for trafc signal control, IEEE Transactions on Intelligent Transportation Systems, 2011.

Copyright

Copyright © 2026 Yogita Dhale. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET81945

Publish Date : 2026-05-04

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here