Transit Signal Priority Control with Connected Vehicle Technology: Deep Reinforcement Learning Approach

With the development of urbanization, the demand for transportation services continues to increase. In general, there are two ways to meet the demand: One is to construct more transportation infrastructure, and the other is to improve the efficiency of the existing infrastructure. Obviously, the latter is a more realistic way due to the limited space and funding. As a result, public transportation, which is more efficient than private transportation, is playing an increasingly prominent role in the urban transportation system. However, people have to share spaces and experience longer travel times when using public transportation, which make it less attractive than private transportation. To this end, extensive research has been conducted on transit priority strategies that can provide higher quality transit services to the public. Among these strategies, Transit Signal Priority (TSP) is an efficient operational strategy to ensure convenient and reliable public transportation services. TSP generally adjusts the signal plan to ensure priority for transit vehicles at intersections, arterials, or networks (Skabardonis, 2000). However, this control strategy generally causes adverse effects on other regular traffic, which limits its widespread adoption. In order to solve this problem that can mitigate negative effects while still providing priority to transit vehicles, adaptive TSP has been studied for decades (Christofa & Skabardonis, 2011; Ma et al., 2010; Skabardonis & Geroliminis, 2008). Generally, adaptive TSP has to obtain real-time traffic data to optimize the traffic signal plan. Traditional traffic data sensors, such as loop detectors, cameras, and radars, are installed in fixed positions and are therefore more or less deficient in acquiring real-time data. Recently, with the rapid development of connected vehicle (CV) technology, more accurate and more comprehensive real-time traffic data can be easily obtained. This advantage can surely boost the advancement of adaptive TSP, and many researchers have integrated CV technology with adaptive TSP (Ghanim & Abu-Lebdeh, 2015; Zeng et al., 2021). The U.S. Department of Transportation (USDOT) has also included TSPCV on its list of High-Priority Applications and Development Approach. In the meantime, optimization algorithms have been evolving rapidly. Among them, mixed-integer nonlinear programming (MINLP) and dynamic programming (DP) are analytical algorithms widely used to optimize traffic signal control (Feng et al., 2015; Li & Ban, 2019; Priemer & Friedrich, 2009). However, these algorithms have to model the traffic environment as comprehensively as possible, which is computational intensive, time consuming, and impractical (Mohamad Alizadeh Shabestary, 2019). Since real-time traffic data can be easily obtained in the CV era, reinforcement learning (RL), which is data-driven and can learn the optimal control strategies when interacting with the environment, is a suitable approach to address aforementioned issues(Aslani et al., 2017; Chow et al., 2021; Li et al., 2016). RL was initially developed for discrete state and action problems. Yet, when integrated with deep learning, known as deep reinforcement learning, it becomes a promising approach for TSP control problems (Genders & Razavi, 2016; Shabestary & Abdulhai, 2022). Most of the existing deep RL studies have focused on optimizing the signal control problem that consider only the purely private traffic mode. This study aims to propose a robust adaptive TSP controller in CV environment to guarantee priority of transit vehicles while minimizing the negative impact on regular traffic. Deep RL approach will be applied in this research to solve the signal control optimization problem. Comprehensive simulation experiments based on real-world traffic configurations will be conducted to examine the effectiveness of proposed control algorithms. This study contributes to the development of adaptive TSP controllers in CV environment using advanced learning-based optimization approaches. The main goal of this research is to develop adaptive TSP control algorithms utilizing CV technology and deep RL approach to optimize the performance of both transit vehicles and private vehicles. The objectives of this project are to: 1) Conduct a comprehensive literature review on exiting CV technologies, TSP control strategies, and deep RL algorithms; 2) Propose adaptive TSP control algorithms by applying the deep RL approach to solve the optimization problems both at the isolated intersection and along the corridor; 3) Build simulation testbed based on real world traffic configurations and conduct simulation experiments considering different scenarios; 4) Compare the results in different scenarios and evaluate the effectiveness of proposed algorithms.