Safe Reinforcement Learning for Intersection Management in RITI Communities Under Rare Extreme Events

The advances of artificial intelligence (AI) is transforming the transportation sector, one of the most critical infrastructures in modern society. AI-based technologies have been used in many facets of the transportation system, such as autonomous vehicles, driver injury prediction and prevention, and traffic management. In particular, efficient traffic management can greatly reduce traffic congestion, a problem that is faced by all on a daily basis. Therefore, it is of paramount importance to develop better traffic management systems, which will in turn boost the efficiency of the overall transportation system. One effective measure of traffic management is intersection management, where the phasing of traffic signals is optimized at each intersection of the transportation network. Recently, reinforcement learning has been applied to adaptive traffic signal control, and demonstrated superior performance. In a nutshell, reinforcement learning is a branch of machine learning, and aims at learning to optimally interact with dynamic environments. In the context of adaptive traffic signal control, reinforcement learning algorithms can learn to optimally set the phasing of traffic signals under time-varying environments of traffic conditions, given enough training data. In this project, the research team proposes safe reinforcement learning for intelligent traffic signal control for RITI communities under rare extreme events. The key innovation behind safe reinforcement learning under significant rare events is the adjustment of rare event probabilities in the training process. More specifically, the research team can artificially increase the probabilities of the significant rare events (e.g., extreme weather conditions that paralyze the transportation system) in a simulator, such as the Simulation of Urban MObility (SUMO) platform. Then, the team needs to make proper adjustment (e.g., through importance sampling) in the learning process to account for the altered rare event probabilities. Under correct adjustment, the learned policy can be optimal under the true rare event probabilities. In addition, the research team will propose techniques, such as hierarchical reinforcement learning and transfer learning, to further improve the convergence speed of the proposed reinforcement learning algorithm.