Deep-Learning Based Trajectory Forecast for Safety of Intersections with Multimodal Traffic (Phase III)

Detecting human behavior (for example grabbing the top of a steering wheel before a turn, or looking over the shoulder for a bicyclist) for the purposes of building a model to predict future vehicle trajectories (to avoid collisions) is very difficult. These gestures are probably very strong predictors for forecasting trajectories over a short time horizon. However, at present there are not any practical, scalable traffic safety systems that consider human body cues to predict vehicle trajectories. The project builds upon an object (vehicles, pedestrians, or other road users) detector and tracker that the research team is currently investigating. The objective is to extend this work by conceiving new computational methods to forecast the trajectories of road users over some short time horizon. Given the complexity of human behavior and the diversity of the scenes to be monitored the team opts to use a model-free approach such as Deep Neural Networks, which mimic human perception, and can allow the detection and classification of complex user features and gestures (for example grabbing the top of a steering wheel before a turn, or looking over the shoulder for a bicyclist). These gestures will probably be very strong predictors for forecasting trajectories over a short time horizon. This trajectory forecast of a user will not only depend on their behavior, their past trajectories, and some body cues. It will also depend upon the past trajectories of other users, which provide information on the likely future path of a given user. This will result in the algorithm learning features such as: a pedestrian standing by a crosswalk is likely to cross it in the future, or a bicyclist moving out of a bike lane is likely to want to turn left at the intersection. The dependency between gestures/body cues, past trajectories and futures trajectories will also be investigated in this proposal. Since the detection and classification of gestures is not expected to be perfect, the team intends to use some manually labeled data (generated using an online labeling service) to determine what features are strong predictors of changes in trajectories and evaluate the added value of using gestures for predicting future trajectories. If these features can be detected with sufficient accuracy, and if they are strongly correlated to changes in expected trajectories, then the system can anticipate the future to a higher degree and provide more timely warnings by detecting dangerous situations before a collision risk exists.