Travel Time Forecasting on a Freeway Corridor: a Dynamic Information Fusion Model based on the Random Forests Approach

In recent years, the need for traffic prediction has become indispensable due to the increasing congestion in the roadway network. To avoid congestion and to increase the utilization of the entire highway network can heavily rely on the ability to predict travel times in a timely manner. In addition, the prediction can also provide the drivers aggregated traffic information that may affect their travel plans and finally may affect the efficiency of the entire transportation system through individual driving decisions. The prediction of travel time can also reduce the waste of road resources. For example, travelers can change their trips from peak hour to non-peak hour or switch between freeways and local streets when they believe that the expected travel delay is too long, especially when the travelers have a flexible schedule. The travel time prediction is highly complex as it is affected by a wide variety of factors, which could include, but are not limited to the sensor captured parameters (e.g., traffic volume, speed, class and occupancy), event and incident information, segment locations, weather conditions, and signal status. A better understanding of the travel time (and also travel delay) can greatly help the decision makers plan, design, operate, and manage a more efficient highway system. In recent years, the acquisition and popularization of big data in the field of transportation, technological advances that have enabled the collection and diffusion of real-time traffic information, and the rapidly growing traffic volume and congestion, have triggered an increasing interest in traffic modeling. Different machine learning approaches such as neural network, ensemble learning and support machines have been employed by different researchers and the results indicate that such approaches to prediction are adaptable and can give better performances than traditional models. However, such machine learning methods are practically faced with an overfitting problem that is difficult to overcome. In particular, when the test conditions are greatly changed, the predicted results are often unsatisfactory. The Random Forests method has a very good Bias-Variance tradeoff which can avoid the machine learning models’ biggest problem of overfitting. This research will develop a random forests method to predict the freeway corridor travel time by using the probe vehicle based traffic data, and therefore will gain a better understanding of how traffic factors might affect travel time in the freeway system.