TY - JOUR PY - 2022// TI - Modeling highly imbalanced crash severity data by ensemble methods and global sensitivity analysis JO - Journal of transportation safety and security A1 - Jiang, Liming A1 - Xie, Yuanchang A1 - Wen, Xiao A1 - Ren, Tianzhu SP - 562 EP - 584 VL - 14 IS - 4 N2 - Crash severity has been extensively studied and numerous methods have been developed for investigating the relationship between crash outcome and explanatory variables. Crash severity data are often characterized by highly imbalanced severity distributions, with most crashes in the Property-Damage-Only (PDO) category and the severe crash category making up only a fraction of the total observations. Many methods perform better on outcome categories with the most observations than other categories. This often leads to a high modeling accuracy for PDO crashes but poor accuracies for other severity categories. This research introduces two ensemble methods to model imbalanced crash severity data: AdaBoost and Gradient Boosting. It also adopts a more reasonable performance metric, F1 score, for model selection. It is found that AdaBoost and Gradient Boosting outperform other benchmark methods and generate more balanced prediction accuracies. Additionally, a global sensitivity analysis is adopted to determine the individual and joint impacts of explanatory factors on crash severity outcome. Vertical curve, seat belt use, accident type, road characteristics, and truck percentage are found to be the most influential factors. Finally, a simulation-based approach is used to further study how the impact of a particular factor may vary with respect to different value ranges.

Language: en

LA - en SN - 1943-9962 UR - http://dx.doi.org/10.1080/19439962.2020.1796863 ID - ref1 ER -