Recursive logit-based meta-inverse reinforcement learning for driver-preferred route planning

Zhang, Pujun; Lei, Dazhou; Liu, Shan; Jiang, Hai

doi:10.1016/j.tre.2024.103485

SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.

RSS Feed

HELP: Tutorials | FAQ

CONTACT US: Contact info

Search Results

Journal Article

Recursive logit-based meta-inverse reinforcement learning for driver-preferred route planning
Citation	Zhang P, Lei D, Liu S, Jiang H. Transp. Res. E Logist. Transp. Rev. 2024; 185: e103485.
Copyright	(Copyright © 2024, Elsevier Publishing)
DOI	10.1016/j.tre.2024.103485
PMID	unavailable
Abstract	Driver-preferred route planning often evaluates the quality of a planned route based on how closely it is followed by the driver. Despite decades of research in this area, there still exist nonnegligible deviations from planned routes. Recently, with the prevalence of GPS data, Inverse Reinforcement Learning (IRL) has attracted much interest due to its ability to directly learn routing patterns from GPS trajectories. However, existing IRL methods are limited in that: (1) They rely on numerical approximations to calculate the expected state visitation frequencies (SVFs), which are inaccurate and time-consuming; and (2) They ignore the fact that the coverage of GPS trajectories is skewed toward popular road segments, causing difficulties in learning from sparsely covered ones. To overcome these challenges, we propose a recursive logit-based meta-IRL approach, where (1) We use the recursive logit model to capture drivers' route choice behavior so that the expected SVFs can be analytically derived, which substantially reduces the computational efforts; and (2) We introduce meta-parameters and employ meta-learning techniques so that the learning on sparsely covered road segments can benefit from that on popular ones. When training our IRL model, we update the rewards of road segments with the expected SVFs by solving several systems of linear equations and update the meta-parameters through a two-level optimization structure to ensure its fast adaption and versatility. We validate our approach using real GPS data in Chengdu, China. RESULTS show that our planned routes better match actual routes compared with state-of-the-art methods including the recursive logit model, Deep-IRL and Dij-IRL: the F1-Score increases by 4.17% with the introduction of the recursive logit model and further increases to 5.19% after meta-learning is employed. Moreover, we can reduce training time by over 95%.
Keywords	Inverse reinforcement learning; Meta-learning; Recursive logit; Route planning