Foraging decisions as multi-armed bandit problems: applying reinforcement learning algorithms to foraging data