Foraging decisions as multi-armed bandit problems: applying reinforcement learning algorithms to foraging data

Juliano Morimoto

    Research output: Contribution to journalArticlepeer-review

    11 Citations (Scopus)

    Abstract

    Finding resources is crucial for animals to survive and reproduce, but the understanding of the decision-making underlying foraging decisions to explore new resources and exploit old resources remains lacking. Theory predicts an ‘exploration-exploitation trade-off’ where animals must balance their effort into either stay and exploit a seemingly good resource or move and explore the environment. To date, however, it has been challenging to generate flexible yet tractable statistical models that can capture this trade-off, and our understanding of foraging decisions is limited. Here, I suggest that foraging decisions can be seen as multi-armed bandit problems, and apply deterministic (i.e., the Upper-Confidence-Bound or ‘UCB’) and Bayesian algorithms (i.e., Thompson Sampling or ‘TS’) to demonstrate how these algorithms generate testable a priori predictions from simulated data. Next, I use UCB and TS to analyse empirical foraging data from the tephritid fruit fly larvae Bactrocera tryoni to provide a qualitative and quantitative framework to quantify animal foraging behaviour. Qualitative analysis revealed that TS display shorter exploration period than UCB, although both converged to similar qualitative results. Quantitative analysis demonstrated that, overall, UCB is more accurate in predicting the observed foraging patterns compared with TS, even though both algorithms failed to quantitatively estimate the empirical foraging patterns in high-density groups (i.e., groups with 50 larvae and, more strikingly, groups with 100 larvae), likely due to the influence of intraspecific competition on animal behaviour. The framework proposed here demonstrates how reinforcement learning algorithms can be used to model animal foraging decisions.

    Original languageEnglish
    Pages (from-to)48-56
    Number of pages9
    JournalJournal of Theoretical Biology
    Volume467
    DOIs
    Publication statusPublished - 21 Apr 2019

    Fingerprint

    Dive into the research topics of 'Foraging decisions as multi-armed bandit problems: applying reinforcement learning algorithms to foraging data'. Together they form a unique fingerprint.

    Cite this