Abstract
This paper analyzes how individuals resolve an exploration versus exploitation trade-off in a laboratory experiment. The experiment implements the single-agent exponential bandit model. We analyze how subjects respond to changes in the prior belief, safe action, and discount factor. We find that subjects respond in the predicted direction to these changes. However, we find that subjects under-respond to the prior belief, under-respond to the safe action, and typically explore less than predicted. Our results suggest that neither risk aversion nor the random termination probability are driving under-experimentation. Our results are consistent with subjects having incorrect beliefs about exploration.
Original language | English |
---|---|
Pages (from-to) | 267-286 |
Number of pages | 20 |
Journal | Economic Inquiry |
Volume | 62 |
Issue number | 1 |
DOIs | |
Publication status | Published - Jan 2024 |
Externally published | Yes |
Bibliographical note
© 2023 The Authors. Economic Inquiry published by Wiley Periodicals LLC on behalf of Western Economic Association International. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- bandits
- continuous time
- experimentation
- exponential bandit model