Access Restriction

Author Cazé, Romain D. ♦ Meer, Matthijs A. A.
Source SpringerLink
Content type Text
Publisher Springer Berlin Heidelberg
File Format PDF
Copyright Year ©2013
Language English
Subject Domain (in DDC) Technology ♦ Medicine & health
Subject Keyword Reinforcement learning ♦ Reward prediction error ♦ Decision-making ♦ Meta-learning ♦ Basal ganglia ♦ Neurosciences ♦ Computer Application in Life Sciences ♦ Neurobiology ♦ Bioinformatics ♦ Statistical Physics, Dynamical Systems and Complexity
Abstract The concept of the reward prediction error—the difference between reward obtained and reward predicted—continues to be a focal point for much theoretical and experimental work in psychology, cognitive science, and neuroscience. Models that rely on reward prediction errors typically assume a single learning rate for positive and negative prediction errors. However, behavioral data indicate that better-than-expected and worse-than-expected outcomes often do not have symmetric impacts on learning and decision-making. Furthermore, distinct circuits within cortico-striatal loops appear to support learning from positive and negative prediction errors, respectively. Such differential learning rates would be expected to lead to biased reward predictions and therefore suboptimal choice performance. Contrary to this intuition, we show that on static “bandit” choice tasks, differential learning rates can be adaptive. This occurs because asymmetric learning enables a better separation of learned reward probabilities. We show analytically how the optimal learning rate asymmetry depends on the reward distribution and implement a biologically plausible algorithm that adapts the balance of positive and negative learning rates from experience. These results suggest specific adaptive advantages for separate, differential learning rates in simple reinforcement learning settings and provide a novel, normative perspective on the interpretation of associated neural data.
ISSN 03401200
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2013-10-02
Publisher Place Berlin/Heidelberg
e-ISSN 14320770
Journal Biological Cybernetics
Volume Number 107
Issue Number 6
Page Count 9
Starting Page 711
Ending Page 719

Open content in new tab

   Open content in new tab
Source: SpringerLink