登入選單
返回Google圖書搜尋
註釋In this chapter we ask a number of questions regarding the brain’s realization of reward-based decision making: 1) What is the right mathematical framework to capture the actual choice learning sequences of animals? 2) How are the crucial variables for multi-step decision making, such as the action value, the history of choice, strategy for multi-step behavior and the reward prediction errors processed in the brain? 3) What is the brain’s mechanism for the decision to pursue a delayed reward or to abandon it? We first show that some extension to the standard reinforcement learning framework is necessary for capturing the actual choice sequences of animals, which often include episodes of the win-stay-lose-shift strategy as well as history of choices and their value. In a multi-step choice task of monkeys, the neurons in the striatum encode the action values and the putamen plays a critical role in history-based action selection. Furthermore, the midbrain dopamine neurons represent the sum of the immediate and expected multiple future rewards and its prediction errors. In a delayed reward task in rats, waiting for delayed reward was associated with both higher level of serotonin release and sustained higher firing of serotonin neurons in the dorsal raphe nucleus.