Crudely, subjects choose actions because they think that those actions lead to outcomes that they presently desire. By contrast, habitual instrumental behavior is supposed to have been stamped in by past reinforcement (Thorndike, 1911) and so is divorced from the
current value of an associated outcome. Thus, key characteristics of habitual instrumental control include automaticity, computational efficiency, and inflexibility, while characteristics of goal-directed control include active deliberation, AZD2014 clinical trial high computational cost, and an adaptive flexibility to changing environmental contingencies (Dayan, 2009). Demonstrating that behavior is goal directed is usually assayed in a test session using posttraining manipulations, which either involve reinforcer devaluation or contingency degradation. Consider a test session carried
out in extinction, i.e., without ongoing reinforcement. In this case, there should be less instrumental responding for an outcome that has been devalued (for example, a food reinforcer that has just been rendered unpalatable) than for an outcome that has not. Importantly, this is only true if knowledge of a reinforcer’s current value (i.e., its desirability) exerts a controlling influence on performance; in other words, if task performance is mediated by a representation of the reinforcer (Adams and Dickinson, 1981). Conversely, habitual behavior comprises instrumental responding that continues to be enacted even BMS-387032 in vivo when the outcome is undesired. Various circumstances promote habitual responding, notably extended training on interval schedules of reinforcement involving single actions and single outcomes (Dickinson and Charnock, 1985, Dickinson and Balleine, 2002 and Dickinson et al., 1983). The requirement for extensive experience is key and this also implies that behavior MRIP is initially goal directed but then becomes habitual over the course of experience. For completeness, we also mention the contingency criterion wherein goal-directed behavior also involves
an encoding of the causal relationship between actions and their consequences. Consider a subject trained to press a lever to receive an outcome. If the outcome subsequently becomes equally available with and without a lever press, goal-directed control leads to a decrease in pressing (Dickinson and Balleine, 1994 and Dickinson and Charnock, 1985). The behavioral distinction between goal-directed and habitual control has provided the foundation for a wealth of lesion, inactivation, and pharmacological animal experiments investigating their neural bases. Rodent studies repeatedly highlight a dorsomedial striatum circuit that supports goal-directed behavior (Balleine, 2005, Corbit and Balleine, 2005 and Yin et al., 2005). Related studies show that a circuit centered on dorsolateral striatum supports habit-based behavior (Yin et al., 2004, Yin et al.