Computational studies of naturalistic behaviors show that the act of acquiring information—whether it is overt
or remains internal to the brain—may indeed have material value, as it increases the chance of success of a future action (Tatler et al., 2011). However, these studies also show that the processes required to compute information value differ markedly from those that have been so far considered in decision tasks. A salient property of this process is that information value depends critically on the subjects’ uncertainty and, in the Rescorla-Wagner CT99021 equation is more closely related with the right side of the equation—the act of learning or modifying expectations. As a simple illustration of this distinction, consider again the tea-making task in Figure 2B. To prepare
and consume her tea, the subject must make both arm and leg actions, and in the reinforcement equation both actions would be assigned a high value term (V). The subject’s gaze, however, is very selectively allocated to the targets of the arm and not the leg actions. This selectivity GSK126 cannot be explained in terms of action value alone but reflects the fact that the arm movements have higher uncertainty and thus more to gain from new information. Thus, the drive that motivates a shift of gaze is not value per se but the need to learn—i.e., to update one’s predictions through new information. Independent support for a view of attention as a learning mechanism comes from an area of research that has been mostly separate all from the oculomotor field (but see Le Pelley, 2010) but has directly addressed
the cognitive aspects of information selection—namely, the question of how subjects learn from and about sensory cues ( Pearce and Mackintosh, 2010). A central finding emerging from these studies is that subjects estimate the reliability of a sensory stimulus based on their prior experience with that stimulus and use this knowledge to modulate their future learning based on that cue. In the Rescorla-Wagner equation this process is implemented using an associability parameter, α, which is a stimulus-specific learning rate ( Pearce and Mackintosh, 2010): equation(Equation 2) Vt=Vt−1+α∗β∗δVt=Vt−1+α∗β∗δ While, as we have seen above, the standard learning rate β is applied globally to a context or task, associability is a property of an individual cue and can differentially weight the available cues. As I discuss in detail in the following sections, this apparently simple modification entails a complex, hierarchical learning mechanism. It entails an executive process which, having previously learned the predictive validity of a sensory cue, guides the moment by moment information selection—i.e., has in effect learnt how to learn. A final line of evidence for the information-bound nature of eye movement control comes from single-neuron studies of target selection that dissociate shifts of attention from overt shifts of gaze (Gottlieb and Balan, 2010).