We fitted seven RL models to the behavioural data on PRL from
Ersche et al. (2011 (link)) using
hierarchical Bayesian methods, incorporating parameters that have been studied
previously in the RL literature.
For all models, trials were sequenced across all trials in the
PRL task. For each trial, the computational model was informed of the subject’s
identity, the subject’s group and drug condition, which stimuli were presented
and where (left or right side of the computer screen), the location (left or
right) of the subject’s response, and whether the trial was rewarded or
unrewarded.
The top level of the Bayesian hierarchy (Fig. 2) pertained to group and drug: each RL parameter
had a group- and drug-condition-specific distribution. The next level involved
sessions for individual subjects: RL parameters for each subject in a given
(drug) condition were drawn from a normal distribution whose mean was the
group/drug mean (from the level above) and whose variance represents
inter-subject variability for that parameter (implemented as a subject-specific
deviation from the group/drug mean). Through this process, the computer
established specific RL parameters for a given set of trials. It then used them
to govern an RL model trained by the sequence of stimuli and
reinforcement.

Schematic of the Bayesian hierarchy used in our analysis,
illustrated here for a single parameter (reward rate). HC healthy
controls

We define t as the trial
number, St as the stimulus chosen on that trial, Lt as the location chosen on that trial, and Rt as the reinforcement delivered on that trial. Each
stimulus was assigned an associated reinforcement-driven value V.
Free full text: Click here