For each of the four species we constructed simulated datasets consisting of 100 replicates each of the following sample sizes: 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75 and 100 individuals. Each replicate contained a random subset of individuals from the empirical dataset and replicates were created using a macro in excel, designed to assign each individual in the empirical dataset a random number (between 1 and 10,000), sort the dataset by the random numbers, then select the first 5 (or 10, 15, etc. depending on the sample size category) to a new worksheet, 100 times, resulting in 100 simulated ‘populations’ that are independent, random subsamples of the empirical dataset, at each sample size. Sampling was done without replacement, so no individual was present more than once in the same replicate (as in a real population genetic dataset), but as replicates were independent of each other, the same individual could be present in more than one replicate of the simulated dataset at each sample size. GenAlEx 6.2 [9] was then used to calculate allele frequencies, heterozygosity expected under Hardy-Weinberg Equilibrium (HE) and pairwise FST between the simulated and empirical datasets for each replicate at each sample size. When we refer to the ‘empirical dataset’ we mean the real dataset of 547 ant, 107 squirrel, 616 albatross or 98 kakī individuals (see below for dataset details). Because the kakī dataset comprised fewer than 100 individuals, the largest sample size assessed for this species was 75 individuals. Throughout this paper, ‘individuals’ means diploid individuals.
Free full text: Click here