Sequences were first filtered to only contain the SNPs present on the 50 K Illumina BovineSNP50 beadchip (Illumina Inc., San Diego, USA). All animals were filtered to contain only these 48,455 variants in common between sequence and the 50 K Illumina BovineSNP50 beadchip. This was done to mimic the scenario where 50 K genotypes are to be imputed to sequence in two steps, and clustering analysis could be performed on 50 K genotypes before imputation. Three clustering methods: 1) ADMIXTURE, 2) PLINK using genotypes, and 3) PLINK using reconstructed haplotypes were explored for imputation from 50 K to 777 K, and compared to imputation using all animals in one population. The method that provided the highest overall accuracy of imputation from 50 K to 777 K was then selected to be compared to imputation using all individuals from 777 K to sequence.
+ Open protocol