There are two golden datasets which are used in this paper. In Table 1 the golden dataset consists of 1782 and 2830 whole exome sequenced (WES) data from the AfAm and EuAm ancestry respectively. The WES dataset has been sequenced with 80–100× coverage. The WES gold standard dataset was aligned with the Mercury pipeline [21 (link)] in single sample mode. The second gold standard dataset which we refer to as cSNP, consists of 3533 samples genotyped with HumanExome BeadChip v1.0 (Illumina, Inc., San Diego, CA) querying 247,870 variable sites using standard protocols suggested by the manufacture at the University of Texas Health Science center at Houston [38 (link)]. There are 1683 EuAm samples and 1850 AfAm samples in the gold standard dataset. All true negative sites with missing genotype data are removed from the gold standard. For the sensitivity calculations, all sites common to WGS dataset with greater than 5 % missing genotypes are also removed from further analysis.
Free full text: Click here