The UK Biobank samples were genotyped on two arrays: Affymetrix UK BiLEVE Axiom array (50,000 individuals) and the Affymetrix UK Biobank Axiom array (450,000 individuals) (Bycroft et al., 2018 (link)). The two arrays share 733,322 autosomal and 20,214 X chromosome variants. Genotypes were phased using SHAPEIT3 (O’Connell et al., 2016 (link)) and imputation was performed with IMPUTE4 (Bycroft et al., 2018 (link)) using reference data from the Haplotype Reference Consortium and UK10K (Huang et al., 2015 (link); McCarthy et al., 2016 (link)). The pseudoautosomal regions (PARs) and non-pseudoautosomal region (non-PAR) were phased and imputed independently. Genotyping, haplotype phasing, and imputation have been previously described (Bycroft et al., 2018 (link)).
Individuals with putative sex chromosome aneuploidy, inconsistent sex (reported sex did not match genetic sex) or were missing >3% of their genotype array data were removed. Analysis was restricted to individuals of white-European ancestry (N = 459,267) based upon principal components analysis. The genotype variants with call rate >95%, Hardy Weinberg Equilibrium p-value < 5.0 × 10−8 (estimated in non-PAR region in females and the variants out of HWE were removed in both females and males), and minor allele frequency (MAF) > 0.001 were used for whole-genome ridge regression (N = 442,313, Supplementary Table S1) implemented in REGENIE (Mbatchou et al., 2021 (link)). Principal components (PCs) of genetic data to include in association analysis were generated using a subset of genotyped markers (N = 144,905) that were pruned for linkage disequilibrium (LD) (r2 > 0.1) (Supplementary Table S1).
SNVs on the X chromosome obtain from the genotype arrays that did not deviate from HWE and had a MAF>0.001 (1,207 PAR and 11,653 non-PAR) were analyzed. Imputed X chromosome SNVs and insertion/deletions (InDels) with a MAF>0.001 and INFO Score ≥0.3 (45,519 PAR and 2,050,039 non-PAR) were also analyzed. To estimate the contribution of common variants on X chromosome to heritability of ARHL we analyzed variants obtained from the genotype array data with MAF>0.01 in the non-PAR region (N = 18,773).
Free full text: Click here