Whole genome sequencing (WGS) to a depth of 30X coverage was performed using Illumina HiSeq X sequencer for N=126 IMPACT and N=120 POISED samples. The samples for WGS were prepared to the Illumina TruSeq Nano DNA library or TruSeq DNA PCR-free library preparation guides. Assembly of each individual genome was performed using the Isaac aligner (9 (link)). The DRAGEN Germline Small Variant Caller was used to call both SNVs and small indels, and to yield a genome variant file (gVCF) that includes variants along with quality metrics. We performed sample-based QC and dropped two samples because of sex inconsistencies from the POISED dataset. We also filtered variants with GQX < 30, DP < 7, SNP hard quality < 10.41, low depth DP <= 1, ploidy conflict, and variants not meeting thresholds for median base quality of alternate reads and likelihood.
We used the WGS data to impute HLA alleles using the HISAT-genotype software that utilizes HISAT2 (hierarchical indexing for spliced alignment of transcripts 2) alignment system to align DNA sequences using a graph Ferragina Manzini index (10 (link)). We had high quality (call rate) of imputation for HLA-DQA1*01:02 allele in both IMPACT (98.81%) and POISED (98.75%).
Free full text: Click here