Paired IRRs originating from expanded STRs may align to other genomic locations, especially if the STR is short in the reference genome at the target location. We refer to the loci where IRRs may misalign as off-target regions. Identifying off-target regions enables us to reduce the search for IRRs to a few regions instead of the whole genome. In order to obtain off-target regions for the C9orf72 repeat, we searched through the 182 samples in cohort one that had an expanded repeat according to the original RP-PCR results to identify all the GGGGCC IRRs. The search was performed through the whole genome for read pairs with a low mapping quality (MAPQ = 0) and a weighted purity score of at least 0.9. The mapping positions of all identified IRRs were merged if they were closer than 500 bp, and the resulting 29 loci that were present in five or more samples were designated as off-target regions (Supplemental Fig. 4) and were used to find additional reads from the C9orf72 repeat expansion.
Free full text: Click here