We calculated the StrucDiff for all the bases in all the transcripts between each pair of individuals: GM12891 and GM12892, GM12891 and GM12878, GM12892 and GM12878. To identify RiboSNitches, we downloaded SNV annotations from HapMap project22 , and then converted SNV annotations from hg18 assembly to hg19 assembly using UCSC executable LiftOver. We then overlaid the hg19 SNV coordinates with our transcriptome annotation, a non-redundant combination of RefSeq and Gencode v12 transcriptome assembly, to identify the positions in the transcriptome that have SNVs. For highly confident detection of structural changes, we require that the sequencing coverage around SNV is dense, such that (1) the SNV is located on a transcript whose average coverage is greater than 1 (on average one read per base); and (2) the average coverage in a 5-base window centered around the SNV is greater than 10 (average S1+V1≥5). We exclude bases that fall within 100 nucleotides from the 3’end of all the transcripts due to the blind tail of 100 nucleotides.
To identify SNVs with statistically significant changes in structure, we estimated a global baseline of structural change by calculating the fold differences between the doping control and SNV cumulative frequencies. We calculated a z-score for each detected SNV: z= (StrucDiffs-mean)/(SD of doped in controls). We used the Tetrahymena ribozyme as the doped in control. We noticed that a StrucDiff ≥1 is equivalent to a z-score≥4.5 and a 100 fold difference between the SNV and doping control cumulative frequencies. To calculate the p-value for the structural change at each detected SNV, we performed 1000 permutations on the absolute values of the non-zero delta PARS scores within each transcript that contains SNV. This p-value is an estimate of the likelihood that a 5-base average of the permutated PARS structural change is greater than the 5-base average of the SNV base’s structural change. The false discovery rate (FDR) of the significance of the structural change at the SNV site is estimated by a multi-hypothesis testing performed using the p.adjust function in R. A SNV is defined as a RiboSNitch if (1) its StrucDiff is greater than 1 (equivalent to z-score ≥ 4.5 and 100 fold cumulative frequency difference); (2) its p-value less than 0.05 and FDR less than 0.1; and (3) local read coverage greater than 10 and at least 3 out of 11 bases contain S1 or V1 signals in a 11-base sliding window centered by the SNV site. We also permutated the structural changes between the Trio by shuffling the StrucDiffs within every transcript. After structural PARS scores were permutated, we identified only 16 RiboSNitches based on the exact same aforementioned methods and thresholds. This number is less than 1% of the original number of RiboSNitches found, indicating that most of the discovered RiboSNitches are not random noise.