For the extended BRH approach, we downloaded the genome (assemblies) and gene annotation of the plant species Arabidopsis lyrata, Arabidopsis thaliana, Carica papaya, Oryza sativa and Solanum tuberosum from Phytozome (16 (link)), and the animal species Homo sapiens, Gallus gallus and Mus musculus from Ensembl (17 (link)) (cf. Text S3).
Additionally, we downloaded genome assembly and gene annotation of Nicotiana benthamiana v0.4.4 (18 (link)) from ftp://ftp.solgenomics.net/genomes/Nicotiana_benthamiana/assemblies/ for mapping RNA-seq data and assessing GeMoMa predictions.
For all analyses, we discarded gene models from the given annotation with missing start or stop codon, premature stop codon(s) or ambiguous nucleotide(s). In addition, we only used one representative gene model if several gene models of a gene have the same CDS, i.e. only differ in their UTRs.