For the extended BRH approach, we downloaded the genome (assemblies) and gene annotation of the plant species
Arabidopsis lyrata, Arabidopsis thaliana, Carica papaya, Oryza sativa and
Solanum tuberosum from Phytozome (16 (
link)), and the animal species
Homo sapiens, Gallus gallus and
Mus musculus from Ensembl (17 (
link)) (cf. Text S3).
Additionally, we downloaded genome assembly and gene annotation of
Nicotiana benthamiana v0.4.4 (18 (
link)) from
ftp://ftp.solgenomics.net/genomes/Nicotiana_benthamiana/assemblies/ for mapping RNA-seq data and assessing GeMoMa predictions.
For all analyses, we discarded gene models from the given annotation with missing start or stop codon, premature stop codon(s) or ambiguous nucleotide(s). In addition, we only used one representative gene model if several gene models of a gene have the same CDS, i.e. only differ in their UTRs.