We downloaded 139 genome sequences with coding gene annotation files, including Coleoptera, Diptera, Hemiptera, Hymenoptera, and Lepidoptera, from the National Center for Biotechnology Information [31 (link)], InsectBase [32 (link)], VectorBase [33 (link)], Fireflybase [34 (link)], Ensembl Genomes [35 (link)], and GigaDB [36 (link)] to allow for more in-depth analysis (Table S1). The corresponding coding genes had to be found based on the annotation file and the gene sequencing data. We filtered out species with low-quality genomes using the Scaffold N50 genome characteristic value, which is positively related to genome quality, and the more significant, the better. Species with scaffold N50 < 400 Kb genomic assemblies were eliminated. The most extended transcript was chosen when there were many alternative splicing variants for a protein-coding gene. We selected 50 insect species containing the annotation file, 27 of which were verified by literature references as herbivorous and used as positive samples. Twenty-three insect species have been shown in the literature not to feed mainly on plants. Therefore, they are used as examples of non-herbivorous insects. (Table S2).
Free full text: Click here