To test for an association between overall piRNA or KRAB-ZFP pathway activity and genome size, we first compiled male and female gonad RNA-Seq datasets for vertebrates of diverse genome sizes, including P. ornatum (ornate burrowing frog), Gallus gallus (chicken), D. rerio (zebrafish), Xenopus tropicalis (Western clawed frog), A. carolinensis (green anole), Mus musculus (mouse), Geotrypetes seraphini (Gaboon caecilian), Rhinatrema bivittatum (two-lined caecilian), and Caecilia tentaculata (bearded caecilian) spanning genomes sizes from 1.0—5.5 Gb, and P. waltl (the Iberian ribbed newt), A. mexicanum (the Mexican axolotl), C. orientalis (the fire-bellied newt), P. annectens, and P. aethiopicus (African and marbled lungfishes) spanning genome sizes from 20—∼130 Gb (Supplementary Files S8,S9). We performed de novo assemblies using the same pipeline as for R. sibiricus on all obtained datasets.
We identified transcripts of 21 genes receiving a direct annotation of piRNA processing in vertebrates in the Gene Ontology knowledgebase that were present in the majority of our target species: ASZ1, BTBD18 (BTBDI), DDX4, EXD1, FKBP6, GPAT2, HENMT1 (HENMT), MAEL, MOV10l1 (M10L1), PIWIL1, PIWIL2, PIWIL4, PLD6, TDRD1, TDRD5, TDRD6, TDRD7, TDRD9, TDRD12 (TDR12), TDRD15 (TDR15), and TDRKH. In addition, we identified transcripts of 14 genes encoding proteins that create a transcriptionally repressive chromatin environment in response to recruitment by PIWI proteins or KRAB-ZFP proteins, 12 of which received a direct annotation of NuRD complex in the Gene Ontology knowledgebase and 2 of which were taken from the literature: CBX5, CHD3, CHD4, CSNK2A1 (CSK21), DNMT1, GATAD2A (P66A), MBD3, MTA1, MTA2, RBBP4, RBBP7, SALL1, SETDB1 (SETB1), and ZBTB7A (ZBT7A) (Ecco et al., 2017 (link); Wang et al., 2023 (link)). Finally, we identified TRIM28, which bridges this repressive complex to TE-bound KRAB-ZFP proteins in tetrapods, lungfishes, and coelacanths (Ecco et al., 2017 (link)). For comparison, we identified transcripts of 14 protein-coding genes receiving a direct annotation of miRNA processing in vertebrates in the Gene Ontology knowledgebase, which we did not predict to differ in expression based on genome size: ADAR (DSRAD), AGO1, AGO2, AGO3, AGO4, DICER1, NUP155 (NU155), PUM1, PUM2, SNIP1, SPOUT1 (CI114), TARBP2 (TRBP2), TRIM71 (LIN41), and ZC3H7B. Expression levels for each transcript in each individual were measured with Salmon (Patro et al., 2017 (link)) (Supplementary File S10).
As a proxy for overall piRNA silencing activity, for each individual, we calculated the ratio of total piRNA pathway expression (summed TPM of 21 genes) to total miRNA pathway expression (summed TPM of 14 genes). As a proxy for transcriptional repression driven by both the piRNA pathway and KRAB-ZFP binding activity, we calculated the ratio of total transcriptional repression machinery expression (summed TPM of 14 genes) to total miRNA pathway expression. Finally, we calculated the ratio of TRIM28 expression to total miRNA pathway expression for each individual. We also calculated these ratios with a more conservative dataset allowing for no missing genes; this yielded 15 piRNA pathway genes, 9 KRAB-ZFP genes, and 13 miRNA genes. We plotted these ratios to reveal any relationship between TE silencing pathway expression and genome size.
Free full text: Click here