RNA was extracted from eight asymptomatic
V. pensylvanica individuals collected from managed honey bee apiaries on Big Island, Hawaii in 2012. Bees were sampled from the frames inside the hive, so will likely be mostly nurses with some foragers and newly emerged bees. Samples W_S23-27 and HB_S11-12 were collected from the North of Big Island, and samples HB_S13, V_S32 and W_S28-30 were from the East. 30 honey bees were pooled for RNA extraction. The Varroa samples were a pool of 10 mites taken from drone brood. cDNA libraries were prepared using oligo dT priming followed by Illumina 2 × 100 bp Hiseq sequencing.
QC was done using FastQC (
http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) to confirm the quality of the raw read data. An in-house contamination-screening pipeline called Kontaminant (
http://www.tgac.ac.uk/kontaminant/) was used to check for any contamination in the raw reads. The wasp libraries showed less than 5% of host mRNA. Even with very low host contamination, kmer filtering was performed to remove any host RNA. There was no viral mapping/filtering done, so we carried out a metagenomic study to assemble all the non-host RNA.
From a total of 8 wasp individuals, around 116 million reads (115, 842, 147 total reads before filtering) were assembled together in a single assembly run using MetaCortex (Unpublished, developed by Richard Leggett in TGAC). MetaCortex is a recently developed variant of Cortex27 (
link)28 (
link) based on de Bruijn graphs, which are constructed by dividing reads into smaller, overlapping sequences called kmers. Contigs were aligned (blastx) against a refseq protein database (NCBI) to identify putative viruses.
One contig in particular was translated within Geneious (Biomatters) and aligned with other
Iflavirus amino acid sequences obtained from Genbank. The alignment was carried out using the Muscle aligner with 8 iterations. The phylogenetic tree was built by the Geneious tree builder using a neighbour joining method and the Jukes-Cantor genetic distance model based on the conserved RdRP region of picorna-like viruses29 (
link). Finally, Geneious was used to map reads against the putative virus contig and Vicuna17 (
link) was used to assemble reads from each individual separately using a pipeline adapted from assembling DWV7 (
link).
Individual reads were aligned against the novel Moku virus genome to create coverage plots for each Illumina sample (
Fig. 2A). From these reads a consensus of the RdRp region was obtained for MV in Varroa and honey bees by keeping bases that match at least 90% of the sequences. The Moku virus genome was annotated based on an amino acid alignment with the SBPV and DWV genomes15 (
link)16 (
link). Regions were identified by protease sites based on the DWV and SBPV genomes and homologous protein domains identified by BLAST. Reads from Varroa and honey bees were pulled out and made into a consensus and aligned with the MV genome from the wasps to confirm that they were indeed MV (
Supplementary Figs S1 and S2).
The insect viromes (
Fig. 3B) were created by aligning individual Illumina reads using BLAST against a custom virus database which included the Moku virus genome, Slow bee paralysis virus, and all three variants of DWV7 (
link). The top hits were counted for each viral species. BLAST hits of individual reads that did not cover the whole read were excluded from the analysis.
Mordecai G.J., Brettell L.E., Pachori P., Villalobos E.M., Martin S.J., Jones I.M, & Schroeder D.C. (2016). Moku virus; a new Iflavirus found in wasps, honey bees and Varroa. Scientific Reports, 6, 34983.