For each sample, a PCR amplicon was created to serve as the template for Illumina sequencing. The steps used to generate the PCR amplicon for each of the seven sample types (
fig. 2) are listed below. Once the PCR template was generated, for all samples the PCR amplicon was created using the amplicon PCR program described above in 50 μl reactions consisting of 25 μl of 2× KOD Hot Start Master Mix, 1.5 μl each of 10 μM of 5’-BsmBI-Aichi68-NP and 3’-BsmBI-Aichi68-NP, the indicated template, and ultrapure water. A small amount of each PCR reaction was run on an analytical agarose gel to confirm the desired band. The remainder was then run on its own agarose gel without any ladder (to avoid contamination) after carefully cleaning the gel rig and all related equipment. The amplicons were excised from the gels, purified over ZymoClean columns, and analyzed using a NanoDrop to ensure that the absorbance at 260 nm was at least 1.8 times that at 230 nm and 280 nm. The templates were as follows:
DNA: The templates for these amplicons were 10 ng of the unmutated independent plasmid preps used to create the codon mutant libraries.
mutDNA: The templates for these amplicons were 10 ng of the plasmid mutant libraries.
RNA: This amplicon quantifies the net error rate of transcription and reverse transcription. Because the viral RNA is initially transcribed from the reverse-genetics plasmids by RNA polymerase I, but the bidirectional reverse-genetics plasmids direct transcription of RNA by both RNA polymerases I and II (Hoffmann et al. 2000 (link)), the RNA templates for these amplicons were transcribed from plasmids derived from pHH21 (Neumann et al. 1999 (link)), which only directs transcription by RNA polymerase I. The unmutated WT and N334H NP genes were cloned into this plasmid to create pHH-Aichi68-NP and pHH-Aichi68-NP-N334H. Independent preparations of these plasmids were transfected into 293T cells, transfecting 2 μg of plasmid into 5 × 105 cells in six-well dishes. After 32 h, total RNA was isolated using Qiagen RNeasy columns and treated with the Ambion TURBO DNA-free kit (Applied Biosystems AM1907) to remove residual plasmid DNA. This RNA was used as a template for reverse transcription with AccuScript (Agilent 200820) using the primers 5’-BsmBI-Aichi68-NP and 3’-BsmBI-Aichi68-NP. The resulting cDNA was quantified by quantitative PCR (qPCR) specific for NP (see below), which showed high levels of NP cDNA in the reverse-transcription reactions but undetectable levels in control reactions lacking the reverse transcriptase, indicating that residual plasmid DNA had been successfully removed. A volume of cDNA that contained at least 2 × 106 NP cDNA molecules (as quantified by qPCR) was used as template for the amplicon PCR reaction. Control PCR reactions using equivalent volumes of template from the no reverse-transcriptase control reactions yielded no product.
virus-p1: This amplicon was derived from virus created from the unmutated plasmid and collected at the end of the first passage. Clarified virus supernatant was ultracentrifuged at 64,000 × g for 1.5 h at 4 °C, and the supernatant was decanted. Total RNA was then isolated from the viral pellet using a Qiagen RNeasy kit. This RNA was used as a template for reverse transcription with AccuScript using the primers 5’-BsmBI-Aichi68-NP and 3’-BsmBI-Aichi68-NP. The resulting cDNA was quantified by qPCR, which showed high levels of NP cDNA in the reverse-transcription reactions but undetectable levels in control reactions lacking the reverse transcriptase. A volume of cDNA that contained at least 107 NP cDNA molecules (as quantified by qPCR) was used as template for the amplicon PCR reaction. Control PCR reactions using equivalent volumes of template from the no reverse-transcriptase control reactions yielded no product.
virus-p2, mutvirus-p1, and mutvirus-p2: These amplicons were created as for the virus-p1 amplicons but used the appropriate virus as the initial template as outlined in figure 2.
An important note: It was found that the use of relatively new RNeasy kits with β-mercaptoethanol (a reducing agent), freshly added per the manufacturer’s instructions, was necessary to avoid what appeared to be oxidative damage to purified RNA.
The overall experiment only makes sense if the sequenced NP genes derive from a large diversity of initial template molecules. Therefore, qPCR was used to quantify the molecules produced by reverse transcription to ensure that a sufficiently large number were used as PCR templates to create the amplicons. The qPCR primers were 5’-Aichi68-NP-for (
gcaacagctggtctgactcaca) and 3’-Aichi68-NP-rev (
tccatgccggtgcgaacaag). The qPCR reactions were performed using the SYBR Green PCR Master Mix (Applied Biosystems 4309155) following the manufacturer’s instructions. Linear NP PCR-ed from the pHWAichi68-NP plasmid was used as a quantification standard—the use of a linear standard is important, because amplification efficiencies differ for linear and circular templates (Hou et al. 2010 (
link)). The standard curves were linear with respect to the amount of NP standard over the range from 10
2 to 10
9 NP molecules. These standard curves were used to determine the absolute number of NP cDNA molecules after reverse transcription. Note that the use of only 25 thermal cycles in the amplicon PCR program provides a second check that there are a substantial number of template molecules, as this moderate number of thermal cycles will not lead to sufficient product if there are only a few template molecules.
To allow the Illumina sequencing inserts to be read in both directions by paired-end 50 nt reads (
supplementary fig. S1,
Supplementary Material online), it was necessary to us an Illumina library-prep protocol that created NP inserts that were roughly 50 nt in length. This was done via a modification of the Illumina Nextera protocol. First, concentrations of the PCR amplicons were determined using PicoGreen (Invitrogen P7859). These amplicons were used as input to the Illumina Nextera DNA Sample Preparation kit (Illumina FC-121-1031). The manufacturer’s protocol for the tagmentation step was modified to use 5-fold less input DNA (10 ng rather than 50 ng) and 2-fold more tagmentation enzyme (10 μl rather than 5 μl), and the incubation at 55 °C was doubled from 5 to 10 min. Samples were barcoded using the Nextera Index Kit for 96 indices (Illumina FC-121-1012). For index 1, the barcoding was DNA with N701, RNA with N702, mutDNA with N703, virus-p1 with N704, mutvirus-p1 with N705, virus-p2 with N706, and mutvirus-p2 with N707. After completion of the Nextera PCR, the samples were subjected to a ZymoClean purification rather than the bead cleanup step specified in the Nextera protocol. The size distribution of these purified PCR products was analyzed using an Agilent 200 TapeStation Instrument. If the NP sequencing insert is exactly 50 nt in size, then the product of the Nextera PCR should be 186 nt in length after accounting for the addition of the Nextera adaptors. The actual size distribution was peaked close to this value. The ZymoClean-purified PCR products were quantified using PicoGreen and combined in equal amounts into pools: A WT-1 pool of the seven samples for that library, a WT-2 pool of the seven samples for that library, etc. These pools were subjected to further size selection by running them on a 4% agarose gel versus a custom ladder containing 171 and 196 nt bands created by PCR from a GFP template using the forward primer
gcacggggccgtcgccg and the reverse primers
tggggcacaagctggagtacaac (for the 171 nt band) and
gacttcaaggaggacggcaacatcc (for the 196 nt band). The gel slice for the sample pools corresponding to sizes between 171 and 196 nt was excised and purified using a ZymoClean column. A separate clean gel was run for each pool to avoid cross contamination.
Library QC and cluster optimization were performed using Agilent Technologies qPCR NGS Library Quantification Kit (Agilent Technologies, Santa Clara, CA). Libraries were introduced onto the flow cell using an Illumina cBot (Illumina, Inc., San Diego, CA) and a TruSeq Rapid Duo cBot Sample Loading Kit. Cluster generation and deep sequencing was performed on an Illumina HiSeq 2500 using an Illumina TruSeq Rapid PE Cluster Kit and TruSeq Rapid SBS Kit. A paired-end, 50 nt read-length (PE50) sequencing strategy was performed in rapid run mode. Image analysis and base calling were performed using Illumina’s Real Time Analysis v1.17.20.0 software, followed by demultiplexing of indexed reads and generation of FASTQ files, using Illumina’s CASAVA v1.8.2 software (
http://www.illumina.com/software.ilmn, last accessed May 31, 2014). These FASTQ files were uploaded to the Sequence Read Archive (SRA) under accession SRP036064 (see
http://www.ncbi.nlm.nih.gov/sra/?term=SRP036064, last accessed May 31, 2014).
, & Bloom J.D. (2014). An Experimentally Determined Evolutionary Model Dramatically Improves Phylogenetic Fit. Molecular Biology and Evolution, 31(8), 1956-1978.