RNA was extracted from each sample using the miRNeasy Kit (Qiagen, Inc., Germantown, MD, USA), and RNA quality was assessed with an Agilent Bioanalyzer 2100 (Agilent, Santa Clara, CA, USA). Downstream RNA processing was performed as we have previously described [50 (link)]. Briefly, RNA sequencing was completed at the SUNY Molecular Analysis Core using the Illumina TruSeq Small RNA Prep protocol and a NextSeq500 instrument (Illumina; San Diego, CA, USA) at a targeted depth of ten million, 50 base, paired-end reads per sample. Reads were aligned to the hg38 build of the human genome using Partek Flow (Partek; St. Louis, MO, USA) and the Bowtie2 aligner. Quantification of messenger RNAs (mRNAs) was performed using Ensembl annotation, and quantification of mature microRNAs was performed using miRBase 22 annotation. RNA reads that were unaligned to the human genome were aligned to the NCBI RefSeq genome using Kraken. Bacteria were aligned at the phylum level and viral phages were aligned at the species level. We chose to measure microbial RNA in order to streamline the nucleic acid extraction and analyses steps. Quality of RNA sequencing results was verified through read quality score and total read count. The RNA features with consistent detection (raw read counts ≥10 in ≥100% of samples) in each category (mRNAs, miRNAs, bacteria, virus) were quantile normalized and scaled prior to statistical analysis. Downstream analysis involved 8 transcripts (TMPRSS2, IL1RN, IL6ST, CCL2, TNFAIP3, TLR4, TAB2, NFKBIA), 8 miRNAs (miR-140-3p, miR-22-5p, miR-29c-5p, miR-34c-5p, miR-125a-5p, miR-27b-3p, miR-203a-3p, miR-155-5p), 8 viral phages (Streptococcus phage SpSL1, Pseudomonas phage PPpW-3, Proteus virus PM135, Streptococcus phage phiARI0131-1, Mycobacterium virus Cooper, Bacillus virus Mater, Haemophilus virus HP1, Klebsiella phage vB_KpV477), and 8 microbial phyla (Actinobacteria, Bacteroidetes, Candidatus Saccharibacteria, Firmicutes, Fusobacteria, Proteobacteria, Verrucomicrobia), plus a measure of microbial diversity (Simpson alpha index). These features were selected based on their biologic relevance to viral upper respiratory infections and their abundance in infant saliva [25 (link),26 (link),51 (link),52 (link),53 (link),54 (link)]. A maximum of 8 features were permitted in each category based on an a priori power analysis demonstrating that the sample size (n = 146) provided > 95% power to detect an effect size ≥ 2.0 for 8 predictors, with alpha set at 0.05 (assuming a non-central distribution).
Free full text: Click here