RSEM normalized expression data was extracted into Microsoft Excel and the HPV status was manually curated based on published datasets as described.47 Primary patient samples with known HPV status were grouped as HPV+, HPV-, or normal control tissue. This classification agrees completely with work done by others41, with the exception of sample TCGA-BB-7862‐01A. That sample had minimal reads aligning to the HPV genome, none of which aligned to the HPV16 E6 or E7 oncogenes, and we classified this sample as HPV- rather than HPV+ . Patient samples with unknown HPV status were omitted from our calculations, as were samples obtained from secondary metastatic lesions. This resulted in 73 HPV+, 442 HPV-, and 43 normal control samples with data available for the HNSCC gene expression analysis. This surgically managed cohort was considered treatment-naïve, as only 9 patients received neoadjuvant radiation or chemotherapy treatment (2 HPV+, 7 HPV-). Boxplot comparison of gene expression was performed using GraphPad Prism v7.0 (Graphpad Software, Inc., San Diego, California, USA) and assembled into final form using CorelDRAW (Corel, Ottawa, Ontario, Canada). For the boxplots, center lines show the medians, box limits indicate the 25th and 75th percentiles as determined by Graphpad Prism, and whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles. Statistical significance was calculated using Graphpad Prism. P-values were assigned using a one-tailed non-parametric Mann-Whitney U test. Selected genes were compared in a pairwise fashion and concordance calculated using Spearman’s Rho analysis. Differences were considered to be statistically significant for P < 0.05.