Multivariate Cox regression, log-rank test and Kaplan–Meier estimators were implemented using the R package survival. The association between CD8 T-cell abundance and tumor status was evaluated using logistic regression corrected for age and clinical stage and was implemented using the R package glm. The same analysis was performed for neutrophil abundance and gender associations, corrected for age and smoking history. Partial correlations of immune cell abundance and gene expression of chemokines and receptors, somatic mutation counts, CT gene expression, as well as immunosuppressive molecule expression were calculated using the R package ppcor. Multiple test correction was performed using the R package qvalue [39 ] and FDR thresholds are applied based on the abundance of signals in the data. In this study, we applied the Pearson correlation to purity and gene expression because it is reasonable to expect that the expression level is linearly associated with tumor purity. For others, we used the Spearman correlation. We applied partial correlation analysis to remove the influence of tumor purity on the involved variables. All other analyses, including linear regression, Fisher’s exact test, Wilcoxon rank sum test, Spearman’s correlation, and hierarchical clustering, were performed using R [40 ]. Of note, in Figs. 2b and 3b, we used the 20 percentile as a cutoff only to help visualize the association of immune infiltration with outcomes and the statistical significance was determined by multivariate Cox regression (Fig. 3a) including all the samples. Our results on survival analysis, neoantigen association, tumor recurrence, and association of checkpoint blockade inhibitory molecules with immune cells are available in Additional file 10: Table S8.
Free full text: Click here