Significantly mutated genes were identified using MutSig2CV which combines p-values from tests for high mutational frequency relative to the background mutation rate (pCV), clustering of mutations within the gene (pCL), and enrichment of mutations within evolutionarily conserved sites (pFN)10 (link). For 660 lung adenocarcinomas, we had 100% power to detect genes mutated in 10% of patients and 73% power for genes mutated in 5% of patients assuming a mutation rate of 8.7/Mb10 (link). For 484 lung squamous cell carcinomas, we had 100% power to detect genes mutated in 10% of patients and 41% power for genes mutated in 5% of patients assuming a mutation rate of 9.7/Mb10 (link). In order to reduce the number of hypotheses tested in the MutSig2CV analysis, we excluded genes that exhibited low expression across tumors with relatively high purity. The median log2 FPKM value for each gene was obtained for 185 ADCs or 238 SqCCs which had a purity estimate from ABSOLUTE of >50% and available RNA-seq data (Supplementary Fig. 1). For each tumor type, a mixture model of two normal distributions was fit in R using the mclust package v4.2. Genes with 95% probability of belonging to the cluster with higher expression were considered in the multiple hypothesis correction of the MutSig2CV combined p-values. One gene, TRERF1, was excluded from the final results as closer inspection of its mutations revealed a recurrent frameshift deletion that was likely a false positive as all of these mutations had low allelic fractions (<1.5%) and had no supporting reads in matching RNA-seq data. A one-sided Fisher’s exact test was used to determine if the proportion of loss-of-function mutations (including nonsense, frameshift, and de novo start out-of-frame mutations) to other mutations for a given gene was significantly higher compared to the proportion of loss-of-function mutations to other mutations across all other genes.