A pair of unique DNA barcodes distinctly identify each tumor. These barcodes were detected via next-generation sequencing on Illumina HiSeq (2500 or 4000) platforms. Reads were filtered for quality and congruence with the expected lentiviral sequences flanking the barcodes, trimmed, and sorted into tallies of unique sequences. These unique barcode sequences were then annotated as tumors using the Tuba-seq barcode clustering algorithm described previously7 (link) and publicly released (see URLs). GC-content amplification bias was then subtracted using a fourth-order polynomial fit to the residual relationship between barcode GC-content and tumor size (LN mean size of all barcodes with a particular GC content after the LN mean effect of each sgRNA in each mouse has been subtracted), as recommended previously7 (link). Lastly, the absolute number of neoplastic cells in each tumor was determined by multiplying the number of barcode reads for each tumor by the cell number of the three benchmark controls (500,000 cells, added to each mouse lung before lysis and DNA isolation) divided by their mean barcode tallies, as described previously7 (link).