To find associations between TF targeting and promoter methylation status and copy number variation status, we selected 76 melanoma CCLE cell lines and we computed the significance of associations using ANOVA as implemented in the Python package statsmodels v0.13.2 [96 ]. Since we were mostly interested in finding strong associations and prominent regulatory hallmarks of melanoma, we discretized the input data by considering a gene to be amplified if it had more than three copies and to be deleted if both copies are lost. For promoter methylation data, promoters were defined in CCLE as the 1kb region downstream of the gene’s transcriptional start site (TSS). We defined hypermethylated promoter sites as those having methylation status with a z-score greater than three and we defined hypomethylated sites as those having methylation status with a z-score less than negative three; we considered a gene to be amplified if it had evidence of more than three copies in the genome and to be deleted if both copies are lost. We only computed the associations if they had at least three positive instances of the explanatory variable (for example, for a given gene at least three cell lines had a hypomethylation in that gene’s promoter) and corrected for multiple testing using a false discovery rate of less than 25% following the Benjamini-Hochberg procedure [97 ].
In all melanoma cell lines, for each modality (promoter hypomethylation, promoter hypermethylation, gene amplification, and gene deletion) and for each gene, we built an ANOVA model using TF targeting as the response variable across all melanoma cell lines while the status of that gene (either promoter methylation or copy number status) was the explanatory variable. For example, in modeling promoter hypermethylation, we chose positive instances to represent hypermethylated promoters and negative instances for nonmethylated promoters along with an additional factor correcting for the cell lineage. Similarly, for copy number variation analysis, we chose positive instance to represent amplified genes and negative instances for nonamplified genes while correcting for cell lineage. We only computed the associations if they had at least three positive instances of the explanatory variable (for example, promoter hypomethylation in at least three cell lines).
To predict drug response using TF targeting, we conducted a linear regression with elastic net [45 (link)] regularization as implemented in the Python package sklearn v1.1.3 using an equal weight of 0.5 for L1 and L2 penalties using regorafenib cell viability assays in melanoma cell lines as a response variable and the targeting scores of 1,132 TFs (Table S5) as the explanatory variable.
Finally, to model EMT in melanoma, we used MONSTER on two LIONESS networks of melanoma cancer cell lines, one representing a primary tumor (Depmap ID: ACH-000580) as the initial state and the other a metastasis cell line (Depmap ID: ACH-001569) as the end state. We modified the original implementation of MONSTER that implements its own network reconstruction procedure to take any input network, such as LIONESS networks. MONSTER identifies differentially involved TFs in the transition by shuffling the columns of the initial and final state adjacency matrices 1000 times to build a null distribution, which is then used to compute a standardized differential TF involvement score by scaling the obtained scores by those of the null distribution.
Free full text: Click here