The entire analysis for both microarray and RNA-seq datasets was performed in the R 2.15.0 statistical environment. Samples with missing survival data were excluded from the analysis. Hazard ratio (HR), 95% confidence intervals (CI) and log-rank p values were calculated. We applied the “survival” R package v2.38 (http://CRAN.R-project.org/package=survival/) for Cox regression analysis and the “survplot” R package v0.0.7 (http://www.cbs.dtu.dk/~eklund/survplot/) for generating Kaplan-Meier plots. We determined each percentile of miRNA expression between the lower and upper quartiles of expression as a cutoff point to divide patients into high and low expression groups as described previously45 (link). Because of the low sample number, a cutoff outside the lower or upper quartile of expression could result in unreliable results. After this, the Cox regression analysis was performed separately for each cutoff. We used the cutoff with the lowest p value as the final cutoff in the Kaplan-Meier analysis. The cutoff values vs. p-values plot was generated to display the overall performance of the selected miRNAs.
In addition, a multivariate survival analysis was performed for the TCGA dataset including – in addition to the miRNAs – clinical data of stage and gender. To plot the distribution of miRNA expression in the high and low expression groups, we applied a one-dimensional scatter plot using the “beeswarm” R package (http://www.cbs.dtu.dk/~eklund/beeswarm/). Correlation between different miRNAs was assessed by computing Spearman rank correlations.
Free full text: Click here