IDAT files were loaded into the R (2.14) environment using the Bioconductor (2.9) minfi package (1.0.0) [25 ]. The detection P-values for all probes were then calculated for the data using functionality provided in minfi. Probes on the × and Y chromosomes were removed at this stage. Two versions of the data were used in subsequent analyses: the raw data and SWAN data. Probes with a detection P-value >0.01 in one or more samples were then excluded. The differential methylation analysis was performed for both datasets on the subset of 18,678 probes that overlapped with the RRBS data using the 'dmpFinder' minfi function. The 'dmpFinder' function uses an F-test to identify positions that are differentially methylated between two groups. The tests are performed on M-values (log2(Methylated/Unmethylated)) as recommended in Du et al. [35 (link)]. Variance shrinkage was used due to the small sample size. In 'dmpFinder', the sample variances are squeezed by computing empirical Bayes posterior means using the limma package [36 ]. Example R code for performing a differential methylation analysis using minfi can be found in Additional file 2.
True positives were defined to be CpGs that had an absolute difference in β value >0.25 between the kidney and rectum RRBS samples. Additionally, for the ROC analysis, which was performed using the ROCR package [31 (link)], true negatives were defined as those CpGs found to have an absolute difference in β value <0.05 between the RRBS samples.
Free full text: Click here