All the analysis of Illumina microarray data was performed in R (42 ). The analysis of the GEO data set is described in more detail in (48 ). As the arrays in this data set were generated from a diverse set of tissues and processed using different normalization methods, direct comparisons between all arrays were not possible. Therefore, the intensities on each array were ranked to investigate the relative expression levels of the probes. The average rank for each bead type across the entire data set was then calculated and used to assess differences between the different annotation categories.
Two separate analyses were performed on the MAQC data. First, a comparison of the MAQC samples run on different Illumina platforms was performed. Normalized data generated from V1 and V2 platforms were taken directly from the published summarized values and the V3 data were BASHed (49 (link)) and summarized using beadarray (50 (link)), and then median normalized. We were then able to look at how particular probes of interest evolved over different versions of the annotation.
The MAQC data generated using Human WG-6 V1 BeadArrays were analyzed to look at the interaction between filtering, differential expression and the annotation categories. Non-normalized MAQC V1 data were read into R and a series of different filtering approaches (Supplementary Data ) were applied to all probes. A differential expression analysis was also performed on non-filtered, quantile-normalized, data. The limma (51 ) package was used to find differentially expressed genes, and the log-odds scores given by empirical Bayes moderation of variances (52 ) were used to rank probes.
For the Trisomy study, the data were summarized, quantile-normalized and log2-transformed using beadarray (50 (link)). Differential expression was again quantified by the log-odds after empirical Bayes moderation of variances.
For DASL, data were analyzed and summarized using default BeadStudio settings to provide a 5 × 1506 matrix of observations.
Two separate analyses were performed on the MAQC data. First, a comparison of the MAQC samples run on different Illumina platforms was performed. Normalized data generated from V1 and V2 platforms were taken directly from the published summarized values and the V3 data were BASHed (49 (link)) and summarized using beadarray (50 (link)), and then median normalized. We were then able to look at how particular probes of interest evolved over different versions of the annotation.
The MAQC data generated using Human WG-6 V1 BeadArrays were analyzed to look at the interaction between filtering, differential expression and the annotation categories. Non-normalized MAQC V1 data were read into R and a series of different filtering approaches (
For the Trisomy study, the data were summarized, quantile-normalized and log2-transformed using beadarray (50 (link)). Differential expression was again quantified by the log-odds after empirical Bayes moderation of variances.
For DASL, data were analyzed and summarized using default BeadStudio settings to provide a 5 × 1506 matrix of observations.