Gene expression profiles were combined with pimonidazole data in a two-step procedure described previously (Halle et al, 2012 (link)). First, an explorative, unsupervised analysis was performed. The investigation cohort was split into a pimonidazole-positive and -negative group according to an immunoscore of <2 (n=21) or ⩾2 (n=18), and the Linear Models for Microarray Data software was applied to find genes differentially expressed between the groups. A nominal P-value of 0.05 was used as cutoff, resulting in an appropriate number of about 1000 genes. Biological processes enriched in the pimonidazole-positive group were analysed using the DAVID gene ontology (GO) software (Huang et al, 2009 (link)), where a false discovery rate of <10% (q<0.1) was considered to be significant.
Second, a supervised gene set enrichment analysis was performed with 21 gene sets covering the significant biological processes from the GO analysis, using the Significance Analysis of Microarrays for Gene Sets (SAM-GS) software, which is based on the moderated t-statistics in SAM (Dinu et al, 2007 ). All gene sets were collected from the Molecular Signatures Database except a prostate cancer-specific hypoxia gene set constructed in this work, two hypoxia gene sets constructed in head and neck (Toustrup et al, 2011 (link)) and cervical cancer (Halle et al, 2012 (link)) and two target gene sets of the hypoxia-inducible factor 1 (HIF1) (Ragnum et al, 2013 (link)) and androgen receptor (AR) (Massie et al, 2011 (link)), respectively. The prostate cancer-specific hypoxia gene set was generated from the expression data of four prostate cancer cell lines and included genes with more than two-fold upregulation under hypoxia in at least two cell lines (Supplementary Table S2).