The study was approved by the UCLA Institutional Review Board, and all subjects signed informed consent.
Monozygotic twin pairs, differing for sexual orientation, were recruited through the study website, online advertisement and press coverage. Male and female control subjects were recruited using fliers. There were no significant differences in racial composition between the sample sets or age groups. Saliva was collected using Oragene DNA collection kits (Genotek). The majority (up to 74%) of the DNA in saliva collected with this method typically comes from white blood cells, with the remainder being buccal epithelial cells [21] . Genomic DNA was prepared according to the manufacturer's protocol. Zygosity was determined using 9 microsatellite markers. Microarray hybridization was performed by the Southern California Genotyping Consortium at UCLA. 500 ng of genomic DNA was bisulfite converted using the EZ-methylation kit (Zymo Research), and processed according to the Illumina Infinium whole genome genotyping protocol. Labeled samples were hybridized to Illumina HumanMethylation27 arrays, scanned (iScan reader, Illumina), and beta (methylation) values extracted using GenomeStudio software. All array data is MIAME compliant, and the raw data has been deposited in NCBI's GEO, a MIAME compliant database as detailed on the MGED Society website (http://www.mged.org/Workgroups/MIAME/miame.html) under accession number GSE28746.
Analysis: A signed weighted correlation network was constructed as described [11] (link), [22] (link). Module definition was based on the gene methylation status in saliva and ignored age. As module representative, we used the module eigenlocus (ME) which is defined as the first principal component of the module methylation profiles and can be considered a weighted average. To incorporate age into the network analysis, the Student t-test statistic for correlating age with methylation status was used. Lasso penalized regression was performed using the ‘penalized’ package of R[14] (link). All statistical analyses and data processing were performed using the statistical package R version 2.11.1 [23] . PCR reactions for amplification, massarray and pyrosequencing analysis were performed using Sahara and Bio-X-ACT Long enzymes (Bioline). PCR primers and conditions are listed in Methods S1.
Free full text: Click here