The marijuana strains genotyped in this study were provided by author DH (grown by Health Canada authorized producers) and represent germplasm grown and used for breeding in the medical and recreational marijuana industries (S2 Table). Hemp strains were provided by author JV (Health Canada hemp cultivation licensee), and represent modern seed and fibre cultivars grown in Canada as well as diverse European and Asian germplasm (S3 Table). DNA was extracted from hemp leaf tissue using a Qiagen DNeasy plant mini kit, and from marijuana leaves using a Macherey-Nagel NucleoSpin 96 Plant II kit with vacuum manifold processing. Library preparation and sequencing were performed using the GBS protocol published by Sonah et al [15 (link)]. The raw sequence has been deposited in the NIH Sequence Read Archive (SRA), under BioProject PRJNA285813. SNPs with a read depth of 10 or more were called using the GBS pipeline developed by Gardner et al. [16 (link)], aligning to the canSat3 C. sativa reference genome assembly [3 (link)]. Quality filtering of genetic markers was performed in PLINK 1.07 [17 (link)] by removing SNPs with (i) greater than 20% missingness by locus (ii) a minor allele frequency less than 1% and (iii) excess heterozygosity (a Hardy-Weinberg equilibrium p-value less than 0.0001). After filtering, 14,031 SNPs remained for analysis.
Free full text: Click here