The heritability of seed protein and oil concentration was defined as
where is the genetic variance among accessions, and is the environmental variance which results from error and the accession × location interaction.
To obtain the matrix of population structure, a total of 42,368 SNPs were analyzed in the 298 germplasm accessions using the Admixture program v. 1.22 [58 (link)]. The 10-fold cross-validation procedure was performed with 25 random seeding replications for K values from 2 to 30. The minimum mean standard error was when K = 17. The kinship coefficient matrix that explained the most probable identity by state of each allele between individuals was estimated with the TASSEL program [59 (link)]. For a genome-wide association study, we compared the false positive rate using the general linear model (GLM), the mixed linear model (MLM), and the compressed MLM of the TASSEL program [59 (link)]. The MLM was as good as the compressed MLM and greatly reduced the false positive rate versus the GLM. For this study, the value of 0.001 was used as a Type I error significance threshold P value. As a verification of the genome regions identified in this study we compared the genomic locations of previously reported seed protein and oil QTL with the physical positions of the markers showing significant associations in this study.