The IGs in the nine species were obtained using the General Feature Format Version 3 (GFF3) file. First, genes that contain the line “CDS” were extracted from the GFF3 file. Redundant sequences representing the same loci were excluded. Genes containing only one line for “exon” were extracted from each genome and used as candidate sequences for further analyses. If there was only one line for “exons,” the coding sequence was considered to lack introns and the gene was designated as intronless. Because mitochondrial and chloroplast DNA do not contain introns, genes labeled “MT” and “PT” were deleted. Genes that were not mapped to chromosomes were also eliminated. To ensure that IGs were accurately identified, all candidate genes were verified using the SMART online tool (http://smart.embl-heidelberg.de). Finally, a non-redundant IG data set for nine Poaceae species was generated. After excluding the IGs, the remaining genes were considered to be potential MEGs. The longest coding sequences were selected as the representative transcripts to generate MEG data sets for the subsequent analyses. The number of introns in each coding gene was extracted from the GFF3 file using the Python script (https://github.com/irusri/Extract-intron-from-gff3).
Free full text: Click here