The orthologous gene pairs between two species were identified through the combination of both Best Reciprocal Hits (BRH) and OrthoMCL strategies. The coding sequences were aligned using PAL2NAL [58 (link)], guided by protein sequence alignment generated by MAFFT (linsi; version 7.045b) [59 (link)], and gaps in the alignment were removed. The gapless coding sequence alignments were used for Ka/Ks ratio calculation using the Bioinformatics Toolbox in Matlab (Mathworks, Inc.) with a 50-codon sliding window. For identifying positively selected sites, coding sequences from Arabidopsis, maize, rice and Agave were aligned by Translatorx [60 (link)] using the standalone script. The HyPhy package were used to identify positively selected sites as described [61 (link)], and the tests of FUBAR and REL models as implemented in Datamonkey webserver were used with default settings [62 (link)]. Since we used a sliding window to study the regions of protein with positive selection, we calculated the probabilities of Ka/Ks positive regions to a null hypothesis that Ka/Ks equals to one by one-sided t-test, as described by Schmid and Yang (2008) [63 (link)].
Free full text: Click here