When using ModelFinder, it is important to remember that it optimizes the likelihood of the tree and model, given the data, whenever it searches for the optimal values of parameters considered. Therefore, it is possible that the search algorithms may become trapped in local optima. To reduce the chance of this occurring, we strongly recommend model selection be repeated many times for each data set, as noted above. Doing so may entail using much more computing time, especially when long, species-rich alignments are considered or the advanced search option of ModelFinder is used. Therefore, when the alignment is very long, we recommend the following set of strategies to reduce the amount of time used on model selection:

If the computational resources allow distributed computing, invoke the –nt x option to spread the processes over x threads;

If the data are characters encoded by a specific type of genome (e.g., mitochondrial), invoke the –msub source option to limit the search to this specific type of data;

If the optimal model turns out to include the R10 model of RHAS, we recommend the analysis be rerun with both the –cmin x and –cmax y options invoked (e.g., –cmin 8, –cmax 20). Doing so will ensure that PDF models with k = 8, 9, … , 20 are considered (i.e., lower values of k are ignored). The program will stop when the optimal value of k has been found, even if this value turns out to be 10.

Use the default search option to find the optimal model of SE. Having identified this model, use the advanced search option with the optimal substitution model selected (e.g., –mset LG) to search for the optimal model of RHAS. While there is no guarantee that this approach will identify the optimal model of SE, our experience suggests that the choice of RHAS model is highly influenced by the topology of the tree while that of the substitution model is not.