We chose six species distribution models: BIOCLIM, DOMAIN, MAHAL (Mahalanobis distance), RF (random forests), MAXENT (maximum entropy), and SVM (support vector machine). These six SDMs are widely used in academic research and species conservation [16] , [17] , [33] , [34] .
The BIOCLIM model uses environmental data of all known species distribution points and can determine the range of weather conditions suitable for species occurrence. The percentile distribution of every climatic variable within each grid in the species distributions zone is used for multivariate analysis. If the ranges of all climatic variables in the grid are within boundaries appropriate for that species, the BIOCLIM model indicates that this place is suitable [8] , [17] .
The DOMAIN uses a point-to-point similarity metric based on the Gower distance, which is a method for creating a distance matrix from a set of characteristics of species. DOMAIN can assign a classification value of habitat suitability index to each potential site based on its proximity in environmental space to the most similar positive occurrence location [38] . Then, a threshold value of suitability is chosen to determine the distribution boundaries of species’ ecological niche.
The MAHAL model is based on Mahalanobis distance (MD). MD considers the variables correlations in the data set without depending on the scale of measurements. The method ranks the potential sites through their MD to a vector, which can express the mean environmental values of all recorded environmental factors. A certain distance threshold can act as the ecological niche boundaries. These algorithm generate an elliptic envelope which can explicitly explain the possible interrelations between these environmental factors [21] (link).
The RF, a classification and regression tree model, is a combination of tree predictors where every tree can depend on the values of a random vector sampled independently with the same distribution for all trees in the forest [39] , [40] .
MAXENT is based on a machine learning algorithm called maximum entropy, and is based on the principle that species without ecological constraints will spread as far as possible with a distribution as close as possible to uniform [41] .
The SVM is a machine-learning method that belongs to a family of generalized linear classifiers. The principle of SVM is the Vapnik Chervonenkis (VC) dimension and structural risk minimization theory [42] . The SVM model can find the most reasonable way between species adaptability and complexity to yield the most likely distribution according to the limited sample information [43] .
Each of the SDMs was operated with strictly following the modeling technique and using the same 13 environmental variables. Modeling data, advantages and disadvantages were listed inTable S2 . We chose “R” as the computing platform and the dismo package to simulate species distribution [44] , [45] .
The BIOCLIM model uses environmental data of all known species distribution points and can determine the range of weather conditions suitable for species occurrence. The percentile distribution of every climatic variable within each grid in the species distributions zone is used for multivariate analysis. If the ranges of all climatic variables in the grid are within boundaries appropriate for that species, the BIOCLIM model indicates that this place is suitable [8] , [17] .
The DOMAIN uses a point-to-point similarity metric based on the Gower distance, which is a method for creating a distance matrix from a set of characteristics of species. DOMAIN can assign a classification value of habitat suitability index to each potential site based on its proximity in environmental space to the most similar positive occurrence location [38] . Then, a threshold value of suitability is chosen to determine the distribution boundaries of species’ ecological niche.
The MAHAL model is based on Mahalanobis distance (MD). MD considers the variables correlations in the data set without depending on the scale of measurements. The method ranks the potential sites through their MD to a vector, which can express the mean environmental values of all recorded environmental factors. A certain distance threshold can act as the ecological niche boundaries. These algorithm generate an elliptic envelope which can explicitly explain the possible interrelations between these environmental factors [21] (link).
The RF, a classification and regression tree model, is a combination of tree predictors where every tree can depend on the values of a random vector sampled independently with the same distribution for all trees in the forest [39] , [40] .
MAXENT is based on a machine learning algorithm called maximum entropy, and is based on the principle that species without ecological constraints will spread as far as possible with a distribution as close as possible to uniform [41] .
The SVM is a machine-learning method that belongs to a family of generalized linear classifiers. The principle of SVM is the Vapnik Chervonenkis (VC) dimension and structural risk minimization theory [42] . The SVM model can find the most reasonable way between species adaptability and complexity to yield the most likely distribution according to the limited sample information [43] .
Each of the SDMs was operated with strictly following the modeling technique and using the same 13 environmental variables. Modeling data, advantages and disadvantages were listed in
Full text: Click here