Primary occurrence records for I. ricinus were obtained from diverse sources. Data were drawn from the Global Biodiversity Information Facility (GBIF; www.gbif.org; ~2110 occurrence points), VectorMap (www.vectormap.org; ~1801 occurrence points), and the scientific literature [20 (link)] (~1195 points; S1 File). Sampling was concentrated in Great Britain and Germany thanks to surveillance by the European Vector Map Program of the European Center for Disease Prevention and Control (ECDC; http://ecdc.europa.eu/en/healthtopics/vector/vector-maps/). The initial set of occurrence records was subjected to several data cleaning steps to reduce possible biases in calibrating ecological niche models (ENMs) [21 (link)]. (1) We discarded all records with unknown geographic references, and removed all duplicate records. (2) The data were further filtered by distance, so that all redundant records occurring in a single 10’ cell (~20 km) were omitted. (3) Finally, we accounted for marked differences in sampling density across countries: data records were filtered by balancing the density of occurrences on a country-by-country basis. We chose Spain as a reasonable intermediate-density reference point (6 occurrence records /100,000 km2) to overcome problems associated with oversampling or undersampling observed in some countries. Although we discarded large numbers of data points, this step removes large-scale spatial biases, and allows a better estimation of niche characteristics [22 (link)].
The final balanced dataset of I. ricinus included 416 occurrence points, which we separated five times randomly into equal-sized subsets of 208 points, one subset was for model calibration and the other for model evaluation (Fig 1). These 5 random subgroups provide replicate views of model results and give a better idea of the variation resulting from the availability of occurrence data.
We obtained data on 19 “bioclimatic” variables from the WorldClim climate data version 1.4 [23 ] available via www.worldclim.org. These variables were derived from interpolation of average monthly temperature and rainfall data obtained from weather stations during 1950–2000. We removed variables 8–9 and 18–19 because of known spatial artefacts. We used the data layers at 10’ spatial resolution because of the continental extent of our models. We obtained parallel data layers for 17 general circulation models (GCMs; Table 1) for each representative concentration pathway (RCP) for each time period. We chose two representative concentration pathways, RCP 4.5 and RCP 8.5 (corresponding to lower and higher greenhouse gas emissions, respectively) for 2050 and 2070 to account for possible climate change influences in both scenarios and in two different times. We used diverse GCMs available from the WorldClim archive to estimate both the future distributional potential of I. ricinus based on each individual GCM, which was a key element in assessing uncertainty in predictions deriving from GCM choice.
Free full text: Click here