The kangaroo apple analysis was conducted using the phylogeny and sample site information provided by [15] (link). Map data of Australia was obtained using MapMaker, a companion program to GenGIS that allows custom georeferenced maps to be derived from the digital map data provided by Natural Earth (http://www.naturalearthdata.com/ ). These data files are provided in the Supplemental Information.
Body site data from [25] (link) were obtained from the DNA Data Bank of Japan (ERA000159). The source data consisted of FASTQ files containing amplicons of variable region 2 (V2) of the 16S ribosomal RNA gene. We used version 7 of the RDP classifier [31] (link) as implemented in mothur 1.16.1 [32] (link) to assign taxonomy to all 16S sequences in this dataset. The resulting taxon counts generated were used to generate visual summaries for each body site. Bray-Curtis distances were calculated for each pair of samples, and the resulting distance matrix subjected to Unweighted Pair Group Method with Arithmetic Mean (UPGMA) clustering in mothur. The background image was modified based on an original image obtained from Wikimedia Commons (http://commons.wikimedia.org/wiki/File:Human_body_silhouette.svg ).
RCA was based on benthic macroinvertebrate samples collected between 2002–2011 in the Atlantic Maritime ecozone. Data employed for the example here are described in detail in [22] . Samples were obtained from reference sites included in the calibration dataset used for model construction (n = 128). Reference sites were distributed throughout New Brunswick, Nova Scotia, and Newfoundland. Test sites (n = 16) used for model testing in the present paper were collected in the Upper Mersey area of Nova Scotia. Most macroinvertebrate samples were collected using a standardized traveling kick method, in which the operator disturbs the river substrate to dislodge attached and unattached organisms, which are washed into a triangular net of 400-µm mesh size while zig-zagging upstream. Samples were subsequently sorted in the lab and identified to the taxonomic level of family, to allow the identification of sites deviating from expected assemblage composition. Topographical data were obtained from the Shuttle Radar Topography Mission (SRTM) dataset, via the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DACC:http://http://daac.ornl.gov/ ). Overlaid on the topography map was vector data describing rivers in Atlantic Canada, obtained from Geobase (http://www.geobase.ca ).
Body site data from [25] (link) were obtained from the DNA Data Bank of Japan (ERA000159). The source data consisted of FASTQ files containing amplicons of variable region 2 (V2) of the 16S ribosomal RNA gene. We used version 7 of the RDP classifier [31] (link) as implemented in mothur 1.16.1 [32] (link) to assign taxonomy to all 16S sequences in this dataset. The resulting taxon counts generated were used to generate visual summaries for each body site. Bray-Curtis distances were calculated for each pair of samples, and the resulting distance matrix subjected to Unweighted Pair Group Method with Arithmetic Mean (UPGMA) clustering in mothur. The background image was modified based on an original image obtained from Wikimedia Commons (
RCA was based on benthic macroinvertebrate samples collected between 2002–2011 in the Atlantic Maritime ecozone. Data employed for the example here are described in detail in [22] . Samples were obtained from reference sites included in the calibration dataset used for model construction (n = 128). Reference sites were distributed throughout New Brunswick, Nova Scotia, and Newfoundland. Test sites (n = 16) used for model testing in the present paper were collected in the Upper Mersey area of Nova Scotia. Most macroinvertebrate samples were collected using a standardized traveling kick method, in which the operator disturbs the river substrate to dislodge attached and unattached organisms, which are washed into a triangular net of 400-µm mesh size while zig-zagging upstream. Samples were subsequently sorted in the lab and identified to the taxonomic level of family, to allow the identification of sites deviating from expected assemblage composition. Topographical data were obtained from the Shuttle Radar Topography Mission (SRTM) dataset, via the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DACC:
Full text: Click here