Existing data from underwater visual surveys of fish and benthic assemblages were collated from seven monitoring programs for the main Hawaiian Islands. Each dataset was transformed into a consistent format and checked for errors. Data were only included in the analysis if benthic and fish surveys were co-located at a unique latitude and longitude and were from forereef habitats at depths of 0 to 30 meters. We also tested for the sensitivity of results to including data from different depth zones between 0 and 30 meters (0–20 meters, and 5–20 meters), and found no effect (Supplementary Figure S1). The majority of surveys were between 10 and 20 meters, so these depth ranges were chosen as they allowed for testing sensitivity to inclusion of shallow depths (<5 meters) and deeper depths (>20 m). The majority of the data (98%) were from 2000–2013. A total of 3,345 unique sites, defined as a survey location with a unique latitude and longitude, were used in the analyses. To account for spatial autocorrelation, means were taken for surveys within a defined distance of 300 meters (Supplementary Figures S2 and S3), resulting in an overall sample size of 1,027. The mean was used for sites with data across multiple years. Data sources, survey methods, and other meta-data are provided in the Supplementary Material.
To account for differences in survey method, fish data were standardised using calibration factors to the NOAA Biogeography Program belt transect since that method had the greatest consistency with the majority of other programs34 (link) (Supplementary Table S1). Calibration factors were developed using an automated software program that utilises general linear models and Monte Carlo simulations35 . Calibrations were calculated by species where possible based on the following decision rules: (1) a minimum of 10 paired observations were available within an island, (2) observations were not dominated by zeros – if they were, a delta model was run in which occurrences were modeled separately from non-occurrences, (3) residuals were normally distributed – if not, data were log-transformed and the model was rerun and checked again. If a species did not pass this series of rules, a calibration factor for each combination of family and trophic level (e.g. zooplanktivorous wrasses) was calculated. If a calibration factor could not be calculated at the family and trophic level, then a global calibration for the entire method was used. For all subsequent analyses, density estimates were based on calibrated densities of raw data. Benthic surveys were not calibrated as previous results found no large bias associated with percent cover among the methods used36 (link) (Supplementary Table S1).
The biomass of individual fishes was estimated using the allometric length-weight conversion: W = aTLb, where parameters a and b are species-specific constants, TL is total length (cm), and W is weight (g). Length-weight fitting parameters were obtained from a comprehensive assessment of Hawai‘i specific parameters (Donovan et al., unpublished data) and FishBase37 . Several fish species were removed from fish biomass calculations if aspects of their life history led to inaccurate counts with visual surveys, such as cryptic benthic species, nocturnal species, and pelagic schooling species. Likewise, manta rays were excluded, as their size is difficult to visually estimate and they have high biomass but are encountered infrequently. Additional methodology was developed for dealing with outliers in the fish data, accounting for extreme observations of schooling species. Extreme observations in the database were defined by calculating the upper 99.9% of all individual observations (e.g. one species, size and count on an individual transect), resulting in 26 observations out of over 0.5 million, comprised of 11 species. The distribution of individual counts in the entire database for those 11 species was then used to identify observations that fell above the 99.0% quantile of counts for each species individually. These observations were adjusted to the 99.0% quantile for analysis.
Fish and benthic assemblages were analyzed primarily at the level of functional groups. Benthic assemblages were broken into major functional groups including coral, macroalgae, turf algae, crustose-coralline algae, and other benthic cover (e.g. sponges, sand, basalt rock, recently dead coral). Other benthic cover was not broken down further due to limitations from incorporating data from different methods with different definitions for other benthic taxa. The fish assemblage was characterised into three trophic groups; herbivores, secondary consumers, and predators. Herbivores were further subdivided based on their feeding mode into browsers, grazers and scrapers following Edwards et al.38 (link), which have been suggested as important indicators of resilience on coral reefs (Supplementary Table S2)4 (link),29 (link),39 (link),40 . Browsers were defined as those herbivores that feed on macroalgae and associated epiphytic material and are important for reducing cover of competing macroalgae. Grazers are considered those fishes that feed largely on algal turfs, which can limit the establishment of macroalgae, and scrapers also feed on algal turfs but can remove the reef substratum, opening space for coral recruitment41 ,42 . Secondary consumers included corallivores, omnivores, invertivores, and planktivores. These groups were not further subdivided because they tend to have unstable biomass estimates (e.g., planktivores usually occur in large numbers with patchy distributions) so are not estimated well with transects, and thus may provide spurious results when considered independently. Predators were defined as large piscivorous species, such as sharks, jacks, and barracuda (Supplementary Table S2)43 (link). Additional functional groups that have been shown to relate to reef resilience, such as urchins and Acanthaster spp., could not be included because data were not available.
Before analysis, all data were fourth root transformed and centered to meet the assumptions of linear models and all variables were standardised to the same scale. The fourth root transformation was chosen because it was strong enough to meet assumptions for all variables, such that a common transformation could be used for all 10 variables.
Regimes were identified using model-based clustering with the mclust package in R44 with the 10 fish and benthic functional groups as inputs. The cluster analysis is based on a probability model where each cluster is a mixture of multivariate normal distributions composed of the densities of each component, and each observation is assigned to a cluster based on the probability of membership given the observation. The mclust function uses three strategies for defining clusters: 1) initialization of the model with model-based hierarchical clustering, 2) maximum likelihood estimation with the expectation-maximization algorithm, and 3) model selection and the number of clusters that are approximated with Bayes factors and Bayesian Information Criterion45 (link) (Supplementary Figure S4). Uncertainty in a point’s assignment to a regime was obtained during the clustering process by subtracting the probability of the most likely regime for each observation from one44 .
The 10 multivariate benthic and fish functional groups were visualised with a non-metric multidimensional scaling plot using the metaMDS function in the vegan package in R46 . A Bray-Curtis distance matrix was created with 2-dimensions and a maximum of 50 random starts to search for a stable solution and avoid getting trapped in a local optima47 . Multivariate dispersion was also calculated for each regime and tested with an analysis of multivariate homogeneity of group dispersions with the betadispr function in the vegan package.
Coral species richness was examined across regimes with an Analysis of Variance, and contrasts and confidence intervals were calculated with Tukey’s honest significant differences where coral richness was the response variable and regimes were the explanatory variable. The community composition of corals was also examined by calculating the proportional cover of the four most abundant species within each regime, including Porites lobata, Pocillopora meandrina, Porites compressa, Montipora capitata. Coral species were also classified with a trait-based approach into competitive, stress-tolerant, generalist or weedy species following Darling et al.48 (link), with additional species specific information on bleaching tolerance, and were compared across regimes.
In the Hawaiian Islands, seascape variables such as depth, habitat complexity, and wave exposure have been shown to be important predictors of fish assemblages49 (link)–52 (link). As a result, spatial patterns across regimes were examined by comparing the proportion of each regime at each island, and by comparing the proportion of sites within a regime that were located on north, south, east, and west facing shores across islands. Additionally, depth and habitat complexity, measured as the maximum rate of change in seafloor slope (i.e. slope of slope), were calculated for each point from LiDAR derived bathymetry within a 60 m radius of each survey location53 (link), and were compared across regimes with an Analysis of Variance and post-hoc Tukey multiple comparisons with either depth or complexity as response variables and regimes as explanatory variables.
Temporal transitions between regimes were assessed by predicting the regime for each year at each site individually with the function predict.Mclust in R. Predictions were retained if there was at least a 95% probability of the regime prediction belonging to that regime, and only sites where predictions were available for at least three years between 2000 and 2016 were retained. We used an extended dataset for the transition analysis encompassing a greater number of years compared to the dataset used to characterise the regimes, which ended in 2013. The extended dataset was not used previously in characterizing the regimes in order to avoid confounding effects of coral bleaching events that occurred in 2014 and 2015, as widespread bleaching was unobserved in all previous study years. A total of 80 sites were included in the temporal data set, and patterns of regime transitions were compared by calculating the frequency of a given transition as a proportion of the total number of possible transitions (n = 261). We also tested the sensitivity of analysing data from all 80 sites compared with only analysing those with longer time series (>4 or 6 years) by calculating binomial confidence intervals for each transition in each case. These sites also tended to represent permanent monitoring stations, which allowed for testing for the sensitivity to using observations from locations that may shift spatially from year to year. Binomial confidence intervals for each transition in each case were produced with the binconf function in the Hmisc package in R54 using the Wilson interval.
Finally, we tested the hypothesis that local and global human impacts will result in some transitions among regimes being more likely than others. Each observed transition was treated as a replicate, and the geographical position was used to obtain a value for a local impact represented as human population density within a 15 km radius, and a global impact represented as the degree heating weeks at the height of the 2015 bleaching event (Supplementary Material). The probability of a given transition was estimated with a Bayesian binomial model for each variable where there were at least four occurrences for a given transition as a function of either human population density or degree heating weeks. More details of the Bayesian binomial model are in the Supplementary material.
All analyses were conducted in the R environment for statistical computing version 3.3.055 .
Free full text: Click here