Introduction

Designing spatially explicit models for predicting the number of species and individuals present as juveniles is a critical challenge for reef fish ecology and conservation. Among the factors suggested to contribute to reef resistance and resilience, the number of fish species in a given reef, or set of connected reefs, is often cited (McClanahan et al. 2002; Wilson et al. 2006). Areas that support high numbers of species are usually given high conservation values, particularly when the maintenance and enhancement of biodiversity is the central goal of the management strategy (Myers et al. 2000; Roberts et al. 2002). However the first factor limiting the numbers of species and individuals in a given fish community, following settlement, is juvenile mortality (see review by Doherty 2002). Although variable in time and species specific, juvenile mortality can reach up to 60% at settlement (Doherty et al. 2004). Since juvenile mortality mostly results from the intensity of competition and predation after the settlement (Caselle 1999; Almany and Webster 2006), post-settlement juvenile abundances may depend on habitat characteristics such as refuge availability (Adams et al. 2004; Almany 2004). Identifying the environmental factors that promote high numbers of species and individuals present as juveniles is thus crucial for understanding the processes governing adult population dynamics. Moreover, preventing the degradation of habitats where the numbers of species and individuals of juvenile fish are high can substantially improve their recruitment (Gilliers et al. 2006; Scharf et al. 2006) and consequently increase the numbers of species and individuals in the whole fish community.

Spatially explicit models predicting the number of species and individuals present as juveniles should be based on environmental variables that influence juvenile distribution at different spatial scales. For example, immediately before they settle on the reef, fish larvae may be heterogeneously distributed in shelf waters that differ at a 0.1 to 10-km scale in turbidity, current speed and direction (Williams 1982). Then, larvae settle on habitats that can be characterized at a 0.1 to 10-km scale by their distance to the reef or their exposure to dominant winds (Doherty 1991). At a 10 to 100-m scale, different factors have been found to influence the spatial distribution of juvenile reef fish: average depth (Srinivasan 2003), substratum composition (Depczynski and Bellwood 2004; Sale et al. 2005), characteristics of benthic covers (Adams et al. 2004; Depczynski and Bellwood 2004; Sale et al. 2005) and substratum rugosity (Hixon and Beets 1993; Sale et al. 2005). The temporal patterns in the distribution of juvenile fish may vary with water temperature (Sponaugle et al. 2006) as well as wind speed and direction (Findlay and Allen 2002; Sponaugle et al. 2005). These environmental variables influence juvenile spatial and temporal distribution for a given number of studied species, but the way they interact at the community level and determine the number of species and individuals that will be observed as juveniles at a given location remains largely unknown.

The models aimed at predicting the number of species and individuals within a given spatial domain are by definition spatially explicit and need to be based on field data continuously measured over this spatial domain. Traditionally, descriptive multivariate approaches have been used for relating environmental variables, sometimes measured at different scales, to the number of reef fish species and/or individuals (Ault and Johnson 1998b; Chittaro 2004). These statistical models, and the underlying ecological knowledge they were based on, usually came from point-based field data and are not spatially explicit. In principle, if environmental variables could be represented over a geographic grid, and if they were well correlated to some ecological variables, then relevant ecological predictions could also be spatially represented over the same grid. This way, the predictive model becomes spatially explicit, providing continuous field of biological data that could greatly enhance the power of decision support tools and multi-scale ecological analyses.

Fortunately, environmental variables are now increasingly measured continuously since remote sensing provides a cost-effective and time-efficient mean to survey and map the landscape of virtually any area on Earth, including the shallow marine and coastal environment (Miller et al. 2005). It is thus logical to use remotely sensed information to transform point-based ecological observations into spatially explicit representation of ecological properties. This is now a common approach in terrestrial studies, but still in its infancy for marine coastal studies (Guisan and Thuiller 2005). Since the early 1990s, remote sensing has been primarily used in coral reef studies to investigate bathymetry (Maritorena 1996), exposure to hydrodynamic energy (Courboulès and Manière 1992), detailed geomorphology (Andréfouët and Guzman 2004) and habitat distribution (Ahmad and Neil 1994). Pilot studies with high-cost sensors have shown further potential for mapping pigmentation of cyanobacterial mats (Andréfouët et al. 2003a), coral cover (Mumby et al. 2004) or bottom rugosity (Brock et al. 2004). One of the most useful remote sensing products is habitat maps. Habitat mapping is considered now a standard practice, with an accuracy that depends on the local habitat complexity (Andréfouët et al. 2003b). Habitat maps potentially provide great interpretative power in terms of coral reef ecological functions such as biogeochemistry and reef production, algal and invertebrate biomass distribution and coral reef beta-diversity (Ahmad and Neil 1994; Andréfouët and Payri 2001; Gilbert et al. 2006; Harborne et al. 2006).

In terrestrial ecosystems, spatially explicit species distribution models are developed by the following two steps (see review by Guisan and Thuiller 2005). The first step is to define a point-based, predictive species-environment relationship using the most appropriate linear or non-linear modeling techniques (Guisan and Zimmermann 2000). The second step is to map model predictions over a spatial domain where environmental predictors are continuously measured. As a first attempt for juvenile reef fish communities, the ultimate aim of the present study was to provide quantitative and spatially explicit information on patterns of juvenile species richness and abundance at spatial scales relevant to the management process of coral reef ecosystems. This was done by: (1) assessing the amount of variability in juvenile fish species richness and abundance that can be predicted by a combination of environmental factors recorded across different spatial and temporal scales and (2) mapping the predicted ecological information using a remotely sensed, independently created, coral reef habitat map.

Materials and methods

Study area and sampling design

The study took place in New Caledonia (southwest Pacific, 166°E, 22°S) where the lagoon covers an area of ∼19,000 km2 with numerous patch reefs and islets. The lagoon is bound by a 1,600-km long barrier reef that can be as far as 65 km from the coast, providing a wide shallow shelf.

Fish were sampled in two no-take marine reserves of the southwest lagoon: Canard Islet and Larégnère Islet, respectively, 1.2 km and 13.0 km from the main island (Fig. 1). Both are no-take areas but are routinely visited for recreational activities. Around each Islet, six stations were equally and randomly distributed in three biotopes including seagrass beds, macroalgal beds and coral patch reefs, and divided between the leeward and windward sides of the islets. At each station, the sample unit area was a 5-m radius circular area of approximately 80 m2. The 12 stations were surveyed during six sampling periods from March 2005 to March 2006.

Fig. 1
figure 1

Position of the 12 stations around Canard Islet and Larégnère Islet in the southwest lagoon of New Caledonia and corresponding biotope (sg seagrass bed, ma macroalgal bed, co coral patch reef). For each islet, light grey represents the emerged area and dark grey the reef flat. Triangle up indicates Anse Vata water temperature loggers. Triangle down indicates Amédée Island wind speed and direction loggers

Underwater visual census of fish fauna using stationary point technique (Bohnsack and Bannerot 1986) was performed by two divers. Stations were divided into four 20 m2 quarters. On two quarters of each station, each diver visually identified and counted all fishes (except Blennidae and Gobiidae) and estimated individual total length (TL) to the nearest centimeter. When fishes were forming a school, the mean TL and the total number of all individuals were estimated.

For biotope descriptions, stations were divided into 12 6.5-m2 radial sectors. In each sector, average depth (in meter) was recorded and the percent cover of seven substratum categories, four abiotic (sand, rubble, boulders and dead coral) and three biotic (seagrasses, macroalgae and live coral), were visually estimated. Heterogeneity was defined as the total number of abiotic substratum categories, seagrass and macroalgal genera and coral growth forms (encrusting, massive, submassive, digitate, branching, foliose, tabulate or soft) in each sector. Bottom rugosity was visually assessed through a qualitative index ranging from 1 (flat) to 4 (with knobs and cavities).

Predictor variables

For each station, habitat variables at 10 to 100-m scale including average depth, percent covers of abiotic and biotic substratum categories, heterogeneity and rugosity were averaged over the 12 sectors. For each sampling period, temporal variables including mean monthly water surface temperature (°C), mean monthly wind speed (m s−1) and direction (°) were calculated from daily records collected at Anse Vata for water temperature and at Amédée Island for wind (Fig. 1). Mean monthly wind direction was subsequently transformed into a vector with cosine and sine components for further analysis. Discrete modalities were used for each station to describe the habitat variables at 0.1 to 10-km scale, i.e., its cross-shelf location (coastal for stations around Canard Islet, mid-shelf for those around Larégnère Islet) and its exposure to trade-wind (windward or leeward). Principal component analysis and correlation matrix were used to examine multicolinearity between predictors.

Predicting juvenile species richness and abundance

Only juveniles, defined here as individuals smaller than one-third of the maximum species length (Dorenbosch et al. 2005), were considered. For each station and for each sampling period, species richness, defined as the total number of species present as juveniles (S), and total abundance of juveniles (N) were calculated. Spatial autocorrelation in S and N was examined by plotting variograms in function of distance between stations and with Moran’s I-test for spatial autocorrelation (Cliff and Ord 1981). Temporal autocorrelation in S and N was examined with Box–Pierce and Ljung–Box tests (Ljung and Box 1978).

In order to account for potential non-linear relationships between faunistic and environmental variables, simple non-linear (squared, logistic and exponential) regressions between faunistic and environmental variables were first compared with linear regressions. Regressions were compared based on the P-value, the coefficient of determination R 2 (proportion of the variance of the response variable that is explained by the model) and sigma (square root of the estimated variance of the random error). Environmental variables that showed a significant non-linear relationship with the response variables were transformed accordingly for subsequent analyses.

Generalized Linear Models (GLMs) were used to fit a link function between the response variable, i.e., S or N, and the predictors, i.e., the environmental variables. GLMs are particularly suitable for modeling faunistic data that usually present a Poisson distribution, and yield predictions within the limits of observed values (Guisan and Zimmermann 2000; Guisan and Thuiller 2005). The presence of overdispersion in the distribution of the response variable was assessed by computing the ratio between the mean and the variance of each response variable (Potts and Elith 2006). If the variance was greater than the mean, data were considered overdispersed and the hypothesis H0 of quasi-Poisson distribution of the response variable was tested through Dean’s test (Dean 1992). When H0 was rejected, the assumption of a negative binomial distribution was made and tested through the likelihood ratio test (LRT, Dean 1992).

Three separate models were first built for predicting S and N, respectively. One model included habitat variables at a 0.1 to 10-km scale only, one included habitat variables at a 10 to 100-m scale only and one included temporal variables only. This exploratory step aimed to identify the most appropriate and important scale for building the final models, a central and recurrent problem in species distribution modeling (Guisan and Thuiller 2005). Eventually, final models considered all spatial and temporal predictors in order to get the most accurate predictions of S and N from a selection of predictors. Predictor selection was performed by a backward stepwise selection procedure, which consisted of introducing all predictors and progressively removing the less significant ones until Akaike’s information criteria (AIC, Sakamoto et al. 1986) was minimal. The coefficient of determination R 2 was used to express the percentage of variability in the response variable that was explained by each model. Predictors were classified according to their influence on the response variable (positive or negative) and ranked according to the increase in AIC produced after being individually removed from each model. The quality of each GLM was finally assessed by comparing predicted and observed values for each response variable with a Student’s t-test for paired samples, the Pearson correlation coefficient r and the coefficient of determination R 2 of a simple regression between predicted and observed values. Diagnostic plots of GLM including the distribution of model residuals and the normal scores of standardized residual deviance (Breslow 1996) were used to ensure that model residuals were normally distributed. Statistical validation of each model used the leave-one-out cross-validation (Davison and Hinkley 1997). This bootstrap resampling procedure estimates a mean prediction error for observations that are removed, one-by-one, from the calibration data set. Bootstrap resampling can provide an efficient statistical evaluation of species distribution models when collecting new data is too costly (Guisan and Thuiller 2005; Potts and Elith 2006). Spatial autocorrelation in model residuals was examined with Moran’s I-test.

Mapping GLM predictions of juvenile species richness and abundance around Larégnère Islet

Five successive steps were required for predicting and mapping S and N from an aerial photograph of Larégnère Islet nearby reefs and lagoon (Fig. 2):

  1. (1)

    The initial aerial photograph at 1.5 m spatial resolution was classified and ground-truthed to assign the pixels into six biotopes (coral patch reefs, macroalgal bed, dense seagrass bed, sparse seagrass bed and sand) resulting in a 1.2-km2 biotope layer around the islet. Since the scope of this paper is not biotope mapping, this work is not detailed here. Briefly, the final map benefited from several optical and acoustic imagery and more than 50 points of ground-truth were collected around Larégnère reefs and terraces following a method similar to that of Clua et al. (2006).

  2. (2)

    For each of these six mapped biotopes, between four and six 1 m × 1 m quadrats were randomly chosen to measure predictor variables at a 10 to 100-m spatial scale and to ensure the consistency of biotope labelling between map categories and fish habitat categories. In each quadrat, average depth was recorded; heterogeneity, rugosity and percent cover of sand and rubble, dead and live coral, macroalgae and seagrass were visually estimated.

  3. (3)

    Bathymetric profiles were obtained across the 1.2 km2 area by recording depth along ten 300 m long transects set perpendicular to the reef flat, approximately 80 m far from each other. Isobaths were then interpolated between transects. Given the very gentle and regular slope of the lagoonal terrace, this was sufficient spacing. The only significant topographical relief occurred at the edge of the reef flats.

  4. (4)

    Biotope characteristics and average depth were gridded and mapped at 35 m resolution from the biotope map initially at 1.5 m resolution. This resampling allowed creating cells with heterogeneous biotope characteristics different than those from the six “pure” biotopes, and also, decreased significantly the amount of computing time when running the GLMs. The GLMs were then applied for each cell of the new grid by considering the biotope characteristics of each cell as biotope model predictors, and by taking into account the mean monthly wind speed of the corresponding sampling period. Predictions of S and N were obtained in each cell of the six grids, one for each sampling period. Mean and standard deviations of the predicted values of S and N were then calculated.

  5. (5)

    Simulated predictive maps using the high resolution biotope maps were obtained by resampling the 35 m × 35 m prediction maps with Triangulated Irregular Network.

All statistical analyses were performed with R 2.2.1. software (Ihaka and Gentleman 1996). The mean predicted values were mapped on the aerial photograph using MapInfo Professional 7.5 software.

Fig. 2
figure 2

Methodological steps for spatial modeling of juvenile species richness (S) and abundance (N). For each step the resolution, or grain size, is indicated in brackets. (1) Interpretation of aerial photograph and assignation of pixels into six biotopes, (2) Definition of biotope characteristics and scaling to model requirements, (3) Scaling of bathymetric profiles to model requirements, (4) Generalized Linear Model (GLM) predictions of S and N for each cell grid and (5) Simulation of outputs for a high-resolution model by triangulated irregular network (TIN) resampling

Results

A total of 1,752 juveniles belonging to 98 taxa of 19 families were observed (Appendix: see Randall (2005) for authorities and Nelson (1984) for family order). On each station, between 0 and 18 species (median = 6), and between 0 and 175 juveniles (median = 14) were recorded (Fig. 3). The most frequently observed and most abundant species were Scarus sp. (Scaridae), Lethrinus genivittatus (Lethrinidae), Siganus fuscescens (Siganidae), Pomacentrus moluccensis (Pomacentridae) and Thalassoma lunare (Labridae). Juvenile species richness and abundance were significantly correlated (P < 0.001, R 2 = 0.25). Moran’s I-test showed no significant spatial autocorrelation in S and N (P > 0.05). Box–Pierce and Ljung–Box tests showed no significant temporal autocorrelation in S and N at each station (P > 0.05). The distributions of S and N were overdispersed compared with a Poisson distribution (ratio variance:mean 3.8 and 16.3, respectively). Dean’s test confirmed the assumption of quasi-Poisson distribution in observations of S (P > 0.05) with an estimated dispersion parameter of 0.81. Conversely, LRT test highlighted an over dispersion in the distribution of N (P > 0.05). Thus, a negative binomial distribution was used for the GLM predicting N.

Fig. 3
figure 3

Frequencies (numbers of stations sampled) corresponding to the different values of juvenile species richness (S) and abundance (N) observed on the 12 stations and all sampling periods combined

Principal component analysis and correlation matrix performed on environmental variables showed that habitat rugosity was the only variable significantly correlated with other habitat variables, including the percent cover of rubble, boulders, dead and live coral when all stations were considered. However this correlation was not significant when considering stations in biotopes of seagrass or macroalgal beds separately. Habitat rugosity was thus considered for further analyses.

The preliminary GLM models built independently from habitat variables at 0.1 to 10-km, at 10 to 100-m scale and temporal variables, showed that descriptions at 10–100 m explained 71 and 49% of the variability of S and N, respectively (Table 1). Habitat variables at 0.1 to 10-km scale and temporal variables only explained a maximum of 15% in the variability of S or N (Table 1). The best final models based on all spatial and temporal variables for S and N both contained a constant (μ) and ten environmental variables. These variables included eight variables at 10 to 100-m spatial scale, and only one at 0.1 to 10-km spatial scale and one temporal variable. Live coral cover was additionally required for N only.

Table 1 Results of successive GLMs built with environmental characteristics of each spatial and temporal scale to predict juvenile species richness (S) and abundance (N). Spatial characteristics at 0.1–10 km scale include cross-shelf location and trade-wind exposure; spatial characteristics at 10–100 m scale include depth, percent coverages of abiotic and biotic substratum categories, heterogeneity and rugosity; temporal characteristics include mean monthly water temperature, mean monthly wind speed and direction

Final equations are as follows:

$$ S\, = \,\mu \, + \,{\text{cross}} - {\text{shelf location}}\, + \,{\text{heterogeneity}}\, + \,({\text{rugosity}})2\, + \,\log ({\text{depth}}\, + \,1)\, + \,{\text{sand}}\, + \,\log ({\text{rubble}}\, + \,1)\, + \,\log ({\text{dead coral}}\, + \,1)\, + \,{\text{macroalgae}}\, + \,\log ({\text{seagrass}}\, + \,1)\, + \,{\text{wind speed}} $$
$$ N\, = \,\mu \, + \,{\text{cross}} - {\text{shelf location}}\, + \,{\text{heterogeneity}}\, + \,({\text{rugosity}})2\, + \,\log ({\text{depth}}\, + \,1)\, + \,{\text{sand}}\, + \,\log ({\text{rubble}}\, + \,1)\, + \,\log ({\text{dead coral}}\, + \,1)\, + \,{\text{macroalgae}}\, + \,\log ({\text{seagrass}}\, + 1)\, + \,\log ({\text{live coral}}\, + \,1)\, + \,{\text{wind speed}} $$

The multiplicative coefficient estimated for each environmental variable (Table 2) discriminated variables with negative or positive influence on S or N (i.e., negative or positive coefficients, respectively). Depth was the most important variable and had a negative influence on the response variable in both GLM models (Table 2). Mid-shelf location had a negative effect on S and N whereas a null coefficient was associated to coastal location. Mean monthly wind speed was the only temporal variable retained and had a positive effect on the response variable in both GLM models.

Table 2 Estimated values for intercept and for coefficient associated to each predictor, standard (std.) error, t value and associated probability (P) for GLM models of species richness (S) and total abundance (N) of juvenile reef fishes. Predictors are ordered according to the increase in Akaike Information Criteria produced when they are separately removed from the model (ΔAIC) and associated with the scale at which they are described. With mid-shelf: mid-shelf location

The final GLM models explained 75 and 52% of variability in S and N, respectively (linear regression P < 0.001, r = 0.99 and 0.93 for S and N, respectively, Fig. 4). There was no significant difference between mean observed values and mean predicted values for both response variables (t-test, P > 0.05). The highest S and N were observed and predicted for stations in coral patch reefs around Canard Islet, and the lowest were observed and predicted for stations in seagrass beds around Larégnère Islet (Fig. 4). In seagrass beds, juveniles were occasionally abundant, but GLM models were not able to accurately predict such events.

Fig. 4
figure 4

Plots of observed values against values predicted by Generalized Linear Models (GLMs) for juvenile species richness (S) and abundance (N). Triangles correspond to stations in coral patch reefs, circles to stations in macroalgal beds and squares to stations in seagrass beds. Open symbols are for Larégnère Islet and filled symbols are for Canard Islet. The line of slope 1 and intercept 0 is drawn for reference. R2 is the coefficient of determination of the model

Considering the full data set, the mean error was 1.5 species for S and 9.5 individuals for N. The leave-one-out cross-validation estimated a mean prediction error of 2.0 species for S and 12.2 individuals for N. For comparison, the standard deviations of observed S and N were 4.2 species and 28.1 individuals, respectively. Moran’s I-test showed no significant spatial autocorrelation in model residuals for both response variables (P > 0.05).

Predictive maps around Larégnère Islet showed that the highest S and N were expected on a narrow margin at the edge of the reef flat and the shallow lagoon terrace, covering less than 5% of the total mapped area (Fig. 5). In this narrow zone, depending on depth, GLM predicted between 30 and 90 juveniles of 5–15 species per 80 m2. In shallow seagrass beds adjacent to the reef flat, GLM models predicted between one and three species and between five and ten juveniles per 80 m2. Predicted S and N decreased with increasing distance to the islet and with depth. For a given range of depths, predicted S and N increased from sandy biotopes to seagrass beds and to macroalgal beds, the latter being present in deeper areas only. Temporal variation in mean monthly wind speed induced standard deviations of 0.33 and 0.51 in model predictions of S and N, respectively.

Fig. 5
figure 5

a Original aerial photograph of Larégnère Islet for which pixels were assigned into biotopes of co coral patch reefs, ma macroalgal beds, sg dense or sparse seagrass beds and sa sand, b Gridded (35 m resolution) predictions of juvenile species richness expected per 80 m2, c Resampled (1.5 m resolution) predictions of juvenile species richness expected per 80 m2 and d Resampled (1.5 m resolution) predictions of juvenile abundance expected per 80 m2

Discussion

The first important result of this study is that a combination of environmental variables recorded across multiple spatial scales explained up to 75 and 52% of the variability in juvenile species richness and abundance, respectively. These levels of predictability compare well with those obtained by models designed to explain adult fish species richness and abundance (Ault and Johnson 1998b; Holbrook et al. 2002; Chittaro 2004; Mellin et al. 2006). However, one notable difference between these models and those designed for adult fish is that a temporal factor (i.e., mean monthly wind speed) had to be included to account for the seasonal variability in juvenile fish assemblages. A comparable temporal variability has already been observed in assemblages of pre-settlement larvae in New Caledonia (Carassou and Ponton 2006) and in assemblages of juveniles at identical latitudes (Robertson and Kaufmann 1998). By contrast, the temporal variability of adult fish assemblages recently reported in New Caledonia was explained by inter-annual variations of environmental conditions due to hurricanes (Wantiez et al. 2006). Such catastrophic events have never been considered in any model designed for predicting adult fish species richness and abundance. This finding confirms that the temporal scale at which fish assemblages must be studied varies with the life stage considered. It also obviously underlines that the cost for collecting field data of juveniles over a large area is much higher than for adult fish, since field surveys of juveniles must be regularly repeated across seasons. This makes the development of spatially explicit models of juvenile fish even more promising and cost-effective than for adult fish.

The present study also emphasises the importance of considering environmental variables at different spatial scales, particularly the finest ones, for predicting the species richness and abundance of juvenile reef fish. In a similar way, Ault and Johnson (1998b) explained up to 82% of the variations in adult fish species richness at Heron Island (Great Barrier Reef) when adding fine-scale habitat characteristics to the type of reef in their multiple regressions. If considering environmental variables at different scales seems mandatory for obtaining robust models, identifying the most important scales is crucial to avoid inadequate decisions and management strategies (Guisan and Thuiller 2005). This study indicates that juvenile species richness and abundance in coral reef habitats are explained by spatial factors at 10 to 100-m spatial scale. This corroborates the conclusion of Chittaro (2004) that the 100 m2 spatial scale was the most appropriate to investigate fish-habitat associations. As a consequence, spatially explicit models of Indo-Pacific juvenile species richness and abundance require spatially dense information about habitat characteristics. Interestingly, although a positive, but non-linear, relationship existed between juvenile species richness and abundance, these two variables were not influenced by the same biotope characteristics. Live coral cover was required for predicting the abundance of juveniles but not their species richness. This is explained by the presence of large monospecific schools of juvenile damselfishes such as P. moluccensis or Chromis viridis that favour coral habitats (Ault and Johnson 1998a; Lecchini et al. 2006).

Cross-shelf location was the only broad-scale habitat variable retained in both models, likely because it correlates to the degree of exposure to tradewinds blowing from the southeast within the lagoon. Larégnère and Canard Reefs are indeed exposed differently to trade winds and currents, Canard being much more protected (Fig. 1). Long term wind speed variability and its relevance in both GLMs can be a sign revealing the role of broad-scale circulation in the lagoon. Broad-scale circulation is strongly affected by wind speed regimes (Douillet et al. 2001) and could control juvenile fish distribution possibly through the dispersal of larvae before they settle in their first essential habitats. The influence of this broad-scale, wind-driven circulation process on reef juvenile patterns warrants further investigation with more comprehensive surveys. The advective dispersal paths of pelagic larvae have been widely investigated according to the hydrodynamic regimes around numerous reefs (Codling et al. 2004; Brown et al. 2005; Francis et al. 2005). However, the success of these investigations has been tempered by difficulties in modeling the behavior of fish larvae, including active swimming. Evaluating how water circulation patterns may enhance the predictability of juvenile species richness and abundance should help sorting out the relative importance of passive dispersal along currents vs. active selection of suitable habitat. Since a 3D numerical circulation model and a biological model of water column production exist for New Caledonia south-west lagoon (Pinazo et al. 2004), it should be eventually possible to design surveys based on broad-scale water column and residence time regimes (Jouon et al. 2006).

In terrestrial ecosystems, species distribution models are cost-effective complements to surveys for determining priority habitats for conservation (Guisan and Zimmermann 2000; Binzenhöfer et al. 2005; Latimer et al. 2006). However, marine resource managers still often lack relevant ecological information on which to base their decisions (Pittman et al. 2007). Spatial predictions of juvenile fish species richness and abundance may thus provide useful conservation tools that contribute to a more informed process in marine protected area (MPA) selection and, in the present case, MPA management. Mapped model predictions around Larégnère Islet indicated that high juvenile species richness and abundance occurred on a narrow margin overlapping the islet reef flat and including coral patch reefs and adjacent seagrass beds. This area, representing less than 5% of total area covered by the different biotopes, is highly frequented by visitors and thus potentially exposed to mechanical damage. One possible way to enhance adult fish species richness and abundance would be to protect important juvenile habitats as indicated by model predictions. The number of holes and cavities (i.e., high rugosity) that characterize coral patch reefs generally provide shelter from juvenile predation, thus increasing juvenile species richness and abundance (Adams et al. 2004; Almany 2004). At the same time, predation of juvenile fish is particularly intense in coral patch reefs (Chittaro et al. 2005) and seagrass beds are generally considered as better nursery habitats (Beck et al. 2001; Cocheret de la Morinière et al. 2002; Nagelkerken et al. 2006). Moreover, seagrass beds may intercept more planktonic larvae compared to coral patches (Parrish 1989) as they generally cover larger areas. At the present stage of this study, it is difficult to conclude on the respective role of coral reefs vs. seagrass beds, and juxtaposition of both is often considered as the most beneficial for fish species richness and abundance (Pittman et al. 2004; Grober-Dunsmore et al. 2007).

Several caveats are necessary before using and generalising models like those developed in this study. Here, juvenile fish were sampled on stations spaced by a >100-m distance, which probably explains the absence of spatial autocorrelation. However, spatial autocorrelation is likely to occur when sampling at higher spatial resolution and further models should account for this effect. Another implicit limitation of this study predicting juvenile species richness and abundance is that species richness or abundance cannot reveal species-specific patterns, which may sometimes entirely drive ecosystem functioning (Bellwood et al. 2006). Therefore, designing integrated management strategies only based on fish species richness and abundance would be simplistic and dangerous (Van Horne 1983; Pittman et al. 2007). In principle though, similar modeling methodology can be applied to any reef community, provided that relevant environmental variables and scales are considered in the analysis. The present model should be considered as a first step setting the scene for more specific studies, relating habitat maps to the distribution of functional groups, or species, of juveniles and their species-specific biological attributes such as growth and survival.