Introduction

Small mammals are considered a hyper-diverse group encompassing very diverse taxa of different phylogenetic origin (Barnett and Dutton 1995). In general, this group includes secretive species (i.e., almost nocturnal) exploiting very different niches, and specific sampling methods (e.g., live trapping devices) are required to estimate their presence and abundance. Consequently, estimating small mammal community composition and structure by using single live-trapping methods can be unrealistic due to trap-specific differences in trappability among species, sexes, and age classes (Anthony et al. 2005; Lambert et al. 2005; Dos Santos-Filho et al. 2006; Burger et al. 2009; Cáceres et al., 2011). Indeed, several authors suggested the combination of sampling techniques, by placing different trap models and at different heights (i.e., ground and trees), to have more accurate assessments (Lambert et al. 2005; Caceres et al. 2011). But studies comparing the relative efficiency of different types of commercially manufactured live traps for small mammals are still lacking (Jung 2016). However, some live trapping devices have a very long history behind, being developed and manufactured in the middle of the last century (e.g., Sherman traps: http://www.shermantraps.com; Longworth traps: (Chitty and Kempson 1949)). These are widely used live traps for small mammals’ community assessments in temperate and tropical areas, and their performance has been tested over the last decades (Morris 1968; Slade et al. 1993; Anthony et al. 2005; dos Santos-Filho et al. 2006; Jung 2016). Both kinds of traps are metal boxes made of light aluminum and were usually reported having similar performance in comparative field tests (Morris 1968; Torre et al. 2018; Eleftheriou et al. 2020). The Longworth trap has been since then a standard for small mammal studies and monitoring programs in Europe (Flowerdew 2004; Sibbald et al. 2006), used alone or in combination with other trap models (Torre et al. 2016). This trap seems especially suitable for shrews and small semi-fossorial voles and was used for monitoring populations of shrews (Jaquiéry et al. 2008), epigeal and fossorial voles (Paradis and Guédon 1993; Renwick and Lambin 2011), and mice (Montgomery 1989). Nonetheless, this trap showed some biases regarding the size of the species and individuals sampled owing to the smaller hole entrance as compared to other commercial live traps (Anthony et al. 2005; Jung 2016; Torre et al. 2016). Furthermore, this trap was also prone to failure and damage (either by captive and large animals), thus producing some undesired effects such as reduced capture efficiency and vandalism (Jackson and Hutchison 1985; Boekel 2014; Torre et al. 2022). Moreover, gnawing/chewing by small rodents and shrews produced significant damage to the trapping mechanisms (Barnett and Dutton 1995), thus lowering their lifespan (Boekel 2014). Some authors proposed modifications of the trapping protocol and trap mechanism to increase trap durability, avoiding missing small shrews and preventing mortality (Sibbald et al. 2006; Boekel 2013, 2014), and some others improved camouflage by simply painting the trap (Jackson and Hutchison 1985). Another important handicap of the Longworth trap is its high price which can constraint field experiments (Lambin and MacKinnon 1997). Notwithstanding the commented shortcomings, the Longworth trap is still recommended as the standard trapping method for small mammal monitoring programs in the UK (Flowerdew et al. 2004).

The Heslinga trap (http://www.heslingatraps.eu) looks like a green colored Longworth trap, but it can be considered an improvement of the latter model (Boekel 2014). This trap (made on aluminum-magnesium alloy) was created in 2004 in the Netherlands and used by investigators from that country from 2006 onwards, but not widely introduced to the scientific community until several years after (2015, Tom Heslinga, comm. pers.). This was the reason why some investigators did not use this trap until recently in their small mammal studies (Westra et al. 2022; Torre and Palau 2023), but none of them has attempted to test the performance of this trap compared to the more traditional and standard live trapping models. Indeed, this trap model was not included as a sampling option for mice in recent trapping practical guides (Sikes and Animal Care and use Committee of the American Society of Mammalogists, 2016; Machtinger and Williams 2020). However, this trap model offered—a priori—several advantages over the Longworth (Boekel 2014; Torre et al. 2021), being stronger and durable (less prone to gnawing/chewing), showing better concealment (green painted), and being cheaper (half the price). But the question that arises is whether it performs better, similar, or worse, than other commercial trap models normally used for small mammal community assessments.

Our goal in this study was to perform a test of the efficacy of the Heslinga trap in field conditions, by comparing its performance with the medium-sized Sherman trap, a widely used live trapping model recommended for small mammal sampling in the study area and elsewhere (Machtinger and Williams 2020; Torre et al. 2021). Due to the similar design, the Heslinga and Longworth traps are expected to have similar performance, and the former could be used as an alternative to the latter working alone or in combination with other trap models to maximize capture probability (McCarthy et al. 2021; Westra et al. 2022; Torre and Palau 2023). If this was true (i.e., Heslinga = Longworth), we will expect a similar performance of the Heslinga trap when compared to the Sherman trap, as was observed in Longworth/Sherman comparisons (Morris 1968; Torre et al. 2018; Eleftheriou et al. 2020). Therefore, we analyzed several community parameters as if they were estimated by each trap model working independently, such as the total number of species recorded (species richness and diversity), the number of captures and recaptures obtained, the size (i.e. weight) of captures, the trap mortality, and the sampling inaccuracies recorded (trap failure). Regarding the common species, we analyzed whether the detectability affected the occupancy estimates, and whether weight of the individuals captured differed by each trap model. Thus, we ascertained whether relevant biases may be present when small mammal communities are sampled and estimated by both traps working together.

Material and methods

Study area

The field work was carried out in the north-east region of the Iberian Peninsula (Catalonia, NE Spain, Fig. 1). 12 out of the 15 sampling stations (80%) were located in different Protected Areas (PN Cap de Creus, PN Montgrí, Illes Medes i Baix Ter, PN Capçaleres del Ter i del Freser, PN Zona Volcànica de la Garrotxa, PNIN Massís de l’Albera) of Girona province. Monitoring plots were distributed from the coast of the Mediterranean Sea (87 m.a.s.l.) to the Pyrenean highlands (2063 m.a.s.l.). The elevational gradient encompasses different climates and habitats, providing a wider range of potential species to trap and revealing potential biases in community assessments under each trap model. Differences between elevation and orography reflect changes in climatic data, ranging from 561.1 mm to 1339.4 mm of annual accumulate precipitation, and from 16.6 °C (min: 11.7 °C, max: 21.7 °C) to 3.41 °C (min: 0.5 °C, max: 6.4 °C) of annual average temperature (data from the last 10 years of Catalan Meteorological Services). The main habitat of the plots is represented by woodlands, 67% (holm oak, pinewoods, beech, and oak tree forest), followed by open natural habitats 33% (scrublands, grasslands, and scree).

Fig. 1
figure 1

Map showing the distribution of the monitoring stations (orange dots) in the study area according to the location of the protected areas (green shaded areas)

Sampling methodology and data analysis

Sampling was performed from spring 2019 to fall 2022, following the SEMICE monitoring protocol described elsewhere (Torre et al. 2018, 2021). This is a long-term monitoring program based on standardized abundance estimates of common small mammals. We surveyed 15 plots during eight sampling seasonal sessions conducted in spring (May to early June) and autumn (October to November). Each plot was sampled by a 6 × 6 trapping grid, consisting of 18 Sherman traps (Sherman folding small animal trap; 23 × 7.5 × 9 cm; Sherman Co., USA) and 18 Heslinga traps (http://www.heslingatraps.eu) in alternate positions (dos Santos-Filho et al. 2006) and spaced 15 m (Torre and Palau 2023). Traps were provided with hydrophobic cotton for bedding and baited with a piece of apple and a mixture of tuna and flour. Traps were active for three consecutive days, and the traps were checked twice a day (at dawn and dusk). The small mammals captured were identified to species, sexed and their reproductive status noted, weighted, marked (rodents with ear tags—National Band Co., USA—and shrews with a fur clip), and released in the place of capture. Live-trapped sympatric Apodemus sylvaticus/A. flavicollis individuals are difficult to identify in hand due to the lack of remarkable differences in body size and fur color in the study area (Torre et al. 2015). Therefore, some doubtful individuals were assigned to Apodemus spp. To avoid biases in performance regarding the use and wear of traps (Anthony et al. 2005), we bought 40 Heslinga and 40 Sherman traps at the start of the study.

In order to assess the effectivity between the two trap models, we compared total counts of individuals (and recaptures), sampling inaccuracies (i.e., open traps with signs, closed traps without capture, and others), trap-induced mortality, and weight of captures (Torre et al. 2019). In that latter case, we also considered the weight of the two most common species, wood mouse (Apodemus sylvaticus) and white-toothed shrew (Crocidura russula). We used generalized linear mixed models (GLMMs), with either negative binomial (for counts) or Gaussian (for weight) error distributions, using the glmer.nb and lmer functions in the R package lmerTest (Kuznetsova et al. 2017). Since all count variables showed dispersion irregularities (underdispersion, with scaled deviance < 0.5 or overdispersion, with scaled deviance > 1.5), models built with negative binomial (instead of Poisson) error distribution showed better fit to the data (Zeileis et al. 2008). For the selected models, we calculated pseudo-R2 values (Nakagawa and Schielzeth 2013) by means of the R function r.squaredGLMM and the delta method for variance estimation. Trap model was included as a fixed factor, and the monitoring station as a random factor. All these analyses were performed using Rstudio software (RStudio Team 2020).

Individual-based species accumulation curves were used to estimate species richness (Gotelli and Colwell 2001) by trap model and for the combined traps. The expected richness functions were calculated with EstimateS v. 9.1.0. (Colwell 2012) after 100 randomizations (default option) of the observed number of species as individuals accumulated. To ascertain the completeness of the inventory by each sampling method, we used the Clench equation to adjust the species accumulation curves (Diaz-Francés and Soberón, 2005). To fit the Clench equation to the functions provided by EstimateS, we used the procedure outlined by (Jiménez-Valverde and Hortal 2003). Also, asymptotic species richness estimators for abundance-based data were calculated (Chao 1). Because the number of species detected is dependent on the number of individuals sampled by each trap model (indeed, individuals carry the taxonomic information, Gotelli and Colwell 2001), comparisons of species richness and diversity were made at the maximum number of individuals sampled by the method showing the lower capture rates, by rarefying the number of species to the same number of individuals.

Occupancy models (Mackenzie et al. 2018) were used to ascertain occupancy estimates of the small mammal species accounting for changes in detectability by each sampling method. Imperfect detection (i.e., when detectability p < 1 (Mackenzie et al. 2002)) can be a consequence of different responses of individuals to the sampling methods (Gorosito et al. 2018). We tested four occupancy models, one for each common species (wood mouse and white-toothed shrew) and trapping device (Heslinga and Sherman). We started with the simplest model, the simple single-season null model, considering that either occupancy or detectability remained constant ψ(·), p(·). After, we tested the possibility of seasonal changes in detectability ψ(·), p(season). Then, we tested more complex models (i.e., simple multi-season), including seasonal changes in other parameters such as colonization and extinction: ψ(·), γ(·), ε(·), p(·). The models were ordered according to AIC, and the most parsimonious models were picked to estimate the occupancy and detectability for each species and trap device. All analyses were performed with Presence software (MacKenzie 2012).

Results

During the eight sampling sessions conducted on 15 SEMICE stations, we captured 640 small mammals of eleven species, representing the 3% of capture success (640 individuals/21,600 traps-night × 100). The wood mouse (Apodemus sylvaticus) was dominant (58.3%), and it was present in all the plots. The greater white-toothed shrew (Crocidura russula) was the second species with more captures (19.1%), followed by the yellow-necked mouse (9.4%, Apodemus flavicollis), the bank vole (4.5%, Clethrionomys glareolus), the Algerian mouse (1.6%, Mus spretus), and the common vole (1.4%, Microtus arvalis). Other five species accounted for less than 1% each (Table 1).

Table 1 Absolute and relative (in parentheses) number of captures and recaptures of the 11 small mammal species by trap model in 15 SEMICE stations during 2019–2022

Total estimated species richness was 11 ± 0.69 (SD), and the Clench equation adjusted to the species accumulation curve fitted very well (r2 = 0.96), with an asymptotic estimate of 11.68 ± 0.25 (SD) species. This represents the 94.1% of the species potentially present in the area. More captures (63.6%) and recaptures (59.3%) were obtained with the Heslinga trap (Tables 1 and 2), outperforming the Sherman trap for almost all the species; this was evident for the common species (wood mice: 61.4%; white-toothed shrew: 69.7%; yellow-necked mouse: 63.3%; and bank vole: 58.6%), but also for rare species that were only trapped by Heslinga traps. Eight out of 11 (72.7%) species were shared by both sampling methods, but the Heslinga trap recorded three species that were not detected by Sherman traps. Indeed, the Heslinga traps showed 11 species, whereas the Sherman trap only eight species.

Table 2 Trap performance tested with generalized linear mixed models (six dependent variables) and linear mixed models (three dependent variables). The sign of the Z-value indicates the performance of Sherman traps compared to Heslinga traps (i.e., Heslinga as reference level)

Estimates of species richness were significantly higher for the Heslinga traps considering species cumulative curves rescaled to the number of individuals captured by each device (Fig. 2). The Sherman traps recorded 8.0 ± 0.49 (SD) species, and the Heslinga traps recorded 10.75 ± 0.51 (SD) species for the same number of individuals captured (n = 326 captures). The Clench equation informed about the inventory power of both trap models, yielding a significant increase for Heslinga traps (asymptotic richness 11.85, 92.8% of species detected) compared to Sherman traps (asymptotic richness 8.27, 96.7% of species detected). The Shannon diversity was slightly higher for the Heslinga community (H′ = 1.31 ± 0.03 vs. 1.10 ± 0.0). However, the small mammal community structure was similar when using both kinds of traps, as revealed by a Log-linear model for contingency tables (interaction species × trap model: Chi2 = 9.64, p = 0.47, df = 10). This means that the frequency of occurrence of each species was similar when estimated by each trap model: A. sylvaticus (H: 56.3%; S: 61.8%), C. russula (H: 20.9%; S: 15.9%), A. flavicollis (H: 9.3%; S: 9.4%), and C. glareolus (H: 4.2%; S: 5.2%). This was further confirmed by comparing the distribution breath (i.e., the number of stations with detections for a species divided by the total number of stations), which was slightly higher for Heslinga traps for three common species: wood mouse (H: 100%; S: 93.3%); white-toothed shrew: (H: 66.6%; S: 60%); yellow-necked mouse (H: 26.6%; S: 20%); and but for the bank vole (H: 13.3%; S: 20%). Differences in distribution breath by trap model were relatively small (± 6.6/6.7% for each species and trapping device).

Fig. 2
figure 2

Species (± SD) accumulation curves—rescaled for the number of individuals captured—for the Heslinga, Sherman, and both trap models combined. Chao 1 estimator also shown

Regarding the size (i.e., weight) of the individuals sampled, both trap devices showed significant differences (Table 2). The Sherman trap captured larger individuals (=21.40 g±9.21 SD, range 5–72 g) than the Heslinga (=18.12 g±8.12 SD, range 3–64 g). Weight differences were also observed in wood mice (H: =20.25 g±5.95 SD, range 6–38 g; S: =22.61 g±5.30 SD, range 9–38 g), but not in white-toothed shrews (H: =8.08 g±1.63 SD, range 3–13 g; S: =8.26 g±1.28 SD, range 5–10 g).

Death casualties in traps were higher in Sherman than in Heslinga traps (6.43% vs. 3.00%), but these differences were not statistically significant (Table 2). Shrews showed a slightly higher mortality rate (5.3%) than rodents (4.0%). Sampling inaccuracies were higher in Sherman (3.30 inac./ 100 traps-night) than Heslinga traps (2.15 inac./100 traps-night), and significant differences were observed for closed traps without captures, and for open traps with evidences of small mammals that entered traps but were not captured (Table 2). Despite differences, the number of sampling inaccuracies were anecdotal, since the traps available for captures were 97.85/100 traps-night (Heslinga) and 96.70/ 100 traps-night (Sherman). Owing to the extremely low trapping success for both trap models (H: 5.24 individuals/100 traps-night; S: 3.17 individuals/100 traps-nigh), the sampling inaccuracies did not represent a significant decline of traps available for additional captures.

The four most parsimonious occupancy models (considering two species and two trap devices) were multi-season models and selected seasonal variations in detectability while occupancy remained constant through time (ψ(·), γ(·), ε(·), p(season)). Only in one case—for A. sylvaticus and Heslinga trap—the model also selected seasonal changes in colonization: ψ(·), γ(season), ε(·), p(season). Seasonal occupancy estimates for that species were slightly higher using Heslinga than Sherman traps (ψ=0.66±0.09 vs. ψ=0.56±0.09), but seasonal detectability was the same when using both kinds of traps (p=0.49±0.09 for both). Indeed, seasonal occupancy estimates were not affected by trap device in wood mice (Wilcoxon’s z=1.40, p=0.16, n=8). Nonetheless, mean seasonal occupancy estimates were affected by trap device in C. russula (ψ=0.43±0.10 vs. ψ=0.23±0.07), as well as mean detectabilities (p=0.43±0.11 vs. p=0.20 ± 0.09), which resulted in higher occupancy estimates for Heslinga traps (Wilcoxon’s z = 2.52, p = 0.01, n = 8). Differences in detectability were mostly noticed during the first two sampling sessions (year 2019) but were more similar during the last campaigns (Fig. 3).

Fig. 3
figure 3

Mean seasonal detectability (probability of detection ± SE) for the Heslinga and Sherman trap models along the eight sampling sessions conducted on 15 SEMICE stations for the wood mouse (upper panel) and the Greater white-toothed shrew

Discussion

This is—as far as we know—the first attempt to study the efficacy of the Heslinga trap for small mammal community assessments, shedding some light on its usefulness as a plausible substitute of the Longworth trap in small mammal studies. Longworth and Sherman traps are the most widely used trapping devices (Anthony et al. 2005) and are recommended for small mammal investigations in Europe and North America (Flowerdew et al. 2004; Machtinger and Williams 2020). Comparative studies showed that both traps performed similar in field conditions (Morris 1968; Torre et al. 2018; Eleftheriou et al. 2020). The Heslinga trap showed a similar design than the Longworth trap, and despite specific tests between both model traps are needed, a similar performance could be expected. Actually, some authors used this trap as a natural substitute of the Longworth despite no field tests were performed to confirm its efficacy (Westra et al. 2022). Nonetheless, the Heslinga trap can be considered an improvement of the Longworth (Boekel 2014; Torre et al. 2021), and our results suggested that the former could outperform the latter.

Indeed, the Heslinga trap recorded more species than the Sherman (11 versus 8), but it also recorded more captures (63% versus a 37% of the total), which can be an advantage in periods of low abundance, as was our case. Nonetheless, capture rates were extremely low along the study period (3 captures per 100 traps-night), and the Heslinga obtained higher capture rates (3.77 versus 2.16 captures per 100 traps-night, respectively), but these differences were deemed as unimportant. Reduced capture rates (< 10%) were obtained with Sherman traps in tropical and temperate areas (dos Santos-Filho et al. 2006; Umetsu et al. 2006; Caceres et al. 2011; Stephens and Anderson 2014; Torre et al. 2016), suggesting that the community estimates provided by using this live trapping model could be misleading (O’Farrell et al. 1994). Indeed, some authors showed higher species richness and captures when using tunnel-based traps (i.e., Longworth) than box traps (i.e., Sherman), and these differences were due to higher capture rates of shrews and voles (Jung 2016). But others showed lower or similar capture rates for the same trap comparisons (Anthony et al. 2005; Torre et al. 2018), suggesting that the performance of each trap model changed depending on the target species and communities studied. Alternatively, these small capture rates could be related to unfavorable climatic conditions along the study period (Deitloff et al. 2010). It can be obvious that small species prefer entering traps with narrow entrances, but large species are not able to enter these tunnel-based traps (e.g., Arvicola sapidus; Torre et al. 2016). This was further confirmed by our results, Heslinga capturing on average individuals 3.3 g lighter than Sherman, confirming size biases in community assessments (Jung 2016). Therefore, in small mammal communities holding small species, some biases could be evident and favoring the tunnel-based traps, but as far as communities are composed by larger species (Eliomys quercinus, Rattus rattus), a different pattern could emerge. Even, the depiction of the age classes, demography, and physical condition of a particular species (Pearson et al. 2003) could be altered when estimated by each trap model independently. Whether these weight differences were caused by species behavior regarding the size and age of individuals attracted to the traps, or by the treadle sensitivity, remains unexplored. Hence, the use of different trap models working simultaneously can offer the best assessments in small mammal communities holding species with different sizes and behavior (Anthony et al. 2005; Nicolas and Colyn 2006; Torre et al. 2016).

Despite evident differences in performance between both trap models, the analyses of the frequencies of occurrence of the most captured species yielded similar values. This means that the representativeness of the species in the communities was the same when estimated by each trap model. The four more common species (A. sylvaticus, C. russula, A. flavicollis, and C. glareolus) were captured in the same order of frequency and represented more than the 90% of captures for both trap models. Since the common goal of small mammal community studies is to accurately determine the number of species and their relative abundance (Stephens and Anderson 2014), the use of the Heslinga trap offered similar or even better results than the Sherman trap.

Sampling inaccuracies can alter small mammal community estimates by reducing the number of traps available for capturing individuals (Beauvais and Buskirk 1999). Indeed, Sherman traps are considered to be more prone to failure than the Longworth traps (Torre et al. 2019), because they suffered higher attacks by large mammals (Torre et al. 2022). Our results further confirmed it, and the Sherman recorded more inaccuracies than the Heslinga traps, but the frequency of trap failure was deemed as irrelevant (about 3 inaccuracies per 100 traps-night), hardly affecting population estimates. Owing to the extremely low trapping success for both trap models, the sampling inaccuracies did not represent a significant decline of traps available for additional captures. These results were similar to those found in other SEMICE stations of the study area, in which sampling inaccuracies affected less than 5% of traps (Torre et al. 2019). However, trapping inaccuracies affecting to age and wear of Sherman traps could be mostly reduced when using new traps (Anthony et al. 2005).

Trap mortality was double in Sherman than Heslinga traps (6.4% vs. 3.0%), but differences were not statistically significant (see (Torre et al. 2016) for similar results). High number of death casualties—mostly affecting shrews—in Sherman traps were also noticed in other studies in temperate areas (Shonfield et al. 2013; Jung 2016), in spite that Longworth traps captured more shrews (Jung 2016; Torre et al. 2016). Nonetheless, the increased frequency of trapping checks (at dawn and dusk) surely reduced mortality as compared to those studies with only trap checks at dawn.

Changes in detectability of a species by a sampling method can affect the occupancy estimates when detection probabilities are less than p < 0.3 (Mackenzie et al. 2002). Wood mouse detectability was similar and well above that threshold for each trapping device (p = 0.49), thus unaffecting the seasonal occupancy estimates. These results confirmed that A. sylvaticus can be sampled by both devices without significant biases, a pattern that could be generalized to other areas (Torre et al. 2016, 2018). But this was not the case for the greater white-toothed shrew, which showed changes in detectability by each trapping device, hence affecting the occupancy estimates. Indeed, Sherman traps underestimated seasonal occupancy (ψ = 0.23 versus 0.43 for Heslinga) because detectability of C. russula was p < 0.3, despite the fact that overall naïve occupancy (i.e., the number of sampling stations with presence of the species) was rather similar (Heslinga: 66.6%; Sherman: 60%). These results confirmed that, in some areas, shrews can be undersampled by Sherman traps (Jung 2016; Torre et al. 2016). Therefore, neglecting differential detection probabilities by trapping devices can lead to false conclusions about spatial occupancy and habitat selection (Gorosito et al. 2018).

This is the first test—as far as we know—analyzing the performance of the Heslinga trap working side by side with a conventional commercial live trapping model (medium-sized Sherman). We showed that the Heslinga trap outperformed the Sherman trap regarding species diversity estimates, capture success, and sampling inaccuracies. These results are promising, and owing to the several advantages previously commented (i.e., stronger, durable, better concealment, and cheaper), the Heslinga trap could be a serious alternative to the more widely used Longworth trap. Furthermore, the frequencies of occurrence of the species recorded by each trapping device were similar, suggesting that both trap models yielded a similar community composition (i.e., the proportion represented by a species over the total). These results suggested that the use of the Heslinga trap alone in sampling stations (Hofmeester et al. 2017) would be enough to have accurate community estimates, but the detection of biases regarding the size of individuals sampled and detectability for some species may lead to the recommendation of combining both trap models to improve overall capturability (Westra et al. 2022; Torre and Palau 2023).