Introduction

Important selection pressures on structures of animals affect the evolution of their life histories. Major examples of such structures of insects are the appendices used for foraging, material processing, reproduction, escaping, and hunting. As a consequence of their use, worn mandibles and ovipositors lead to usage constraints, increase of metabolic costs, and decrease in lifespan and survival (Bernays 1991; Quicke et al. 1998; Roitberg et al. 2005; Park et al. 2009; Schofield et al. 2011). For example, during foraging activities Spodoptera moth caterpillars feeding on plants rich in Si+ or calcium oxalate crystals increase abrasion of their mandibles and so reduce larvae growth (Massey and Hartley 2009; Park et al. 2009); in leaf-cutter ants accumulative mandible wear scale to the energy expended on their labors by a factor of two (Schofield et al. 2011). Likewise, for nest builder hymenopterans like megachilid bees (Schaber et al. 1993), carpenter bees (Flores-Prado et al. 2014), ants (Acosta et al.1983), and social wasps (Sarmiento 2004; Silveira and dos Santos 2011), mandible morphology is under strong selective pressures due to factors, such as the material used for construction or the substrate where they dig their nest. This abrasion of the structure will increase individual senescence independently of age (Giraldo and Traniello 2014; De Verges and Nehring 2016). Since life expectancy may be reduced when workers abrade their mandibles performing material foraging and nest construction (O’Donnell and Jeanne 1995; Toth et al. 2016), an optimal task allocation might include an interplay between colony requirements, life expectancy, and worker attrition avoidance. For example, in ant workers, there are records of shifts towards less abrasive labors when mandibles are worn (Schofield et al. 2011; Giraldo and Traniello 2014).

On the other hand, in social species, efficient nest construction is expected as the nest is expensive and also the place for a large number of vital functions of the colony. One strategy to improve the efficiency of nest construction is task partitioning or allocation of individuals into subtasks, such as pulp foraging, water foraging, and structure building (Jeanne 1986a). Specialization reduces work output losses due to adjustment times as a result of labor changes (Leighton et al. 2017) and also may require lower cognitive demands (Bernays and Funk 1999; Rana et al. 2002; Chittka and Muller 2009). Another factor favoring task specialization is colony size as it will increase efficiency (Jeanne 1991a, b). For example, in the neotropical social wasp Polybia occidentalis, larger colonies show a higher degree of task partitioning in nest construction than do small colonies (Jeanne 1986a). However, some factors that may impose a strong selection against a rigid task partitioning. One is the uncertainty of environmental challenges posed to the colony that may favor a certain amount of flexibility in age task partition (Seeley 1982). For example, a climatic event can damage the nest and thus will require a rapid response of workers to perform building duties regardless of their current setting, in fact, O’Donnell and Jeanne (1992a, b) have suggested that workers may adjust their task distribution depending on the colony context; specifically, if this is under a construction pulse, we may expect more individuals performing this labor regardless of their age. A second factor could be the cost of the task, highly demanding or even dangerous labors, such as nest construction or foraging, may generate a rapid or abrupt decline in worker’s performance reducing its net contribution to the colony. A schedule for performing tasks according to their cost may reduce the effects of structural attrition and improve worker output (Jeanne 1986b; O’Donnell and Jeanne 1992a, b; Giraldo and Traniello 2014). The study of this phenomenon acquires interesting characteristics in social wasps and implications for bees, as they, unlike most ants, perform different tasks at different ages and also because worker production is mostly episodic in the colony (Jeanne 1991a, b; Seeley 1982).

Thus, as described above, task partitioning around nest construction will be under opposite forces that may be shaped by the consequences of mandible attrition. Accordingly, we tested: first, whether mandible demanding tasks, such as pulp management and nest construction, are concentrated into a single age as expected by polyethism, or are dispersed along the different ages of the species by comparing mandible wear between workers of different ages using wing wear as a proxy of relative age. We conduct these analyses at the colony and at the subfamily levels using 18 colonies which belong to 13 species of neotropical social wasps. Second, whether task specialization is related to colony size using mandible wear variation as an indication of specialization. We conduct these analyses at the subfamily levels using 13 species of neotropical social wasps.

Materials and methods

Species studied

We studied 18 colonies of 13 species of neotropical Polistinae wasps selected as representatives of major clades of the neotropical Polistinae phylogenetic hypotheses (Wenzel and Carpenter 1994; Piekarski et al. 2018; Menezes et al. 2020). The number of individuals studied per nest was as follows: Mischocyttarus sp. (n = 15), Polistes erythrocephalus (nest A, n = 15; nest B, n = 12; nest C, n = 15), Agelaia pallipes (n = 13), Parachartergus apicalis (n = 15), Parachartergus fraternus (n = 16), Synoeca septentrionalis (n = 14), Metapolybia aztecoides (n = 12), Protopolybia exigua (n = 13), Charterginus fulvus (n = 15), Polybia occidentalis (nest A, n = 14; nest B, n = 15; nest C, n = 15), Polybia velutina (n = 14), Polybia gorytoides (n = 14), and Polybia emaciata (nest A, n = 9; nest B, n = 14). We studied a total of 250 individuals. We avoid within-colony sampling bias since we captured the nests at night with all individuals using a plastic bag. After collection, individuals were transferred to 96% ethanol.

We dissected mandibles from the individuals using a Leica S8APO stereomicroscope. Mandibles were cleaned using cavitation for 2 min at 26 °C with an ultrasonic cleaner FS20D (Fisher Scientific), then they were dehydrated by soaking in ethanol 96% for 3 min and subsequently transferred to Xylol for 10 min. Gold-coated mandibles were photographed in a mesial position using a scanning electron FEI Quanta200 microscope at 25–30 kV (Fig. 1). Measurements were taken from scaled SEM images using ImageJ version 1.50b (https://imagej.nih.gov/ij/). Every measurement is the mean of the left and right mandibles of each individual. Mandible terminology follows (Silveira and dos Santos 2011).

Fig. 1
figure 1

Mesial views of social wasp mandibles. a Mandible areas studied; b examples of abrasions in teeth one and two of S. septentrionalis and; c example of tooth loss of Mischocyttarus sp.. T1 = tooth one, T2 = tooth two, T3 = tooth three, T4 = tooth four, Ad = apical denticle, Md = medial denticle, a-d = location of linear measurements of denticle wear. W = mandible width. Scale bar = 200 μm

We used different approaches to measure teeth and denticle wear. Since mandibles varied in size and their teeth may gradually abrade or lose tips because of usage (Fig. 1b, c), estimation of the original loss of length as a quantitative indication of wear was not possible. Thus, we measured mandible’s tooth wear as follows: Groups of 3–6 observers (mean = 5.9) graded wear from the SEM image of each mandible into one of the following four ranks: (1) no sign of wear; (2) small isolated chips; (3) close groups of chips located in more than one region of the tooth, and large fractures; and (4) large and extensively damaged edges losing the original shape. Observers received randomly selected mandibles to grade. Observers were unaware of the species under evaluation. We used the averages of the rankings provided by all observers of a single mandible.

We always observed the original edges of the apical and medium denticles; thus, we recorded the maximum length of worn surfaces from the edge of the denticle to its inner part (Fig. 1a). We took linear measurements both at the base and near the apex of apical and medium denticles (Fig. 1a), then we averaged both values per denticle. We did not measure the basal denticle because this was extremely small in several species or because the entire structure was lost. We are aware that this could bias our conclusions regarding the effect of wear in the mandible as a whole, but there were no reliable options to quantify wear in this structure.

To correct for the effect of mandible size differences between species, we used the ratio between every denticle measurement and the mandible width, as measured from the anterior border close to the carinae of the anterior mesial sulcus to the distal border of the mandible (Fig. 1a), this measurement is close to those used in previous studies (Sarmiento 2004; Silveira and dos Santos 2011) but we adjust them to consider differences in mandibles curvature and make them more comparable. Variables were log-transformed. After teeth and denticle measurements were recorded, we selected denticle wear data for the posterior analyses based on three criteria: denticle and teeth wear data are positively correlated (Pearson correlation test, R = 0.65, t = 13.6, df = 248, p≪ 0.05, Appendix S1 in supplementary); denticle data showed overall lower within age variation (Appendix S2 in supplementary); and denticle size is related to the hardness of materials used for nest construction (Sarmiento 2004).

Mandible wear and age

To test the relationship between mandible wear and worker age, we used wing wear as an indication of individual age. Given cautionary evidence regarding the relationship between wing wear and age (Foster and Cartar 2011), we assumed the conservative approach of using wear at the distal margin of the wings as an indicator of both relative age and shifting from within-colony activities to the foraging activity stage. There is solid evidence that this behavior occurs at the oldest age of the wasps (Jeanne 1986b, 1991a; O’Donnell and Jeanne 1992b). We added a third intermediate category as indicator of the initial stages of foraging activities. Wing wear cannot be taken as a chronological indicator of age but as a relative age proxy as the time at which individuals perform a specific function may be influenced by stochastic colony requirements, such as incidental massive loss of individuals, pulses of specific functions (Jeanne 1991a), and environmental challenges (Hölldobler and Wilson 1990; Gordon 2013, 2019).

In several insects, wing wear, as indicator of age, can be altered by factors, such as courtship, digging, or predatory activities, or by the complexity of the environment while flying (Hayes and Wall 1999; Polidori et al. 2006; Foster and Cartar 2011; Nalepa 2012). However, these factors may have little influence in our case since vespid colonies are extensively composed of worker females who rarely perform courtship activities and no species included digs their nest in the soil. We believe that the influence of predatory or flying-environmental complexity factors would be closer to a noise pattern rather than to bias. Individuals with no wing damage were classified as the youngest (1) (Fig. 2a). As increase of flying time indirectly contributes to wing wear (Foster and Cartar 2011). We further divided individuals into two categories: (2) middle for individuals with sparse damage and small rips and, (3) old for individuals with large portions of their wings missing (Fig. 2b–c). Once adults of the entire colony were divided into these three groups, between three and seven individuals per category (SD = 0.9) were randomly selected for studying their mandible wear. In this way, we had no reason to assume biased sampling within each category. Wing wear ranking was based on a relative scale and not on an absolute measurement of wing area loss. Thus, interspecific body size difference or wing size difference effects will not interfere with the results. We used the coefficient of variation (CV) of denticle wear per age within a colony as indicator of task specialization. CV was calculated as the ratio between standard deviation and the average values per nest.

Fig. 2
figure 2

Examples of wing wear used to define relative age categories in the wasps studied. a Young individual with no damage in its wing; b middle age individual with sparse damage and small rips in its wing; c old individual with large portions of its wing missing. Scale bar = 3 mm

Statistical analyses

Mandible wear between the three age categories per colony was analyzed through Spearman regressions and Krukal-Wallis tests. Ages were coded as numbers but treated as factors. Because we also conducted multiple species in our analyses, we tested for phylogenetic signal using Pagel (1999)’s lambda (Molina-Venegas and Rodríguez 2017) for the CV of mandible wear per colony and for colony size. The lambda index expresses whether character evolution is linked to the phylogenetic relationships of the group (Münkemüller et al. 2012; Molina-Venegas and Rodríguez 2017). It can be argued that mandible wear per se is not a heritable character, however, the amount of wear in a mandible can be a result of mandible shape, characteristics of the mandibular apparatus, and mode of mandible use; all these are heritable traits that can be reflected in the amount of wear. The phylogenetic signal was not assessed for wing wear as individuals were ranked into the categories one, two, or three according to preset levels of wing area loss as observed within each colony; thus, all individuals with higher wing area loss were ranked three for every colony regardless of the species.

To test whether there are differences in both the mandible wear and the coefficient of variation between ages across species, we conducted a phylogenetic ANOVA (Garland et al. 1993) considering the species as repeated measurements as we have both the mandible wear, and the CV for the three age categories from each of the thirteen species studied. For this analysis, we used Menezes et al. (2020) phylogeny as it includes more species used in our study and includes explicit time-calibrated branch lengths. The relationship between colony size and CV of mandible wear was estimated through a phylogenetic generalized least-squares (PGLS) analysis (Grafen 1989) with Pagel’s lambda correlation structure as this showed the highest model fitting (AIC = 7.89) compared to a Brownian model (AIC = 7.08). We used the mean values of all nests for those species where we included more than one colony. Variables were log-transformed. The influence of uncertainty factors, such as species set, sampling effort, and phylogeny, to estimate both phylogenetic signal and the relationship between variables, was studied using Paterno et al. (2018)’s methodology. Sampling effort analysis used 100 replications.

For the analyses, we took into account three phylogenetic hypotheses for the polistines. First, the morphologically based generic phylogeny by Wenzel and Carpenter (1994) was used with a standard branch length of one. Second, the anchor-hybrid arrangement’s phylogeny by Piekarski et al. (2018), and third, the UltraConserved Element’s phylogeny by Menezes et al. (2020). The latter two trees include actual species as terminals and provided explicit branch lengths; in addition, the latter was time-calibrated. To consider the differences in taxon sampling between published phylogenies and our study, we preserved intergeneric topology and branch lengths in the pruned trees as follows: (1) we added the internode lengths of the missing genera; (2) if the phylogeny included a single species per genus, we used its length for the congeneric species studied; (3) If more than one species per genera were available in the molecular phylogenies, we calculated the mean of all branch lengths for those terminals to the common node of the congeneric species and used this value for our studied species. Parachartergus species were not included in the analysis where the Piekarski et al. (2018) phylogeny was used. For the genus Polybia, we set a species branch length of 0.0001 for Wenzel and Carpenter (1994) and Piekarski et al. (2018) phylogenies (Purvis et al. 1994). All statistical analyses described above were conducted with the basics of R v 3.4 environment (R Core Team 2020) and the specific packages ape (Paradis and Schliep 2019), geiger (Harmon et al. 2008), ggpbur (Alboukadel, 2020), ggplot2 (Wickham 2016), nlme (Pinheiro et al 2021), mvMORPH (Clavel et al. 2015), phytools (Revell 2012), and sensiPhy (Paterno et al. 2018).

Results

In general, we observed a positive trend in the relationship between mandible wear and age in the 18 colonies compared, however, this relationship was significant in only seven of these colonies (Fig. 3, Appendix S3 in supplementary); similarly the Kruskal–Wallis tests for age group per colony did not show clear differentiation between age ranks, in six colonies, there was differentiation between young and old individuals, in two colonies between middle age and old, and in three colonies between young and middle age individuals (Fig. 3, Appendix S3 in supplementary). The phylogenetic ANOVA with the repeated measurements pointed towards no statistical differences in mandible wear between ages when considering species (F = 0.40, p = 0.66).

Fig. 3
figure 3

Mandible wear across ages per colony. Sample size per age in a colony ranged between three and seven individuals (SD = 0.9), here represented by dots. Significant Spearman regressions are indicated by a continuous slope line while non-significant regressions are indicated by a dashed slope line. Different letters by each age column indicate differences at 0.05 according to the Wilcoxon pairwise post hoc tests of the Kruskal–Wallis test

We detected phylogenetic signal for colony size with Menezes et al. (2020) phylogeny (λ = 0.999, p = 0.03) but not with Piekarski et al. (2018) or Wenzel and Carpenter (1994) hypotheses (λ = 0.999, p = 0.44; λ = 0.999, p = 0.24; respectively). However, the combined estimate calculated by the sensitivity analysis indicated phylogenetic signal for this variable (λ = 0.999 p = 0.03), had no influential species, is not affected by the tree (mean p = 0.24), and in addition, lambda estimation remained within 10% of the variation even removing up to 15% of the species (Appendix S4 in supplementary). The CV of mandible wear showed no phylogenetic signal with any of the phylogenies (Menezes et al. (2020)’s λ = 0.00007, p = 1.0; Wenzel and Carpenter (1994)’s λ = 0.976, p = 0.09; Piekarski et al. (2018)’s λ = 0.99, p = 0.18). The general sensitivity analysis suggests no phylogenetic signal (λ = 0.00006, p = 1.0), and this result is not affected by the phylogeny (p = 0.42). The two species of Parachartergus appear as influential in the outcome, however, their removal did not change the main conclusion (P. apicalis λ = 1, p = 0.56; P. fraternus λ = 1, p = 0.52). Nonetheless, the analysis indicated that a removal of 8% of the species provided 19% of values above 10% of the variation of the estimate suggesting the importance of a more complete sample size on the estimation of this trait (Appendix S5 in supplementary). The analyses suggested that the results of the estimation of phylogenetic signals are strongly influenced by the characteristics of the dataset. It is possible that, despite the sensitivity analyses, the species sampling size is too low to convey with a consistent result.

There were no significant differences in mandible wear variation between wasp worker ages using the phylogenetic ANOVA for repeated measurements (F = 0.36, p = 0.69) (Fig. 4). The coefficient of variation was not statistically related with the colony size using Menezes et al. (2020)’s phylogeny (PGLS: r = 0.2, p = 0.19) (Fig. 5); similar conclusions were obtained using the other phylogenies (Wenzel and Carpenter 1994’s, p = 0.22; Piekarski et al. 2018’s p = 0.14). The sensitivity analysis did not generate a significant relationship between these variables either considering influential species detected (A. pallipes was detected, p = 0.13) or the phylogeny (w = 0.24); and a high sensitivity to species sampling size was observed such that a removal of 8% of the species provided up to 55% estimates above 10% of the general estimated value (Appendix S6 in supplementary).

Fig. 4
figure 4

Coefficient of variation (CV) of mandible wear per worker ages; CV is used here as an indication of specialization in the colonies. Boxplots depict median, 25–75 percent quartiles, and 1.5 interquartile range for the 13 species of neotropical social wasps studied. Different letters by each age column indicate differences at 0.05 according to the phyloANOVA pairwise comparisons considering species as repeated measurements

Fig. 5
figure 5

Plot of PGLS analysis between the coefficient of variation (CV) of mandible wear and colony size. The average of these variables was used for species with more than one nest. Menezes et al. (2020) phylogeny was used in this analysis. M = Mischocyttarus sp., Pe = Polistes erythrocephalus, Ap = Agelaia pallipes, Pa = Parachartergus apicalis, Pf = Parachartergus fraternus, Ss = Synoeca septentrionalis, Ma = Metapolybia aztecoides, Pre = Protopolybia exigua, Cf = Charterginus fulvus, Po = Polybia occidentalis, Pv = Polybia velutina, Pg = Polybia gorytoides, and Poe = Polybia emaciata

Discussion

Wear impairs strong selection pressure on mandibles in insects, and individuals might exhibit strategies to deal with it as they age (Schofield et al. 2011, 2016; Giraldo and Traniello 2014) and there are several studies in other insects, such as crickets (Chapman 1957; Köhler et al. 2000; Kuřavová et al. 2014) and ants (Schofield et al. 2011; Giraldo and Traniello 2014; Garrett et al. 2016), that concur with this pattern. In fact, the theoretical expectations of age polyethism predict that risky or demanding tasks are delayed for the later ages increasing the individual’s output (O’Donnell and Jeanne 1992a, 1995; Toth et al. 2016). In our case, however, despite a generally positive trend between mandible wear and age, there were several colonies where no statistical relationship was detected nor clear differentiation between age groups either, and the comparative analysis did not hold this tendency. In addition, a frequent observation was the presence of workers with strong differences in mandible wear within the same age group. We believe that focusing on a single activity at specific ages, as may be expected from a strict interpretation of the model of age polyethism, implies that individuals performing nest material handling could quickly become impaired due to strong mandible damage. Such a pattern will significantly decrease the net contribution of the worker. Alternatively, if these demanding functions are dispersed throughout different ages, workers will be more productive for the colony in the long run. This strategy may explain why we found no differences in both mandible wear and mandible wear variation between ages. Studies in ants reveal that workers with worn mandibles display behavioral changes towards less abrasive tasks extending their contribution to the colony (Schofield et al. 2011; Giraldo and Traniello 2014) which may support our proposal.

Both empirical and theoretical studies suggest that colony size plays an important role in favoring the appearance of higher degrees of task specialization (Jeanne 1986ab; Pacala et al. 1996; Gautrais et al. 2002; Jeanson et al. 2007; Holbrook et al. 2011). In the social wasp Polybia occidentalis, the higher the colony size, the higher the task partitioning around nest-building activities (Jeanne 1986a). We did not detect a relationship between colony size and the coefficient of variation in mandible wear, and we attribute this discrepancy may be a result of two non-excluding factors. First, while Jeanne sampled colonies at building bouts and traced individuals displaying building activities, we sampled entire colonies and analyzed a sample taken from the entire nest population, which carries the history of multiple stages of that colony. Thus, a single nest may include cohorts of individuals that have been through very different moments of the colony regarding nest construction. This may blur the specialization phenomena detected when studying single construction events. Second, other factors may distort the general trend for task specialization other than colony size and demand flexibility to the colony, such as the mechanism of task assignment, the task performed, the ecological context, the worker’s individual differences, colony stage, and unpredictable environmental events like predatory attacks (Hölldobler and Wilson 1990; Gordon et al. 2005; Dornhaus et al. 2012; Du et al. 2017); For example, in the social wasp Polybia occidentalis, workers vary between five up to 40 days in the age of initiation to forage (O’Donnell and Jeanne 1992b). Studies in vespines identify that there are workers either specialized for a single activity throughout their life, specialized only for short periods, or they are not specialized at all (Santoro et al. 2019) and there are studies in ants at both colony (Dornhaus et al. 2009) and interspecific levels (Fjerdingstad and Crozier 2006) where no relationship was detected between colony size and specialization. This flexibility will improve the evolutionary advantages of age polyethism (Seeley 1982; O’Donnell and Jeanne 1992b; Naug and Gadagkar 1998; Tofilski 2002), and may explain the lack of relationship between colony size and mandible wear variation, and the lack of differences in mandible wear variation between age groups in our data.

If our findings are held by further studies, we suggest that the strong pressure on mandibles due to wear may be solved in social wasps through two non-excluding strategies that may increase the individual's contribution to the productivity of the colony, extending its life expectancy and general performance: first a general trend towards the known temporal distribution of tasks according to their effect on mandible performance, initially displaying less expensive tasks and posteriorly those more costly chores such as nest material foraging (Jeanne 1991a; Kim et al. 2012), and second, distributing tasks according to their attrition costs, performing the more demanding labors over longer and less intense periods, while inexpensive activities may be temporarily clustered. Our data suggest that mandible abrasive tasks are more dispersed throughout the individual’s life. All this while accepting that circumstantial colony demands can alter these directives.