Introduction

Allometry has long been a major focus of evolutionary biologists both because body size is arguably the most important organismal trait and because allometry is the simplest case of three sometimes contrasting kinds of constraints: functional, developmental, and genetic. Allometry exemplifies a functional constraint in that proportions must change as size does to maintain functional equivalence as size increases (e.g.,Gould, 1966; McMahon, 1975; von Bertalanffy, 1964). Allometry can thus be viewed as an adaptation to functional constraints imposed by physical laws. Allometry also exemplifies a developmental constraint in that growth allometry may bias the probability of evolutionary change along a conserved developmental trajectory (Jablonski, 2020). Allometry, specifically brain:body allometry, is also the paradigm for constraints due to genetic correlations (Lande, 1979). Moreover, allometry is proposed to be the primary determinant of morphological diversity because otherwise unrelated traits are correlated with body size (Pyron & Burbrink, 2009). Allometry has even been viewed as a limit on the range of forms that evolve (Bright et al., 2019). Furthermore, as well as being a constraint, allometry itself may be constrained in that traits subject to a common growth control may be limited in their evolvability (Huxley, 1924; Rensch, 1948; Savageau, 1979; Voje et al., 2014).

For all that allometry exemplifies the simplest of constraints, and can be viewed as the simplest case of morphological integration, allometry itself is not biologically simple (Hallgrimsson et al., 2019). Growth allometry is not simple because it arises from multiple, partially overlapping processes that are related through their common influences on size; thus, growth allometry depends on the relative contributions of those various global and local processes (Hallgrimsson et al., 2019; Larson et al., 2018). Allometry within a single age class (static allometry) is even more complex because that is determined not only by growth allometry, but also by the covariance between the allometric slope and body size (Pelabon et al., 2013). Evolutionary allometry should be even more complex because that depends not only on the developmental origins of growth allometry and the determinants of allometric variation within populations, but also on the evolutionary dynamics of size and shape. It should also depend on the ecological opportunity for size-dependent ecomorphological specialization. Lineages that converge on their size-dependent specializations might exhibit similar evolutionary allometries but otherwise, evolutionary allometries might be so heterogeneous that the net effect of their variously oriented trends would be little predictable change along any one trend.

Countering those expectations for complex evolutionary allometries, brain:body size allometry is highly predictable, as are its facial correlates. Even this one is not invariant (Smaers et al., 2021): six grades of mammalian brain:body allometry have been distinguished, differing in slope as well as intercepts within grades. Nonetheless, there are highly predictable craniofacial scaling relationships: small animals not only have relatively large, globose brains but also relatively large eyes, a pattern known as Haller’s rule (Rensch, 1948) and large animals have relatively flatter crania and longer faces, a pattern once termed Progressive Preoptic Preponderance (Robb, 1935a) and, more recently, the rule of Craniofacial Evolutionary Allometry, CREA (Cardini et al., 2015). One explanation for that scaling relationship is specific to this particular case: the positively allometric face is a logical corollary of the negatively allometric brain (Gould, 1975a; Hemmer, 1966; Weidenreich, 1941), which assumes that the skull scales near-geometrically with body size. If the skull scales near-geometrically and the brain is negatively allometric, the face must be positively allometric. Radinsky (1984) proposed that facial allometry buffers skull size against the negatively allometric brain, but that appears to be costly for both very small and very large animals: small animals, like hummingbirds, must house relatively huge brains in their tiny skulls (Ocampo et al., 2018) and large animals have relatively long faces, and thus less leverage for jaw and neck muscles (e.g., Mitchell et al., 2018; Radinsky, 1981; Slater & Van Valkenburgh, 2009; Wroe & Milne, 2007). Empirical studies do support the hypothesis that the skull scales near-geometrically with body size in mammals (Gould, 1975b; Radinsky, 1984; van der Geer et al., 2018; Wayne, 1986), but they do not explain why that should be the case.

Other explanations for craniofacial scaling of mammals are more general in both their morphological scope and mechanisms. One proposed mechanism is evolutionary bias along the line of least evolutionary resistance (LLER), the major axis of the genetic covariance matrix (Marroig & Cheverud, 2005; Schluter, 1996). The LLER is not necessarily an axis aligned with size (Boell, 2013); static allometry may account for merely 2–7% of the variance in shape (e.g., Bergmann et al., 2021; Bruner et al., 2010; Zelditch et al., 2009). Even when it is aligned with size, it may not be allometric; the LLER may more often be an axis of geometric scaling (Bjorklund, 1999). If so, size and shape should be independently evolvable. Evolutionary allometry along a conserved LLER has sometimes been invoked to explain craniofacial evolution (Bergmann et al., 2021; Boell, 2013; Cardini & Elton, 2008; Marcy et al., 2020; Marroig & Cheverud, 2005; Singleton, 2012) but more often, the hypothesis is ontogenetic scaling, a form of developmental bias in which a conserved ancestral growth allometry is extended or truncated to a larger or smaller adult body size (e.g., Cardini & Polly, 2013; Cardini et al., 2015; Giles, 1957; Krone et al., 2019; Lumer, 1940; Pilbeam & Gould, 1974; Robb, 1935b; Shea, 1983; Wayne, 1986). For example, Pilbeam and Gould (1974) concluded that “The nearly universal trend both of primate ontogeny and of static series of closely related adults is toward negative allometry of the brain and positive allometry of the face” and Robb (1935b) similarly concluded that “with respect to facial preponderance, ontogeny in the horse both repeats and outruns phylogeny, in that living varieties are in some instances larger than any predecessor.” That parallelism might be expected, even in the absence of any developmental bias, if growing animals are subject to the same size-related functional constraints as adults. For example, we might expect parallel trends in the relative thickening of limb bones that enlarge weight-supporting cross-sections proportional to body volume (Gould, 1966). But the striking parallelism between growth and evolution of the negatively allometric brain (and positively allometric face) could also be mere coincidence, resulting from two otherwise unrelated brain-driven trends. Coefficients of growth allometry are determined by the body size at which structures begin to grow and the body size at which they must be full grown (Gould, 1966; Kramer, 1959; von Bertalanffy & Pirozynski, 1952); hence, precocial organs like the brain that are relatively large at birth, grow with negative allometry as the face and body catch up.

The primary question addressed in this study is whether the brain:body size and braincase:face evolutionary allometries are representative of size-related changes in form. These allometries might be representative if their explanations are general rather than specific to them, e.g., reflecting a functional relationship between structures rather than the timing of functional maturation of one particular organ. If those allometries are representative, we would expect that other allometric relationships would exhibit the same general patterns, especially a high degree of conservatism, indicated by parallelism between evolutionary allometries of related lineages, as well as parallelism between growth and evolutionary allometry. In the absence of intrinsic constraints on allometry, such as a developmental bias, we would instead expect that evolutionary allometries are not likely to parallel each other unless lineages resemble each other, converging on their size-dependent ecological specialization, especially at the extremes of the size range.

In this study, we analyze mandibular evolutionary allometry of squirrels (Sciuridae). On functional first principles, we might expect that jaw would scale geometrically rather than allometrically because the functionally relevant parameters are linear dimensions whose functional equivalence is maintained by geometric scaling. However, there is more to jaw shape than those linear dimensions; their geometry also matters (Herring, 2007; Herring et al., 2001; Throckmorton et al., 1980). Even when moment arms scale geometrically, jaws may differ in the orientation of those lever arms, and therefore of the structures on which muscles originate or insert (Swiderski & Zelditch, 2010). As evident from Fig. 1, squirrel jaws are not geometrically scaled variants of each other. Allometry, in this case, would result from predictable size-related deviations from functional equivalence. Absent developmental bias, we would not expect growth allometry to resemble evolutionary allometries and absent constraints on the evolvability of allometric coefficients, we would not expect evolutionary allometries to resemble each other unless lineages converge on size-dependent ecological specialization. Several lineages of squirrels converge on a distinctive ecological specialization of miniatures, bark-gouging, and their mandibles converge, as do mandibular shapes of some durophagous giants (Zelditch et al., 2017). If allometric evolution is constrained by jaw function, the evolutionary allometries of these lineages, Protoxerini, Nannosciurinae, Sciurini and the more informal grouping of Ratufa and Sciurillus, are the lineages most likely to exhibit similar evolutionary allometries, presuming that miniatures and giants lie along the allometric trend. To determine if that is the case, and given the potential complexities that could be due to the evolutionary dynamics of size and shape, we also dissect evolutionary allometry within two species-rich lineages, Nannosciurinae and Marmotini.

Fig. 1
figure 1

Photographs of mandibles of selected species. Those within the same row are from the same lineage (from top to bottom row): Marmotini, Nannosciurinae, Protoxerini, Pteromyini, Sciurini. The miniatures are on the left; giants are on the right

Materials and Methods

Samples of Adults

Our sample of adults comprises 1712 individuals from 191 species that are represented in the most recent and comprehensive phylogeny of Sciuridae (Menendez et al., 2021). The data for adults are the means of species, estimated from sample sizes that range from 1 to 31 (mean N = 8.9, median N = 10). To compare evolutionary allometries between lineages, we divided the sample by tribe because tribes are the smallest lineages that contain enough species for statistically meaningful comparisons. The subfamily Sciurinae contains two tribes of arboreal species, the largely Holarctic tree squirrels (Sciurini) and the largely Paleotropical flying squirrels (Pteromyini). The subfamily Xerinae contains three tribes, the African tree squirrels (Protoxerini), and two ground squirrel lineages (Xerini and Marmotini). Xerini is one of the tribes that is exceptionally species poor; there are only six extant species in the tribe, so even though we have sampled all of them, there are too few for meaningful statistical comparisons. The Asian tree squirrel subfamily Nannosciurinae contains three tribes, but two of the three (Funambulini, Exilisciurini) contain a single genus, and one of those (Exilisciurus) contains only two species. We therefore did not divide Nannosciurinae by tribe. We also did not divide the remaining group, comprising Sciurillus pusillus and Ratufa, which contains only five species (of which only three are in both our data and the phylogeny). We include this pair of genera (Ratufa/Sciurillus) as well as Xerini in the analysis because they contain miniatures and giants of Sciuridae and thus contribute important information about evolutionary allometry of Sciuridae.

Samples of Ontogenetic Series

The ontogenetic sample for this analysis is the one analyzed previously (Zelditch et al., 2016), based on 380 individuals of eight species, from two suborders and three families of rodents. Two are myomorphs, one within Muridae, Mus musculus domesticus, and the other in Cricetidae, the cotton rat, Sigmodon fulviventer, both from laboratory populations. The six ontogenetic series of squirrels include one sciurine (Sciurus carolinensis), one protoxerine (Heliosciurus rufobrachium) and four marmotines (Ammospermophilus leucurus, Callospermophilus lateralis, Otospermorphilus beecheyi and O. variegatus). With the exception of two infant O. beecheyi, the samples come from natural populations. Because we found that the ontogenetic trajectories that have the youngest infants (M. musculus and O. beecheyi) are nonlinear early in development, we removed the mice younger than seven days old, and the four smallest O. beecheyi, leaving a sample of 365 specimens (all 380 individuals are in the Dryad file).

Morphometric Data

Mandibles were photographed in lateral view; those of the youngest individuals in the ontogenetic series obtained from museum collections were photographed and measured twice on both sides, which were averaged to minimize measurement error given the difficulty of both photographing and measuring them. Fourteen landmarks were originally digitized on the sample of adults, but some of those could not be located on the younger specimens in the ontogenetic series, so the landmark data for the adults were reduced to ten that could also be located on young prior to complete tooth eruption (Fig. 2). These ten include the mental foramen, openings of the incisor alveoli, tips of the mandibular processes, and sites where those processes join the ramus (or sheets of bone between processes). These few landmarks provide little information about curvature or where jaw deepening occurs, so we also sampled semilandmarks along the curves of the posterior processes, the diastema between the most anterior premolar and the incisor, and the ventral margin of the horizontal ramus (corpus). The original ontogenetic sample had one additional semilandmark so this too was removed to reconcile the two datasets. As well as analyzing these landmarks plus semilandmarks, we also analyzed samples containing only the ten landmarks to ensure that the differences we detect are not exaggerated by differences in the subtle details of curvature. Landmarks and semilandmarks were digitized in tpsDig2 (Rohlf, 2017). The complete set of landmarks for adults are in the Dryad file; the four removed for the present analysis include one at the anterior margin of the masseteric fossa, two at molar landmarks, and one at the posterior end of the condyle (the four in the Dryad file of all adult specimens that were removed are landmarks 4, 5, 6, and 9; the semilandmark in the Dryad file of the ontogenetic sample that was removed was 36).

Fig. 2
figure 2

Landmarks (black) and semilandmarks (white) shown on the mandible of Sciurus oculatus

Landmarks for each dataset were first superimposed by Generalized Procrustes analysis (GPA); semilandmarks were slid to minimize bending energy (Bookstein, 1997; Green, 1996; Zelditch et al., 2012) then projected onto the tangent space. Superimposition was done using the gpagen function in geomorph, ver. 4.0.3 (Adams et al., 2020; Baken et al., 2021), run in R (ver. 4.1.2) (R_Core_Team, 2021). Size is measured by centroid size (CS), the square root of summed squared distances of each landmark to the centroid. Following superimposition, we averaged the shapes and CS for each individual that was measured twice on both sides and also calculated the average shape of the adults for each species. The two data sets were then combined and superimposed again to place them within the same morphometric space. After removing the semilandmarks, the file containing both datasets (juveniles and adults) was superimposed again.

Analyzing Size and Shape Disparity

Because tribes might differ in the strength of the allometric signal owing to differences in their disparities of size, shape and/or ecology, we examine disparities of size and shape prior to interpreting differences in evolutionary allometry. The allometric signal might be stronger, as well as yield more precise estimates, when the range of size is wide. But if the lineage is ecologically uniform, the allometric signal could still be weak because the mandibles are expected to scale geometrically to maintain functional equivalence. To compare disparities of shape and size across tribes, we fit a model predicting that shape (or size) depends on Tribe, then used the pairwise function in RRPP (Collyer & Adams, 2018) to compare the variance within each group (test = “var”). The test statistic is the absolute difference between the variances, which is assessed for its statistical significance by permutation of the residuals of the reduced model (Adams and Collyer, 2009; Anderson and ter Braak, 2003; Collyer et al., 2015). Pairwise comparisons between allometries were done using the pairwise function in RRPP ver. 1.2.3 (Collyer & Adams, 2018, 2019).

To depict the disparities of the tribes, we show the distribution of size in each tribe as a violin plot; the disparity of shape is more difficult to depict because of its multidimensionality, so we use the partial disparities of each species, which is its squared distance from the mean shape of its tribe, divided by the sample size. High values for this indicate that a species makes a relatively large contribution to its tribe’s disparity. Partial disparities sum to the total disparity so the disparity contributed by selected species can be summed to yield the contribution that those species make to the total. The partial disparities contributed by each individual species can also be depicted by violin plot.

Analyzing and Comparing Allometry

Estimating and comparing species’ growth allometries to each other is straightforward and estimating comparing evolutionary allometries to each other is only slightly more complex because that requires an evolutionary model for the error covariance matrix. However, comparing growth to evolutionary allometries is not straightforward. These are comparisons between vectors of allometric coefficients, one estimated within a species (by ordinary least squares, OLS), the other estimated within a lineage (by phylogenetic generalized least squares, PGLS). We first present the methods for analyzing growth allometries, then evolutionary allometries and then a procedure for comparing the allometric vectors from the two models to each other.

Growth Allometry

To estimate coefficients of growth allometry, and test for species-specific growth allometries, we used Analysis of Covariance, fitting a model with two main effects, the factor “Species” and the covariate “Size”, measured as ln-transformed CS (LCS), and the interaction between them. A significant interaction indicates that slopes are not all equal, i.e., we can reject the hypothesis of a common slope for all species. The statistical significance of the interaction term is determined by a permutation test of the residuals of the reduced model, i.e., the model lacking that interaction term (Adams and Collyer, 2009; Anderson and ter Braak, 2003; Collyer et al., 2015). Differences in allometries are quantified by the angle between them, calculated as described above, and the statistical significance of the angle is determined by comparing it to the distribution of the random values (Adams and Collyer, 2009; Anderson and ter Braak, 2003; Collyer et al., 2015). The Ancova was done using the procD.lm function in geomorph; pairwise comparisons between allometries were done using the pairwise function in RRPP ver. 1.2.3 (Collyer & Adams, 2018, 2019). These analyses were done using datasets comprising landmarks plus semilandmarks, and for the datasets comprising only landmarks.

The null hypothesis of these analyses is that the vectors of slope coefficients do not differ, which is equivalent to the biological hypothesis of parallelism between vectors. An alternative hypothesis, also of interest, is that the vectors are no more similar than expected by chance. The expectation depends on the dimensionality of the data; this test compares the area of the cap on a hypersphere, which is the portion of the surface of the hypersphere within a certain angle from a fixed vector, to the area of the entire hypersphere (Li, 2011; Watanabe, 2022). Assuming a uniform distribution of the vectors, the probability that a random vector equals the observed angle (or less) is obtained by dividing the area of the cap by the area of the hypersphere. Unlike the test of parallelism, this does not take the uncertainty of the estimates of the vectors into account. The functions available for this test presume that the data comprise only landmarks; although the functions could be rewritten for the dimensionality of data containing semilandmarks, the distributional assumption is even more unrealistic for semilandmarks. We therefore limit this analysis to the landmark-only data. This test was done using the angle.test function in the R package Morpho (Schlager, 2017), ver. 2.9; the wrapper Test of Angle in Geometrics Morphometrics Mix (Fruciano, 2019) provides the angle in degrees. For our data, the critical angle for the ratio between the area of the cap and area of the hypersphere is 68.36°.

Evolutionary Allometry

To estimate coefficients of evolutionary allometry, and to test for tribe-specific evolutionary allometries, we use Phylogenetic Analysis of Covariance (PGLS). The model has two main effects, the factor “Tribe” and the covariate “Size”, again measured as ln-transformed CS (LCS), and the interaction between them. A significant interaction indicates that slopes are not all equal, i.e., we can reject the hypothesis of a common slope for all tribes. This model is unconventional because tribes are clades, hence this factor is entirely confounded with the phylogeny. Nevertheless, that model yields estimates of tribe-specific evolutionary allometries. Given that confounding, we used Type II Sums of Squares for statistical tests of the terms in the model (Adams & Collyer, 2018b). In this analysis, like in the analysis of growth allometry, tests of statistical significance were done by permuting the residuals of the reduced model, and the statistical significance of the angle again is determined by comparing it to the distribution of the random values. The phylogenetic Ancova was fit using the procD.pgls function in geomorph and pairwise comparisons between evolutionary allometries were done using the pairwise function in RRPP. Like the analyses of growth allometry, these analyses were done using datasets comprising landmarks plus semilandmarks, and for the dataset comprising only landmarks. Again, as well as testing the null hypothesis that the angles differ by no more than expected by chance, we tested the hypothesis that they are no more similar than expected by chance.

Like any other analysis using modern phylogenetic comparative methods, this one relies on an evolutionary model to adjust the covariance matrix of the error term for the non-independence of species. The model typically used for that purpose is Brownian Motion but when that model does not fit the data, phylogenetic regression can have poor statistical performance, yielding inaccurate estimates of the intercept and slopes even if the Type 1 error rate is not inflated (Revell, 2010). This is a serious concern when our major objective is to estimate allometric slopes. An alternative to assuming that the model does fit is to transform the covariance matrix, rescaling branch lengths so that they result in the expected constant variances on the transformed scale (Pagel, 1997, 1999). Ideally, the phylogenetic signal of the residuals would be estimated simultaneously with the regression parameters, a method implemented for univariate data (Revell, 2010). That is not yet feasible for high dimensional shape data because, like any likelihood-based method, model-misspecification rates can increase with the number of variables and when the number of variables exceeds the number of species (as it does in our case) the evolutionary covariance matrix is singular (Adams & Collyer, 2018a). If the residuals exhibit high phylogenetic signal, close to the expected value of 1.0 for Blomberg’s K (Blomberg et al., 2003), using the multivariate extension of that metric, Kmult (Adams, 2014), the possibility of poorly estimated parameter values would be of less concern. However, for the residuals of the PGLS in this analysis, Kmulti = 0.107. We therefore used a methodological compromise, first reducing the OLS residuals by a Principal Components Analysis (PCA) to the first 12 or 16 PCs, which account for 92.9% and 96% of the variance, respectively, then estimating Pagels λ for both multivariate datasets, using the transformPhylo.ML function in the R package motmot ver. 2.1.3 (Thomas & Freckleton, 2012) and finally transforming the phylogeny by the value of λ using the rescale function in the R package geiger (Pennell et al., 2014), ver. 2.07. The vectors of allometric coefficients obtained from the PGLS models using the different sets of adjusted branch lengths, with λ = 0.704 or λ = 0.609, differ by less than 2°. When analyzing the data, we used the value of 0.704. This procedure was done a second time when analyzing the landmark-only data, using λ = 0.77.

This procedure cannot be applied to lineages with too few species to estimate the values of lambda. We therefore analyze allometry within only two tribes, Marmotini (N = 79) and Nannosciurinae (N = 43).

Comparing Growth to Evolutionary Allometry

In these analyses, we did pairwise comparisons between growth and evolutionary allometric vectors. These are pairwise comparisons between vectors of allometric coefficients, the growth allometries estimated by OLS, and evolutionary allometries estimated by PGLS. One approach to comparing vectors is to produce those vectors by predicting the shapes at the smallest and largest sizes, then adding the residuals of the models to those endpoints, then conducting a trajectory analysis to compare the directions and rates of change along those trajectories (as in Collyer & Adams, 2007, 2013). That procedure yielded unrealistically low values for upper limits of the confidence intervals so instead we extracted the fitted values from the OLS and PGLS models, combined them, and fit another Ancova, using the procD.lm function in geomorph, regressing the fitted values on size, and then did the pairwise comparisons between vectors (again, using the pairwise function in RRPP). We restricted these comparisons to tribes that have enough species for meaningful statistical comparisons, omitting comparisons to the species-poor Xerini and Ratufa/Sciurillus.

Treatment of Multiple Comparisons

In each analysis, we conduct a large number of pairwise comparisons and do not statistically correct for multiple comparisons. That is because the null hypothesis of the test is equivalent to the biological hypothesis of parallelism, hence a conservative approach to Type I error favors the biological hypothesis. The biological hypothesis might therefore seem to be supported only because the model parameters are estimated with such high uncertainty that divergent trajectories cannot be statistically distinguished from parallel lines. Being unable to reject the null when the observed value for the angle is large should not be taken as evidence for the biological hypothesis. The support for the hypothesis is small, (and well-estimated) angles.

Depicting Comparative Analyses of Allometry

Diagrams for comparative analyses of allometry resemble the traditional scheme for depicting heterochrony (Alberch et al., 1979) because they contain only one shape axis plus a size axis. These plots are effective at displaying the differences in rates of change along that shape axis relative to size. One diagram, depicting regression scores, has a shape axis that is a normalized vector of allometric coefficients; the scores for individual shapes are obtained by projecting the data onto that vector (Drake & Klingenberg, 2008). Another uses PC1 of the predicted values for shape from each sample’s regression model as the shape axis (Adams & Nistri, 2010). The predicted shape for each individual is obtained from its group-specific regression model, then the predicted shapes for all samples are combined and subjected to a PCA. Scores on PC1 are then plotted against size. In both diagrams, the orientation of the shape axis is arbitrary so scores can either increase or decrease as size increases. What cannot be seen in either plot are the differences among groups in their shape axes. That disparity of allometries is better visualized by plotting the allometric axes in the plane of the first two PCs, paired with a plot of the scores relative to the size axis. Plots of the fitted values were done using the plotAllometry function in geomorph (with method = “PredLine”). To depict the shape change as a deformed grid, we show the change from the smallest to largest predicted shape, using the plotRefToTarget function in geomorph; vectors of relative landmark displacements were then added to the plot.

Results

Size and Shape Disparity of Tribes

In jaw size, (LCS), the disparity of Sciuridae is 0.1232. Within the four tribes with enough species for meaningful statistical comparisons, size disparity ranges from 0.0474 (Sciurini) to 0.16289 (Marmotini) (Table 1A). Marmotini is statistically significantly more disparate than both Nannosciurinae and Sciurini (Table 1B). Although Nannosciurinae has the smallest miniatures aside from the mouse-sized protoxerine, its giants are not exceptionally large (Fig. 3A). Sciurines are relatively large, on average, but this tribe lacks notable extremes at both ends of the size range, as does Pteromyini.

Table 1 Comparing disparities of size and shape
Fig. 3
figure 3

Distribution of size and the partial Procrustes distances of shape

In jaw shape, the disparity of Sciuridae is 0.007853. Within tribes, shape disparity ranges from 0.00171 (Sciurini) to 0.00558 (Protoxerini) (Table 1A). With semilandmarks included in the analysis, Marmotini, Nannosciurinae, and Protoxerini are statistically significantly more disparate than Sciurini (Table 1B); without semilandmarks, Marmotini and Protoxerini are statistically significantly more disparate than Sciurini, but Nannosciurinae is not (P = 0.064). Despite its relatively high disparity, Marmotini lacks extreme morphologies (Fig. 3B); miniatures (chipmunks, Tamias), giants (Marmota) and some species of intermediate sizes, contribute evenly to disparity. In contrast, the relatively high disparities of Protoxerini and Nannoscurinae are largely due to their specialized miniatures, which make the largest contributions to the disparity of those groups, although the termite-eating nannosciurine, Rhinosciurus laticaudatus contributes 24% of the total disparity of that tribe, equal to the contribution of the four miniatures (Exilisciurus concinnus, E. exilis, Nannosciurus melanotis, Glyphotes simus). In the case of Pteromyini, the disparity is more evenly distributed; species, on average, make equal contributions to disparity. In Sciurini, the specialized miniatures are less extreme than those of Protoxerini and Nannosciurinae, but they, and the one giant, contribute disproportionately to that group’s relatively low shape disparity.

Growth Allometry

Growth allometries of squirrels are typically similar even when statistically significantly different. In the analyses of landmarks plus semilandmarks, no pair of squirrels’ growth allometries differ by more than 30° (Table 2). The squirrels’ allometries are also at least moderately similar to those of the myomorphs; no squirrel’s growth allometry differs from either myomorph’s by more than 42°. Growth allometries of the distantly related myomorphs are also moderately similar, differing by 48.2°. Removing the semilandmarks has little effect on these angles. No pair of squirrels’ growth allometries differs by more than 34° and their growth allometries are again at least moderately similar to those of the myomorphs’; differing by no more than 44.2° and myomorphs are again also moderately similar, differing by 45.5°

Table 2 Pairwise comparison of growth allometries

Evolutionary Allometry of Sciuridae

Size (LCS) accounts for 6.6% of the disparity of mandible shape, although the magnitude of the effect, as measured by the Z-score, is relatively large (Table 3A). As expected, there is a highly significant interaction between Tribe and Size (LCS). Removing the semilandmarks has little effect on these results; size accounts for 6.5% of the disparity of mandible shape and there is a significant interaction between Tribe and Size (Table 3B).

Table 3 Evolutionary allometry of squirrel mandible

That interaction is evident in the distribution of jaw shapes along the first two PCs (Fig. 4), which, together, display half of the disparity of this group. Along PC1, the largely arboreal tribes (Sciurini, Pteromyini and Protoxerini) are distinguished from the ground squirrels (Xerini and Marmotini), but the largest species in the family, the giants of Marmota, have the highest scores on PC1, and the smallest marmotines, chipmunks (Tamias), have the lowest scores of the tribe on this axis. PC1, however is not aligned with the difference between miniature and giant tree squirrels. Rather, on PC2, miniature tree squirrels, excepting flying squirrels, have the most negative scores. Those with scores lower than −0.06 include several nannosciurines (species of Exilisciurus, Nannosciurus, Prosciurillus and the chipmunk-sized Funambulus and Glyphotes simus), as well as two less extreme miniature sciurines (Microsciurus alfari and Syntheosciurus brochus), and the smallest species of the family, the mouse-sized protoxerine, Myosciurus pumilo. The miniature flying squirrels, however, have either the highest scores on PC1 or lowest on PC2. The giants of Marmota differ from species with lower scores in their less rectangular angular processes that extend more posteriorly, plus they have less robust coronoid and condyloid processes, with a deeper notch between them. Along PC2, the jaws with the lowest scores have a distinctive feature of miniatures, a reduced or absent coronoid process. Those with higher scores not only have well-developed coronoid processes, but also a slenderer, more elongate condyloid process, and a shallower incisor alveolus.

Fig. 4
figure 4

Phylomorphospace of sciurid mandibles

Comparing Tribal Evolutionary Allometries

The angles between evolutionary allometries (Table 4) range from a remarkably narrow 25.75°, between the evolutionary allometries of Nannosciurinae and Ratufa/Sciurillus, to an extraordinarily wide 114.09° between two tribes of Xerinae, Marmotini and Xerini. In comparisons between lineages containing bark-gouging miniatures and durophagous giants (Nannosciurinae, Protoxerini and Sciurini and Ratufa/Sciurillus) the angles between vectors of evolutionary allometric coefficients range from 25.75 to 67.11°; only one exceeds 55° and none of the lineages that is diverse enough for meaningful statistical comparisons differ significantly in their evolutionary allometries. In comparisons between one of those lineages and one lacking those two size-dependent specializations, the angles range from 58.60 to 98.85°. Only one is less than 70° and in all comparisons between lineages diverse enough for meaningful statistical comparisons, the allometries differ statistically significantly. The comparisons between lineages that both lack those size-dependent specializations yield angles ranging from 63.83 to 114.09°.

Table 4 Comparing slopes of tribal allometries, measured as angles between them

Removing the semilandmarks has little effect on the results. The angles range from 25.31 to 104.62°. In comparisons between lineages containing bark-gouging miniatures and durophagous giants, the angles between vectors of evolutionary allometric coefficients range from 25.31 to 64.17° but only one exceeds 55°. In comparisons between one of those lineages and one lacking those two size-dependent specializations, the angles range from 60.13 to 104.62°. Only two are less than 68.36 hence only two are any more similar than expected by chance. In all but one comparison between pairs of lineages diverse enough for meaningful statistical comparisons, the allometries differ statistically significantly (the exception is the comparison between Pteromyini and Sciurini, P = 0.053). Comparisons between lineages that both lack size-dependent specializations yield angles ranging from 71.62 to 116.45°.

Rates of allometric shape change differ numerically (Table 5A), but, with one exception, not statistically significantly when the semilandmarks are included in the analysis; removing them yields significant differences between the two lineages with very low allometric rates (Marmotini, Pteromyini), and the species-rich lineages with high rates (Nannosciurinae, Protoxerini) (Table 5B).

Table 5 Comparing the rate of shape change per unit change in lnCS, measured by the length of the per-unit size allometric vector

The dominant axis of variation among allometric trends is not size; rather, it is a contrast in the directions of allometry (Fig. 5A). Plotting PC1 relative to size (Fig. 5B) shows the component of allometry along that PC1 but there is also a component of allometry along PC2 as well (Fig. 5C).

Fig. 5
figure 5

A Principal components (PCs) of fitted values of evolutionary allometry; B PC1 of fitted values of evolutionary allometry plotted against the Predictor (LCS); C PC2 of fitted values plotted against the Predictor (LCS)

Evolutionary Allometry Within Tribes

In both Marmotini and Nannosciurinae, evolutionary changes in size have a moderate impact on shape (Table 6). That size explains only 12% of the variation in Marmotini might seem surprising in light of the variation along PC1(Fig. 6A), which accounts for 64.58% of the variance, and looks like a size axis in that the species with the most negative scores are small-bodied chipmunks (Tamias) or antelope squirrels (Ammospermophilus) and those with the most positive scores are the large-bodied prairie dogs (Cynomys) and giant Marmota. In a non-phylogenetic Anova, size explains 42.8% of the variance in shape but rather than co-evolving, size and shape are jointly retained. Size accounts for so much less in the phylogenetic Anova because of those evolutionary dynamics of size and shape. In Nannoscurinae, size explains only 11% of the variance in shape (Table 6) although PC1 (Fig. 6B) is again at least partly related to size. The miniatures, Exilisciurus, Nannosciurus and Glyphotes have the highest scores on PC1, and the larger-bodied Callosciurus, Dremomys and Sundasciurus have the lowest scores, and the intermediate-sized Prosciurillus are between them. But even though the species of Funambulus are within that broad cluster of miniatures, they are not miniatures, and the most extreme positive score on both PC1 and PC2 is the jaw shape of Rhinosciurus laticaudatus, which is no smaller than the species with the most negative scores on this axis. Within that cluster of scores on the left are species that differ by a factor of two in jaw size. Even in a non-phylogenetic Anova, size accounts only for 15% of the disparity of shape. In this group, despite the convergent miniatures, size and shape neither coevolve nor are jointly retained; size evolves even when shape is conserved, and shape evolves when size is conserved.

Table 6 Analyses of tribe-specific allometries for two species-rich tribes, A: Marmotini; B: Nannosciurinae
Fig. 6
figure 6

Phylomorphospaces of two diverse tribes: A Marmotini; B Nannosciurinae

The regression scores for Marmotini (Fig. 7A) show what was evident for shape in the phylomorphospace, but more clearly show that those scores do not consistently increase with size. The regression scores of the giants (LCS = 5.5) are no higher than those of species of intermediate size (LCS ≈ 5.0). The giants differ along a different direction than the allometric axis because jaws of giants do not evolve by extending a common allometric trend to a larger size. In Nannosciurinae, the allometric trend is the dominant pattern (Fig. 7B), but there is substantial variation around that trend. Two miniatures have the highest scores, and larger-bodied species (aside from R. laticaudatus) have, on average, lower scores, but at every size there is variation in the regression scores. Species of intermediate size (LCS = 4.2–4.4) have scores as high as one of the miniatures, and at all sizes within that range, there are regression scores ≈ − 0.002. Species that have similar jaw shapes differ in size and species that have similar sizes differ in jaw shape.

Fig. 7
figure 7

Scores for the regression of shape on size: A Marmotini; B Nannosciurinae

Comparing Growth to Tribal Evolutionary Allometries

Growth and evolutionary allometries are strikingly different (Table 7). For the data consisting of both landmarks and semilandmarks, the angles between growth and evolutionary allometries range from 58.81 to 105.34°. The results differ little when the semilandmarks are removed. Without semilandmarks, the angles range from 57.94 to 102.01°. Growth allometries of six species resemble the evolutionary allometry of Marmotini by more than expected by chance and that of the cricetid rodent (Sigmodon hispidus) also resembles the evolutionary allometry of Nannosciurinae. Only the growth allometries of one marmotine (Ammospermophilus leucurus) and the muroid (Mus musculus) do not, nor do they resemble any other evolutionary allometry. The growth allometry of the sciurine (Sciurus carolinensis) and the protoxerine (Heliosciurus rufobrachium) differ from their own tribe’s evolutionary allometry by more than 88°.

Table 7 Comparing growth to evolutionary allometries

Those striking contrasts between growth and evolutionary allometries can be seen in the principal components of the fitted values (Fig. 8A). In this plane, growth allometries are from nearly horizontal, to increasing on PC2 as scores increase on PC1, but evolutionary allometries are more vertically oriented and decrease on PC2 as scores increase on PC1. Plotting PC1 scores for these fitted values against size (Fig. 8B), including the tribes that are too species-poor for meaningful statistical comparisons, shows that scores for growth allometries consistently decrease with increasing size but scores for some tribes’ evolutionary allometries are nearly constant and others’ slightly increase with size. Plotting PC2 scores against size (Fig. 8C) again shows steeply decreasing scores for the growth allometries but variously constant, decreasing or, for Xerini, increasing scores with increasing size.

Fig. 8
figure 8

A Principal components (PCs) of fitted values of mandibular growth and evolutionary allometries. B PC1 of fitted values of evolutionary and growth allometry plotted against the Predictor (LCS); C PC2 of fitted values plotted against the Predictor (LCS)

To the extent that growth and evolutionary allometries have any features in common, it is the highly positive allometry of the angular process relative to jaw depth, which is characteristic of all ontogenetic allometries and the evolutionary allometries of Marmotini and Nannosciurinae (Fig. 9). That resemblance, however, is slight because of the differences in orientation of that deepening.

Fig. 9
figure 9

Deformation grids depicting allometric shape changes A growth; B evolutionary

Discussion

We anticipated that evolutionary allometry would be complex because it depends not only on the multiplicity of processes generating growth allometry and the complex determinants of allometric variation but also the evolutionary dynamics of size and shape. We might still expect allometries to be highly predictable in functionally homogeneous lineages because those are more likely to maintain functional equivalence across the range of body size, but in functionally diverse lineages, correlations between size and shape would instead indicate size-related deviations from functional equivalence. Hence, the predictability of allometry should depend on the ecological opportunity for size-dependent ecomorphological specialization. As expected, evolutionary allometries are heterogeneous; consequently, the net effect of the variously oriented trends is little change predictably along one; size accounts for merely 6.6% of the disparity of jaw shape. However, evolutionary allometries are at least moderately predictable from convergent, size-dependent specializations; the lineages that contain both convergent specialized miniatures and durophagous giants, Ratufa/Sciurillus, Nannosciurinae, Sciurini and Protoxerini, are similar in their evolutionary allometries; Ratufa/Sciurillus and Nannosciurinae are remarkably similar, differing by merely 25.31° but only one pair differ by more than 55°. In contrast, when one lineage lacks these size-dependent specializations, only one comparison yields an angle of less than 70° and when both lack those specializations, the angles range from 63.83 to 114.09°. That evolutionary dynamics of size and shape introduce additional complexity beyond that due to ecological specialization is most evident in the two diverse lineages in which allometry would seem to be the dominant trend in the variation of shape. Despite that apparent trend, in neither lineage do size and shape regularly co-evolve. Within one (Marmotini), size and shape are consistently related, and the giants have a distinctive morphology but one that arose only once; rather than co-evolving and converging on both size and shape, size and shape are jointly retained. Because they do not regularly co-evolve, their evolutionary covariance is weak. Within the other lineage (Nannosciurinae), there are distinctive, convergent miniatures, but shapes similar to theirs are found at other sizes, and size does not predict the most strikingly divergent forms, hence in this lineage too, the evolutionary covariance between size and shape is weak.

We find no evidence of a developmental bias on evolutionary allometry. There is a weak resemblance between the evolutionary allometry of one lineage, Marmotini, and the growth allometries of the cricetid rodent (Sigmodon hispidus) and all squirrels other than one marmotine (Ammospermophilus leururus). But while the growth allometries of the sciurine (Sciurus carolinensis) and protoxerine (Heliosciurus rufobrachium) slightly resemble the evolutionary allometry of Marmotini, they differ from their own tribes’ evolutionary allometry by more than 88°. It might still seem plausible that growth allometries bias evolution of jaw form, but there are more obvious constraints that limit the angle between growth and evolutionary allometry. Were growth allometry reversed, infants would be born with well-developed jaw muscles, robust mandibular processes and fully erupted teeth and gradually lose them as they grew and were evolutionary allometries reversed, giants would be clinging to trees by their claws, gouging or gleaning bark to feed on the exudates and insects. Other influential hypotheses of constraints could not be directly tested, except to the extent that they predict highly conservative allometries. In particular, we could not directly test the hypothesis that evolutionary allometry is constrained by conservative static allometry because a comparison between static and evolutionary allometries requires large samples of adults to estimate subtle allometric effects, given that size accounts for merely 4.7% of the variance of shape in adult fox squirrels (S. niger) (Zelditch et al., 2009). Some studies have concluded that static allometries do not differ among species (Bergmann et al., 2021; Freidline et al., 2015; Marcy et al., 2020), and even that static allometry is aligned with evolutionary allometry (Marcy et al., 2020). But only one of those studies reported the angles between vectors (Bergmann et al., 2021), and as evident from theirs, as well as ours (see Table 4), the upper confidence limit for a difference in slope can exceed the angle expected between randomly related vectors. Being unable to reject the null is not equivalent to documenting that it actually is true.

The general hypotheses that fail to predict mandibular allometries are commonly invoked to explain craniofacial allometries, but they may also fail to predict craniofacial allometries aside from a few dimensions. Their predictability is more likely to depend on the highly predictable relationships between body size, brain and facial size rather than on size-dependent ecological specializations. An obvious question is whether the allometric relationship between the braincase and face covary with shifts in brain-body allometry, as suggested by Radinsky’s hypothesis that facial length buffers skull size against the negatively allometric brain. In some lineages (e.g., Carnivora) the slope and/or intercept shift at least three times (Smaers et al., 2021). Far more is known about the evolutionary dynamics of the brain’s allometry than the scaling of the braincase relative to the face. The 30 shifts in allometric slope and/or intercept were inferred from data of at least 1400 species (Smaers et al., 2021), but the evidence of the skull’s near-geometric scaling (a PGLS major axis slope of 0.906), rests on data from only 30 species (van der Geer et al., 2018). Although the renewed interest in the scaling of the braincase to the face has stimulated several recent studies, most focus specifically on that one scaling relationship (Cardini, 2019; Cardini & Polly, 2013; Cardini et al., 2015; Linde-Medina, 2016; Marcy et al., 2020; Tamagnini et al., 2017), not on how it accommodates shifts in brain:body allometry, or whether it maintains the near-geometric scaling of the skull. Although these questions are specific to a few craniofacial dimensions, their scaling can affect others, such as cranial base flexion, which appears to be determined by brain volume, facial size, and shape (Bastir et al., 2010; Ross et al., 2004). Of course, there is more to skull morphology than these dimensions; brains vary in shape as well as size (Swanson et al., 2012; Weisbecker et al., 2021), and so do faces. Despite some highly predictable scaling relationships, craniofacial allometry many be no more constrained than mandibular allometry, and, aside from a few highly predictable traits, it too may be predictable only from size-dependent ecological specializations.

As evident in the heterogeneous allometries of squirrels, allometry may be the simplest case of constraints and of morphological integration, but allometry itself is far from simple. The mandible may be the simplest possible case because it is geometric rather than allometric scaling that maintains functional equivalence of lever arms, so what we see in the allometric trends of the mandible are consistent, size-dependent deviations from functional equivalence. Evolutionary integration is likely to be even more complex when the covariances are between two or more high dimensional traits. Yet, we might expect that evolutionary integration would resemble evolutionary allometry except that its predictability would depend on shape- rather than size-dependent ecological specializations.