Introduction

In the early nineteenth century, Étienne Geoffroy Saint-Hilaire proposed the principe des connexions as a methodological rule to study animal form (Saint-Hilaire 1818). Other notable naturalists before Geoffroy, such as Pierre Belon and Johann Wolfgang Goethe, also made use of this principle as a way to recognize similarities, a tradition that goes back to Aristotle. However, Geoffroy was the first to establish connections as an operational criterion to identify morphological similarity among different anatomical parts by means of their structural relations to other parts, rather than by their shape and function. Thus, Geoffroy’s principle of connections formalized the intuitive notion of similarity then in vogue and opened up a new research program in pure morphology at the structural level (Appel 1987; Le Guyader 2003; Ochoa and Barahona 2009; Nuño de la Rosa 2012).

Several conceptual frameworks were later proposed for the use of connectivity relations in anatomical systems: Woodger’s structural correspondence, Rashevsky’s bio-topological principle, and Riedl’s diagrammatic morphotype (Woodger 1945; Rashevsky 1954, 1960; Riedl 1978). However, they were too general to be systematically applied to study practical morphological problems. Another, more quantitative way to address connectivity relations in anatomical systems within a precise operational framework, using Network Theory, was also laid out (Rasskin-Gutman and Buscalioni 2001; Rasskin-Gutman 2003). We have argued elsewhere that patterns of bone sutures in the skull can also be characterized as networks, in which nodes represent bones and links represent suture connections. The analysis of these networks in tetrapod skulls revealed evolutionary patterns in morphological complexity, integration, modularity, and phenotypic stability (Esteve-Altava et al. 2011, 2013a, b).

The tetrapod skull has undergone many different lineage-specific morphological changes during its evolution; for example, enlargement and shortening of the rostrum in humans and porpoises (Lieberman 1998; Galatius et al. 2011), miniaturization in lizards and amphibians (Rieppel 1984; Trueb and Alberch 1985; Laurin 2004), and expansion of the cranial vault in birds (Marugán-Lobón and Buscalioni 2003; Bhullar et al. 2012). In addition to these specific trends, a general pattern has occurred in all major lineages since the origin of the vertebrate skull: the reduction in number of skull bones (Table 1). Williston (1914) first described this trend in his studies on Permian reptile skulls; later, Gregory (1935) generalized it to all tetrapods, suggesting that loss and fusion of bones were the mechanisms underlying the establishment of this evolutionary pattern. Gregory paid homage to Williston by naming this evolutionary trend Williston’s Law.

Table 1 Skull bones commonly absent in tetrapods according to different authors (1Gaffney 1979; 2Hildebrand 1988; 3Benton 1990; 4Sidor 2001; 5Benton 2005; 6Kardong 2005)

The reduction in the number of elements, as it occurs in Williston’s Law, has also been proposed as a general mechanism to retain highly complex and functional biological systems throughout evolution, “complexity by subtraction” (McShea and Hordijk 2013); this notion of complexity uses a standard definition of morphological complexity as number of part types (McShea 1996). Using this metrics, Sidor (2001) concluded that Williston’s Law is an evolutionary trend toward skull simplification in synapsids. Our view on morphological complexity also includes number of bones (part types) as model parameters, but the focus is on measuring complexity as connectivity relations between the bones using a series of complementary network parameters: density of connections, characteristic path length, clustering coefficient, and heterogeneity (explained below). These parameters capture not only the number of part types in the skull, but also their local and overall organization (i.e., their connectivity pattern).

Using this new morphological complexity metrics, we showed in a phylogenetic analysis that this reduction in bone number generates an evolutionary trend toward more complex skulls (Esteve-Altava et al. 2013a). In addition, we concurred with Gregory about the importance of losses and fusions of bones as evolutionary mechanisms producing the diversity of extant and extinct skull forms. Moreover, the use of connectivity patterns to quantify morphological complexity suggested that the selective loss of poorly connected bones, alongside new unpaired bone formation by fusion, is responsible for this evolutionary trend. We concluded that the connectivity pattern among skull bones is a source of structural constraints on the loss and fusion of individual bones. Conversely, both mechanisms imposed new constraints on the modification of the connectivity pattern of the entire skull, for example, by increasing the number of connections of bones originated by fusions. The underlying developmental basis for this structural constraint is due to the increase in functional and developmental dependencies, which arises with the establishment of connections among bones (Esteve-Altava et al. 2013b), an evolutionary concept known as developmental burden (Riedl 1978). Other authors have also suggested similar constraint relationships in more general biological contexts, such as Wimsatt’s generative entrenchment (Wimsatt 1986). Since the number of connections of a given bone (i.e., dependencies) characterizes the amount of burden carried by that bone, we suggested that the higher the burden the less likely the bone will be lost during evolution (Esteve-Altava et al 2013a).

Here, we address this hypothesis by analyzing the effect of random and selective losses and fusions of bones. To do so, we have built a computational model of skull evolution that simulates Williston’s Law-like evolutionary patterns, from hypothetical ancestral skulls. We have used Gabriel networks (Gabriel and Sokal 1969; Matula and Sokal 1980) as a null model to analyze growth rules and constraints that might be involved in producing connectivity patterns during evolution. Then, we compared the complexity measures of the ancestral and derived simulated networks with those of empirical skull networks from all major tetrapod groups (see “Methods”). Our aim is to explore selective versus random processes of bone loss and fusion mechanisms as plausible evolutionary scenarios. We evaluate three different processes by which the computational model picks a specific bone to be lost or fused: (1) selection of the least connected (L), (2) selection of the most connected (M), and (3) random selection (R). The combination of these mechanisms produces nine different scenarios to be evaluated: LL, LM, LR, ML, MM, MR, RL, RM, RR, in which the first letter is for loss mechanism and the second for fusion mechanism. We also systematically evaluate a series of initial conditions that constrain the model: (1) spatial boundary of the including space, (2) loss to fusion ratio, and (3) number of unpaired bones.

Methods

Computational Model

Our computational model simulates the evolution of the skull by losses and fusions of bones (Fig. 1). The model starts each simulation with the generation of a random position vector that defines the coordinates of each initial bone in a fixed 3D Euclidean spatial boundary. We add an anatomically sound constraint: bones must preserve bilateral symmetry unless they are unpaired. Thus, paired bones are positioned with bilateral symmetry on both sides of the left–right axis at random locations; while unpaired bones are positioned along the midline in the left–right axis and randomly in the other axes. Once bones have been positioned (Fig. 2a) the Gabriel rule determines their junctions (Fig. 2b), forming a hypothetical ancestral skull network, in which each node represents a bone and each link represents a bone junction (Fig. 2c).

Fig. 1
figure 1

Computational model flowchart

Fig. 2
figure 2

Simplified 12-bone positioning and Gabriel rule connection establishment. This network will be used as the hypothetical ancestral skull network in the example of bone number reduction shown in Fig. 3. a Positioning bones at random but preserving bilateral symmetry in a 2D boundary space. Note that bones f and g are medially positioned unpaired bones. b Establishing connections among bones by applying the Gabriel rule: two bones connect if, and only if, the sphere whose diameter is the line between both bones does not have any other bone within its volume. In this 2D example, we show only the application of this rule to bone a. Circles have been drawn only for four bones (a′, b, c, and f). Following the Gabriel rule, only aa′ and ab will connect (solid line), whereas ac and af will not (dashed line). c After applying the Gabriel rule to all pairs of bones, a network among all bones is formed

Then, the number of bones is reduced iteratively, by deciding between fusion and loss. The difference between these two mechanisms is that, for losses, the space left by the removed bone is locally re-wired again using the Gabriel rule; for fusions, connections are not lost, instead the ‘new’ bone inherits these connections. Reduction in the number of bones continues while the simulated skull network has more than 15 bones, otherwise the simulation stops. The reduction between the initial number of bones (60–67, see below) to 15 bones is a reasonable range that covers the empirical sample from the skulls with the highest number of bones, 56 (Ichtyostega and Seymouria) to the skull with the fewest, 18 (Anser). Figure 3 shows a 2D toy example of the bone number reduction process starting with only 12 bones and ending with 5 (see also a full 3D animation of an actual simulation starting with 62 bones in Online Resource 1).

Fig. 3
figure 3

Simplified 2D example starting with the 12-bone ancestral network from Fig. 2. The simulation reduced the number of bones by applying two loss and two fusion events (l:f was set to 0.5) to the initial network until a 5-bone derived network was reached. Note that bilateral symmetry is always preserved. For a full 3D animation of an actual simulation starting with 62 bones, see Online Resource 1

Comparing Simulated Networks with Real Skull Networks

The evolutionary path of each hypothetical ancestral skull network was traced in the simulation by quantifying four network parameters: density of connections, characteristic path length, clustering coefficient, and heterogeneity. These network parameters have been used in previous studies as complementary estimates of morphological complexity. They quantify how many connections are actually formed and the complexity of their arrangement pattern in the skull (see Esteve-Altava et al. 2011, 2013a for detailed mathematical description and biological significance of these parameters). The density of connections is a straightforward measure of structural complexity as the proportion between the connections realized and the maximum possible, which offers a raw estimate of complexity as number of connections. The characteristic path length quantifies how far away from each other are bones in the network (this is carried out by counting the minimum number of links needed to directly or indirectly connect two bones); the characteristic path length is an estimate of complexity as efficiency for biomechanical loads and biochemical signal transfers between bones. The clustering coefficient quantifies the proportion between triangular motifs realized and the maximum possible (i.e., for each bone, how many of its neighbors are also inter-connected); the average clustering coefficient of the skull network is an estimate of complexity because it captures the integration associated to correlated connections between skull bones. Finally, heterogeneity quantifies the overall disparity in individual bone connectivity number; thus, this complexity estimate is related to the irregularity of the skull network.

Each reduction step during a simulation run generates a new derived network with fewer bones, for which the above explained network parameters are quantified. After 1,000 simulations, we computed the mean and STD for each network parameter. Results are shown as error bar diagrams representing two STD from the mean value versus number of bones. In order to evaluate the fit of each scenario to the empirical data, we counted the number of real skull networks that fall within the error bars range for all four parameter at the same time. Each skull network that meets this requirement is considered as a data match. The number of data matches for the whole empirical sample (44 skull networks, see Table 2) defines how well each combination of scenario and set of initial conditions fits the data. Combinations with 36 or more data matches (more than 80 % of fit) define what we call ‘plausible scenarios’.

Table 2 Empirical sample of skull networks (Esteve-Altava et al. 2013a)

Exploration of the Parameter Space

A full parameter space exploration has been carried out after discretizing the three initial conditions: spatial boundary of the including space, lost to fusion ratio (l:f), and number of unpaired bones (Fig. 4). Four different initial spatial boundaries (i.e., 3D Euclidean space where bones are initially positioned) were used: cubic (1 × 1 × 1); and three different rectangular prisms, long (1 × 1 × 2), flat (2 × 1 × 2), and flat and long (2 × 1 × 4). The l:f ranges from 0 for only fusions to 1 for only losses, and it was sampled in intervals of 0.1. The initial number of bones was 30 paired bones (60 total) plus 1–7 unpaired bones. In total, 2,772 combinations of scenarios and initial conditions were evaluated by running 1,000 simulations for each combination.

Fig. 4
figure 4

Parameter space definition for the three initial conditions: l:f, number of unpaired bones, and initial spatial boundary. The number of unpaired bones defines the total initial number of skull bones as 30 paired (60 total) plus 1–7 unpaired bones. For each scenario, we ran 1,000 simulations for each possible combination (2,772) in this parameter space

Results

After full exploration of the parameter space, results for each combination of scenario and set of initial conditions range from 0 to 38 data matches. Table 3 shows the number of plausible scenarios for all possible scenarios, itemized by the initial spatial boundary condition. Results indicate that all scenarios with selection of the least connected bones to be lost or fused (LL, LM, LR, ML, and RL) have less than 80 % of fit (i.e., fewer than 36 matches out of 44), which indicates that if these processes are present no plausible scenarios are generated. In contrast, when the mechanism for fusion of bones is the selection of the most connected ones, MM and RM, the greatest number of plausible scenarios occurs, 11 and 17, respectively.

Table 3 Number of plausible scenarios for all scenarios in each initial spatial boundary

Selective Scenarios

For the MM scenarios the best initial spatial boundary is the cubic one, with 7 plausible scenarios. Figure 5a shows how this highly selective scenario varies in number of matches according to l:f and initial number of unpaired bones. Higher numbers of matches occur between l:f = 0.4 (40 % loss, 60 % fusion) and l:f = 0.1 (10 % loss, 90 % fusion), and an initial number of unpaired bones between 4 and 7.

Fig. 5
figure 5

Number of matches (color bar) in the parameter space for scenarios cubic MM and flat rectangular RM. a The MM scenario shows higher matches for lower values of l:f, except for only fusions (l:f = 0), and higher number of unpaired bones. b The RM scenario shows higher matches for higher values of l:f, except for only losses (l:f = 1), and lower number of unpaired bones. The two scenarios have opposite optimal initial conditions due to differences in the process of picking bones to be lost (selection of most connected vs. random selection) and the shape of the initial spatial boundary (cubic vs. flat rectangular). Color bar and marker size indicate the number of matches (Color figure online)

Mixed Scenarios

For the RM scenarios the best initial spatial boundary is the flat rectangular one, with 7 plausible scenarios. Figure 5b shows how this mixed scenario varies in number of matches according to the l:f and initial number of unpaired bones. Higher numbers of matches occur between l:f = 0.5 (50 % loss, 50 % fusion) and l:f = 0.9 (90 % loss, 10 % fusion), and an initial number of unpaired bones between 1 and 5.

Within the RM scenarios, the best overall plausible scenario occurs for the following conditions: l:f = 0.7 (70 % loss, 30 % fusion), 2 initial unpaired bones, and a cubic spatial boundary, which shows the highest number of matches, 38. Figure 6 plots all empirical skull networks on the average values of each network parameter estimated for 1,000 simulations.

Fig. 6
figure 6

Data matches (38; 86 %) of the best overall plausible scenario for the four network parameters used to evaluate the fit of the model: density of connections, characteristic path length, clustering coefficient, and heterogeneity. Red line indicates average values of 1,000 iterations and error bars represent 2 STD (Color figure online)

Discussion

We have shown that complexity in connectivity patterns among skull bones (i.e., number of connections and their organization) increases in every evolutionary scenario of bone number reduction by loss and fusion of bones. This increase in morphological complexity varies in a wide range below and above the actual increase that we have measured previously (Esteve-Altava et al. 2013a). Thus, how each scenario fits our empirical sample depends on which processes have been involved, selective or random, as well as the fine-tuning of the initial conditions of the model: spatial boundary of the including space, loss to fusion ratio, and number of unpaired bones. The main finding in this study is that Williston’s Law is a trend guided by a structural constraint: the random loss of poorly connected bones and the selective fusion of the most connected ones. This evolutionary scenario highlights the importance of bone reduction mechanisms to explain morphological complexity (see McShea and Hordijk 2013, for a general discussion of “complexity by subtraction”).

Our results further indicate that neither the selective loss nor the selective fusion of the least connected bones can fully explain the evolution of morphological complexity in Williston’s Law (Esteve-Altava et al. 2013a). In all these scenarios (LL, LM, LR, ML, and RL) new connections appear among bones, over-increasing the complexity of the simulated skull networks; thus, no plausible generated scenarios can account for Williston’s Law under these circumstances. In contrast, two scenarios involving the selective fusion of the most connected bones produce a higher number of plausible scenarios: one with selective loss of the most connected bones (MM) and one with random bone loss (RM). Hereafter, we refer to these two types of plausible scenarios as ‘selective’ and ‘mixed’ scenarios, respectively.

In selective scenarios, loss and fusion of bones have opposite effects. The loss of the most connected bones reduces complexity because, on average, more connections are lost than re-wired among neighboring bones. On the other hand, fusion of the most connected bones increases morphological complexity because the new fused bone ends up being hyper-connected after inheriting the connections of all the bones involved in the fusion event. In these scenarios, both mechanisms are balanced for low values of l:f, that is, loss is less frequent than fusion (40 % loss or less, 60 % fusion or more). A higher frequency of fusion events buffers the decrease of complexity due to losses, and also produces some plausible scenarios with good fits to empirical data. However, the prevalence of this selective scenario would suggest that fusions have been more frequent than losses during the evolution of the skull, but mixed scenarios suggest a different story.

In mixed scenarios, loss of bones occurs at random. However, a random pick does not mean that all bones are lost with equal probability whether they are highly or poorly connected. This is because, as in empirical skull networks (Esteve-Altava et al. 2011), simulated Gabriel networks have right-skewed distributions of connections, such as binomial decay, uniform decay, exponential decay, and power-law. This indicates that most bones have fewer connections than the average, while a few bones have most of the network connections. As a consequence, poorly connected bones are more easily picked than highly connected ones, even when this is done at random. In mixed scenarios, loss of bones also increases morphological complexity. Here, the range of l:f that produces the highest number of data matches (shown in Fig. 5b) is biased toward more proportion of losses than fusions (50 % loss or more, 50 % fusion or less). Furthermore, the best overall plausible scenario simulated is a mixed scenario with l:f = 0.7 (70 % loss, 30 % fusion). As Table 1 shows, the number of lost bones compiled from mainstream literature is slightly higher than the number of fused bones in tetrapods. However, to determine if a bone has been lost rather than fused in the fossil record is very difficult. Nevertheless, the proportion of bone loss and fusion in the literature seems to better support mixed scenarios than selective ones (i.e., slightly more loss than fusion of bones). It is worth noting that for both, selective and mixed scenarios, the most extreme ratios of loss to fusion events (i.e., only loss or only fusion) show a significant decrease in number of data matches; this suggest that, whatever the scenario, both losses and fusions mechanisms are necessary to evolve complex skull networks.

The optimal initial spatial boundary is also different for selective and mixed scenarios. A cubic boundary is preferred in selective scenarios, while a long rectangular boundary is preferred in mixed scenarios. However, this result has much to do with the Gabriel rule that we used to build theoretical ancestral skull networks. Gabriel networks capture an important developmental constraint: the impossibility of creating a suture contact between distant bones. This is not due to the physical distance between ossification centers, but rather to the presence of obstacles between them: cavities, openings, organs, as well as other bones. Thus, in spaces in which one or more axes are more prevalent, such as in flat (2 × 1 × 2) and long and flat (2 × 1 × 4) prisms, the Gabriel rule imposes too strong constraints on connectivity (Matula and Sokal 1980). For instance, positioning bones along a very long axis will prevent most of the connections between them, since many bones will fall within the intersection sphere of others. As a consequence, those spaces that are more uniform in the three body axes, such as the cubic (1 × 1 × 1) and the long rectangular (1 × 1 × 2), are the least restrictive of all spatial boundaries; the latter being the optimal in mixed scenarios. Furthermore, the flat rectangular boundary resembles more the shape of the skull in basal tetrapods, such as Acanthostega, Ichthyostega or Seymouria.

The initial number of unpaired bones also shows different optimal values for each scenario. In selective scenarios, this number ranges from 4 to 7, which is above the estimated average values for the reconstructed last common ancestor using parsimony optimization (Esteve-Altava et al. 2013a). In mixed scenarios, there is a preference for lower numbers of unpaired bones, from 1 to 5, that is, below the average for the reconstructed last common ancestor. Furthermore, the best overall plausible scenario simulated is a mixed scenario with 2 initial unpaired bones, which is what is found in some basal tetrapods, such as Seymouria baylorensis (Laurin 1996) or in basal bony fishes (Claeson et al. 2007). Thus, the preference for a low number of unpaired bones further reinforces the plausibility of mixed scenarios.

In addition, the plausibility of mixed scenarios is further supported by a series of arguments. The loss of poorly connected bones, rather than of the most connected ones, has a sound biological explanation due to the many developmental and functional roles of suture connections as sites of bone growth (Rice 2008), cranial bone movements (Jaslow 1990), and strain sinks (Rafferty et al. 2003). Thus, bones with a high number of connections carry a higher developmental burden within the skull structure than poorly connected ones do. As a consequence, highly connected bones tend to be preserved during evolution, while the loss of poorly connected ones is less constrained, as is predicted given their lower burden (Riedl 1978; Wimsatt 2007; Schoch 2010; Esteve-Altava et al. 2013a, b). Finally, the higher the number of suture connections, the higher the chance of undergoing a fusion event of bones, explaining the preference for fusions of the most connected ones.

Concluding Remarks

Computational models based on networks, like the one presented here, demonstrate their usefulness in unveiling plausible mechanisms underlying evolutionary trends such as Williston’s Law. These models offer the opportunity to reproduce structural constraints and processes that might have taken place during skull evolution. Here, we have used a computational model to assess the likelihood of some bones, and not others, to be lost or fused according to their number of connections, as well as the initial conditions that facilitated these two mechanisms.

Our findings support a mixed scenario for Williston’s Law: the random loss of poorly connected bones and the selective fusion of the most connected ones. Specifically, the model suggests the following optimal evolutionary conditions: (1) an initial spatial boundary unconstrained and uniform in the three body axes, (2) a low number of initial unpaired bones, and (3), on average, bone losses should be slightly higher than bone fusions. These three conditions seem to be optimal to facilitate the evolution of the tetrapod skull in which the reduction in number of bones promotes an increase in morphological complexity.