A multivariate phylogenetic comparative method incorporating a flexible function between discrete and continuous traits

Haba, Yuki; Kutsukake, Nobuyuki

doi:10.1007/s10682-019-10011-6

A multivariate phylogenetic comparative method incorporating a flexible function between discrete and continuous traits

Methods
Published: 08 October 2019

Volume 33, pages 751–768, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Evolutionary Ecology Aims and scope Submit manuscript

A multivariate phylogenetic comparative method incorporating a flexible function between discrete and continuous traits

Download PDF

1142 Accesses
5 Citations
9 Altmetric
Explore all metrics

Abstract

One major challenge of using the phylogenetic comparative method (PCM) is the analysis of the evolution of interrelated continuous and discrete traits in a single multivariate statistical framework. In addition, more intricate parameters such as branch-specific directional selection have rarely been integrated into such multivariate PCM frameworks. Here, originally motivated to analyze the complex evolutionary trajectories of group size (continuous variable) and social systems (discrete variable) in African subterranean rodents, we develop a flexible approach using approximate Bayesian computation (ABC). Specifically, our multivariate ABC-PCM method allows the user to flexibly model an underlying latent evolutionary function between continuous and discrete traits. The ABC-PCM also simultaneously incorporates complex evolutionary parameters such as branch-specific selection. This study highlights the flexibility of ABC-PCMs in analyzing the evolution of phenotypic traits interrelated in a complex manner.

Working with the Tree of Life in Comparative Studies: How to Build and Tailor Phylogenies to Interspecific Datasets

Phylogenetic Prediction to Identify “Evolutionary Singularities”

Evolutionary Models and Phylogenetic Signal Assessment via Mantel Test

Article 19 October 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Phylogenetic comparative methods (PCMs) provide a powerful statistical framework for investigating the patterns and processes of trait evolution (Felsenstein 1985; Harvey and Pagel 1991; Nunn 2011; Garamszegi 2014). The recent development of PCMs permits analyses of biologically interrelated discrete and continuous variables in a single multivariate statistical framework (Table 1). The development of such multivariate PCMs is crucial for two reasons. First, conducting two separate univariate analyses for a continuous trait and a discrete trait is redundant when the two variables are interrelated. Second, and more importantly, separate univariate analyses will miss the opportunity to consider important biological links between these traits.

Table 1 Summary of previous methods and the originality of this study

Full size table

The threshold model (Felsenstein 2005, 2012) was among the first PCMs to fully combine both discrete and continuous traits. The idea of the threshold model was originally developed in quantitative genetics by Wright (1934) to understand how multiple underlying genetic loci contribute to categorical traits such as the number of digits in guinea pigs. The threshold model assumes an unobservable continuous trait called ‘liability’ that underlies a discrete trait of interest. Because the liability is a continuous trait, Brownian motion has been conventionally used to model its evolution. Then, the state of the discrete trait of interest is determined by whether the liability trait value is below or above a particular threshold. This model allows users to incorporate both continuous and discrete traits in a straightforward way, as well as to estimate the covariance between the liability trait and other continuous traits of interest. However, because liability is unobservable, it is impossible to directly infer a latent function between the discrete trait and other observable continuous traits of interest. Moreover, although it is convenient to assume that the discrete trait is determined by a simple threshold (which can be treated as a probit link function in a framework of a phylogenetic generalized linear mixed model, PGLMM; Hadfield 2015; also see Ives and Garland 2014), it is unclear if the assumption is always biologically valid. At the very least, it is desirable for researchers to be able to assume other forms of latent functions (Fig. 1).

In addition to Felsenstein’s threshold model, other methods that can link discrete traits and continuous traits have been proposed (Ives and Garland 2010, 2014; Hadfield and Nakagawa 2009; Hadfield 2015; see Table 1). For example, Ives and Garland (2010) developed a phylogenetic logistic regression to test the effects of observable continuous independent traits on a discrete dependent trait (phylogenetic logistic regression, Ives and Garland 2010). Hadfield and Nakagawa proposed a phylogenetic generalized linear mixed model employing a Bayesian approach (MCMCglmm, Hadfield and Nakagawa 2009; Hadfield 2015; also see Ives and Garland 2014 for other approaches and a comparison of their performance). These approaches successfully extended traditional linear models to enable nonlinear link functions (logit function in a phylogenetic logistic regression; logit or probit functions in MCMCglmm) between discrete and continuous traits. Notably, a probit-GLM can be mathematically equivalent to the threshold model (see Hadfield 2015). However, limitations to these models still exist. For example, the model by Ives and Garland (2010) assumes that the continuous traits are known values measured empirically and does not allow the continuous traits to evolve along the tree (see Felsenstein 2012 and Hadfield 2015 for models that do not have such assumption; but also see Ives and Garland 2014 for analyses of the relatively small effects of phylogenetic signal on continuous traits). Moreover, the link function in the models is predetermined. In most cases, equipped functions (e.g., logit or probit functions) are those that have useful mathematical properties for analyzing a relationship between discrete and continuous traits. Still, cases may exist in which it is biologically valid to establish a more complicated function between discrete and continuous traits. Therefore, it is ideal to prepare a framework that enables researchers to examine a flexible function to test specific hypotheses of interest.

Furthermore, analyses will be even more complex if additional variables of interest are included, such as the presence of branch-specific directional selection (Kutsukake and Innan 2013, 2014). This complication often prevents the description of a likelihood function of the model, which most conventional PCMs require (e.g., the maximum likelihood, Bayesian approach). The aforementioned PCMs (Felsenstein 2005, 2012; Ives and Garland 2010, 2014; Hadfield and Nakagawa 2009; Hadfield 2015) cannot incorporate branch-specific directional selection into their models.

Here, we propose a PCM using approximate Bayesian computation (ABC) to overcome the difficulties discussed above (Fig. 2). ABCs have been shown to facilitate flexible analyses in a comparative framework and therefore have increasingly been applied to PCMs with intricate evolutionary scenarios (Bokma 2010; Slater et al. 2012; Kutsukake and Innan 2013, 2014; Janzen et al. 2015; Harano and Kutsukake 2018). Briefly, an ABC-PCM estimates parameters of interest by simulating phenotypic evolution without a likelihood function (Beaumont 2010; Csillery et al. 2010). The proposed parameters are accepted only when the simulated data and real data are similar, and the accepted data comprise posterior distributions of parameters. Thanks to this flexibility, ABC-PCMs enable researchers to test evolutionary models whose likelihood function is mathematically intractable.

Our initial motivation for extending ABC-PCMs was to analyze a heterogeneous evolutionary pattern between group size and social system in African subterranean mole rats (family Bathyergidae; Table 2). In this clade, species have varied social structures that span solitary, social, and eusocial states. Eusociality has independently evolved twice: in naked mole rats (Heterocephalus glaber) and in Damaraland mole rats (Fukomys damarensis) (Jarvis 1981; Sherman et al. 1991; Bennett and Faulkes 2000; Faulkes and Bennett 2013). We hypothesized that the probabilities of evolutionary transitions among eu/social and solitary states (discrete trait) change depending on group size (continuous trait). To capture all of these complex evolutionary features, we are required to incorporate (1) the evolutionary trajectory of sociality and group size, (2) the parameters for the trait-dependent latent functions, and (3) the presence of branch-specific directional selection on group size in the two eusocial species. Certain aspects of these features can be analyzed using previous methods; however, it is not possible to incorporate all factors in a single model with those methods. Then, based on the example data, we discuss how this ABC-PCM can be used for inferring other similar complex evolutionary processes.

Table 2 Data used in this study

Full size table

Methods

ABC-PCM

Our ABC-PCM extends a previously developed framework (Fig. 2; Kutsukake and Innan 2013, 2014) designed to analyze heterogeneous evolutionary modes whose likelihood is not straightforward to describe. We assume knowledge of the species tree Ψ, which consists of information on tree shape and the lengths of all branches in the topology. We also assume that trait data at the tips of the tree D have been observed.

In bivariate analyses of continuous and discrete traits, several causational patterns are possible. For example, a discrete trait can be determined by a continuous trait that evolves on its own, or vice versa. Alternatively, it is also possible to assume no a priori causality between the continuous or discrete traits (Hadfield and Nakagawa 2009; Ives and Garland 2010). Any of these cases can be modeled by the ABC-PCM framework proposed in this paper. Moreover, this framework can be used regardless of whether a given value is latent (e.g., liability in the threshold model) or measurable.

We hereafter consider a simple case of interrelated evolution in which a continuous trait determines a discrete trait. In our example study of African subterranean mole rats, both traits were measurable.

Briefly, our ABC-PCM process is implemented as follows (see Fig. 2 for a visual schematic).

Let Λ be the parameter set to be estimated based on a hypothesis.

(Step 1) Determine prior distributions for all parameters in Λ. If prior biological knowledge is available, it can be used to set informative (i.e., strong) prior distributions.
(Step 2) Parameters used in the simulation (Λ’) are randomly generated from the prior distributions.
(Step 3) Using Λ′, the trait evolution is simulated on the phylogeny Ψ. In the current model, the trait simulation has two parts corresponding to continuous and discrete trait evolution (Fig. 1).
- (Step 3a) The continuous trait evolves via Brownian motion (Felsenstein 1985; note that other evolutionary models can be used; see Kutsukake and Innan 2013).
- (Step 3b) Then the discrete trait is determined according to the probability of transition between two states (A and B) as a function of the continuous trait. Note again that this setting can be relaxed such that causation between continuous and discrete traits can be flexibly changed according to a hypothesis of interest. Here, p(x) is the probability function that state A changes to B given the continuous trait x. Similarly, q(x) is the probability that state B changes to A given the continuous trait x (Fig. 1).
(Step 4) Calculate the likelihood by comparing simulated data Θ with the real data D and determine whether the parameter set is accepted or rejected. A joint probability (full likelihood) for the comparison of n species can be calculated. In most cases, this probability is difficult to obtain. In such cases, a composite likelihood that is proportional to the full likelihood can be used as an approximate proxy. Intraspecific variation in the trait data can be considered by assuming a certain distribution for the trait when calculating the likelihood.

Then the acceptance or rejection of the parameters can be determined based on the likelihood. Several methods of judgment exist (Marjoram et al. 2003; Marjoram and Tavare 2006).
(Step 5) Repeat Steps 2–4 until a sufficient number of parameters Λ′ is accepted. Then, posterior distributions and credible intervals can be estimated.

Although the fundamental structure of ABC is straightforward, the number/choice of summary statistics and the width of tolerance for judging the acceptance or rejection of simulated data at Step 4 are controversial, and there is no general consensus on the choice of summary statistics and tolerance (Beaumont et al. 2002; Csillery et al. 2010; Leuenberger and Wegmann 2010). In this study, we use a combination of perfect match and joint probability as summary statistics (see “Acceptance and summary statistics” section below for details).

Application: social evolution in African mole rats

One example of complex evolution where continuous and discrete traits are interrelated is social evolution. Sociality in animals can be characterized using a discrete classification based on mating and/or social systems (e.g., Shultz et al. 2011). Average group size of a species, a continuous trait, is also an important variable in characterizing sociality in animals (e.g., comparative analyses: Faulkes et al. 1997; Sheehan et al. 2015). Loss of sociality is correlated with a decrease in group size (secondary loss of sociality; Wcislo and Danforth 1997; Beauchamp 1999; Sheehan et al. 2015). Likewise, cooperative social systems are likely to appear as the group size increases (e.g., limited dispersal by ecological constraints: Emlen 1982; Duffy and Macdonald 2010). Thus, the complex social evolution across African mole rats is an ideal system to test our framework. Here, species-specific sociality and the mean group size for each species (x) are the target traits.

Dataset

We surveyed the literature for field data on the sociality (solitary, social, or eusocial), mean group size, and/or distribution of group sizes in each species (Table 2). When the distribution of group size was available, we calculated its mean and standard deviation for each species. For solitary species, we regarded the mean group size as one. In reality, however, their group sizes can deviate from this value, because females may have dependent, pre-dispersal offspring; thus, the group size of solitary species can also have a distribution. Therefore, we incorporated a realistic value for the variance of solitary species (Table 2). We used the phylogeny presented in Faulkes et al. (2004), who used mitochondrial genes 12S rRNA and cyt b. The mean value of the estimated divergent interval in millions of years was used as the length of each branch.

Parameter set and trait simulation

The trait simulation included six parameters to be estimated: the group size of the most recent common ancestor (MRCA) (θ), the rate of evolution (µ), the parameters of directional selection (k_n and k_d), and the parameters for latent functions (a and b, or c and d, depending on the model used; see next section). Table 3 shows the notation and settings of the prior distributions.

Table 3 Parameters estimated in this study

Full size table

The evolutionary process of the continuous trait was as follows. First, we generated the values of six parameters from the prior distributions. Then, based on the proposed value of x at the root (i.e., the group size of the MRCA, θ), we randomly determined the state (i.e., sociality) of the MRCA by either p(x) or q(x). The number of evolutionary events that change traits (i.e., either an increase or decrease in group size) on each branch was respectively modeled as a random number from a Poisson distribution with a mean equaling the product of the evolutionary rate, µ (trait change per million years), and branch length τ (millions of years). The degree of trait change by one evolutionary event is a random value from an exponential distribution, with its mean being an arbitrary value φ (set to 0.01). φ corresponds to the average effect of one change on the trait value. We used an exponential distribution because a change in trait due to one evolutionary event would have small effects on traits in most cases, and large effects less frequently. Thus,

$$\begin{aligned} \Delta^{ + } & = {\text{Pois}}(\upmu \times\uptau_{\text{i}} ) \times \exp \left( \varphi \right) \ldots {\text{trait}}\,{\text{increase}}\,{\text{in}}\,{\text{a}}\,{\text{branch}} \\ \Delta^{-} & = {\text{ Pois}}(\mu \times \tau_{i} ) \times \exp \left( \varphi \right) \ldots {\text{trait}}\,{\text{decrease}}\,{\text{in}}\,{\text{a}}\,{\text{branch}} \\ \end{aligned}$$

where τ_i is the length of the ith branch. The total trait change on the branch is calculated as Δ⁺ − Δ⁻.

Changes in group size caused by one evolutionary event would depend on the initial group size. For example, an increase in the group size by one individual will have different biological meanings in solitary and eusocial species. Therefore, we transformed the value of group size to a natural logarithmic scale during trait simulation. Because the mean group size x cannot be less than one, we enforced a lower bound of one. That is, when a trait change resulted in a value less than one on a branch, the change was not implemented.

We also tested whether there were selective pressures for increasing group size in the two branches leading to eusocial species. In our ABC-PCM model, the parameter k represents directional selection (Kutsukake and Innan 2013, 2014). The selection parameter k biases the number of positive or negative trait changes such that the number of trait changes Pois(µ × τ_i) × exp(φ) is multiplied or divided by k, respectively (see Kutsukake and Innan 2013 for more details). Thus,

$$\begin{aligned} \Delta^{ + } & = {\text{Pois}}(\mu \times \tau_{i} \times k) \times \exp \left( \varphi \right) \ldots {\text{trait}}\,{\text{increase}}\,{\text{in}}\,{\text{a}}\,{\text{branch}} \\ \Delta^{-} & = {\text{Pois}}(\mu \times \tau_{i} /k) \times \exp \left( \varphi \right) \ldots {\text{trait}}\,{\text{decrease}}\,{\text{in}}\,{\text{a}}\,{\text{branch}} \\ \end{aligned}$$

and again, the total trait change on the branch is calculated as Δ⁺ − Δ⁻.

When k = 1, the evolutionary mode of trait evolution is asymptotically identical to the Brownian motion. On the other hand, a significant departure of k from 1 is used as a signature of directional selection.

In this analysis, we set selection parameters for the branches for both naked mole rats and Damaraland mole rats as k_n and k_d, respectively (Table 3), to test our hypothesis that branch-specific selective pressure has increased group size in eusocial species (Jarvis and Bennett 1993; Young et al. 2015). When the 95% CI of the posterior distribution of k was larger than 1, it was considered a signature of directional selection for larger group size.

Models of latent evolutionary functions

At each evolutionary step, the discrete trait value was determined by the continuous trait value according to the latent functions (see Step 3b of ABC-PCM). Because the shape of this function between sociality and group size was unknown, we took advantage of the flexibility of our ABC-PCM framework and tested two examples of evolutionary models with different transition functions, p(x) and q(x). We note that this framework is not limited to these two models but can use any function, depending on the focal biological traits and hypotheses of interest. For simplicity, we set the latent evolutionary functions such that p(x) + q(x) = 1, but this is not required if one desires otherwise.

Model 1: logistic function

In this model, we used a logistic function as the latent evolutionary function, which was similar to a previous model using a generalized linear model for a binary dependent term (Ives and Garland 2010). The transition functions p(x) and q(x) were defined as follows:

$$\begin{aligned} p\left( {\text{x}} \right) & = 1-1/\left( {1 + \exp \left( {-a \times \left( {x-b} \right)} \right)} \right) \\ q\left( {\text{x}} \right) & = 1/\left( {1 + \exp \left( {-a \times \left( {x-b} \right)} \right)} \right) \\ \end{aligned}$$

The parameter a determined the curvature of p(x) and q(x), i.e., the effects of the group size on sociality (Fig. 1). When a is equal to 0, p(x) and q(x) are flat and the transition between social and solitary is independent of group size (Fig. 1, right). When a is positive, a species with large group size is likely to be social; when a is negative, a species with large group size is likely to be solitary. Importantly, when |a| is large enough, the transition between social and solitary is determined by a certain group size, which virtually behaves like a step function (Fig. 1, left). The parameter b is the x value at which p(x) and q(x) equal 0.5, and determines the group size at which the probability of transitioning from solitary to social becomes larger than the probability of the reverse. Because the curvature and midpoint of p(x) and q(x) (i.e., a and b) were unknown a priori, we set broad prior distributions of a and b (Table 3).

Model 2: exponential function

In the second model, we used an exponential function as the latent evolutionary function. We set the transition functions p(x) and q(x) as follows:

$$\begin{aligned} & When\,{\text{d}} > 0\quad \left\{ {\begin{array}{*{20}l} {p(x) = \left\{ {\begin{array}{*{20}l} {1 - \exp ( - c \times (x - d)),} \hfill & {x > d} \hfill \\ {0,} \hfill & {x \le d} \hfill \\ \end{array} } \right.} \hfill \\ {q(x) = \left\{ {\begin{array}{*{20}l} {\exp ( - c \times (x - b)),} \hfill & {x > d} \hfill \\ {0,} \hfill & {x \le d} \hfill \\ \end{array} } \right.} \hfill \\ \end{array} } \right. \\ & When\,d \le 0\quad \left\{ {\begin{array}{*{20}l} {p(x) = \left\{ {\begin{array}{*{20}l} {1 - \exp (c \times (x - d)),} \hfill & {x > d} \hfill \\ {0,} \hfill & {x \le d} \hfill \\ \end{array} } \right.} \hfill \\ {q(x) = \left\{ {\begin{array}{*{20}l} {\exp (c \times (x - d)),} \hfill & {x > d} \hfill \\ {0,} \hfill & {x \le d} \hfill \\ \end{array} } \right.} \hfill \\ \end{array} } \right. \\ \end{aligned}$$

These functions describe an exponential decrease or diminishing increase in the transition probability, and a given species with a mean group size of one is always a solitary species. Similar to a and b in model 1, c and d determine the curvature of the exponential function and the point at which the exponential decrease/increase begins, respectively. Note that one side of the exponential function used here is asymptotically equal to either 0 or 1 when x is sufficiently large, but that of the other side is not; therefore, we set the probability of state change as 0 or 1 when x is smaller than d, at which the functions reach 0 or 1, respectively.

Acceptance and summary statistics

Among ABC-PCMs, no clear consensus has been reached concerning the number/choice of summary statistics or the width of tolerance for judging the acceptance or rejection of simulated data. In the present analyses, we used two summary statistics to assess the fit of the simulated data to actual data. First, for the discrete variable (i.e., sociality), we only accepted simulations in which the simulated data were a perfect match to the real data. For the continuous variable, we used a conventional method that uses a joint probability (Kutsukake and Innan 2013, 2014); we first calculated the probability that the real trait value is gained under a simulated trait value for all 10 species. When calculating the probability, group size (back-transformed to an arithmetic scale from a log-transformed value) was assumed to be normally distributed with a mean and standard deviation equal to those of the real data. Then we calculated the product of those 10 probabilities and used the joint probability as a summary statistic. Simulated parameter sets were accepted in proportion to the joint probability. For example, if the joint probability was 0.8 for a simulation, the parameter set of the simulation has an 80% chance of being accepted. Because the joint probability can be considered a direct likelihood, this method is superior to the standard ABCs, which depend on the choice of summary statistics and arbitrary tolerance (discussed in Kutsukake and Innan 2013, 2014).

In total, 500 parameter sets were collected for estimating posterior distributions. All results were visualised using R version 3.5.3 (R Core Team 2017). The simulation code was written in C and is available in a public repository (https://github.com/YukiHaba/ABC-PCM).

Results

Model 1: logistic functions

When the latent evolutionary function was logistic, the social system of the MRCA was estimated to be solitary in 61.4% of the cases (Fig. 3a, node A). The accepted functions of the transition probabilities varied widely (Fig. 4a), but showed several consistent patterns. First, the curvature parameter of functions, a, was positive in all cases (Table 3, Fig. 4b; see Supplementary material for a low risk of type I error), indicating that the probability of transition from a solitary to social species increased as group size increased. In addition, p(x) typically increased rapidly around x = 2 to 4, which corresponded to the peak of the posterior distribution of b (Table 3, Fig. 4c).

At each internal node before reaching the common ancestor of social species, the inferred social system was predominantly solitary (Fig. 3a). The common ancestor of solitary species was likely to be solitary (Fig. 3a, node C), and a similar pattern was evident at the node of the common ancestor of B. janetta and B. suillus. By contrast, at the nodes leading to the clade of Fukomis, predominantly social states were inferred (Fig. 3a, node B). The estimated group sizes were consistent with the inferred social systems at internal nodes (Fig. 2b–d); the more social the system was likely to be, the larger the group size.

We detected marginal directional selection for increasing group size in the branch that led to naked mole rats (the proportion of k_n < 1 was 6.4%, Fig. 3f). By contrast, we did not detect significant directional selection in the branch leading to another eusocial species, Damaraland mole rats (the proportion of k_d < 1 was 15.8%, Fig. 3g). Based upon this non-significant result for k_d, we repeated the estimation after excluding k_d (i.e., directional selection was assumed only for the branch leading to naked mole rats), but the results did not change qualitatively (data not shown).

Model 2: exponential functions

With the exponential latent evolutionary functions, the MRCA was inferred to be solitary in 60.4% of the cases (Fig. 5a, node A), similar to model 1. Again, although the accepted functions of the transition probabilities between social and solitary varied (Fig. 6a), the curvature of p(x), c, was always positive (Table 3, Fig. 5b; see Supplementary material for a low risk of type I error).

Group size and sociality at each internal node were almost identical to those in model 1. Within the clade of Fukomis, we inferred predominantly social states (Fig. 5a, node B) and relatively large group sizes (Fig. 5c). Within the clade of Bathyergus and Georychus, in contrast, the social system at each node was consistently solitary (Fig. 5a, node C) and the group size was around 2–4 (Fig. 5d).

We detected significant directional selection for increasing group size in the branch leading to naked mole rats (the proportion of k_n < 1 was 3.4%; Fig. 5f). However, we did not detect significant directional selection in the branch to eusocial Damaraland mole rats (the proportion of k_d < 1 was 16.2%; Fig. 5g). Again, based on this non-significant result for k_d, we repeated the estimation after excluding k_d (i.e., directional selection was assumed only for the branch leading to naked mole rats); however, the result did not change qualitatively (data not shown).

Discussion

This study proposes an extension of a previously described ABC-PCM (Kutsukake and Innan 2013, 2014) to analyze complex evolutionary scenarios in which discrete and continuous traits are biologically interwoven. This is the first ABC-PCM framework that allows simultaneous analyses of the interdependent evolution of discrete and continuous traits. In addition, our model offers at least two advantages over other existing methods.

First, this study incorporated a feature that has rarely been included in a PCM framework: branch-specific directional selection (Kutsukake and Innan 2013, 2014; Harano and Kutsukake 2018). This inclusion was possible thanks to the flexibility of the ABC-PCM, which does not require the mathematical expression or analytic solution of a likelihood function. Second, our multivariate model can incorporate flexible user-defined functions that describe the evolutionary relationship between continuous and categorical traits. Previous methods can incorporate nonlinear functions such as logit and probit functions (Ives and Garland 2010; Hadfield and Nakagawa 2009; Hadfield 2015), but the choice of a latent function is less flexible than in our present method. Another well-established method for simultaneously analyzing both continuous and categorical traits is the threshold model (Felsenstein 2005, 2012). The threshold model is equivalent to a phylogenetically controlled linear model with a probit latent function (see Supplementary Material in Hadfield 2015 for a useful graphical representation). Although the assumption is mathematically convenient for analyses, it may be an oversimplification of the evolutionary relationship between the continuous and categorical traits of interest.

Applying this new approach to data on African subterranean rodents, we estimated the intricate evolutionary trajectory of sociality and group size. Two models with different latent evolutionary functions were tested: logistic and exponential. Although the exponential function has not been used in previous PCMs, we believe that this function is suitable for our study species for two reasons. First, group size should have a minimum value of one, and second, a species whose group size is one must be a solitary species. Although these functions have different mathematical characteristics and are not nested, they have common features such as a monotonic increase and an asymptotical approach from one to zero. Potentially due to these common features, the estimated parameters showed similar posterior distributions. In both models the social state of the MRCA was not decisively solitary or social, and its estimated group size varied (node A in Figs. 3a, 5a).

The latent evolutionary functions p(x) and q(x) had a consistent pattern in both models (Figs. 4, 6). Namely, most of the accepted functions had a steep transition around a group size of 2 to 4. This indicates that this particular window of group size was an evolutionary tipping point of the transition between solitary and social states. This study is the first to quantitatively infer the range of group size that is crucial to the evolution of sociality in this clade (Fig. 6).

We tested differential selective pressure on group size in the branches leading to the two eusocial species in this clade (k_n and k_d); we detected directional selection (k_n) for larger group size in the branch leading to naked mole rats in model 2 (Fig. 5f) but not in model 1 (Fig. 3f). In both models, a selective pressure was not detected in the branch leading to Damaraland mole rats (k_d, Figs. 3g, 5g). Thus, although both species are deemed eusocial, the two species may have undergone qualitatively different evolutionary paths (Burda et al. 2000). Future studies should explore the differences in the ecological and evolutionary causes of the evolution of these two eusocial species.

One critical limitation of this case study was the relatively low sample size (10 species; see Supplementary material). This sample size is caused by three inevitable limitations. First, the family Bathyergidae is a monotypic group composed of less than 20 OTUs, which is an inevitable constraint on sample size. Second, detailed data for group size are not available for all species of this family, which further limits the available data. Finally, and most importantly, expanding the number of study species to include closely related non-subterranean species is questionable, as it is highly likely that the underground ecological niche has affected predation pressure and consequently sociality. More broadly, it remains unclear whether heterogeneity of traits other than the traits of interest would affect biological interpretation in PCM studies based on a large sample size. It is generally believed that larger sample sizes (a larger number of species, or more precisely, a larger number of evolutionary transitions) would produce a more powerful test in PCMs. While this is true, it remains largely undiscussed how the inclusion of species that have fundamentally different ecological or behavioral traits would affect the results. In this sense, in addition to a large-scale comparison, it is important to conduct PCMs that focus on a taxon that shares fundamental traits, even if the sample size is not very large.

One major challenge in ABC-PCMs is that they are computationally intensive. In our study, for example, it took several days to accept one parameter set (MacPro, OS 10.6.7, 2 × 2.93 GHz Quad-Core Intel Xeon; also see Kutsukake and Innan 2013). To compare the performance of our model to an available method, we ran a bivariate model on our dataset and phylogeny using MCMCglmm (Hadfield and Nakagawa 2009; Hadfield 2015) assuming that group size and sociality follow Gaussian and probit (“threshold”) families, respectively. For simplicity, we ran the program without incorporating intraspecific variation, branch-specific selection, or the bounding of group size. The MCMCglmm results qualitatively agreed with our results (e.g., 95% CI of the covariance between group size and sociality > 0), yet the process took only a few minutes to an hour on a standard laptop, depending on the parameters (results not shown). Thus, the scope of the present ABC-PCM algorithm is currently limited to analyses of relatively small to moderate numbers of species (29 species in Harano and Kutsukake 2018), but it is not suitable for a large number of species. Improvements in both efficient simulation algorithms and computational power will enable analyses based on a larger number of species.

In summary, we have developed a flexible multivariate ABC-PCM that has great potential for testing biologically intricate scenarios of trait evolution. Although our analyses considered the evolutionary transition of sociality in a relatively small number of species, our method can be applied to other topics and to larger datasets. Despite the fact that our analyses focused on a simple case in which a continuous trait affects the state of a discrete trait, other causational patterns can be dealt with by a flexible setting of an evolutionary simulation. Furthermore, our ABC-PCM framework can also be extended to model more complex evolutionary trajectories, such as asymmetric transitions between states and/or more than two states of discrete traits with different transition functions. Thus, our method allows evolutionary biologists to explore various hypotheses of interest concerning the evolution of interrelated traits.

References

Beauchamp G (1999) The evolution of communal roosting in birds: origin and secondary losses. Behav Ecol 10:675–687
Article Google Scholar
Beaumont MA (2010) Approximate Bayesian computation in evolution and ecology. Ann Rev Ecol Evol Syst 41:379–406
Article Google Scholar
Beaumont MA, Zhang W, Balding DJ (2002) Approximate Bayesian computation in population genetics. Genetics 162:2025–2035
Article PubMed PubMed Central Google Scholar
Bennett NC, Faulkes CG (2000) African mole-rats: ecology and eusociality. Cambridge University Press, Cambridge
Google Scholar
Bennett NC, Jarvis JUM, Cotterill FPD (1994) The colony structure and reproductive biology of the afrotropical Mashona mole-rat, Cryptomys darlingi. J Zool 234:477–487
Article Google Scholar
Bokma F (2010) Time, species, and separating their effects on trait variance in clades. Syst Biol 59:602–607
Article PubMed Google Scholar
Burda H, Kawalika M (1993) Evolution of eusociality in the Bathyergidae: the case of the giant mole-rat (Cryptomys mechowi). Naturwissenschaften 80:235–237
Article CAS PubMed Google Scholar
Burda H, Honeycutt RL, Begall S, Locker-Grütjen O, Scharff A (2000) Are naked and common mole-rats eusocial and if so, why? Behav Ecol Sociobiol 47:293–303
Article Google Scholar
Csillery K, Blum MG, Gaggiotti OE, François O (2010) Approximate Bayesian computation (ABC) in practice. Trends Ecol Evol 25:410–418
Article PubMed Google Scholar
Duffy JM, Macdonald KS (2010) Kin structure, ecology and the evolution of social organization in shrimp: a comparative analysis. Proc R Soc B 277:575–584
Article PubMed Google Scholar
Emlen ST (1982) The evolution of helping. I. An ecological constraints model. Am Nat 119:29–39
Article Google Scholar
Faulkes CG, Bennett NC (2013) Plasticity and constraints on social evolution in African mole-rats: ultimate and proximate factors. Philos. T. R. Soc. B. 368:20120347
Article Google Scholar
Faulkes CG, Bennett NC, Bruford MW, O’brien HP, Aguilar GH, Jarvis JUM (1997) Ecological constraints drive social evolution in the African mole–rats. Proc R Soc B 264:1619–1627
Article CAS PubMed PubMed Central Google Scholar
Faulkes CG, Verheyen E, Verheyen W, Jarvis JUM, Bennett NC (2004) Phylogeographical patterns of genetic divergence and speciation in African mole-rats (Family: Bathyergidae). Mol Ecol 13:613–629
Article CAS PubMed Google Scholar
Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125:1–15
Article Google Scholar
Felsenstein J (2005) Using the quantitative genetic threshold model for inferences between and within species. Philos Trans R Soc B 360:1427–1434
Article Google Scholar
Felsenstein J (2012) A comparative method for both discrete and continuous characters using the threshold model. Am Nat 179:145–156
Article PubMed Google Scholar
Garamszegi LZ (2014) Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice. Springer, Heidelberg
Book Google Scholar
Hadfield JD (2015) Increasing the efficiency of MCMC for hierarchical phylogenetic models of categorical traits using reduced mixed models. Methods Ecol Evol 6(6):706–714
Article Google Scholar
Hadfield JD, Nakagawa S (2009) General quantitative methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters. J Evol Biol 23:494–508
Article Google Scholar
Harano T, Kutsukake N (2018) Directional selection in the evolution of elongated upper canines in clouded leopards and sabre-toothed cats. J Evol Biol 31:1268–1283
Article PubMed Google Scholar
Harvey PH, Pagel MD (1991) The comparative method in evolutionary biology. Oxford University Press, Oxford
Google Scholar
Ives AR, Garland T (2010) Phylogenetic logistic regression for binary dependent variables. Syst Biol 59:9–26
Article PubMed Google Scholar
Ives AR, Garland T (2014) Phylogenetic regression for binary dependent variables. In: Garamszegi LZ (ed) Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice. Springer, Heidelberg, pp 231–261
Chapter Google Scholar
Janzen T, Hoehna S, Etienne RS (2015) Approximate Bayesian computation of diversification rates from molecular phylogenies: introducing a new efficient summary statistic, the nLTT. Methods Ecol Evol 6:566–575
Article Google Scholar
Jarvis JUM (1981) Eusociality in a mammal: cooperative breeding in naked mole-rat colonies. Science 212:571–573
Article CAS PubMed Google Scholar
Jarvis JUM, Bennett NC (1993) Eusociality has evolved independently in two genera of bathyergid mole-rats—but occurs in no other subterranean mammal. Behav Ecol Sociobiol 33:253–260
Article Google Scholar
Jones K, Bielby J, Cardillo M, Fritz S (2009) PanTHERIA: a species-level database of life history, ecology, and geography of extant and recently extinct mammals. Ecology 90:2648
Article Google Scholar
Kutsukake N, Innan H (2013) Simulation-based likelihood approach for evolutionary models of phenotypic traits on phylogeny. Evolution 67:355–367
Article PubMed Google Scholar
Kutsukake N, Innan H (2014) Detecting phenotypic selection by approximate Bayesian computation (ABC) in phylogenetic comparative methods. In: Garamszegi LZ (ed) Modern phylogenetic comparative methods and their application in evolutionary biology: concepts and practice. Springer, Heidelberg, pp 409–424
Chapter Google Scholar
Leuenberger C, Wegmann D (2010) Bayesian computation and model selection without likelihoods. Genetics 184:243–252
Article PubMed PubMed Central Google Scholar
Lewis PO (2001) A likelihood approach to estimating phylogeny from discrete morphological character data. Syst Biol 50:913–925
Article CAS PubMed Google Scholar
Marjoram P, Tavare S (2006) Modern computational approaches for analysing molecular genetic variation data. Nat Genet Rev 7:759–770
Article CAS Google Scholar
Marjoram P, Molitor J, Plagnol V, Tavare S (2003) Markov chain Monte Carlo without likelihoods. Proc Natl Acad Sci USA 100:15324–15328
Article CAS PubMed PubMed Central Google Scholar
Nunn CL (2011) The comparative approach in evolutionary anthropology and biology. University of Chicago Press, Chicago
Book Google Scholar
Pagel M (1994) Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proc R Soc B 255:37–45
Article Google Scholar
R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Google Scholar
Sheehan MJ, Botero CA, Hendry TA, Sedio BE, Jandt JM, Weiner S, Toth AL, Tibbetts EA (2015) Different axes of environmental variation explain the presence vs. extent of cooperative nest founding associations in Polistes paper wasps. Ecol Lett 18:1057–1067
Article PubMed PubMed Central Google Scholar
Sherman PW, Jarvis JUM, Alexander RD (1991) The biology of the naked mole-rat. Princeton University Press, Princeton
Google Scholar
Shultz S, Opie C, Atkinson QD (2011) Stepwise evolution of stable sociality in primates. Nature 479:219–222
Article CAS PubMed Google Scholar
Sichilima AM, Faulkes CG, Bennett NC (2008) Field evidence for a seasonality of reproduction and colony size in the Afrotropical giant mole-rat Fukomys mechowii (Rodentia: Bathyergidae). Afr Zool 43:144–149
Article Google Scholar
Sichilima AM, Bennett NC, Faulkes CG, Bronner GN (2011) Field evidence for colony size and aseasonality of breeding and in Ansell’s mole-rat, Fukomys anselli (Rodentia: Bathyergidae). Afr Zool 46:334–339
Google Scholar
Slater GJ, Harmon LJ, Wegmann D, Joyce P, Revell LJ, Alfaro ME (2012) Fitting models of continuous trait evolution to incompletely sampled comparative data using approximate Bayesian computation. Evolution 66:752–762
Article PubMed Google Scholar
Van Daele PA, Blonde P, Stjernstedt R, Adriaens D (2013) A new species of African mole-rat (Fukomys, Bathyergidae, Rodentia) from the Zaire-Zambezi watershed. Zootaxa 3636:171–189
Article PubMed Google Scholar
Wcislo WT, Danforth BN (1997) Secondarily solitary: the evolutionary loss of social behavior. Trends Ecol Evol 12:468–474
Article CAS PubMed Google Scholar
Wright S (1934) An analysis of variablity in number of digits in an inbred strain of guniea pigs. Genetics 19(6):506–536
Article CAS PubMed PubMed Central Google Scholar
Young AJ, Jarvis JUM, Barnaville J, Bennett NC (2015) Workforce effects and the evolution of complex society in wild Damaraland mole rats. Am Nat 186:302–311
Article PubMed Google Scholar

Download references

Acknowledgements

We would like to thank Dustin R. Rubenstein, Rafael Maia, and Margaret E. O’Brien at Columbia University and Jessica Zung at Princeton University for useful comments and advice. Masahito Tsuboi at University of Oslo also provided useful insights on the earlier version of manuscript. Two reviewers gave detailed and helpful comments on the manuscript. This study was financially supported by MEXT (No. 25711025) to K. N.

Author information

Authors and Affiliations

Department of Ecology and Evolutionary Biology, Princeton University, 106A Guyot Hall, Princeton, NJ, 08544, USA
Yuki Haba
Department of Evolutionary Studies of Biosystems, Sokendai, The Graduate University for Advanced Studies, Hayama, Kanagawa, 240-0193, Japan
Nobuyuki Kutsukake

Authors

Yuki Haba
View author publications
You can also search for this author in PubMed Google Scholar
Nobuyuki Kutsukake
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YH and KN conceived, designed, and performed the analysis. Both discussed the result and wrote the final manuscript.

Corresponding author

Correspondence to Yuki Haba.

Ethics declarations

Conflict of interest

Both authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 458 kb)

Supplementary material 2 (DOCX 15 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Haba, Y., Kutsukake, N. A multivariate phylogenetic comparative method incorporating a flexible function between discrete and continuous traits. Evol Ecol 33, 751–768 (2019). https://doi.org/10.1007/s10682-019-10011-6

Download citation

Received: 02 June 2019
Accepted: 23 September 2019
Published: 08 October 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s10682-019-10011-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A multivariate phylogenetic comparative method incorporating a flexible function between discrete and continuous traits

Abstract

Similar content being viewed by others

Working with the Tree of Life in Comparative Studies: How to Build and Tailor Phylogenies to Interspecific Datasets

Phylogenetic Prediction to Identify “Evolutionary Singularities”

Evolutionary Models and Phylogenetic Signal Assessment via Mantel Test

Introduction