Introduction

Vegetative propagation is widely used for large-scale Eucalyptus industrial plantations as it allows full capture of the additive, dominance, and epistasis effects underlying the traits of interest and deployment of the desired genotypes. The increase in genetic gain when the improvement program produces clone varieties is well known in the Eucalyptus genus (Zobel 1993; Barbour and Butcher 1995; MacRae and Cotterill 1997; Hossain et al. 2004). This is a common strategy for deployment of tropical or sub-tropical Eucalyptus species such as Eucalyptus urophylla, Eucalyptus grandis, Eucalyptus camaldulensis, and their hybrids such as the widely used E. urophylla × E. grandis. The yearly production of several hundred million cuttings by the Eucalyptus forest industry to supply its planting programs calls for considerable investments in terms of manpower, nurseries, watering, and fertilization. In addition, the cost of the upstream breeding and selection programs providing the selected genotypes to be propagated considerably increases the final cost of the plantlets. So, one of the challenges of clonal forestry is the suitability of the species for vegetative propagation (Kovacevic et al. 2008), which represents a fundamental selection criterion of clonal varieties. Vegetative propagation-related traits, i.e., stock plant production and survival, cutting survival during rooting phase, duration of rooting phase, and final rooting rate, are the major traits to consider for selection and then condition the feasibility and cost of varietal deployment. Practitioners are well aware of between-species variability in vegetative propagation traits, even if it is poorly reported in the literature. Huge clone-to-clone variation in cutting success is experienced daily in nurseries (Martin and Quillet 1974; Mankessi et al. 2010) and limits the number of genotypes used on an industrial scale. Genetic control of vegetative propagation traits has been studied for a few industrial species such as Eucalyptus globulus (Borralho and Wilson 1994; Lemos et al. 1997), Eucalyptus nitens (Tibbits et al. 1997), and Eucalyptus sideroxylon (Burger 1987). Knowledge of the genetic to phenotypic variance ratio is of major importance in predicting future genetic gain and choosing optimum strategies for breeding and selection. Moderate to large estimates of narrow- and broad-sense heritability are needed for accurate selection of parents to be crossed and clones to be deployed on an industrial scale.

Capacity for vegetative propagation is simultaneously conditioned by the physiological age of plant donors (Davis 1988; Hackett 1988; Pierik 1990; Browne et al. 1997; Mankessi et al. 2009), by genetic predisposition to vegetative propagation (Rauter 1983; Radosta 1989; Radosta et al. 1994), and by environmental factors (Zsuffa 1968; Farmer et al. 1989; Radosta 1989; Kovacevic et al. 2004, 2005). Influential factors affecting rooting ability can be divided into exogenous and endogenous, and are liable to interact. The season (i.e., combined effect of temperature, moisture, light availability…) is considered as the most influential exogenous factor affecting rooting ability (Rauter 1983; Monteuuis et al. 1995; Teklehaimanot et al. 2004; Danthu et al. 2008). Endogenous factors include genetic identity (Shepherd et al. 2005) and the physiological state of plant (Mankessi et al. 2009).

Although vegetative propagation by cuttings of E. urophylla × E. grandis is well documented (Martin and Quillet 1974; Chaperon and Quillet 1977; Saya et al. 2008), the magnitude of additive and non-additive genetic variation and the heritability of vegetative propagation ability are poorly understood in Eucalyptus. Furthermore, the genetic and environmental relationships between rooting ability and other growth and adaptive traits are poorly documented in Eucalyptus compared with other perennial angiosperms (Foster 1985; Paul et al. 1993; Radosta et al. 1994; Goldfarb et al. 1998; Foster et al. 2000; Baltunis et al. 2007). However, a better understanding of these correlations is crucial for a breeding strategy which deploys clones as varieties.

With the aim of improving our knowledge of the genetic and environmental determinants of propagation ability by cutting in Eucalyptus, a large experiment was undertaken using a full-sib progeny test of E. urophylla × E. grandis. Based on the 83 elite full-sib families of E. urophylla × E. grandis, this study was undertaken to (i) assess the genetic control of propagation ability in the framework of intensively managed container-ground stock plants, (ii) assess the level of additive and non-additive genetic effects on shoot production and cutting success, and (iii) determine the correlation between the vegetative propagation traits and initial growth in the field. Data analysis was the subject of a special focus because the categorical variables used in this study, often used in genetic trials, need to be carefully studied when variance components and genetic parameters are crucial information. We therefore investigated the performance of the generalized linear mixed model (GLMM) and compared it with that of the classic mixed model (LMM).

Material and methods

Material and methodology applied

A 13 (female) × 11 (male) E. urophylla × E. grandis incomplete factorial mating design (Table 1) was used in order to produce 83 full-sib families by controlled pollination. One to 36 (i.e., nearly 25 on average) young healthy and vigorous seedlings per family, according to availability, were transplanted, generating 2,115 stock plants grown on a mixture of a black sandy soil and charcoal (volume proportion 4:1) substrate with 100 g of ammonium nitrate. Each container contained 4 dm3 of substrate. Twelve plants of the same full-sib families were planted in each container. Up to a maximum of three containers per full-sib families were randomly distributed in the nursery area dedicated to the experiment.

Table 1 Pedigree and number of stock plants (full-sib individuals) for each crossing. The "9-101" code used for the male corresponds to E.grandis (9) and to the parent identity (101). The "14-109" code of the female corresponds to E.urophylla (14) and to the parent identity (109)

Stock plants were managed according to Saya et al. (2008) in order to stimulate the production of numerous actively growing axillary shoots with short internodes. After reaching 5–7 cm long, the stock plants produced shoots displaying chlorophyllian leaves and apex, in accordance with Saya et al. (2008). Approximately 7 to 8 days after their decapitation, the young plantlets started sprouting. The first cuttings started to be produced by the 24th day after decapitation. Cuttings were harvested every 7 days. Harvesting was spread out over 1 year with discontinuous periods, divided into three. A preliminary period (day 30 to day 74) was a stock plant formation phase, during which preliminary observations were made and the collected cuttings were not transplanted. Thus, cutting success was not measured for this period. A 1st propagation period (day 120 to day 155 corresponding to July–August), was considered in this analysis. A 2nd propagation period (day 335 to day 356 corresponding to February–March) was also considered. Between the 1st and 2nd propagation periods, produced shoots were cut at the same 7-day intervals. For the two periods considered, percentage of cutting was calculated at the end of each period and not at each harvest.

Two types of traits were taken into account considering absolute and relative variables. The absolute variables were recorded per plant donor and are defined as follows: the absolute stock plant productivity (PROD) (number of collected shoots per plant donor) and the absolute propagation by cutting success (CUT) (number of cuttings alive 3 months after their transplantation). Using these absolute variables, the relative propagation by cutting success was estimated as RCUT = CUT/PROD.

After the nursery experiment, cuttings were transplanted to the field in order to analyze the trend in variance components with age of growth and adaptive traits. The trees were planted in randomized bloc design with three replications (corresponding to three ramets per clone). Clones of the same full-sib family were planted in a 5 by 5 plot with a 3 × 4 m spacing. The trees were measured for height and circumference at 1.3 m at 25 months. These growth variables were combined with the propagation variable to perform a multivariable analysis with statistical model (1) described in the “Data analysis” section.

Data analysis

The LMM was used to analyze the data collected at the nursery and in the field. A first model was used to estimate the proportions of male, female, and male by female interaction variance components in the total genetic variance and their interaction with the two propagation periods. The model 1 is described by the following equation:

$$ y=\mu {\mathbf{1}}_{\mathbf{n}}+\mathbf{X} \mathbf{p}+{Z}_{\mathbf{m}} m+{Z}_{\mathbf{f}} f+{Z}_{\mathbf{m}\mathbf{f}}\mathbf{mf}+{Z}_{\mathbf{pm}}\mathbf{pm}+{Z}_{\mathbf{pf}}\mathbf{pf}+{Z}_{\mathbf{pm}\mathbf{f}}\mathbf{pm}\mathbf{f}+\varepsilon $$

Where

y :

is the vector of measurement related to each stock plant for PROD and CUT, individuals resulting from the cross of the male m and a female f

p :

is the vector of fixed effect of propagation period

m :

is the vector of random effects of males, m ~ N(0, σ 2 m Id), 0 is the vector of null values and Id is the identity matrix

f :

is the vector of random effects of females, f ~ N(0, σ 2 f Id)

mf:

is the vector of random interaction effects between males and females, mf ~ N(0, σ 2 mf Id)

pm:

is the vector of random interaction effects between the male and propagation period, pm ~ N(0, σ 2pm Id)

pf:

is the vector of random interaction effects between the female and propagation period, pf ~ N(0, σ 2 pf Id)

pmf:

is the random interaction effect between the family and propagation period, pmf ~ N(0, σ 2 pmf Id)

ε :

is the vector of random residual error, ε ~ N(0, σ 2 e Id).

The model (1) was implemented using the following types of variables: (i) response variable used for y is the original variable without any transformation, (ii) response variable used for y is transformed by different functions (log(y) for PROD and logit, log[(1 + y)/(1 − y)], for CUT). For CUT, using original and transformed data, a weighted linear mixed model was used, with the variable PROD as weight.

In order to complete these first statistical approaches, the GLMM was used (Mc Cullah and Nelder 1989; Bolker et al. 2008). GLMM is an extension of the LMM to situations with a distribution other than normal, such as binomial and Poisson. It needs the specification of the distribution, together with a link function that connects the response to the explanatory variables of the linear model. For PROD, the distribution was Poisson and the link function was the natural logarithm. For CUT, the distribution was binomial and the link function was the logit. Thus, a realistic hypothesis for the PROD variable is the following:with

$$ \begin{array}{l} Y\ \Big|\ \mathrm{random}\ \mathrm{effects}\sim \mathrm{Posson}\left(\lambda \right)\hfill \\ {} \log\ \left(\boldsymbol{\lambda} \right)=\boldsymbol{\mu} {1}_{\boldsymbol{n}}+\boldsymbol{Xp}+{\boldsymbol{Z}}_{\boldsymbol{m}}\boldsymbol{m}+{\boldsymbol{Z}}_{\boldsymbol{f}}\boldsymbol{f}+{\boldsymbol{Z}}_{\mathrm{mf}}\mathrm{mf}+{\boldsymbol{Z}}_{\mathrm{pm}}\mathrm{pm}+{\boldsymbol{Z}}_{\mathrm{pf}}\mathrm{pf}+{\boldsymbol{Z}}_{\mathrm{pm}\mathrm{f}}\mathrm{pm}\mathrm{f}\hfill \end{array} $$

and λ the Poisson parameter.

For the CUT proportion data, a realistic hypothesis is given by:

$$ Y\ \Big|\ \mathrm{random}\ \mathrm{effects}\sim \mathrm{Binomial}\left( n, p\right) $$

with

$$ \log \left(\frac{\boldsymbol{p}}{1-\boldsymbol{p}}\right)=\boldsymbol{\mu} {1}_{\mathrm{n}}+\mathrm{Xp}+{\boldsymbol{Z}}_{\mathrm{m}}\boldsymbol{m}+{\boldsymbol{Z}}_{\mathrm{f}}\boldsymbol{f}+{\boldsymbol{Z}}_{\mathrm{m}\mathrm{f}}\mathrm{mf}+{\boldsymbol{Z}}_{\mathrm{pm}}\mathrm{pm}+{\boldsymbol{Z}}_{\mathrm{pf}}\mathrm{pf}+{\boldsymbol{Z}}_{\mathrm{pm}\mathrm{f}}\mathrm{pm}\mathrm{f} $$

and p and n the parameters of the binomial distribution.

The use of GLMM needs to consider the dispersion parameter φ which is related to the variance of the distribution. In this study, our estimations were done by fixing φ = 1 (i.e., no overdispersion) after preliminary analyses showing that the GLMM did not lead to overdispersion with our data.

A second mixed model, adapted from the animal model (Mrode and Thompson 2005; Piepho et al. 2008) was used to estimate correlations and to compare the selection accuracy.

$$ y=\boldsymbol{\upmu} {\mathbf{1}}_{\mathbf{n}}+\mathbf{X} \mathbf{p}+{Z}_{\mathrm{u}} u+{Z}_{\mathrm{fam}}\mathbf{f}\mathrm{am}+{Z}_{\mathrm{pu}}\mathrm{pu}+{Z}_{\mathrm{pfam}}\mathrm{pfam}+\varepsilon $$

where p is a vector fixed effect of propagation period, u is a vector of random additive effects u ~ N(0, σ 2 a A) with A the relationship matrix computed among individuals defined with the pedigree and fam is a vector of random family effects not explained by the additive effects, fam ~ N(0, σ 2 f Id), pu is the vector of random interaction effects between the additive effect u and propagation period, pu ~ N(0, σ 2 paId), pfam is the vector of random interaction effects between the family and the propagation period, pfam ~ N(0, σ 2 pfId), ε is the vector of random residual error, ε ~ N(0, σ 2 eId). Variance components were estimated using Asreml version 3 (Gilmour et al. 2006) for both model 1 and 2.

Variance components, heritability, and correlations

The relation between variance components and the classic model of quantitative genetics was used to calculate the following variances (Gallais 1990):

σ 2 Am = 4 × σ 2 m :

is additive variance due to male effect

σ 2 Af = 4 × σ 2 f :

is additive variance due to female effect

σ 2 D = 4 × σ 2 fm :

is the dominance variance of the hybrid population

σ 2 G = ½(σ 2 Am + σ 2 Af) + σ 2 D :

is the total genetic variance of the hybrid population

For the model 1 using the original and the transformed variable and the LMM, the narrow- (ss) and broad- (sl) sense heritabilities were calculated using the classic formulas.

Narrow-sense heritability was given by:

$$ h{{}^2}_{\mathrm{ss}}=\frac{{{}^2}_{\mathrm{A}}}{{{}^2}_{\mathrm{m}}+{{}^2}_{\mathrm{f}}+{{}^2}_{\mathrm{m}\mathrm{f}}+{{}^2}_{\mathrm{pm}}+{{}^2}_{\mathrm{pf}}+{{}^2}_{\mathrm{pm}\mathrm{f}}+{}^2\mathrm{e}} $$
(1)

Broad-sense heritability was given by:

$$ H{{}^2}_{\mathrm{sl}}=\frac{{{}^2}_{\mathrm{A}}+{{}^2}_{\mathrm{D}}}{{{}^2}_{\mathrm{m}}+{{}^2}_{\mathrm{f}}+{{}^2}_{\mathrm{m}\mathrm{f}}+{{}^2}_{\mathrm{pm}}+{{}^2}_{\mathrm{pf}}+{{}^2}_{\mathrm{pm}\mathrm{f}}+{}^2\mathrm{e}} $$
(2)

Using the GLMM approach, the heritability calculation depends on the type of variable. The heritability based on the link function (Nakagawa and Schielzeth 2010) has been considered

For PROD analysed with GLMM with a Poisson distribution and the log link, narrow- (h2ss) and broad- (H2sl) sense heritabilities were then defined using the two following equations:

$$ {h}_{\mathrm{ss}}{{}^2}_{\log }=\frac{\sigma_{\mathrm{A}}^2}{{{}^2}_{\mathrm{m}}+{{}^2}_{\mathrm{f}}+{{}^2}_{\mathrm{m}\mathrm{f}}+{{}^2}_{\mathrm{pm}}+{{}^2}_{\mathrm{pf}}+{{}^2}_{\mathrm{pm}\mathrm{f}}+\upvarphi \ln \left(\frac{1}{{\overline{y}}_{\mathrm{g}}}+1\right)} $$
(3)
$$ {H}_{\mathrm{sl}}{{}^2}_{\log }=\frac{\sigma_{\mathrm{A}}^2+{\sigma}_{\mathrm{D}}^2}{{{}^2}_{\mathrm{m}}+{{}^2}_{\mathrm{f}}+{{}^2}_{\mathrm{m}\mathrm{f}}+{{}^2}_{\mathrm{pm}}+{{}^2}_{\mathrm{pf}}+{{}^2}_{\mathrm{pm}\mathrm{f}}+\varphi \ln \left(\frac{1}{{\overline{y}}_{\mathrm{g}}}+1\right)} $$
(4)

with φ being the dispersion parameter and \( {\overline{y}}_{\mathrm{g}} \) the geometric mean of y.

For CUT, a GLMM was considered assuming a binomial distribution and the logit link, and the narrow- and broad-sense heritabilities were calculated as follows:

$$ h{{{}^2}_{\mathrm{ss}}}_{\mathrm{logit}}=\frac{{{}^2}_{\mathrm{A}}}{{{}^2}_{\mathrm{m}}+{{}^2}_{\mathrm{f}}+{{}^2}_{\mathrm{m}\mathrm{f}}+{{}^2}_{\mathrm{pm}}+{{}^2}_{\mathrm{pf}}+{{}^2}_{\mathrm{pm}\mathrm{f}}+\varphi\ {\pi}_{{}^2}/3\ } $$
(5)
$$ H{{{}^2}_{\mathrm{sl}}}_{\mathrm{logit}}=\frac{{{}^2}_{\mathrm{A}}+{{}^2}_{\mathrm{D}}}{{{}^2}_{\mathrm{m}}+{{}^2}_{\mathrm{f}}+{{}^2}_{\mathrm{m}\mathrm{f}}+{{}^2}_{\mathrm{pm}}+{{}^2}_{\mathrm{pf}}+{{}^2}_{\mathrm{pm}\mathrm{f}}+\varphi\ {\pi}_{{}^2}/3\ } $$
(6)

With φ the dispersion parameter.

The additive (ρ A ), dominance (ρ D ), total genetic (ρ G ), and environmental (ρ E ) correlations between two traits (x and y) were estimated from a bivariate analysis using the individual model 2. The vectors for the two traits were stacked up and a covariance structure between the vectors of random effects was declared that made possible the estimation of the correlation coefficients.

As for univariate analysis, σ 2 x, σ 2 y represent the variance of trait “x” and “y”, respectively, and cov (x, y), the covariance between traits “x” and “y”. The coefficients of correlation were estimated as follows:

$$ {\rho}_{\mathrm{A}}=\frac{{\mathrm{Cov}}_{\mathrm{A}}\left( x, y\right)}{\sigma_{\mathrm{A}\mathrm{x}}.{\sigma}_{\mathrm{A}\mathrm{y}}} $$
(7)
$$ {\rho}_{\mathrm{D}}=\frac{{\mathrm{Cov}}_{\mathrm{D}}\left( x, y\right)}{\sigma_{\mathrm{D}\mathrm{x}}.{\sigma}_{\mathrm{D}\mathrm{y}}} $$
(8)
$$ {\rho}_{\mathrm{G}}=\frac{{\mathrm{Cov}}_{\mathrm{G}}\left( x, y\right)}{\sigma_{\mathrm{G}\mathrm{x}}.{\sigma}_{\mathrm{G}\mathrm{y}}} $$
(9)
$$ {\rho}_{\mathrm{e}}=\frac{{\mathrm{Cov}}_{\mathrm{e}}\left( x, y\right)}{\sigma_{\mathrm{e}\mathrm{x}}.{\sigma}_{\mathrm{e}\mathrm{y}}} $$
(10)

All the variances and co-variances associated with random effects were estimated by the restricted maximum likelihood (REML method) using ASReml version 3 (Gilmour et al. 2006). Standard errors of variances, heritabilities, and correlations were calculated with ASReml using a standard Taylor series approximation (Gilmour et al. 2006).

Comparison of selection accuracy

The impact of the different approaches on the selection accuracy was studied by comparison of ranking of the predicted genetic values and the prediction accuracy. As the populations of females and males (13 and 11 individuals, respectively) were too small to compare the ranking, this analysis was conducted with the hybrid population of stock plants (2,115 individuals) using the model 2.

The accuracy of the selection, r, generated by the three data analysis approaches (original variable with LMM, transformed variable with LMM, GLMM with dispersion coefficient fixed at one) and the model 2 was assessed by averaging the accuracy of the individual predicted breeding values r i calculated with Eq. (11) for all the individuals.

$$ {r}_{\mathrm{i}}=\sqrt{1-\frac{s_{\mathrm{i}}^2}{\left(1+{f}_{\mathrm{i}}\right){\sigma}_{\mathrm{a}}^2}} $$
(11)

Where s i 2 is the ‘prediction error variance’ of predicted breeding values (Gilmour et al. 1995), f i is the inbreeding coefficient for the ith individual, and σ 2 a is the additive variance; s i 2 and f i were estimated with Asreml version 3 (Gilmour et al. 2006).

The comparison of ranking was done by estimating the Spearman coefficient of correlation among the different prediction approaches.

Results

Trends in propagation ability with stock plant age

The evolution of cutting production can be characterized by a first phase corresponding to the entry into production, which is not considered in this analysis. This phase was characterized by increased shoot production. After this first phase, we observed near-stabilization of PROD and CUT between the two propagation periods (Table 2), but some variables presented significant differences. The average of the two propagation periods was around PROD = 7 for the number of shoots produced per stock plant and CUT = 6 for the number of cuttings. These absolute values correspond to the mean percentage RCUT = 69 %.

Table 2 Main statistics related to the propagation traits according to the two periods of propagation and to the growth traits of the cuttings planted in the field experimental design

The difference between these two periods for the PROD (from 8.71 to 6.03), and the relative cutting success RCUT (from 64.4 to 74.9 %) were significant at p = 0.05, demonstrating a propagation period effect for these variables. The coefficient of variation (CV) showed high values for PROD and increased during the propagation process from 55 to 100 %. The same trend was observed with CUT, with an increase from 72 to 95 %. These high CV values are explained by the much skewed distribution of these variables. For RCUT, the CV was stable around 30 %. For growth variables, it varied between 34 and 41 %.

Comparison of variance components and their ratio

Phenotypic variance of the different traits was partitioned into male, female, and male-female interaction, their interaction with the propagation phase and the residual variance using the model (1). (All the estimates of the variances and their standard error are given in Tables 3 and 4.)

Table 3 Estimates of variance components and genetic parameters for PROD with the parent model (model 1)
Table 4 Estimates of variance components and genetic parameters for CUT with the parent model (model 1)

As expected, due to the different scales used to define variables, the genetic variance components, whatever the random effect, varied with the type of model (LMM or GLMM) and the variable transformation (Table 3). For the shoot production PROD, for example, σ 2 m = 2.93 with LMM and σ 2 m = 0.078 with LMM and transformed variable LOGPROD, σ 2 m = 0.136 with GLMM and ψ = 1. Whatever the type of variable and the model, the residual variance σ 2e was preponderant, showing the strong environmental effect in the propagation process (assuming a no epistatic effect). Variance due to the female was smaller than variance due to the male or to the male-female interaction. The variances related to the period by genetic effect interactions were preponderant for some genetic effects like female and male-female interaction. As a result, the estimated additive and dominance variances varied with variable transformation and models, for example σ 2 A = 5.863 with LMM, σ 2 A = 0.186 with LMM and LOGPROD and σ 2 A = 0.272 with GLMM and ψ = 1. Although the variance component estimates varied according to the model and variable transformation, the ratio of dominance to additive genetic variance varied to a lesser extent and was close to one depending on the model and transformation, varying from σ 2 D/σ 2 A = 1.435 with LMM without transformation to σ 2 D/σ 2 A = 1.068 with GLMM and ψ = 1 (Table 3). Narrow- and broad-sense heritabilities varied according to the model. For example, h 2 ss = 0.182 and H 2 sl = 0.443 with LMM and the non-transformed variables, but h 2 ss = 0.431 and H 2 sl = 0.839 with GLMM and log link equation (Eqs. 3 and 4) (Table 3). As expected, broad-sense heritability values were higher than narrow-sense, but both showed a similar trend according to the model and the equation used (Table 3).

Similar results were observed with the CUT variable (or RCUT) (Table 4): the variance component estimates varied with the model and the variable transformation, the residual variable was preponderant, the male variance was higher than the female variance, and the dominance to additive variance ratio varied between 0.907 and 1.571. As for PROD, estimated broad- and narrow-sense heritabilities for CUT were smaller with LMM (h 2 ss = 0.094 and H 2 sl = 0.180) than with GLMM model (h 2 ss = 0.199 and H 2 sl = 0.421). They were smaller than for PROD with both LMM and GLMM model.

Genetic and environmental correlations

To avoid bias due to autocorrelation between the number of shoots and the number of cuttings, the correlations were calculated between the number of shoots (PROD) and the percentage of cutting success (RCUT). For original variables (PROD with RCUT) and transformed variables (logPROD and logitRCUT), the correlations were higher than 0.5, respectively ρ A  = 0.98 and ρ A  = 0.96 for additive, ρ D  = 0.71 and ρ D  = 0.60 for dominance, ρ G  = 0.69 and ρ G  = 0.57 for total genetic, showing the strong genetic relationship between those two traits. The residual correlations ρ e were small, included in [−0.010; 0.15] showing a quasi-independence of environmental effects on both traits.

The magnitude of the correlations between shoot production (PROD) and field growth (height and circumference at 25 months) differed according to the genetic effect, but was generally low to moderate and included in [−0.5; 0.5], and a very similar pattern was observed for LOGPROD (Table 5). The environmental correlation (ρ e ) was low, the additive correlation (ρ A ) was negative but the standard error was high, showing a very poor estimation of this parameter, the dominance correlation (ρ D ) was positive and the total genetic correlation (ρ G ) was rather small. For RCUT and LOGITCUT, the correlations were generally smaller than for PROD (Table 5) and a large standard error was observed for the additive effect.

Table 5 Genetic correlation estimates between propagation variables and juvenile growth of the cutting in the field trial. The environmental correlation (ρ E), the additive correlation (ρ A), the dominance correlation (ρ D), and the total genetic correlation (ρ G)

Impact of variable transformation and models on selection accuracy

The impact of the three different approaches with individual model and pedigree (model 2) (LMM with original and transformed variable and GLMM) on the genetic parameters is illustrated in the Tables 6 and 7. For PROD (Table 6), the results are consistent with the family model (model 1). For CUT (Table 7), the additive variance showed smaller estimates and the dominance variance showed very high estimates (and as a consequence the σ 2 D/σ 2 A ratio was also very high). This may be caused by the very unbalanced design for this variable due to the numerous missing data in the second period (when PROD = 0, CUT is considered as missing).

Table 6 Estimates of variance components and genetic parameters with individual model (2) for the PROD variable
Table 7 Estimates of variance components and genetic parameters with individual model (model 2) for the CUT variable

In our results, a higher accuracy corresponded to a higher narrow-sense heritability which is expected when using the same LMM model (Mrode and Thompson 2005). For example, the PROD variable which showed a higher accuracy than CUT, presented higher narrow-sense heritability (Tables 6 and 7). For PROD, the accuracy obtained with the three approaches showed higher accuracies using LMM with original and transformed variable (r = 0.69 and r = 0.74, respectively) than with GLMM (r = 0.46) which showed also a smaller narrow-sense heritability (Table 6). For CUT, the differences between the three approaches were small (varying from r = 0.32 to r = 0.35) and the narrow-sense heritabilities (h 2ss) were close (Table 7). These results are illustrated by significant rank correlations between the methods for predicted values, higher for CUT, varying from 0.920 to 0.998, than for PROD, varying from 0.752 to 0.944 and by Fig. 1.

Fig. 1
figure 1

Correlation between the predicted values obtained with the three different approaches and model (2). X axisLMM predicted values assessed with linear mixed model and original variable; LMM-LOG and LMM-LOGIT predicted values assessed with linear mixed model and transformed variables. Y axisGLMMDISP = 1 predicted values assessed with generalized linear mixed model and dispersion parameter equal to one; LMM-LOG predicted values assessed with linear mixed model and transformed variables

However, comparison between LMM and GLMM must be done with caution as accuracy and heritabilities depend on completely different assumptions of the background error, and as they will estimate σ 2 A differently, as they in one sense depend on internal transformations (as the logit or log link) that changes the scale.

Discussion

Trends in propagation ability with stock plant age

Cutting mortality was much higher in the dry season (36 %) (First propagation period) than in the rainy season (25 %) (Second propagation period). At the same time, we observed an increase in cutting success from the first to the second period (64 and 75 %, respectively). As noticed above, these two periods corresponded to the two main climatic seasons in southern Congo. Environmental conditions (temperature, light, air moisture) are more suitable for propagation in the second period (rainy season) (Mankessi et al. 2011), and this may explain the greater cutting success. It is well known that one of the most marked environmental effects on shoot production and cutting success is season (light, temperature, moisture, fungus attack, etc.). Its effect has been shown for various plants (Rauter 1983; Monteuuis et al. 1995; Teklehaimanot et al. 2004; Bhardwaj and Mishra 2005; Danthu et al. 2008). The improvement of CUT and RCUT during the second propagation period is due in part to the season effect, but is not the only cause of variation in shoot production and cutting success, other causes include environmental risks, physiological age, and a potential operator effect.

Before the first propagation period, the stock plants were attacked by Leptocibe invasa (Fisher & LaSalle, Hymenoptera: Eulophidae), which had the direct effect of inducing a high number of poorly developed cuttings with weak rooting ability. As the prevalence of L. invasa decreased, the quality of cutting production improved. So, during the second propagation period, shoots became more suitable and the operator had a broader range of choice, thus introducing a selection factor, which may explain the increased cutting success during this period.

The physiological age effect can have a negative impact on cutting success. Maturation of plant material (aging of meristem) leads to a reduction in rooting ability (Foster et al. 1981; Bonga 1982; Marino 1982; Wareing 1987; Hackett 1985, 1988; Greenwood and Hutchison 1993; Poupard et al. 1994; Hamann 1995; Ruaud et al. 1999). For example, Marino (1982) has shown with southern pines that rooting remains active during the first 3 years for ground donors plant. In the case of extensive stock plants of E. urophylla × E. grandis, in humid tropical conditions, the length of this active period remains unknown. Cutting success improves with the age of stock plants when shoots are collected during the first 2 years (Mankessi et al. 2011). This was the case with our experiments, which explains why an older stock plant combined with a more favorable season (rainy season: better conditions in terms of light and temperature) led to a higher percentage of cutting success.

Impact of variable transformation on variance component, heritability, and selection accuracy

The transformation of non-Gaussian data was used to confirm the LMM which assumes that both random effects and residuals are normally distributed. We used an a priori transformation for the count variable (LOG) and the proportion variable (LOGIT). Examination of the distributions of the residual showed that the transformation with LOGIT (CUT) improved the Gaussian distribution and independence of residuals, whereas the impact was not noticeable for PROD with the logarithmic transformation (results not shown). These transformations led to different estimates of variance components and ratios (Tables 3, 4, 6, and 7). This result was expected because the parameters were estimated on the transformed scale and not on the original scale. Differences between estimated heritability on the original and transformed scales have been reported before (Browne et al. 2005; Carrasco and Jover 2005). The impact of variable transformation on selection accuracy was relatively limited.

Impact of statistical model on variance component, heritability, and selection accuracy

In this study, we used the variables (CUT and PROD) to understand the genetic basis of propagation ability. Using such variables in breeding programs has required the development of adequate statistical methods for the estimation of parameters and the prediction of breeding values because these variables do not follow a normal distribution (Garcia et al. 2012).

A first possibility consists in transforming the non-Gaussian data and using the LMM. Although this approach might be appropriate, it seems more pertinent to use the raw distribution of the data (Bolker et al 2008; Wittenburg et al. 2008). GLMMs (Kachman 2007; Isik 2011; Sun 2011; Che and Xu 2012) are an extension of generalized linear models that allow the prediction of random effects. They can be used to estimate the heritabilities on the original and transformed (latent) scales and are based on the real distribution of the data.

Our results showed marked differences in variance components and heritabilities estimated using either LMM or GLMM (see Tables 3, 4, 6, and 7). This result was expected as the estimations are not based on the same scale. These observations highlight the difficulty of choosing a particular heritability expression. The interpretation and choice of these heritabilities are not trivial and are not established; more research is needed to fully exploit the potential of this kind of method (Nakagawa and Schielzeth 2010).

If we consider the impact of modeling in terms of selection efficiency, our results show that GLMM is less accurate. This lower performance affected the ranking of predicted breeding values, especially for PROD. GLMM is more adapted to non-Gaussian data from a theoretical point of view, and these practical considerations show that it should be used with caution. Use of the GLMM with restricted maximum likelihood is very sensitive to non-orthogonality of the data. Non-orthogonality was pronounced in our experiments because stock plant mortality was high (36 and 25 % during the first and second periods) and the mating design was not complete (Table 1). Some scientists favor the normal approximation and use of the LMM if there are numerous data points per level of a factor. However, the GLMM can still be applied by using a simpler model (eliminating a genetic effect), by simplifying the experimental design (combining factors…) or by using a more informative response variable (Gezan, personal communication).

Contribution of additive and non-additive effects

Our results showed that σ 2 D/σ 2 A was close to 1 and sometimes higher than 1.5 for all the variables with the parent model (model 1). For the individual tree model (model 2), this result was amplified. Although confirmation with supplementary experiments is needed because of the small parent sample size, this result may reveal the importance of the dominance effect for propagation traits in the hybrid population. This is supported by previous findings showing the preponderance of non-additive variance for such traits, for example in Pinus taeda (Foster 1978; Anderson et al. 1999), Tsuga heterophylla (Sorensen and Campbell 1980), and Platanus occidentalis (Cunningham 1986). The importance of dominance effects has already been stressed in the case of this E. urophylla × E. grandis hybrid population regarding growth traits (Bouvet et al. 2009) and could be confirmed by this study on propagation ability.

Heritability

Our results indicated that PROD is more heritable than CUT. CUT is under weak genetic control (Tables 4 and 7), while PROD is under moderate to high genetic control (Tables 3 and 6). Some studies on the genetic control of propagation traits have reported similar results. Ruaud et al. (1999) reported weak and moderate heritability for rooting ability in E. grandis (h 2 = 0.16 for cuttings derived from open pollinated families and h 2 = 0.27 for cuttings derived from half-diallel crossings). In E. globulus, Borralho and Wilson (1994), England and Borralho (1995), and Lemos et al. (1997) reported moderate genetic control for rooting ability, included in the range [0.36–0.41]. In theory the lower narrow- and broad-sense heritability for CUT could be explained by a higher environmental variance due to non-optimal environmental control in the nursery, the number of cutting manipulations during the process, and a higher sensitivity of cutting to environmental changes.

Genetic correlations

The values of correlation between PROD, CUT, and growth at 25 months in the field highlighted a weak relationship between gene effects in the nursery and field. Only correlation between PROD and field growth for dominance effects was close to 0.5. Additive correlation estimates were poor due to the very small variance components and cannot be considered as consistent. Few published results are available for comparison with our findings. Baltunis et al. (2007) reported a significant but weak genetic correlation between rooting ability and initial growth in P. taeda (ρ G = 0.29). Such a low level of correlation was expected in our experiments based on previous results showing the weak relationship between cutting growth at the nursery stage and clone performances in the field with eucalyptus hybrids in Congolese conditions (Bouvet et al 2004). This weak correlation between field growth and propagation should be confirmed, but these first results underscore the need to start improvement programs with a large genetic base population in order to select genotypes combining good growth and propagation ability.

Conclusion

This study is one of the few that address the genetics of propagation in Eucalyptus. Similar analyses have been conducted in E. globulus (Borralho and Wilson 1994; Lemos et al 1997), but to our knowledge very few recent studies have investigated the genetic basis for cutting success and its relation to adaptive and biomass traits. Such genetic analyses with adequate statistical models are needed to take into account the types of variables (proportion and count data). We present an alternative to the LMM that can be used according to the quality of the data and the objective (assessment of genetic parameters or BLUP). The GLMM has good potential because of its relevant mathematical properties and the ability to estimate heritability on both scales. However, the study of these different models showed that it is difficult to interpret the genetic parameters (variance, heritability,…) and the risk when the experiment is of complex design and the data are unbalanced. Additional research through simulation is needed to investigate the potential of the GLMM and the interpretation of the estimates, especially in complex genetic designs.

We observed here that cutting success is under weak genetic control and that shoot production is under moderate genetic control, and that the two traits are genetically correlated. The level of heritability, even though small, suggested that genetic gain can be achieved through breeding and clone selection. In addition, the low genetic correlations between the shoot production of stock plants and between cutting success and tree growth in the field are important to consider in breeding. In terms of multitrait selection, they show that there is a need to start breeding programs with a large genetic basis so as to be able to select clones combining good propagation properties and good growth. However, these first results need to be confirmed by additional experiments.