Introduction

Growth and yield models are commonly used as decision-support tools in forest management (Zhang et al. 1993; Trasobares et al. 2004). An important tree variable in these models is crown size, which usually denotes crown length (CL), crown radius (CR; as show in Fig. 1) or crown width (CW). However, attributes of individual branches define crown structure and have significant influences on tree growth as they control the amount and display of leaf area (Vose et al. 1994). In addition, the number and size of branches on a stem have been related to stem growth, wood quality, wildlife habitat, and key physiological processes such as the photosynthesis, respiration and transpiration of a tree (Maguire et al. 1991; Hayes et al. 1997, 2003; Hein et al. 2008). The sizes of branches are quite sensitive to stand conditions, which have generally been predictable from site, tree or branch-level factors (Maguire et al. 1991; Weiskittel et al. 2007). However, despite its numerous benefits, measuring the branches of every sampled tree is prohibitively costly and time consuming (Hein et al. 2008). Consequently, accurate models that analyze branch data from adequate numbers of sample trees are required. Such models allow forest manager to predict branch development precisely.

Fig. 1
figure 1

The sketch map of tree and branch variables. HT tree height; CL the length of crown; HCB the height of crown base; DINC absolute depth of branch into the crown; CR the crown radius; CW the crown width; BA the branch angle; BCL the branch chord length; BAL the branch arc length; BL branch length; RDINC the relative depth of branch into crown, which can be calculated as: RDINC = DINC/CL

In the previous literatures, numerous investigations have studied the individual branch attributes, but the majority have focused on several commercially important plantation species, including: Norway spruce [Picea abies (L.) Karst] (Mäkinen et al. 2003; Kantola and Mäkelä 2004), radiata pine [Pinus radiata D. Don.] (Woollons et al. 2002), jack pine [Pinus banksiana Lamb] (Beaulieu et al. 2011), Douglas-fir [Pseudotsuga menziesii (Mirb.) Franco] (Weiskittel et al. 2007; Hein et al. 2008), Scots pine [Pinus sylvestris L.] (Mäkinen 1996; Mäkelä and Vanninen 2001), white spruce [Picea glauca (Moench) Voss] (Sattler et al. 2014) and Dahurian larch [Larix gmelinii (Rupr.) Kuzen.] (Jiang et al. 2012a, b). However, to our best knowledge, the models of branch attributes of these species that grow in mixed-stands have received less attention. It was explicit that the crown characteristics in mixed-species stands can be largely influenced by differential resource utilization. For example, species classified as shade intolerant tend to have crowns with branch spread in a relatively even horizontal distribution, while shade-tolerant species tending to have multi-layered crowns that can support greater self-shading (Nelson et al. 2014). However, no matter where species grow either in pure or mixed forest, branch characteristics models are frequently obtained from stand and tree variables that are input to mechanistic models (e.g., Roeh and Maguire 1997; Weiskittel et al. 2007; Courbet et al. 2012). These models have been combined with individual tree growth and yield simulators and have been found to be useful for understanding the effects of management on stem growth and wood quality as well as for improving growth predictions across a wide range of stand conditions (e.g., Maguire et al. 1991; Weiskittel et al. 2007; Barbeito et al. 2014). To date, most of these models are simple linear or nonlinear functions of the relative or absolute branch depth into the crown (RDINC or DINC; as show in Fig. 1), estimated using ordinary linear or nonlinear least squares techniques (Kantola and Mäkelä 2004; Beaulieu et al. 2011; Barbeito et al. 2014). Data for branch characteristics studies generally involve multiple measurements of trees growing in different stands or regions. This hierarchical structure (i.e., branches within tree, and trees within plot) results in a lack of independence between observations, since data coming from the same sampling unit (tree and plot) tend to be significantly correlated, which can result in biased estimates for the confidence interval of the parameters by the ordinary least squares techniques (Calama and Montero 2005).

Nonlinear mixed-effects models (NLME), consisting of fixed and random parameters, provide an efficient means of analyzing repeated measurement data and making accurate local prediction (Vonesh and Chinchilli 1997; Pinheiro and Bates 2000; Fang and Bailey 2001). The fixed parameters in NLME models account for the covariate or treatment effects, as in traditional regression, and the random parameters account for heterogeneity and randomness in the data by known and unknown factors. When the random effects are employed for the samples that are not measured and recorded in field work, NLME models can improve predicted accuracy. For details on NLME modeling, see Vonesh and Chinchilli (1997), Pinheiro and Bates (2000) and Fang and Bailey (2001). Because of their versatility, NLME models have been widely applied to forest growth and yield models (Lappi and Bailey 1988; Fang and Bailey 2001; Calama and Montero 2005; Yang and Huang 2011; Corral-Rivas et al. 2014), as well as branch attributes models (e.g., Meredieu et al. 1998; Hein et al. 2008; Dahle and Grabosky 2010). However, most of the NLME models in forestry were often simplified or assumed the within-subject heterogeneous and autocorrelation to be negligible in the presence of random subject effects (e.g., Garber and Maguire 2003; Corral-Rivas et al. 2014). Although the models without considering the characteristics of heterogeneous and autocorrelation can also be acceptable for practical applications, the acceptability is highly dependent on the model form and data involved, and there are situations for which further accounting for the within-subject heterogeneous and autocorrelation may improve the fit and prediction of a model (Yang and Huang 2011). The commonly used variance function (e.g., exponential function and power function) and autocorrelation structures [e.g., first-order autoregressive structure AR(1) and moving average structure MA(1)] in forestry have been confirmed effectively to remove heterogeneous and autocorrelation within a specific subject (e.g., Jordan et al. 2005; Zhao et al. 2005; Yang and Huang 2011). Therefore, similar variance function and autocorrelation structures can also be introduced into branch characteristics models. However, to our best knowledge, few studies have applied the completely multi-level NLME approach to predict the branch characteristics with hierarchical and autocorrelation data.

Dahurian larch is the most important commercial species in northeastern China, occupying more than 7 × 104 km2 in this region (CSFA-FRM 2010). The main uses of natural Dahurian larch forests are for timber, fuel wood, pulp, grazing, landscape and recreational purposes. Natural Dahurian larch forests also play an important role as water conservation and soil protectors, as they grow in poor soils. Since the beginning of the twenty-first century, a research line in the China State Forestry Administration (CSFA) has been devoted to the sustainable management and multiple uses of forests in northeastern China. However, Dahurian larch is known for quick branch growth after crown release even at advanced ages. Branch number and size of Dahurian larch are major determinants of wood quality. Furthermore, Dahurian larch timber is highly appreciated in wood markets when the branch size is small and few branches exist. Therefore, concerning branches, a quantitative assessment of the relationships between branch characteristics and tree variables is needed. The development of these branch-level prediction models will help to understand growth response at the tree-level and allow assessment of potential wood product quality in this region, which should promote more effective and efficient management practices.

As mentioned above, a large-scale investigation that physically understands the growth responses of branches to different stand conditions would be too costly and time-consuming to implement across a broad area. Therefore, our aim was to develop a model which can be calibrated for the given forest stand by collecting small amount of branches. The overall goal of this study was to develop branch diameter and length prediction models for Dahurian larch growing in mixed, uneven-aged stand across northeastern China, which can be integrated into an individual tree growth and yield simulation system. Specific objectives were to: (1) compare different local branch diameter and length equations for the mixed, uneven-aged natural Dahurian larch forests in northeastern China; (2) develop new generalized branch diameter and length models based on the best local model previously fitted and the potential stand (or tree) variables; (3) use the local and generalized equations to study the capacity of nonlinear mixed effects model to explain the variability in the relationship between branch diameter (or branch length) and tree variables; and (4) evaluate the predictive ability of the developed models and the applicability of the multi-level NLME technique with the help of an independent data set.

Materials and methods

Study area and data

The study was conducted at a site on 1.52 × 105 ha state-owned forest in northeastern China (52°41′57.1″N, 123°51′56.5″E), where the total volume is 9.43 × 106 m3, and the forest coverage is as high as 88.9 %. The predominant vegetation in the area is mixed, uneven-aged forests of Dahurian larch and white birch (Betula platyphylla). The altitude above sea level of the study area varies between 200 and 1400 m, and the average elevation is 573 m. The climate in this study area is typical cold temperate continental monsoon climate, with a distinct wet summer and a cool, dry winter. Temperature varies from approximately −52.3 to 40.6 °C, and the annual average temperature is only −2.8 °C. The mean annual precipitation ranges between 400 and 600 mm. Variations in precipitation and temperature for this area are strongly correlated with latitude and proximity to the coast. Soils are considered mostly brown coniferous forest soil, with small amounts of dark brown, meadow and swamp soils.

Stem and branch analysis data were acquired from 18 permanent plots of natural Dahurian larch stands that are used to monitor the growth and production of the forests in the Great Xing’ an Mountains. These plots, which were established in 2010, were selected with the aim of representing different stand conditions (i.e., stand age, density and sites). Stand density ranged from 767 to 3333 stems ha−1, the canopy density ranged from 0.30 to 0.75, the mean diameter at breast height (DBH) ranged from 7.3 to 16.2 cm, and the mean total tree height (HT) ranged from 6.7 to 17.8 m, respectively. The selection of sample trees and branches is detailed in Dong (2013). In summary, the DBH values of all the live trees (DBH ≥ 5 cm) in the plots were measured for the DBH, HT, CW and height to crown base (HCB). Then the DBH distribution was divided into three equally sized classes, and only one sample tree per size class was randomly selected around the plot, to avoid destroying the entire sample plot. After felling, the total stem height and height of the crown base, defined as a whorl with at least one living branch, was noted. Discs were taken from the stem at a height of 1.3 m, and at an interval of 1 m above the stump. Each section at 1-m intervals within the crown is called a “layer”. Every branch in each crown layer was then numbered. Branch measurements included branch diameter (BD), length (BL), azimuth angle (A), insertion angle (BA), chord length (BCL), arc length (BAL) and DINC (Fig. 1). Only one sample branch was selected within each whorl for measuring the attributes of the branch, according to the branch diameter and length (i.e., the sample branch diameter and length should be equal (or close) to the average branch diameter and length in the whorl). The number of sampled primary branches per tree ranged from 8 to 28, and was 28.98 % of the total number of branches. Because the dead branches were a very minor component of all live crowns, occurring predominantly in the lower part of crown, only live branches were used in this study. We first sorted the sample trees by their DBH, and then the sample trees would be classified into two datasets using a systematic sampling method: one for model fitting and the other for model validation. If a sample tree was classified into fitting datasets, then all the branch datasets in this tree were treated as fitting dataset as well. In this sampling method, the model fitting data contained 705 sample branches from 37 sample trees, while the validation data contained 261 sample branches from 13 trees. Data statistics and relevant stand characteristics are summarized in Table 1.

Table 1 Summary statistics for modeling fitting and validation data

Comparison of equations

Graphic analysis of primary branch size showed that the BD and BL were mainly related to DINC for each tree, which both increased continuously with increasing DINC and then finally approached an asymptotic value at the crown base (Fig. 2). Several functions have existed for modeling the relationship between BD (or BL) and DINC of trees. We compared the fitting and predictive performance of 8 standard functions (Table 2).

Fig. 2
figure 2

Branch diameter (BD) and branch length (BL) versus depth of branch into crown (DINC) for the model fitting data

Table 2 Branch diameter and length functions considered in the analysis

For preliminary selection, we used ordinary non-linear least squares (ONLS) to fit each of the equations to the fitting data with the NLS function in the S-Plus software (S-Plus Institute 2008), and then their prediction performances were examined against validation data. We evaluated the goodness of fit of the models by graphical analysis and by considering the following statistics, calculated from the residuals: root mean square error (RMSE), the coefficient of determination (R 2) and bias (Lejeune et al. 2009; Yang and Huang 2011):

$${\text{RMSE}} = \sqrt {{{\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {\sum\limits_{k = 1}^{{T_{ij} }} {(h_{ijk} - \hat{h}_{ijk} )^{2} } } } } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {\sum\limits_{k = 1}^{{T_{ij} }} {(h_{ijk} - \hat{h}_{ijk} )^{2} } } } } {\left( {\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {n_{ij} } } - p} \right)}}} \right. \kern-0pt} {\left( {\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {n_{ij} } } - p} \right)}}}$$
(1)
$$R^{2} = 1 - {{\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {\sum\limits_{k = 1}^{{T_{ij} }} {(h_{ijk} - \hat{h}_{ijk} )^{2} } } } } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {\sum\limits_{k = 1}^{{T_{ij} }} {(h_{ijk} - \hat{h}_{ijk} )^{2} } } } } {\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {\sum\limits_{k = 1}^{{T_{ij} }} {(h_{ijk} - \bar{h})^{2} } } } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {\sum\limits_{k = 1}^{{T_{ij} }} {(h_{ijk} - \bar{h})^{2} } } } }}$$
(2)
$${\text{Bias}} = {{\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {\sum\limits_{k = 1}^{{T_{ij} }} {(h_{ijk} - \hat{h}_{ijk} )} } } } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {\sum\limits_{k = 1}^{{T_{ij} }} {(h_{ijk} - \hat{h}_{ijk} )} } } } {\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {n_{ij} } } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{T} {\sum\limits_{j = 1}^{{T_{i} }} {n_{ij} } } }}$$
(3)

where h ijk , \(\hat{h}_{ijk}\) and \(\bar{h}\) are the observed, predicted and mean values of the branch attributes at the measurement point for the branch k = 1,…, T ij in tree j = 1,…, T i within plot i = 1,…, T, respectively; T, T i, T ij are the number of plots, the number of trees in ith plot, and the number of branches in T ijth tree within ith plot, respectively; and p is the number of model parameters in the equations.

Besides DINC, BD and BL are also affected by size and vigor of trees, the site itself and competition (Roeh and Maguire 1997; Weiskittel et al. 2007). Based on the best functions, we studied the relationship between the ten variables and the parameters of the local equations (see Table 2) that best described the BD-DINC (or BL-DINC) relationship, with the aim of improving the accuracy of the equation and developing new generalized functions. Variables denoting the size and vigor of trees and stands are: stand density (SD, trees ha−1), canopy density (CD, %), mean HT of stand (MHT, m), mean DBH of stand (MDBH, cm), DBH (cm), HT (m), CL (m) and CW (m). The site effect variable is site index (SI, m) which are calculated using the Meyer’s site index function (Gu et al. 2001): SI = DH·exp((b/Age) − (b/Age0)), where DH is dominant height (m), Age is stand age (years) at time of DH measurement, and Age0 is the standard age (years) which is assumed as 20 years. Competition variable is denoted by Hegyi competition index (HCI; Anning and McCarthy 2013): HCIi = ∑(DBHj/(DBHi·D ij), where HCIi is the competition index for ith tree, and DBHi and DBHj are DBH (cm) for the ith subject tree and its jth competition, and D ij is the distance (m) between the ith subject tree and its jth competition tree.

We selected variables suited for the procedure, beginning with graphical exploration of the data and examination of the correlation statistics. The process of selecting variables was conducted as: (1) estimated the parameters of local functions (as showed in Table 2) for each tree; (2) detect the linear or nonlinear relationships between the parameters and the stand (or tree) variables, as well as the necessary combination (e.g., DBH2HT) and transformation of variables [e.g., log(DBH)]; (3) integrate the relationships into the best local function. All of these calculations were performed in the S-Plus NLS function (S-Plus Institute 2008). The best functions with selected tree (and stand) variables were then used as a base model from which the mixed-effects BD and BL models were constructed.

Development of mixed models

The BD-DINC (or BL-DINC) observations made in trees and plots may be highly correlated, thus violating the principle of independence of error terms. One procedure used to deal with correlated observations is to fit mixed models, in which the variability between the sampling units can be explained by including random parameters that are estimated at the same time as the fixed parameters (Vonesh and Chinchilli 1997; Fang and Bailey 2001). Basically, following the multi-level nonlinear mixed model technique, the parameter vector of a non-linear mixed model can be defined as follows (Vonesh and Chinchilli 1997; Pinheiro and Bates 2000):

$${\varvec{\Phi}}_{ij} = {\mathbf{A}}_{ij} {\varvec{\uplambda}} + {\mathbf{B}}_{ij} {\mathbf{b}}_{ij}$$
(4)

where Φ ij is the parameter vector r × 1 (where r is the total number of parameters in the model) specified for the jth tree in ith plot, λ is the vector p × 1 of the common fixed parameters for the whole population (p is the number of fixed parameters in the model), b ij is the vector q × 1 of the random parameters associated with the jth tree in i-plot (q is the number of random parameters in the model), and A ij and B ij are matrices of size r × p and r × q for specific and random effects for the jth tree in ith plot, respectively.

The variance–covariance matrices for the random effects vector, b i and b ij , usually account for the variability between plots and trees, respectively. As in Calama and Montero (2005) and Fu et al. (2013), b i and b ij are assumed to be unstructured. A hypothetical 3 × 3 variance covariance matrix can be expressed in general form as:

$$\left( {\begin{array}{*{20}c} {\sigma_{1}^{2} } & {\rho_{12} } & {\rho_{13} } \\ {\rho_{12} } & {\sigma_{2}^{2} } & {\rho_{23} } \\ {\rho_{13} } & {\rho_{23} } & {\sigma_{3}^{2} } \\ \end{array} } \right)$$
(5)

where σ 2 i (i = 1, 2, 3) is the variance in the ith random effect, and ρ ij (i, j = 1, 2, 3, i ≠ j) is the covariance between the ith and the jth random effects, intercepting ρ ij  = ρ ji .

In addition to random effects, a correlation structure and a variance functions can be specified for R ij. Details on correlation structures and variance functions are available in Pinheiro and Bates (2000), Fang and Bailey (2001) and Yang and Huang (2011). In a general context, the covariance matrix R ij can be expressed as (Pinheiro and Bates 2000):

$$R_{ij} = \sigma^{2} G_{ij}^{0.5} \varGamma_{ij} G_{ij}^{0.5}$$
(6)

where σ 2 is a scaling factor of the error dispersion, G ij is a n ij  × n ij diagonal matrix describing both between-tree heterogeneity variance structures, and Γ ij is a n ij  × n ij matrix describing the within-tree correlation structure of error. As mentioned above, the variation of branch diameter and length along with the stem can be considered as typical repeatedly measured datasets. Therefore, commonly used time series correlation structures, such as the first-order autoregressive structure AR(1), first-order moving average structure MA(1) and first-order autoregressive and moving average structure ARMA(1,1) were evaluated in this study (Vonesh and Chinchilli 1997; Pinheiro and Bates 2000). In the literatures, Zhao et al. (2005) and Yang and Huang (2011) have evaluated the abilities of these structures to remove residual autocorrelation in forest growth and yield models. Their work have indicated that these structures were effective in removing autocorrelation for repeatedly measured data. Therefore, we are expanding on these works to further explore these structures to remove the autocorrelation of branch datasets.

Variance heterogeneity was removed by three frequently used variance functions: the exponential function [Eq. (7)], power function [Eq. (8)] and the constant plus power function [Eq. (9)] (Pinheiro and Bates 2000; Yang and Huang 2011; Fu et al. 2013).

$${\text{Var(}}\varepsilon_{ijk} )= \sigma^{2} \exp (2\gamma_{1} {\text{DINC}}_{ijk} )$$
(7)
$${\text{Var(}}\varepsilon_{ijk} )= \sigma^{2} {\text{DINC}}_{ijk}^{{2\gamma_{2} }}$$
(8)
$${\text{Var}}\left( {\varepsilon_{ijk} } \right) = \sigma^{2} (\gamma_{3} + {\text{DINC}}_{ijk}^{{2\gamma_{4} }} )^{2}$$
(9)

where DINCijk is the DINC of the kth branch in the jth tree in ith plot, and γ 1γ 4 are estimated parameters. An appropriate variance function for the model was determined by the Akaike Information Criterion (AIC), Schwarz’s Bayesian Information Criterion (BIC) and logarithm likelihood values (LogLik) as well as the Likelihood Ratio Test (LRT) for the nested models.

We constructed the non-linear mixed effects model by selecting the local and generalized equations that yield the best fits for the BD and BL models by maximum likelihood (ML) using the NLME function in the S-Plus software (S-Plus Institute 2008). The analysis of different components of NLME in our paper was organized as: (1) selection of a base function (i.e., local function); (2) selection of stand and tree level variables (i.e., generalized function); (3) selection of a function to address variance heterogeneity; (4) selection of an autocorrelation function. We compared the fitting statistics for each process to determine which parameters and components should be included in the final branch models.

Model validation

The models were validated through an assessment of their predictive abilities when applied to the validation data set. Mixed-model predictions are based on the fact that the stochastic component of growth variability is a consequence of different factors acting simultaneously (Yang and Huang 2011). As described in Fang and Bailey (2001) and Wang et al. (2007), when evaluating the ability of a calibration approach, two different situations should be considered: fixed effects and mixed effects (Yang and Huang 2011). If no prior measurement data are available, a mean branch diameter and length prediction can be generated for each branch using only the fixed parameter. Thus, an expected value of zero was used for all random parameters and the prediction of the fixed part is the standard BD-DINC (or BL-DINC) curve. However, these functions only represent the pattern of a typical response and define the mean behavior of the branch variation for a given tree. Therefore, we did not produce such BD and BL here. Mixed-effects model predictions require a sub-sample of BD-DINC (or BL-DINC) measurements (and the stand and tree variables are included in the model in the case of generalized models) from each tree for random parameter prediction. The random parameters were predicted using the following expression (Vonesh and Chinchilli 1997; Wang et al. 2007; Yang and Huang 2011):

$$b_{i} \approx \hat{D}\hat{Z}_{i}^{T} (\hat{R}_{i} + \hat{Z}_{i} \hat{D}\hat{Z}_{i}^{T} )^{ - 1} \hat{e}_{i}$$
(10)

where \(\hat{D}\) is the matrix q × q of variance–covariance associated with the random parameters, which is common to all trees in the general model fitting procedure; \(\hat{R}_{i}\) is the m j  × m j estimated matrix of variances–covariance of the error term; \(\hat{e}_{i}\) is the residuals vector m × 1, the components of which are obtained as the difference between the observed BD (or BL) each branch in specific tree and the value predicted using the model with fixed parameters only; and \(\hat{Z}_{i}\) is the matrix m × q of the partial derivatives of the random parameters evaluated in b i  = 0. Details on the prediction of random effects parameters in the forestry context can be found in the study by Fang and Bailey (2001) and Wang et al. (2007). Usually, the greater the number of sub-sample selected for estimation of random effects parameters, the higher the prediction accuracy (Yang and Huang 2011; Corral-Rivas et al. 2014). We tested BD and BL measurements of various numbers of branches (i.e., 1–8 branches for each tree) to estimate random effects parameters in our preliminary analyses. Considering both measurement costs and potential errors (Calama and Montero 2005; Corral-Rivas et al. 2014), we decided to use four randomly selected branches to estimate random effects parameters for prediction of BD and BL.

Prediction accuracies of the models with and without the random effects were compared by examining the goodness-of-fit statistics (RMSE, R 2 and Bias), which are obtained for the equations fitted by NLS and NLME functions in S-Plus software. Model variants in the mixed effects models were compared using the AIC, BIC and Loglik. Unless otherwise, the level of significance is 0.05 (alpha = 5 %) throughout this paper.

Results

Local function selection

The fits statistics of the functions (Table 2) are presented in Table 3. Although evaluation indices were almost identical for all of the functions, function F8 (a Hosfled form) showed a slightly superior predictive ability for both models. In selecting the best local functions, we also examined graphs of the residuals and the significance of the parameters for each function. In this respect, the Hosfled function was the preferred model. As emphasized here, although the Mitscherlich function (F2) did not perform much worse for the average growth response of branch diameter and length and has one less parameter than Hosfled function (F8), it usually failed to reach convergence for a specific tree which indicating that Mitscherlich function was not suitable to include additional stand (or tree variables). Therefore, the Hosfled function was selected as the basic nonlinear model for constructing the BD and BL model.

$${\text{BD}}_{ijk} = b_{0} /(1 + b_{1} \cdot {\text{DINC}}_{ijk}^{{ - b_{2} }} ) + \varepsilon_{ijk}$$
(11)
$${\text{BL}}_{ijk} = b_{0} /(1 + b_{1} \cdot {\text{DINC}}_{ijk}^{{ - b_{2} }} ) + \varepsilon_{ijk}$$
(12)

where BD ijk , BL ijk and ε ijk are branch diameter, length and error estimated by the models for the kth branch in the jth tree in the ith plot, DINC ijk are depth of branch into crown for the kth plot in the jth tree in the ith plot, and b 0 b 2 are function parameters.

Table 3 Performance criteria for BD- and BL-DINC base models

Generalized function

To avoid over-parameterization and collinearity in the models, we used only those variables displaying a significant contribution to branch diameter and length variation. On relating the parameters in Eqs. (11) and (12) to the stand and tree variables, we found that in the both models, parameter b 0 was positively correlated with DBH (r = 0.8851 and 0.7753) and HT (r = 0.8073 and 0.6510), whereas parameter b 1 was negatively correlated with DBH (r = −0.3819 and −0.3432) and HT (r = −0.3568 and −0.2630) and, finally, parameter b 2 was uncorrelated with all the variables in our study. To develop a generalized equation from Eqs. (11) and (12), we tested various combinations of the tree variables to improve their efficacy in the fit, taking into account the previously mentioned correlations, which resulted in Eqs. (13) and (14):

$${\text{BD}}_{ijk} = (b_{0} + b_{3} \cdot {\text{DBH}}_{ij} )/(1 + b_{1} \cdot {\text{HT}}_{ij} \cdot {\text{DINC}}_{ijk}^{ - b \cdot 2} ) + \varepsilon_{ijk}$$
(13)
$${\text{BL}}_{ijk} = (b_{0} + b_{3} \cdot {\text{DBH}}_{ij} )/(1 + b_{1} \cdot {\text{HT}}_{ij} \cdot {\text{DINC}}_{ijk}^{ - b \cdot 2} ) + \varepsilon_{ijk}$$
(14)

where DBH ij and HT ij are the diameter at breast height (DBH, cm) and tree height (HT, m) of the jth tree in i-plot, and b 0b 3 are formal parameters.

Random-effects models

There would be 225 different combinations of random effects parameters for Eqs. (13) and (14) while simultaneously considering both plots effects and the nested effects of plots and trees for the BD and BL models, respectively. When fitting to the data, only 179 and 156 of the BD and BL mixed-effects models reached convergence. The models of Eqs. (15) and (16), incorporating plot effects on b 1 and nested interaction on b 1 and b 3 for the both mixed-effects models, yielded the smallest AIC and BIC, and the largest Loglik.

$${\text{BD}}_{ijk} = (b_{0} + (b_{3} + u_{3ij} ) \cdot {\text{DBH}}_{ij} )/(1 + (b_{1} + u_{1i} + u_{1ij} ) \cdot {\text{HT}}_{ij} \cdot {\text{DINC}}_{ijk}^{{ - b_{2} }} ) + \varepsilon_{ijk}$$
(15)
$${\text{BL}}_{ijk} = (b_{0} + (b_{3} + u_{3ij} ) \cdot {\text{DBH}}_{ij} )/(1 + (b_{1} + u_{1i} + u_{1ij} ) \cdot {\text{HT}}_{ij} \cdot {\text{DINC}}_{ijk}^{{ - b_{2} }} ) + \varepsilon_{ijk}$$
(16)

where b 0b 3 are fixed-effects parameters; u 1i are random-effects parameters generated by plot on b 1; u 1ij and u 3ij are random-effects parameters generated by interaction of plot and tree on b 1 and b 3, respectively.

Within-tree variance–covariance (R) structure

Evaluation indices of three variance functions applied to Eqs. (15) and (16) are listed in Table 4. Each of the three variance functions yielded significantly different results from that of homogenous variance (P < 0.0001). Thus, even with random effects in the parameters, heteroskedasticity persisted in the mixed-effects BD and BL model [Eqs. (15), (16)]. Among the tested variance functions Eqs. (7)–(9), the three evaluation indices and the likelihood ratio test (LRT) showed that the exponential function [Eq. (7)] and the constant plus power function [Eq. (9)] demonstrated the best performance to the BD and BL mixed-effects models, Therefore, the expressions of the BD and BL mixed models finally obtained are:

$${\rm Eq.} (15) + {\rm Eq.} (7)$$
(17)
$${\rm Eq.} (16) + {\rm Eq.} (9)$$
(18)
Table 4 Performances of BD and BL mixed-effects models [Eqs. (15), (16)] with different variance functions

To account for autocorrelation residual variances, correlation structures, including AR(1), MA(1) and ARMA(1,1), were incorporated into the optimal branch diameter and length mixed models. Each of the convergent correlation structure yielded much better results than that of the random effects mixed model. For both models, all the three evaluation indexes and the LRT showed that the AR(1) structure provided much better fits than the others. Thus, the final models are:

$${\text{BD}}_{ijk} = (b_{0} + (b_{3} + u_{3ij} ) \cdot {\text{DBH}}_{ij} )/(1 + (b_{1} + u_{1i} + u_{1ij} ) \cdot {\text{HT}}_{ij} \cdot {\text{DINC}}_{ijk}^{{ - b_{2} }} ) + \varepsilon_{ijk} + {\text{Var(}}\varepsilon_{ijk} )= \sigma^{2} \exp (2\gamma_{1} {\text{DINC}}_{ijk} ) + {\text{AR}}\left( 1\right)$$
(19)
$${\text{BL}}_{ijk} = (b_{0} + (b_{3} + u_{3ij} ) \cdot {\text{DBH}}_{ij} )/(1 + (b_{1} + u_{1i} + u_{1ij} ) \cdot {\text{HT}}_{ij} \cdot {\text{DINC}}_{ijk}^{{ - b_{2} }} ) + \varepsilon_{ijk} + {\text{Var}}\left( {\varepsilon_{ijk} } \right) = \sigma^{2} (\gamma_{3} + {\text{DINC}}_{ijk}^{{2\gamma_{4} }} )^{2} + {\text{AR}}\left( 1\right)$$
(20)

Estimation of parameters

The values of the parameters and goodness-of-fit statistics for the fixed-effects models [Eqs. (11)–(14)] and for the generalized mixed models [Eqs. (15)–(20)] are shown in Table 6. All estimated parameters were highly significant at α = 0.05. We compared the RMSE values obtained with the mixed effects models with those obtained fixed effects models (fitted by ONLS); the values obtained with the generalized mixed models [Eqs. (15), (16)] were 53.08 and 54.51 % lower than those obtained with the local models [Eqs. (11), (12)], and 37.73 and 44.87 % lower than those obtained with the generalized models [Eqs. (13), (14)] for BD and BL models, respectively. However, when the variance functions were added to the generalized mixed models, 8.74 and 0.75 % increases in the RMSE values were observed form the generalized mixed models with variance functions [Eqs. (17), (18)], and the increases were 20.68 and 8.11 % when the correlation structures were continuously modeled [Eqs. (19), (20)]. The results obtained for R 2 were similar to those obtained for RMSE.

On inspecting the values of the AIC, BIC and Loglik for the BD and BL mixed models with variance functions and correlation structures, we found that the Eq. (19) for the BD and Eq. (20) for the BL were the most accurate fitting models (Tables 4, 5). However, the values of the RMSE and R 2 violated the above results, obtaining from the AIC, BIC and Loglik. Meanwhile, we did not find any significant difference for the graphics of the residuals for the BD and BL estimated by the generalized mixed models without and with variance functions and correlation structures, indicating that the variance functions and correlation structures did not exert significant effects on BD and BL (Fig. 3). Therefore, we selected the Eqs. (15) and (16) as the final BD and BL mixed-effects models. Substituting the fixed and random parameters estimated into the Eqs. (15) and (16), the generalized branch diameter and length model for natural Dahurian larch in northeastern China becomes:

$${\text{BD}}_{ijk} = (0.4327 + (0.1643 + u_{3ij} ) \cdot {\text{DBH}}_{ij} )/(1 + (0.1694 + u_{1i} + u_{1ij} ) \cdot {\text{HT}}_{ij} \cdot {\text{DINC}}_{ijk}^{ - 1.2194} ) + \varepsilon_{ijk}$$
(21)

where b i  = [b 1i ] ∼ N{[0], (0.0373)},

$$b_{ij} = \left[ \begin{aligned} b_{1ij} \hfill \\ b_{3ij} \hfill \\ \end{aligned} \right]\sim N\left\{ {\left[ \begin{aligned} 0 \hfill \\ 0 \hfill \\ \end{aligned} \right],\left( {\begin{array}{*{20}c} {0.0691} & {0.5671} \\ {0.5671} & {0.0285} \\ \end{array} } \right)} \right\},$$
$$\varepsilon_{ijk} \sim N(0,R_{ij} = 0.2883I_{ij} )$$
$${\text{BL}}_{ijk} = (90.9044 + (14.3599 + u_{3ij} ) \cdot {\text{DBH}}_{ij} )/(1 + (0.2635 + u_{1i} + u_{1ij} ) \cdot {\text{HT}}_{ij} \cdot {\text{DINC}}_{ijk}^{ - 1.2464} ) + \varepsilon_{ijk}$$
(22)

where b i  = [b 1i ] ∼ N{[0], (0.0003)},

$$b_{ij} = \left[ \begin{aligned} b_{1ij} \hfill \\ b_{3ij} \hfill \\ \end{aligned} \right]\sim N\left\{ {\left[ \begin{aligned} 0 \hfill \\ 0 \hfill \\ \end{aligned} \right],\left( {\begin{array}{*{20}c} {0.1423} & {0.7604} \\ {0.7604} & {3.6909} \\ \end{array} } \right)} \right\},$$
$$\varepsilon_{ijk} \sim N(0,R_{ij} = 17.3771I_{ij} )$$
Table 5 Performances of BD and BL mixed-effects models [Eqs. (17), (18)] with different correlation structures
Fig. 3
figure 3

Distribution of residuals for five equations fitting the branch diameter (BD) and branch length (BL) of Dahurian larch in northeastern China. a Local model [Eqs. (11), (12)], b generalized model [Eqs. (13), (14)], c random-effects model [Eqs. (15), (16)], d random-effects model with variance function [Eqs. (17), (18)], e random-effects model with variance function and correlation structure [Eqs. (19), (20)]

Model validation

Most developed models are intended for making predictions on datasets that were not used in model development. As for the mixed-effects models, it is extremely difficult to calculate the values of random effects parameters, which is the most important step in the process of model prediction. The random effects parameters were calculated with Eq. (10) using the validation datasets. Table 7 presents the three prediction statistics [Eqs. (1)–(3)] obtained for the validation data for branch diameter and length predictions, using the estimated coefficients listed in Table 6 for Eqs. (15) and (16), which indicated that the fitting data and validation data had the same result for all the models considered in our study. The mean prediction biases in all models were not significantly different from zero (α = 0.05). The prediction accuracy of Eqs. (15) and (16) was much higher that of other fixed-effects model and random-effects model for the BD and BL models, respectively. For the BD models, the Eq. (15) reduced RMSE of branch diameter by 64.85 and 49.04 %, when compared to simulation with the local model [Eq. (11)] and the generalized model [Eq. (13)]; however, the simulation of branch diameter led to very slight increases of RMSE (1.78 and 4.48 %) when the variance function and correlation structure were added to the Eq. (15), respectively. Similarly, for the BL models, the reductions were approximately 53.02 and 48.99 % when the Eq. (16) compared those with the local model [Eq. (12)] and the generalized model [Eq. (14)], and the increases of RMSE values were also much smaller (0.91 and 2.00 %) when compared those with the Eq. (16) with the variance functions [Eq. (18)] and the correlation structures [Eq. (22)]. All these conclusions can be confirmed form the changes of R 2 in Table 7.

Table 6 Parameter estimates and evaluation criteria of each model
Table 7 Prediction statistics for different branch diameter and length models based the validation data

Model application

Figure 4 shows the simulation of branch diameter and length for different DBH and HT introduced the fixed parameters into Eqs. (15) and (16). The both models are typical monotonic increasing functions with an asymptotic line. For a tree of given DBH and HT, BD and BL become progressively larger with DINC increasing. Moreover, for the trees with the same DBH and different HT, BD and BL at a given DINC reduced with increasing HT; however, for the trees with the same HT and different DBH, both of them at a given DINC increased with increasing DBH. Based on the above analysis, the DBH and HT can effectively respond to the changes of branch diameter and length for different tree size, which indicated that Eqs. (15) and (16), incorporating nested two-levels effects of plot and tree without variance functions and correlation structures, display sufficiently high predictive power to constitute a final model for predicting the BD and BL of natural Dahurian Larch forest in northeast China.

Fig. 4
figure 4

Simulation of branch diameter (BD) and branch (BL) length for different diameter at breast height (DBH) and tree height (HT)

Discussion

Multi-level linear or nonlinear mixed modeling provides the opportunity to study the branch growth variation at different level of the hierarchy. In the present work, branch diameter and length for natural Dahurian larch trees in northeast China are described as a stochastic process, where a fixed effect explains the mean value for the development, while unexplained residual variability is described and modeled by including random parameters acting at plot and tree levels.

In the fixed part of the models, which explains approximately 80 and 85 % of the variability for branch diameter and length, the primary explanatory variable was the locations of branches, defined as the absolute depth of branch into the crown (DINC) from the stem apex to the base of the crown. DINC is a good indicator of branch development along with the stem, because it includes both past competitive interactions and genotypic differences in response to environment. In the proposed models,they showed that BD and BL increased quickly in the early stage and then approached to a maximum value at the crown base. Together with DINC, the fixed part of the models also included variables commonly used as predictors in forest growth and yield models, such as DBH and HT. Positive values for the parameter b 0 in Eqs. (11) and (12) associated with DBH are related to the fact that larger DBHs are attaining larger branch size, and HT is negatively correlated with b 1 in Eqs. (11) and (12), which means that branches from the higher trees are expected to smaller branch size than those from shorter trees. In comparing the present models with the branch size models for Dahurian larch plantation growing on the same area by Jiang et al. (2012a, b), it is possible to see that the latter also includes variables indicating branch location (DINC) and tree size (DBH). Therefore, the differences of branch growth process for Dahurian larch with different origin were mainly embodied in the effects of HT. The dominant trees grown in natural forest usually can acquire more resources (i.e., photosynthetically active radiation), thus the optimal strategy for these trees may allocate more energy and matter to enhance the growth of diameter and height. However, the priority for inferior trees should allocate more energy and matter to enhance the growth of branch which is useful to acquire more resources. This may be the main reasons why higher trees can lead to smaller branch sizes for natural Dahurian larch. As for the plantation forest, HT has no significant effects on the branch growth process, because of the variation of HT within a specific stand usually was negligible.

The stochastic part of the models, which contribute additional 12 and 10 % of the growth variability for branch diameter and length, include the hierarchical random effects (trees and plots). Compared to the residual variability within a specific tree, the smaller between-plot (or tree) residual variability can be taken into account the stand characteristics (i.e., stand density and age), site properties (i.e., evaluation, slope aspect and position) and management treatments (i.e., cutting intensity). However, as the contributions of these variables were much less than the DBH and HT, they were not included into the final mixed-effects model. Large within-tree residual variability can be due to competition factors. Crown widths were significantly negatively correlated with the parameter b 0 for both models (R 2 = −0.3819 and −0.3432), and the Hegyi competition index denoting the social status of the trees within stand were positively correlated with the parameter b 0 (R 2 = 0.2982 and 0.3622) and b 1 (R 2 = 0.2248 and 0.2662), which indicated that trees from the dominant stratum are expected to smaller branch size than those from other strata. However, considering the tradeoffs between the complexity and prediction ability of the models, as well as the prohibitively costly and time consuming to measure the crown width and competition factors, they were not added to the fixed effects models.

Finally, approximately 7 and 4 % of the growth variability cannot be explained in our study, which suggested that some variables were ignored in the final mixed model [Eqs. (19), (20)], e.g., factors related genetics, microsite or ecological. The larch casebearer (Coleophora dahurica) and caterpillar (Dendrolimus superans) are the main defoliator pests for the pure and mixed Dahurian larch forest in northeast China, which may affect the growth of branches (Jin et al. 1995). In addition, the human error in branch attribute measurement is also one of the major causes for failing to explain them completely, since the Dahurian larch has the phenomenon of false whorls in the stem (Wu 2010), which leads to the branch location in the stem, were difficult to determine.

In general, the random parameters were not able to remove the heterogeneous variance and residual autocorrelation completely, both of these values required further consideration in the presence of random effects. To evaluate the heterogeneous variance, comparing the residual plots between the ordinary regression model and the mixed-effects model was predominantly used in building the mixed-effects model. For our data, the distribution of residuals for the BD and BL random-effects model [Eqs. (15), (16)] showed slight heterogeneity (Fig. 3c). Thus, we tested an array of variance functions to completely alleviate the problems of heterogeneous variance. However, significant heterogeneous variances were still present for both models [Eqs. (17) and (18)] even with the variance functions included (Fig. 3d). The fitting precision of the models were reduced about 1.62 and 0.07 % (for the R 2, same the below), respectively. And the same situation, which the fitting precision decreased by 3.71 and 0.75 %, were still present when the time series correlation structures were added to the Eqs. (17) and (18) (Fig. 3e). Similar findings were reported by others when further modeling of within-tree heterogeneous variance and residual autocorrelation was still necessary (Garber and Maguire 2003; Yang and Huang 2011). In addition to the above-mentioned natural and human reasons, the most important reason for not being able to remove them completely is a slight lack of flexibility in the model form for some trees (Yang and Huang 2011). For instance, it may be possible that, for a tree, the model slightly underestimates the branch size for the first few measurements, and overestimates for the middle part and underestimates for the last few measurements, which could lead to positive–negative–positive residuals. It appears that, although the range of residuals are reduced substantially by including random parameters, the heterogeneous variance and residual autocorrelation may not always be completely removed for some models and data even with the appropriate covariance (Fortin et al. 2008; Yang and Huang 2011).

In predicting the branch size from mixed-effects models, the random-effects parameters should be estimated from a prior information provided by dependent variables (Vonesh and Chinchilli 1997; Fang and Bailey 2001; Yang and Huang 2011; Corral-Rivas et al. 2014). Thus, in addition to prediction variables such as DBH and HT, DINC of a small subsample for a special tree must be measured, and random parameters should be estimated (Vonesh and Chinchilli 1997; Yang and Huang 2011; Corral-Rivas et al. 2014). The number of subsamples for estimating random parameters has been discussed in the literature, most of which showed that 3–5 subsamples are more appropriate for subject-specific prediction (Calama and Montero 2005; Yang and Huang 2011; Fu et al. 2013; Corral-Rivas et al. 2014). The analysis of data using one to eight branches for each revealed the accuracy of prediction analysis with four branches was similar to the results of the analysis with five or more branches. Therefore, as Calama and Montero (2005) and Corral-Rivas et al. (2014) suggested, four randomly selected branches were measured for random parameters estimation in our study. The accuracy of prediction for all the models in Table 7 can lead to the fact that combining the available measurement data and a prior knowledge of the variance–covariance structure of the random parameters is the optimal way to derive subject-special predictions.

One of the important messages presented in our analysis, why more complex models [Eqs. (17)–(20)] at Tables 4 and 5 performed worse than simpler models [Eqs. (15), (16)] when evaluating their fitting and prediction abilities using the traditional model selection criterias (e.g., R 2, RMSE), should cause our attentions. This phenomenon can be called as “law of diminishing returns” in term of predictive performance as more complex models are constructed. Foe example, the percentage of decreased for AIC values between Eqs. (19) and (15) was as high as 48.14 %, however this improvements can not be verified by the traditional criteria. Similar results were also found in the work by Zhao et al. (2005) and Yang and Huang (2011), the percentage of decreased for AIC values for their work were 10.38 and 7.6 %, respectively. There is no doubt that the models are increasingly being developed more complex in the field of forestry, however our study should that a simpler function may be also a good one from the perspective of application.

The models presented here allow for integration into a model system, as implemented in SILVA2.0 (Pretzsch et al. 2002), which can be used for predicting the effects of different stand conditions and management measures on the geometry of branches for natural Dahurian larch forest in northeast China. When applying the models, most of the predictions use variables that can be obtained from traditional mensurational data for tree-level descriptors as provided by sample-plot inventories. The outputs of DBH and HT, as well as other variables, are used as the values of the independent variables in the branch models. During the simulation, information on branch attributes of each sample tree is updated annually together with the other tree and stand variables. Some natural extensions of the work would be to more fully examine the impact of various culture treatments, site properties, stand origins and their nested interaction on the growth responses of branches. In addition, given the widespread attention paid to global climate change over the last decade or so, a potential need has arisen to develop climate-sensitive growth models for trees or branches.

Conclusions

Branch attributes are the key link between management measures, tree growth and wood quality; however, detailed studies of branch response to a wide range of site and stand conditions for natural forest are lacking. A total of 966 sample branches on 50 trees for natural Dahurian larch were monitored in this study, representing a variety of site and stand conditions throughout the northeast China. Branch diameter [Eq. (15)] and length [Eq. (16)] models were developed to estimate branch size of individual Dahurian larch trees in natural mixed uneven-aged stands in northeast China using a multi-level nonlinear mixed-effects model approach. Two tree variables (DBH and HT) were significant predictors of branch size of individual trees, hence were included in the models. By partitioning the residual variation via random-effects parameters modeling at the plot and tree levels, the models [Eqs. (15), (16)] were able to capture the variation successfully. The interaction between plot and tree played a more important role than plot alone. However, the heterogeneity and autocorrelation were not successfully removed by the frequently used variance functions [Eqs. (7)–(9)] and time series correlation structures [i.e., AR(1), MA(1) and ARMA(1,1)] for both models, respectively. The simulation and prediction accuracy of the nested plot and tree models were higher than the local and generalized models, as well as the mixed models with the appropriate variance function and correlation structure. Overall, our results showed that the multi-level nonlinear mixed-effects model performed better than the traditional local models and the generalized models for predicting natural Dahurian larch branch properties.

Author contribution statement

Conceived and designed the experiment: Zhaogang Liu. Analyzed the data: Lingbo Dong, Zhaogang Liu. Wrote the paper: Lingbo Dong, Zhaogang Liu, Pete Bettinger.