Introduction

Korean pine, Pinus koraiensis Siebold & Zucc., is one of the most economically important tree species in Northeastern China (Li and Löfgren 2000). Apart from its excellent wood properties, P. koraiensis nuts have significant nutritional and therapeutic value (Zhang et al. 2015). The use of korean pine nuts as a nutritional source can be traced back to the Paleolithic era in eastern Asia (Rosengarten 2004). China is a major producer of pine nuts in the world, with most exported to Western countries (Destaillats et al. 2010), generating more than US $ 250 million dollars each year (Man et al. 2012).

Because of its important economic value, P. koraiensis is much in demand for reforestation in Northeastern China (Ren et al. 2018). However, planting stock with desirable genetic properties is needed to meet the planting demand, but due to the difficulty of producing large numbers by cottage, tissue culture and other asexual methods, seed orchards have been an important step to improve the species (Wang and Hong 2004; Feng et al. 2010). In China, several P. koraiensis seed orchards were established in the beginning of the 1960s. To date, some orchards have produced improved varieties based on progeny tests or clonal evaluations (Liang et al. 2019). Because of the demand for timber over the past several years, improved varieties of P. koraiensis spp. were selected based only on their growth traits or wood properties (Wang et al. 2018; Liang et al. 2019). But with the official decree in 2016 against the cutting of natural forests, we pay more attention to seeds and cone of P. koraiensis.

Characteristics of the cone and seeds, nut numbers and shape, are important traits to evaluate for improving seed yield (Davis 1967; Matziris 1998; Roy et al. 2004; Salvatore et al. 2010; Rawat and Bakshi 2011). In this study, 110 clones of 38-year-old P. koraiensis were selected and 14 cone, seed and nut characteristics were investigated and analyzed (Table 1). The primary objectives were to: (1) compare the performance values of different P. koraiensis clones for each trait; (2) estimate the genotypic and variation parameters of different traits; (3) identify the relationship among different traits; and, (4) perform a comprehensive assessment of each clone and predict selection gains.

Table 1 ANOVA, genetic and variation parameters on 110 Pinus koraiensis clones in the Naozhi seed orchard

Materials and methods

Site description

Data were collected from the Naozhi forestry seed orchard located on the western hillside of Changbai Mountain, Linjiang City, Jilin Province, Northeast China (41°05′ N, 126°06′ E). Natural P. koraiensis forest and broad-leaved mixed forest are typical of this region 700 to 1100 m. The region has a moderate temperate and a continental monsoon climate with average annual temperatures, rainfall, and frost-free period is 5 °C, 800 mm, and 135 days, respectively. The soil is Albi-boric Argosols according to the U.S. soil taxonomy (Xu et al. 2018), dominated by dark brown soil and about 40 cm thick, with a textural proportion of sand (15.1%), silt (63.3%), and clay (21.6%) (Zhu et al. 2010).

Materials

In 1979, 110 clones (Table 2) were selected, based on growth traits from the natural distribution of P. koraiensis in Linjiang City. Cuttings were grafted the following year and plantations were established with 4-year-old plants in the spring of 1984. The experimental design consisted of 10 large blocks, and each containing five small blocks. Each clone was planted using a randomized complete block design containing one tree at 7.0 m × 7.0 m in each small block.

Table 2 Comprehensive evaluation of traits investigated in 110 P. koraiensis clones

Trait measurements

The cones were collected at maturity. Cones from each ramet were counted (cone number, CN) and directly weighed (cone weight, CWei). In order to have a stable cone number per ramet, as cone numbers are different each year, previous years data on cone number were taken into account and cone number recorded for 5 years (from 2014 to 2018). Averages were calculated and used in the analysis. Measurements of the other traits were only for the study year (2018). Cone lengths (CL) and widths (CWd), (diagonally at the widest portion), were measured with a digital caliper. Scales were counted per layer, and the number of layers/cone were recorded (CLN). After measurements were recorded, seeds were removed and the number of seeds per cone per ramet were counted (CSN). To calculate clone thousand seed weight (TSWei), seeds from different ramets were mixed, 400 seeds were randomly chosen and divided into four equal portions for replication, and every 100 seeds were weighed. A thousand seed weight was obtained by extrapolation. To determine seed characteristics, 50 seeds from each clone were randomly chosen and measured for length (SL), width (SWd), and weight (SWei). Finally, nuts were extracted from the 50 seeds and length (NL), width (NWd), and weight (NWei) were determined. Coat thickness (CTH) was measured 10 times for each seed using a Vernier caliper, and an average value were taken.

Data analyses

One-way analyses of variance (F test) were performed for the 14 traits using R Software (Version 2.5-3). Clonal and environment/random effects were estimated using IBM SPSS, version 20 (Field 2013) and the linear model Eq. (1) (Li et al. 2017)

$$ Y_{ij} = \mu + a_{i} + \varepsilon_{ij} , $$
(1)

where \( Y_{ij} \) is the performance of individual tree j of clone I, \( \mu \) is the overall mean, \( a_{i } \) is the clone effect, and \( \varepsilon_{ij} \) is random error. The phenotypic coefficient of variance \( \left( {\text{PCV}} \right) \) in Eq. (2) expresses the range of individual tree performances compared to the clonal mean values for a given trait. PCV was calculated using the ratio of the standard deviation \( \left( {\text{SD}} \right) \) to the average values \( \left( {{\bar{\text{X}}}} \right) \)(Brancher et al. 2019).

$$ {\text{PCV}} = \frac{\text{SD}}{{{\bar{\text{X}}}}}, $$
(2)

Repeatability was calculated from the clone averages to estimate the stability of traits in different ramets (Xiao et al. 2019). And it calculated by subtracting the ratio of the F (Fisher-Yates) coefficients determined by ANOVA (\( 1/{\text{F}}) \) from the unit value (\( 1) \) (Zheng et al. 2015).

$$ {\text{R}} = 1 - \frac{1}{\text{F}} $$
(3)

The relationships between characteristics were estimated by Pearson’s correlation coefficient (\( {\text{r}}_{{{\text{A}}\left( {{\text{x}},{\text{y}}} \right)}} \)), which was the partition of the covariance of related traits \( {\text{COV}}_{{{\text{P}}\left( {{\text{x}},{\text{y}}} \right)}} \) divided by the result of multiplying their respective variances \( \upsigma_{\text{p}} \left( x \right) \times\upsigma_{\text{p}} \left( y \right) \) (Zhao et al. 2016).

$$ {\text{r}}_{\text{A }} \left( {\text{xy}} \right) = \frac{{{\text{COV}}_{{{\text{P}}\left( {{\text{x}},{\text{y}}} \right)}} }}{{\upsigma_{\text{p}} \left( {\text{x}} \right)\upsigma_{\text{p}} \left( {\text{y}} \right)}}, $$
(4)

Considering the correlation matrix, factor analysis was performed by principal components using PAST software (Hammer and Harper 2001). It determined the major traits that explained most of the observed variance. These traits were the most suitable to be used to generate the clone selection index (Aït-Sahalia and Xiu 2019). The highlighted traits were then used to perform a comprehensive evaluation using the Qi value. Qi was obtained by squaring the sums of the ratios of the clonal averages to the maximum values of overall means of the clone (Zhao et al. 2016). A selection rate of 10% was applied to retain elite clones for future seed production.

$$ {\text{Qi}} = \sqrt {\mathop \sum \nolimits_{{\text{j}} = 1}^{{\text{n}}} {\text{a}}_{{\text{i}}} } , $$
(5)

Selection success was verified by the genetic gain obtained for the selected clones. The genetic gain was calculated by the product of the repeatability \( {\text{R}}, \) and the ratio of the selection difference \( {\text{W}}/{\bar{\text{X}}} \), for a given trait (Gonçalves et al. 2019).

$$ \Delta {\text{G}} = {\text{R}} \times {\text{W}}/{\text{X}}, $$
(6)

Finally, the expected economic gain of the Naozhi seed orchard was considered as a function of seed yield. It was estimated by extrapolating the unitary economic value per kilogram of seeds multiplied by the recorded production and realized gain in seed weight.

$$ \Delta {\text{Eco}}. = {\text{Price}} \times {\text{unitary}}\;{\text{seed }}\;{\text{weight}} \times {\text{realized}}\;{\text{gain }}\;{\text{in}}\;{\text{seed }}\;{\text{weigth}}, $$
(7)

Results

Average values of different traits

Variations in the traits were distinct among clones, with moderate to high variation between clone averages. Overall mean values, minimum and maximum values, standard deviations (SD) and phenotypic coefficients of variance (PCV) are shown in Table 1. With regards to cone characteristics, overall means of cone numbers, lengths and widths were 18.1 cm, 132.9 cm, and 70.4 cm, values ranged from 11.0 to 26.0 cm, 11.6 to 201.1 cm, and 24.5 to 125.4 cm, respectively. Cone weight varied from 80.0 to 813.0 g, with a mean of 216.7 g. The number of layers per cone and seeds per cone ranged from 5 to 24 layers and from 42 to 268, with overall averages of 13.5 layers and 137.5 seeds, respectively. Traits related to single seed dimensions showed large average of 16.4 mm, 10.8 mm, 0.9 mm, and 0.7 g in length, width, coat thickness, and single seed weight, respectively. A value of 724.9 g was calculated for a thousand seed weight. For nut traits, average values of length width and weight were 13.4 mm, 7.1 mm and 0.3 g respectively.

ANOVA, PCV and R

ANOVA results show that clone performances were significantly different for the traits at probability levels of 0.001 and 0.01 (Table 1). The coefficient of variance (PCV) for all cone, seed and nut traits varied from 9.1 to 34.3% (Table 1). The highest PCV value (34.3%) was recorded for cone number, followed by 29.8% for nut weight and 25.7% for cone weight. The lowest PCVs, below 15%, were found for seed length (9.1%), nut length (9.7%), nut width (11.0%), and seed width (11.0%). The repeatability values (R) for all traits ranged from 27.5 to 93.4% (Table 1) with most having high repeatability values (R ˃ 0.80). Cone dimensions (CL, CWd, CWei, CLN, and CSN) were strongly inherited (R ≥ 90%) as well as seed traits (SL, SWd, SWei and TSWei, higher than 80%). Low R values for cones per clone were 27.5%.

Correlation analysis

The correlation coefficients between the investigated traits are shown in Table 3. Cone number (CN) was negatively correlated to the other traits, ranging from − 0.382 (CN with nut length, NL) to − 0.005 (CN with seeds/cone/ramet, CSN). Cone length (CL), width (CWd) and weight (CWei) were positively correlated with each other and with the other traits (r ranged from 0.196 to 0.799). In addition, CN, seed length (SL), seed width (SWd), seed weight (SWei), nut length (NL), nut width (NWd), nut weight (NWei), seed coat thickness (CTH) and thousand seed weight (TSWei) were positively correlated, with coefficients ranging from − 0.382 (CN with NL) to 0.937 (SL with NL). There were no significant correlations between layers/cone (CLN) and the other traits except for CL, CWd, CWei and seeds/cone/ramet (CSN) which showed significant positive correlations (r ranging from 0.311 to 0.647). Excluding SWd, SWei, NWei, CTH and TSWei, CLN was not significantly correlated with the other traits (r ranged from − 0.369 to 0.565).

Table 3 Pearson correlation coefficients of each investigated traits in 110 P. koraiensis clones

Principal component analysis

Three principal components with high eigenvalues were obtained from the principal components analysis (PCA) with a cumulative contribution of 75.56% (Table 4). From the first principal component (PC I), seed and nut traits (SL, SWd, SWei, NL, NWd, NWei and TSWei) were the most important with high eigenvalue values (0.73 to 0.94). From the second component (PC II), number of layers, seeds per cone, cone length and cone width had high positive values (˃ 0.51). In PC III, only cone number per clone had a high positive absolute value.

Table 4 Variance and cumulative variance of different components

Comprehensive evaluation and genetic gain

The number of cones is one of the most important traits of P. koraiensis, but the correlation analysis indicated that cone number was negatively correlated with the other traits; consequently, cone number was evaluated separately. Eleven clones (PK14, PK75, PK24, PK88, PK44, PK4, PK5, PK103, PK82, PK72, PK2) showed the highest performance at a selection rate of 10% based on cone number, and the genetic gain was 7.2% (Table 5). PCA results show that cone seed and nut traits, including SL, SWd, SWei, NL, NWd, NWei, TSWei, CLN and CSN, could be used together to perform a comprehensive evaluation of P. koraiensis clones. The Qi values of the clones based on the selected traits are shown in Table 2. The eleven clones (PK52, PK60, PK110, PK49, PK84, PK101, PK42, PK106, PK3, PK56 and K39) were selected as elite clones with a selection rate of 10%. The corresponding genetic gains ranged from 6.2 to 24.3%. The greatest genetic gains were found in seed weight, following by nut weight (22.7%), cone weight (17.9%), and cone width (14.1%) (Table 5).

Table 5 Component values and genetic gain of different traits

Discussion

ANOVA and variation parameters of different traits

Average values and their comparison through variance analysis enable the selection of genetic resources in tree improvement research. In fact, an analysis of variance helps to characterize phenotypic variations and variance components among individual trees (Guerra et al. 2016), while average performance values are used to establish a succession order for selection of improved material (clone or families) in different traits according to their ranks (Hietz et al. 2017). Cones, seeds and nuts are the most important traits considered by scientists and foresters in the breeding of economically important coniferous tree species, but their use and output are different because of the sizes and nutritional requirements of different species. The average values for the cone traits (number, length, width, weight) in this study established a succession order to carry out a valuable selection. However, these values are higher than those in studies on Pinus halepensis Mill. (Matziris 1998), Pinus roxburghii Sarg. (Roy et al. 2004) and Pinus sylvestris L. (Sivacioglu 2010). Results from previous studies indicate that P. koraiensis seeds are more suitable for consumption, supporting the basis of this current study for a more in depth seed trait characterization for a breeding program in Northeast China.

Variation parameters such as the phenotypic coefficient of variation and repeatability are basic elements for selecting material for improvement programs, as these elements explain the significance of degrees of difference in the materials (Pan et al. 2019), and the range of variability and stability of traits in tree populations (Bai et al. 2019). The traits investigated in this study showed significant differences among clones, indicating that the methods of this study were effective. Our results are in accordance with previous studies that estimated the variation parameters in cone and seed traits for Pinus wallichiana A. B. Jacks. (Rawat and Bakshi 2011), P. sylvestris L. (Sivacioglu and Ayan 2008) and P. koraiensis (Miyaki 1987). The repeatability of most of the traits was higher than 80%, indicating that the production and quality of seeds from P. koraiensis are stable and effective. Our repeatability values are similar to those of Yuan et al. (2016) for Pinus tabuliformis Carr. However, they were markedly higher than those reported by Matziris (1998) and Sivacioglu (2010) for P. sylvestris and P. halepensis with 70% and 80% heritability, respectively.

Correlation analysis

Correlation coefficients reveal the degree of correlation between different traits and are thus useful for the selection of multiple chraccters (Xia et al. 2016). In this study, Cone numbers were negatively correlated with all the other traits. This is in contrast to Jiang et al. (2019) with P. koraiensis who concluded that cone number was significantly and positively correlated with the number of seeds per cone and thousand seed weight. The results may vary from ours due to differences in investigative methods and trees that were younger than ours. In this study, cone number was recorded for five consecutive years and our results are more representative and consistent with the logic of energy conservation, indicating that the evaluation and selection of the materials were more stable and effective. The number of cone layers and seed number are two important traits that can impact seed yield though both traits have rarely been investigated. In this study, both traits were significantly correlated with other cone traits but weakly correlated with other seed or nut traits which indicate that both traits have little effect on seed or nut traits. Seed traits showed a significant positive correlation with nut traits, corroborating similar results reported for P. sylvestris (Sivacioglu and Ayan 2008; Sevik and Topacoglu 2015). Therefore, we suggest that seed size can affect the size of nuts and that both traits can be evaluated and selected together.

Principal component analysis (PCA)

PCA is an important tool because it reduces a large number of variables into a small number of common or major components that effectively summarize a large portion of variation in a complex dataset (Aït-Sahalia and Xiu 2019). The method is very helpful for the selection of multiple traits, determining the most suitable that mostly contribute to the variation of the selected materials (Nardo et al. 2005). In this study, the accumulated contribution of the first two principal components captured 68.7% of the total variance, slightly higher than the results of Salvatore et al. (2010) and Turna (2004) for the morphological traits of cones and seeds of P. halepensis and Picea orientalis (L.) Link, respectively. This may be caused by the significant correlation between these traits. PC I represents seed and nut traits, PC II and PC III represents cone traits and cone number. This finding confirms the correlation results in this study, and similar results have been found for growth traits and wood properties (Wang et al. 2018), indicating that principal component analysis is effective.

Comprehensive assessment and genetic gain

Breeding targets determine the research methods used in tree genetics. P. koraiensis is an important economic tree species, therefore, seed yield is a critical trait when evaluating clones (Jiang et al. 2018). Although comprehensive assessment is an important method for selecting elite material when analyzing multiple traits, this may reduce genetic gain when many weak correlated traits are considered (Pollak et al. 19874). When traits are negatively correlated, they should be separated in the evaluation of elite materials (Yin et al. 2017). In this study, cone number was significantly negatively correlated with the other traits; therefore, cone number should be evaluated alone according to its significance among the investigated traits. Using cone number and other traits (according to PC I results) as selection criteria, with a selection rate of 10%, 22 clones were selected from cone number and Qi data, respectively. Based on cone number, the genetic gain was 7.2%, lower than that found by Jiang et al. (2018) in the same seed orchard. This may possibly be due to the use of single year data and periodic hardening in this species used by Jiang et al. (2018). Based on the other traits, genetic gains ranged from 6.2 to 24.3%, which are lower than the results for Pinus pinea L. (Mutke et al. 2005) and higher than Pinus gerardiana (Singh 1992). This difference in genetic gain may be due to tree species, selection rates, and environments, but in general, the genetic gain was the expected selection effect in the P. koraiensis genetic improvement program.

Economic benefit analysis

The most important purpose for tree breeding is to promote ecological and economic benefits (Dyjakon 2019). The economic gains that can be generated from improved breeding material are of considerable significance in evaluating whether the investment can be recovered (Ivković et al. 2010). For P. koraiensis, aside from ecological benefits, improved wood and seed yields are two important economic benefits. With the ban on cutting natural P. koraiensis stands, seed yields of improved clones have even greater importance. Considering that the average genetic gains for cone number (7.2%), cone weight (17.9%), seed weight (24.3%) and weight per thousand seed (11.4%) were greater than 11.0%, a value of 10.0% may be assumed for the estimated genetic gain of the selected clones for calculating economic benefit. Currently, the price of P. koraiensis seeds in Northeastern China is approximately US $14/kg. With an average seed yield of 1.1 kg/tree/year (average cone number × seed number per cone × thousand seed weight/1000), the estimated economic gain is US $1.76/tree. If elite trees are planted at a density of 7 m × 7 m on 10,000 hectares, the estimated economic gain after 40 years will be US $ 3.52 million per year.

Conclusions

The selection of elite or plus trees for a breeding program is a critical step to increase genetic quality and thus reach improvement goals. The range of consumptive uses of P. koraiensis seeds constitutes a strong motive for tree improvement. In this study, 14 traits of cones, seeds and nuts were examined in 110 P. koraiensis clones to evaluate clone variability and productivity. The results show moderate to high significant differences in cone, seed and nut traits, which supports the selection method. Positive and significant correlations were observed among cone, seed and nut dimensions, but cone number was negatively weakly correlated to cone, seed and nut dimensions, suggesting independent selection of elite clones, and that correlated traits together support an efficient, indirect clone selection. Phenotypic coefficients of variance were slightly higher for cone characteristics than for those of seeds and nuts. Higher repeatability was observed for cone, seed and nut traits. These parameters of repeatability and phenotypic coefficients of variance suggest substantial genetic variation among clones, and that the selection of quality materials was feasible. Cone, seed and nut dimensions and cone numbers were separately identified as indices in the selection evaluation of clones based on correlation and principal component analysis. Thus, 22 elite clones were selected based on a 10% selection rate, corresponding with a genetic gain from 6.2 to 24.3% in all cone, seed and nut traits. These results provide beneficial information to select high quality clones and the elite clones can supply resources for planting.