Introduction

Brazil stands out worldwide in the production of cellulose and paper (IBA 2019). The country has a great territorial extension and climatic variations that challenge the breeders of plant in the selection of the best genotypes for different locations or regions with desirable characteristics (Marcatti et al. 2017).

The species of the genus Eucalyptus, the main ones for the production of cellulose and paper in the country, have been indicated, mainly, for areas of higher temperatures (IBA 2019; Fonseca et al. 2010). However, the discovery of species of the same genus, which show frost tolerance and good silvicultural characteristics, such as rapid growth, has enabled the expansion of planting areas to south of the country (IBA 2019). In this, context, the species Eucalyptus benthamii has stood out as having great potential for planting in cold areas (Brondani et al. 2010).

Eucalyptus benthamii is a species that stands out for presenting good growth and tolerance to frost (HIGA 1999). However, the genetic breeding of this species is recent and the initial focus is on obtaining intraspecific and interspecific hybrids that add relevant characteristics to increase productivity, frost tolerance and, mainly, wood quality, with an emphasis on cellulose pulp production (Estopa et al. 2017).

The study of the potential of a plant population makes the selection of a breeding program more efficient, allowing satisfactory genetic gains and identifying the potential of the population, among others, for advances in generations (Cruz et al. 2011). In various cases, the measurement of characteristics in all individuals in a population is a costly and expensive process, as for example, chemical characteristics of wood (Nunes et al. 2016). One strategy is to carry out a very representative sample and measure only a few individuals. To verify the potential of this sample, we can use the same statistics used to study the potential of a population, saving time and making work more efficient.

Uni, bi, and multivariate statistical analysis can assess the potential of a sample. We can generate information through descriptive analysis, feature associations, and factor analysis. The result, individual and joint, of these analyzes helps to understand the characteristics, allowing better visualization and even the reduction of characters due to the biological redundancy between them (Cruz et al. 2011).

Laboratory analysis of the characteristics of wood, such as chemical and physical characteristics, is laborious and requires several days to determine its attribute according to the norms utilized to those processes (TAPPI T-21 OM-2 2002; TAPPI 280 PM-1999; TAPII 222 OM-2002; TAPPI UM-250 1991; TAPPI T223 CM-1999). It is an expensive process, usually applied to a limited number of individuals, and often demands the total loss of the sampled individual, making it a destructive method (Nunes et al. 2016). Thus, non-destructive methods, or that sample a small part of individuals, and that circumvent these problems without reducing the accuracy in the selection, are fundamental for the success of the forest improvement program. In addition, studying and associating the characteristics of wood using multivariate techniques can facilitate the interpretation of data structures, reduce losses of information, allow the use of wood to be recommended, and assist in the understanding of the relationships between the numerous characteristics for breeders in making decisions (Silva et al. 2016).

Despite the importance of these characters and the species, there are still few studies to study the potential of sampling in the evaluation of phenotypes in E. benthamii trees, using of uni, bi, and multivariate techniques. In view of the above, the objective of this study was to evaluate the potential of a sample of E. benthamii individuals for suitability of use, based on technological characteristics and wood growth through uni, bi, and multivariate techniques.

Materials and methods

Material and characteristic measurements

The genetic material sampled comes from a 4-year-old E. benthamii progeny test carried out on the premises of the company CMPC Celulose Riograndense, located in the municipality of Encruzilhada do Sul, in the state of Rio Grande do Sul. Seventy-five individuals of E. benthamii and four individuals from three other species used as controls (E. saligna, E. grandis, and E. dunnii) were sampled, totaling 87 individuals.

The properties that express the wood quality and silvicultural characterization were conventionally determined for the 87 wood samples and constituted the set of dependent variables. The samples were analyzed at CMPC Celulose S.A., at the Santa Fé plant, in the city of Nascimiento, Chile.

Fifteen characteristics were evaluated, four of which were growth and 11 were technological wood. The growth characteristics were: DBH (diameter at 1.30 of the soil), height, volume and mean annual increment (MAI). The technological characteristics of the wood were: total yield, ash content, extractive content in acetone, extractive content in water, content of total extraction, content of pentosan, Klason lignin, total lignin, holocellulose, basic density, and kappa number.

The DBH was obtained at 1.30 m from the ground directly at the change of the tree and the height with the device called a hypsometer.

To calculate the tree volume (VOL, m3), the Schumacher and Hall equation (1933) was used as described below:

$${\text{VOL}} = \frac{{\pi \times {\text{DBH}}^{2} \times {\text{Height}} \times f}}{40.000}$$

where VOL, volume of the tree in m3; DBH, diameter at breast height in cm; Height, total height of trees in m; f, form factor adopted (0.405); and π = ratio between the circumference and diameter of a circle (3.14159).

The mean annual increment (MAI, m3 ha−1 year−1) was calculated using the VOL of each tree in the experiment produced in the 3.0 × 2.0 m spacing and extrapolated to 1 ha and divided by age (4 years):

$${\text{MAI}} = \frac{{{\text{VOL}} \times 10.000}}{24}$$

where MAI, mean annual increase (m3 ha−1 years−1); VOL, volume of the tree in m3.

The basic density was determined according to the SCAN—CM 43:95 standard, which is based on the Archimedes principle.

For chemical analysis, the chips selected from the accepted fraction were ground in a Retsch brand knife mill, so the material was sieved according to the TAPPI T 257 standard, where the fraction ≥ 40 mesh was collected and used for analysis, according to the standards detailed in Table 1.

Table 1 Standards used for chemical analysis of milled wood

The kappa value was determined according to the ISO 302: (2004) standard, using dry pulp after calculating yield.

Evaluation of population potential

Descriptive analysis

This analysis sought to assess the phenotypic potential to be explored by selective techniques of the various characteristics under study by the variability manifested in terms of the range of variation. The descriptive analysis contains information on means, variance, minimum, maximum, coefficient of variance, standard deviation and amplitude, for the growth and quality of characteristics.

Association between characteristics

The study of Association between characteristics aimed to evaluate the performance of the set of growth and technological characteristics of wood, to evaluate if there was any undesirable change in the relationships between the characteristics and to identify groups of characteristics. In addition, highlight characteristics of economic or silvicultural interest, which can generally be complex or more difficult to measure, and auxiliary characteristics for the purpose of indirect or correlated selection. Correlations were obtained, together with the significance test and the correlation network graph was generated.

The correlation matrix was represented in a correlation network, in which the links between variables were determined by the “adjacency matrix” A = h(R), with the following function: \(h_{{ (r_{ij} )}} = \frac{1}{2}\{ {\text{SNG}}(|r_{ij} | - \rho + 1)\}\), where: \(\rho\) is the parameter that determines the minimum value that a correlation must present to be represented in the correlation network. In this work, as well as a study by Silva et al. (2016), its value was set to zero, so that all connections between variables could be traced.

The thickness and color intensity of the edges were controlled by applying a cutoff value equal to 0.3, which means that only \(\left| {r_{ij} } \right| \ge 0.3\) have their lines highlighted. Finally, the positive correlations were colored in dark green, while the negative ones were depicted in red. Positive correlations indicate the tendency of one variable to increase when the other increases, negative correlations indicate the tendency of one variable to increase while the other decreases.

By correlation network, the variables are represented by nodes, which are connected by lines where each one has a weight indicating the strength of the correlation—so the thicker and the longer the line, the greater the correlation, allowing the formation of a group of characters (Epskamp et al. 2012).

Factor analysis

This analysis seeks to identify how many and which common factors are responsible for explaining the variation of the variables, to give a biological interpretation to these factors, and finally, in scatter plots to identify individuals’ performance by positioning them in scatter plots usually made with scores of pairs of factors representative of the biological complexes to be inferred.

Factor analysis is a multivariate statistical technique that, based on the existing dependence structure between the variables of interest, which in general represented by the correlations or (co)variances between these variables, allows the creation of a smaller set of variables (factors) obtained as a function of the original variables (Teixeira et al. 2015). In addition, it is possible to know how much each factor is associated with each variable (specific factors) and how much the set of factors explains the general variability of the original data, that is, common factors (Ferreira et al. 2010).

The factorial model adopted for an observable variable, with mean, can be represented as follows (Johnson and Wichern 2007; Silva et al. 2014):

$$X_{i } - \mu_{i} = l_{i1} F_{1} + l_{i2} F_{2} + \cdots + l_{im} F_{m} + \varepsilon_{i} ,$$

where \(i = 1, 2, \ldots , p\) and m \(\le p\), p being the number of original observable variables (equal to 15 in the present study); the coefficient \(l_{ij}\) is called the factor load of the ith variable over the jth common factor, being: \(i = 1, 2, \ldots , m;\; F_{1} ,F_{2} , \ldots , F_{m}\) are called common factors, unobservable random variables, and εi are the random errors that are associated only with i, the corrected variable \(X_{i}\), respectively.

The number of factors was defined considering an explanation percentage of 70% of the total variability that, according to Ferreira et al. (2010) and Teixeira et al. (2015), is sufficient to reduce data satisfactorily.

The disposition of the variables in each factor was done through the loadings\(l_{ij}\), or factorial loads, which consist of the correlation between each variable and the respective factors. These values, as well as the simple correlation, vary between − 1 and 1 and, the greater factor load (in module) the more correlated the variable will be with the respective factor. Therefore, the variables will be part of the factor to which they are most correlated (Teixeira et al. 2015).

Communalities were used to assess the proportion of each variable explained by the factor to which it belongs and the proportion explained by random error. According to Filho and Júnior (2010) such values must be greater than 0.5. Finally, aiming at a better interpretation of the distribution of variables in the respective factors, Varimax rotation was used (Teixeira et al. 2015).

Computational aspects

All analyzes were performed using the Genes Program (Cruz 2016), version 2019.

Results and discussion

Potential of the eucalyptus population based on silvicultural and technological characteristics of wood

The results of the descriptive analysis of the growth characteristics of E. benthamii are shown in Table 2. All growth characteristics (DBH, height, volume, and AMI) showed high variation and amplitude, which was already expected, as this material comes from a progeny test. For DBH, the high variation shows that the growth of trees is directly influenced by genetic factors of species interacting with environmental factors such as climatic, soil, topographic, and competition factors (Finger 1992). The height of the trees is also influenced by several factors in addition to those already mentioned, for example, the response to fertilization (Andrade and Angelo 2016). The volume, depending on the DBH and height, is also influenced by the same factors. The medium annual increment (MAI), dependent on the volume, therefore, with influences from the same factors, is useful in improving the characterization of the annual growth of the forest, which in this sample averaged 40.733 m3/ha year. Gomide et al. (2005), evaluating materials from several national companies, considered that MAI above 40 m3/ha year is classified as high productivity material.

Table 2 Potential for wood growth characteristics of the studied eucalyptus population represented by 75 individuals of E. benthamii

In general, for the growth characteristics of the wood, it is noticed that the population has good average potential associated with high variability, with prospects that efforts applied through improvement will be rewarded by genetic gains.

Table 3 describes the values presented by the materials used as controls, these being E. saligna, E. grandis, and E. dunnii. These species have a longer breeding time compared to E. benthamii. On mean, the values of E. benthamii were higher than that of E. saligna. For the other two species, this one presented inferior results, on average. When referring to amplitude, E. benthamii was superior to the others, showing greater variability, confirming the possibility of genetic gains for these characteristics in this population.

Table 3 Potential expressed by means and variability for wood growth characteristics of the studied eucalyptus population represented by four individuals of E. saligna, four of E. grandis, and four of E. dunnii

The study of the mean and variability of the technological characteristics of the wood of the E. benthamii population is presented in Table 4. Variations in wood quality can significantly affect the industrial manufacturing process, from the production of the digester and the performance of the recovery boiler to the quality of the cellulose pulp (Gomide et al. 2010).

Table 4 Average values for wood quality characteristics for the studied eucalyptus population represented by 75 individuals of E. benthamii

The mean of pulp yield was 48.984%, and ranged from 42.600 to 52.800%. Normally, materials with values above 50% yield is considered high (Gomide et al. 2005). This can be explained by low values of basic density, extractives, lignin, uronic acids, acetyl groups, and high levels of cellulose and syringil-to-guaiac ratio (Magaton et al. 2009).

The ash content showed a mean of 0.555%, and varied from 0.290 to 2.510%, being of low magnitude. For Moreira (2006), the amounts of inorganics in Eucalyptus are low and acceptable in various applications. This material in the wood is harmful to the pulping process, causing corrosion and incrustations in the equipment, reducing the calorific value and decreasing the industrial productivity (Jardim et al. 2017).

The contents of total extracts can be subdivided into extracts in acetone and extracts in water. In general, the problem of this variable is related to its viscosity and adherence to the equipment, requiring stops all the process for cleaning (Jardim et al. 2017). According to these same authors, extractive values between 1.9 and 4.9% are admissible. In the analyzed material, a variation of 1.380 to 7.550% was observed, being considered high variation and amplitude, 6.520%.

The pentosan content characteristic ranged from 12.660 to 22.010%, with an average of 16.719%. The importance of the pentosan content of wood in the pulping process is linked to the relevance of the hemicellulose content (Garcia 1998). Hemicelluloses contribute to the yield and have beneficial inter-fiber reactions and resistance to cellulosic pulp (Souza et al. 2016). According to Gomes et al. (2008), most processes for obtaining cellulose pulp seek to remove as little of this material as possible due to its desirable benefits. Thus, materials with high pentosans content are desirable in the pulping process.

The lignin content is important for pulping performance, as the presence of phenolic compounds tends to increase the consumption of chemical reagents during the cooking process and to reduce the yield (Souza 2016). In this study, the levels of total lignin ranged from 28.110 to 34.400%, whereas those of Klason lignin, which has a chemical structure less resistant to degradation and solubilization, varied from 24.600 to 30.710%. Gomes et al. (2008) found values between 27.900 and 31.300% as being acceptable, however, for the production of cellulose, the lowest possible lignin content and high S/G ratio are desirable.

Holocellulose refers to the combination of cellulose and hemicellulose contents. This characteristic constituted, on mean, 64.843% of the weight of the wood, presenting wide variability, from 56.200 to 68.200%, in the evaluated material. For cellulose production, the higher the holocellulose content, the higher the yield and quality of the pulp, whereas for the production of charcoal the opposite occurs (Protásio et al. 2012).

The basic density of the wood represents the sum of several characteristics of the wood, making it difficult to establish perfect correlations involving cellulose production results (Gomide et al. 2005). In the group of individuals under analysis, the mean for this characteristic was 414.707 kg/m3, ranging from 357 to 491 kg/m3. Density values around 500 kg/m3 are considered satisfactory according to Gomide et al. (2005).

The kappa number is a relevant characteristic in wood cooking, being defined as the number of milliliters of 0.1 N potassium permanganate solution consumed per gram of absolutely dry cellulose pulp, under specific conditions, and corrected for a relative consumption 50% permanganate (D’Almeida 1988). Table 4 shows a variation between 15.700 and 23.600%, with a mean of 18.803%. Ventorim et al. (2006), studied the variations in the kappa number and concluded that: (a) the reduction of this number during oxygen delignification varies between pulps and the efficiencies; (b) kappa number reduction efficiencies were different in the five clones evaluated; (c) delignification efficiency was higher for higher kappa numbers; (d) the type of wood and its pulp has a significant effect on bleaching with oxygen and peroxide, but little effect on bleaching with ozone.

As with the growth characteristics, three species were used as controls, each represented with four individuals. Tables 5, 6, and 7 show the results of the descriptive analysis for the species E. saligna, E. grandis and E. dunnii, respectively.

Table 5 Average values for wood quality characteristic for the studied eucalyptus population represented by four individuals of E. saligna
Table 6 Average values for wood quality characteristic for the studied eucalyptus population represented by four individuals of E. grandis
Table 7 Average values for wood quality characteristic for the studied eucalyptus population represented by four individuals of E. dunnii

In relation to the mean, in Table 5, E. saligna was superior to E. benthamii in the characteristics of yield, pentosans, holocelluloses, basic density, and kappa number. As for the amplitude of the data, and this reflects in the variance, E. benthamii was shown to be superior in comparison with this same material.

In relation to the mean, in Table 6, E. grandis was superior to E. benthamii in the characteristics of yield, lignin Klason, total lignin, and holocellulose. As for the breadth of the data, and this reflects in the variance, E. benthamii was superior in almost all characteristics, except for lignin Klason and total lignin.

In relation to the average, in Table 7, E. dunnii was superior to E. benthamii in the characteristics of yield, ash contents, extractives in water, pentosans, holocellulose, and basic density. As for the breadth of the data, and this reflects in the variance, E. benthamii was superior to E. dunnii. Thus, E. benthamii has great possibilities of gains due to its amplitude and variance of data being greater in relation to the others.

The studied population has good phenotypic potential to be explored by selective techniques in view of the favorable average level it has already reached for the various characteristics studied and the variability manifested in terms of the range of variation. However, it is known that the improvement must be conducted, with a view to good performance for a set of characteristics and changes in one, or few characteristics, can cause undesirable changes in others. In addition, in each group of characteristics, some of greater economic or silvicultural interest can be highlighted, generally complex or more difficult to measure, and auxiliary characteristics that could be useful for the purposes of indirect or correlated selection. In this way, the most relevant associations between characteristics will be discussed below, considering the obtained correlation estimates.

Association between silvicultural and technological characteristics of wood

The estimates of correlations between the variables in studies are shown in Fig. 1, with approximately 34.90% being significant. Among these, we can highlight some that are correlated with the majority and that are important in the selection process. The greatest correlation found was between volume and MAI, which was expected, since the latter is dependent on volume. In addition, silvicultural characteristics were highly correlated with each other. Of the significant relationships, it is worth mentioning that the correlation between total lignin and holocellulose was negative by 76.22%. Within the main constituents of wood are celluloses, lignin, and hemicellulose, with holocellulose being the sum of cellulose plus hemicellulose. Justifying this correlation between these characteristics.

Fig. 1
figure 1

The upper part of the main diagonal of the figure shows the values of the correlations of the characteristics. At the bottom of the main diagonal are the dispersions and trend lines between the characteristics. On the main diagonal are the 15 characteristics (yield yield of wood; Extace Extac acetate extract; Extwa extractives in water; Extt total extracts; Pent pentosans; Ligkl lignin Klason; Ligt total lignin; Hol holocellulose; Dens density; Kappa kappa number; DBH diameter at 1.30 m of the soil; Vol volume; MAI medium annual increment) along with their distributions

The pulp yield was the characteristic that most correlated with the others, 80.00% of the variables, with the extracts in acetone, in water and totals, Klason and total lignin, holocellulose, basic density, DAP, height, volume, and MAI. It is a characteristic often used to define the quality of the wood when the focus is low-cost production, since it combines several effects of the characteristics of the wood in just one parameter (Magaton et al. 2009). On the other hand, according to these same authors, the pulp yield itself should not be considered a characteristic of wood, as it is influenced by the pulping technique and kappa number, among other processes.

The basic density is the direct quantification of the woody material per unit volume, being related to the properties and technological characteristics of the wood (Alves et al. 2011). This characteristic was correlated with pulp, ash, total lignin, holocellulose yield, and with all wood growth characteristics. For the production of cellulose, wood with uniform density is desirable because the speed of impregnation and delignification of the chips are influenced by the specific mass (Alves et al. 2011).

In the pulping process, lignin is a variable that is sought to be removed from the process. The lignin contents correlated with holocellulose (cellulose and hemicellulose) and yield, in a negative way. It was expected that the lignin content would be negatively correlated with pulp yield, which was the case (Table 7), however, this is not always true (Magaton et al. 2011; Gomes et al. 2008).

The growth characteristics are highly correlated. This result is due to the fact that the volume is a quadratic function of the DBH and the MAI is a function of the volume, corroborating with Nunes et al., 2016. Similar results were found by Verma and Sharma (2011), Luna and Bikram (2009), and Behera et al. (2017). The high positive correlation between the characteristics indicates that the improvement of one characteristic can be accompanied by the improvement of another (Behera et al. 2017). This correlation makes it possible to practice indirect selection aiming at gains in volume and MAI, through the character DBH (Nunes et al. 2016). In addition, errors in measuring height caused by various factors are avoided (Couto and Bastos 1988).

The extractive content correlated with several variables, some positively and others negatively. In general, for the pulping process, low values of these characteristics are expected and the correlations with yield are not very significant (Magaton et al. 2011).

The correlations between the 15 variables under study are represented graphically (Fig. 1) in pairs of measured characteristics. At the top, we can see all the correlations between the variables, regardless of whether they are significant or not. In the main diagonal, the characteristic distributions are present. At the bottom, the distributions of the correlated characteristics are observed, so that in some we can see a well-defined trend and in others we do not see the same.

A global view of the associations between the measured characteristics is provided by the biometric technique called the correlation network. Through a connection network, it is possible to obtain associations between interest groups of variables that must take into account common genetic factors and that can be effectively used in studies of cause and effect system, allowing the breeder to predict the consequences of his direct action on some variables over others not directly considered in the selection. To allow this type of analysis, Fig. 2 is presented in which the network of correlation between the technological and growth characteristics of the wood was defined. It appears that this network allows better visualization between the variables, especially when it is possible to separate them, previously, according to the knowledge of biology and chemistry, in groups. In this study, some groups were highlighted for purposes of better interpretation.

Fig. 2
figure 2

Correlation network between silvicultural (DBH diameter at 1.30 m of the soil; Vol volume; MAI medium annual increment.) and wood quality (yield yield of wood; Extace Extac acetate extract; Extwa extractives in water; Extt: total extracts; Pent pentosans; Ligkl lignin Klason; Ligt total lignin; Hol holocellulose; Dens density; Kappa: kappa number;) characteristics measured in a population of Eucalyptus. The circles represent the characteristics, the lines indicate the correlations that can be negative (red line) or positive (green line). The line thickness represents the intensity of the correlation—the greater the thickness, the greater the correlation and the smaller the thickness, the lower the correlation. The colors of the circles represent the groups to which each characteristic belongs, with seven groups (Extractives, Lignin, Group3, Pentosans, Kappa, Ashes, Density and Sil_car (being formed according to the corrections between the characteristics (silvicultural characteristics). (Color figure online)

Pulp yield is the final product in a pulping process, so much attention is paid to this characteristic, especially with regard to the industrial sector. This is because this parameter is the combination of several wood parameters in a single variable (Magaton et al. 2009). In Fig. 2, we can observe this combination between this characteristic and the others, with positive and negative correlations.

The first group to be highlighted in the network is called extractives, where the three types of extractives were grouped: total extracts, extracts in water, and extracts in acetone. Through the correlation network, it is possible to verify a high positive correlation between total extracts with the others, as previously mentioned the total extracts is the sum of the extracts in water plus the extract in acetone. In addition, there is significance of the correlations between total extractives and the other two types of extractives (supplementary table). Magaton et al. (2009) observed a negative impact of total extracts on pulp yield, even if not relevant. As for acetone extracts, this same author found a significant negative correlation with pulp yield. When looking at the correlation network (Fig. 2), this can be observed.

The second group highlighted in the correlation network was that of lignin, consisting of lignin Klason and total lignin. For pulping pulp, it is known that this is an undesirable component since it has a negative influence on yield (Foelkel 2013). This is observed in the correlation network. Also, it is possible to see that these two are strongly significantly correlated (supplementary table). This correlation was expected, as the Klason lignin is part of the total lignin.

Group three is defined by pulp and holocellulose yield. Analyzing the supplementary table, we observed that a positive and significant correlation was estimated. The holocellulose characteristic is the sum of cellulose and hemicellulose, which are the major constituents of pulp yield, and are highly correlated (Gomide et al. 2005). Through the correlation network, the components of this group are highly correlated with each other and these are negatively correlated with extracts and lignins.

The fourth group is formed by the union of the four characteristics of wood growth: DBH, height, volume, and MAI. These characteristics are highly correlated, which was clearly demonstrated by the correlation network.

The relationships between the groups can be well defined when we analyze the correlations between them. When analyzing the extractive group, all the variables that constitute it are negatively correlated with the variables in group three, and the characteristics of the latter are negatively correlated with the lignin group. In addition to these, the growth characteristics of the wood were highly correlated with each other, separating them from the other groups. Thus, the network of correlations facilitated the understanding and analysis of correlations within and between the groups of characteristics studied. Da Silva et al. (2016), working with pepper (Capsicum spp.), also observed efficacy in the selection of genotypes by networks of correlations both to the characteristic related individually and to groups of characteristics.

The other characteristics were not correlated due to the fact that they are not chemically or biologically linked to the others, evidenced by the weak correlations. The similarity network helped a lot in understanding the correlations between the characteristics (described in the supplementary material) and the results corroborated with those found in the literature, however, which were found in a more punctual way. In this way, the 87 individuals are representing the characteristic relationships well.

Access performance of a Eucalyptus population

In view of the population’s potential for improvement purposes, it is desirable to identify individuals (or accessions) that stood out for one or more characteristics, or for one or more complexes of characteristics established statistically, but that preserve a biological interpretation. Structural simplification of information in complexes with biological interpretation can be achieved through the multivariate technique called factor analysis (Cruz et al. 2014). This analysis seeks to identify how many and which common factors are responsible for explaining the variation of the variables, to give a biological interpretation to these factors, and finally, in scatter plots to identify the performance of individuals by positioning them in scatter plots usually made with scores of pairs of factors representative of the biological complexes to be inferred.

It was observed that four factors were sufficient to explain more than 70% of the total variation, obtained by principal component analysis or at the researcher’s discretion (Ferreira et al. 2010; Teixeira et al. 2015; Barbosa et al. 2019). It should be emphasized that this value, in the factor analysis, represents the average commonality, that is, how much these common factors explain the variations of the studied characteristic (Cruz et al. 2014). The results obtained are presented in Table 8. With the final factorial loads, the task of biological interpretation is performed so that some or all of the established factors are now identified with biological complexes for future use. Thus, although Table 8 presents information on four factors, it considered it appropriate to adopt only factors as being representative of the following biological complexes: silvicultural factor or complex, lignin factor or complex and the extractive factor or complex. The variables whose variation was primarily explained by the common factor were those that best expressed each of the referred biological phenomena.

Table 8 Factor load values, initials (CFI) and final (CFF) and communalities obtained in the analysis of factors from the means of Eucalyptus half-sib families, for silvicultural and wood quality characteristics

It is appropriate, in view of the final factor loads, that the first factor 1 can be interpreted as characteristics of wood productivity, involving the characteristics of total yield, DBH, height, volume, and MAI. Characteristics with positive and significant correlation according to the supplementary table.

Regarding factor two, the influence of lignin, which involves lignin Klason and total lignin, is evident. In addition, holocellulose can be interpreted inversely because the greater the amount of lignin the lower the amount of holocellulose in the wood. The high correlation between the levels was evidenced in the correlation studies previously discussed, in addition to the negative correlation with holocellulose by the correlation network. The lignin content due to the presence of phenolic compounds tends to increase the consumption of chemical reagents during the cooking process and reduce the yield (Souza 2016).

Factor three can be interpreted with extractives content, in the case of total extracts and in water. This complex is important for the quality of the wood. According to Gomide et al. (2005), clones with high extractive content should show low pulping yield. In addition, removal of extractives can cause a 4% loss in pulp yield (Gomide et al. 2005). The rapid growth of Eucalyptus trees allows a low amount of total extracts in addition to their low cutting age (Magaton et al. 2009).

This method of analyzing factors is widely used in the agronomic area (Garbuglio et al. 2007; Mendonça et al. 2007), mainly for environmental stratification and to recommend genotypes with wide adaptability.

After the analysis of the factorial loads and the establishment of the biological complexes, it is possible to establish scores associated with each complex that will be used in graphic analysis, generally of two dimensions, allowing inferences about the particular performance of the studied accessions, revealing their potentialities for the purposes of selection and use, at least per se, in breeding programs.

A global analysis allows the identification of individuals 7, 9, 10, and 11 as outstanding for the three studied complexes, being of interest in use per se. In other words, according to factor analysis, these individuals have high productivity, low levels of lignin, and extractives, which is of great interest to the pulp industry. The adequacy of use of these individuals in breeding programs involving recombination will depend on additional studies on genetic diversity so that the base population to be formed gathers, in addition to good potential, high variability, increasing the expectation of the appearance of transgressive segregants to be exploited by the techniques selective (Cruz et al. 2011).

Conclusions

The E. benthamii population under study has good potential to be explored through selective techniques in view of the witnesses used, considering the descriptive statistics as the average favorable level reached for the studied characteristics and the variability presented in terms of the amplitude of variation.

In the association between wood characteristics, the E. benthamii correlation network allowed, in addition to facilitating the visualization of the correlations between the characteristics individually, to understand the correlations within and between groups.

By the factor analysis, it was possible to form four complexes of characteristics, being able to reduce the number of variables in future works.

By studying genetic diversity, it was possible to understand the variation within the population to identify the importance of the variables and the association between the growth and technological characteristics of the wood.