Introduction

Growth medium salts, plant growth regulators, temperature and lighting are all key factors for improving in vitro plant growth. Tissue culture medium optimization studies have traditionally focused on a few factors studied at the same time, and based on simple ANOVA analysis or classical factorial designs. Factorial designs require a large number of treatments, even when only a few factors are included (Compton and Mize 1999; Ibañez et al. 2003; Mize et al. 1999; Nas et al. 2005). Experimental designs and statistical analyses that are able to evaluate the effect of many factors with various levels and their interactions on mineral nutrition of in-vitro plants are required for better optimization process.

New methodologies for improving in vitro shoot growth by changing mineral nutrients include using advanced statistical models such as response surface methodology (RSM) and neuro-fuzzy logic (Alanagh et al. 2014; Gago et al. 2011; Niedz and Evens 2007). RSM is a statistical technique for modeling and analysis of responses affected by several factors, with the main objective to optimize the response. If the response is a linear function of the factors, it is a first order RSM model (multiple regression). If curvature is present in the response surface, higher degree of polynomial should be used, for example second order polynomial. The parameters within the polynomials are estimated according to the Method of Least Square. Response Surface Methods are designs and models which allow optimization of continuous factors (Bradley 2007; Montgomery 2005). Therefore, the main drawback with respect to tissue culture medium optimization is that nominal variables such as genotype cannot be included as a factor in the analysis. RSM simultaneously evaluates polynomial relationships between several independent variables and the dependent variable, and provides genotype specific models showing general areas of optimal plant growth. Computer generated RSM optimal designs allow testing of multiple nutrient factors at once, reducing treatments, time, labor and explant number compared to traditional factorial designs (Anderson and Whitcomb 2005; Hand and Reed 2014; Reed et al. 2013). Other statistical techniques used for better understanding mineral nutrition requirements are artificial neural network and neurofuzzy logic algorithms. The neurofuzzy logic algorithm is a hybrid technology that combines artificial neural network and fuzzy logic techniques. The algorithm utilizes IF (condition) and THEN (observed behavior) rules to model nonlinear relationships between several independent variables and responses (Gago et al. 2011).

Another option for data analysis that has not been used for plant tissue culture optimization is the Chi-Squared Automatic Interaction Detection (CHAID) data mining algorithm. As opposed to regression analysis, CHAID is a non-parametric technique, and therefore does not require distributional assumptions like normality and linearity. The algorithm can project interactions between variables, and non-linear effects which are generally missed by traditional statistical techniques (Hébert et al. 2006). The tree-based algorithm applied to the data obtained from an optimal design would help to more precisely define the optimum concentrations of salts in tissue culture media and to better understand any interactions. CHAID allows the analysis of responses of several genotypes simultaneously, which could contribute to developing a common tissue-culture medium rather than several genotype-specific media.

CHAID constructs a visual (non-binary) decision tree that contains many homogenous subsets with multi-way node splitting from a heterogeneous data set, by selecting explanatory variables that significantly interact with a response variable (Ali et al. 2015). The algorithm is appropriate for nominal, ordinal, and continuous data. Decision tree building processes consist of merging, splitting and stopping steps. Optimal splits are identified by Chi square statistics when the dependent variable is categorical, and F-test is used when the response is continuous. Continuous variables are split into categories with similar number of observations. CHAID splits the explanatory variables based on their smallest Bonferroni adjusted p-value. The categories of the independent variables are used for calculating p-values to check whether the mean responses are same for different categories of the predictors. If the p-values are not significant the pair is merged into a single group. A group with three or more categories is evaluated to define the most significant binary split. The splitting process continues until the node size is less than the predefined minimum node size value. The decision tree components are root node (containing the dependent variable), parent node (the first two or more categories after the data is split), child nodes (explanatory variable categories below the parent categories) and the terminal node is the last category. The most significant category on the dependent variable is at the top of the tree and the least important one (terminal node) is at the bottom (Rashidi et al. 2014; Statistics-Solutions 2016).

This study was designed to compare the conclusions of RSM and CHAID analysis for in-vitro culture data, and to provide practical approaches for tissue culture medium optimization. Optimal shoot quality, shoot length, multiplication, and callus formation were determined with each statistical technique.

Materials and methods

Plant material and in vitro culture conditions

Shoots of C. avellana L. hazelnuts ‘Dorris’, ‘Wepster’ and ‘Zeta’ were micropropagated on COR-2013 medium (Hand et al. 2014; Hand and Reed 2014) consisting of modified DKW (Driver and Kuniyuki 1984) medium salts [1.5× Ca(NO3)2·4H2O, 1.5× MgSO4·7H2O and 1.5× KH2PO4; 4× H3BO3, 4× Na2MoO4·2H2O, 4× Zn(NO3)2·6H2O, 0.5× MnSO4·H2O, 0.5× CuSO4·5H2O] with 30 g L−1 glucose, 200 mg. L−1 sequestrene 138 Fe EDDHA, 2 mg L−1 thiamine, 2 mg nicotinic acid L−1, 2 mg. L−1 glycine, 1 g L−1 myo-inositol, 50 mg N6benzyladenine (BA), and 0.5 % (w/v) agar (PhytoTechnology Laboratories A1111). Each vessel (Magenta GA7, Magenta, Chicago, IL) held 40 ml medium, and was autoclaved for 20 min at 121 °C. The growth room conditions consisted of 80 µmol m2s−1 light intensity with a 16-h photoperiod of half warm-white and half cool-white fluorescent lamps (Ecolux Starcoat, 32 W; GE, Fairfield, CT) at 25 ± 2 °C.

A RSM IV-optimal experimental design was set up by Design-Expert software (Design-Expert 2010) using the standard salt composition of DKW medium (1×) with the modified micronutrients listed above (Hand and Reed 2014). A six factor design with NH4NO3, Ca(NO3)2·4H2O, CaCl2·2H2O, MgSO4·7H2O, KH2PO4 and K2SO4 salts were tested in a range of 0.5–3.0× DKW. Forty-two treatments were assigned with the last two treatments as controls of DKW salts with modified micronutrients (Table 1).

Table 1 Six factor design including 42 treatment points

Shoots were cut to 3 cm with the apical meristem removed. During the first transfer the callus was removed and shoots were transferred to new medium. In the second transfer shoots were reduced to 3 cm by removing the base and the apical meristem. For each treatment two boxes with five shoots for each cultivar were used (n = 10). Boxes were randomized on the growth room shelf. Shoots were grown on each treatment medium for 10 week with the first and second transfers for 3 week, and the last transfer for 4 week.

Data

Three shoots from each culture vessel (n = 6) were evaluated as follows: shoot quality, a subjective visual assessment of shoot vigor and form was evaluated as 1 = poor, 2 = moderate and 3 = good. Shoots longer than 5 mm were counted. The longest shoots were measured in millimeters. Callus formation was evaluated as: 1 = callus ≥2 mm, 2 = callus <2 mm, and 3 = absent (no callus present).

Statistical analysis

The mean response (quality, shoot length, shoot number and callus) of six shoots per treatment for each genotype was assessed by RSM using Design Expert 8 software (Design-Expert 2010). The factors and their polynomial relationships and interactions which affected plant growth responses were determined at p < 0.05. RSM does not provide an option for defining a variable structure, all factors are treated as continuous. Graphical models of RSM were produced based on the six salt-factor design space. Genotype was not included as a factor. Separate models were built for each genotype. The factors with highest statistical impact were assigned as axes of the graphs.

CHAID analyzed quality, shoot length, shoot number and callus dependent variables based on the six basal salts and the three genotypes (‘Dorris’, ‘Wepster’, and ‘Zeta’) independent variables. The mean response of six shoots per treatment was calculated for each genotype. Therefore, the dependent variables and the basal salts were treated as continuous variables, only genotype was selected as a nominal variable within the algorithm. To prune the redundant structuring of the tree diagram, the ideal minimum plant numbers for parent and child nodes were assigned based on the best tree diagram with highest Pearson correlation and no overlapping of the factors. The minimum plant numbers for quality were assigned as 28:14, shoot length 6:3, shoot number 24:12 and callus 12:6. Statistical analyses of the tree-based CHAID algorithm were performed with IBM SPSS Statistics for Windows (Version 22.0, Armonk, NY) software (SPSS 2013).

Results and discussion

This study was designed to compare statistical methods for improving, and making more practical, the optimization process for mineral nutrients of tissue culture medium using RSM and CHAID data mining methodologies. RSM is a computer assisted technique which models optimal growth areas using selected treatment points within the range of all possible treatment combinations, and therefore reduces the number of treatments required for traditional factorial designs (Anderson and Whitcomb 2005). CHAID is a data mining algorithm used for constructing decision trees with homogenous sub-groups. It is useful for detecting non-linear and interaction effects without requiring linearity and normality assumptions (Hébert et al. 2006).

Quality

Plant quality rating is a subjective evaluation of general plant growth consisting of leaf and shoot health, as well as multiplication (Niedz et al. 2007). The RSM models for quality were significant (p < 0.05) for all of the genotypes tested (Table 2). Compared to DKW medium, increased KH2PO4 and lower K2SO4 concentrations affected the quality of all three cultivars (p < 0.05) and the NH4NO3 requirement was also low (Figs. 1, 2). For ‘Dorris’ there were interactions (p < 0.05) of several factors that impacted the shoot quality (Table 2). ‘Dorris’ required very high KH2PO4 and low K2SO4 concentrations and moderate to high Ca(NO3)2 for higher quality shoots (Fig. 1a). Low NH4NO3 was a significant factor for quality of ‘Wepster’, as were very high KH2PO4 and low K2SO4 concentrations (Fig. 1b). ‘Zeta’ quality models indicated that low to medium KH2PO4, low K2SO4 and low NH4NO3 were all required for the best growth. Improved growth was seen with several of the treatment combinations (Fig. 2).

Table 2 DKW nutrient factors that had significant effects on four growth responses for each hazelnut cultivar at p-value (< 0.05)
Fig. 1
figure 1

Response surface graph of mineral nutrient effects on hazelnut shoot quality for a ‘Dorris’, b ‘Wepster’ and c ‘Zeta’. The quality ratings were 1 = poor, 2 = moderate, 3 = good and highest (red-yellow) to lowest quality (greenblue). (Color figure online)

Fig. 2
figure 2

Shoots of a ‘Dorris’, b ‘Wepster’ and c ‘Zeta’ grown on the control DKW salts and two treatments which produced higher plant quality

The CHAID data mining algorithm indicated that the most significant factor for plant quality was NH4NO3, followed by KH2PO4 and K2SO4 (Fig. 3). The genotype effect in the CHAID algorithm was determined to be an insignificant source of variation for plant quality. The Pearson correlation coefficient between actual and predicted values for quality was 0.661 (p < 0.01), indicating a medium to high predictive ability of the algorithm. All the plants used in the analysis (Node 0) were split into two nodes (Nodes 1 and 2) based on the response to NH4NO3 concentration. The decisive cut-off value of NH4NO3 was 1.701×. The group of shoots with the best response to NH4NO3 ≤1.701× (Node 1) was statistically different (adj. p < 0.001) from the group of shoots responding best at NH4NO3 > 1.701× (Node 2). The quality of shoots grown on NH4NO3 ≤1.701× was also influenced by KH2PO4, while the quality of shoots on NH4NO3 >1.701× was significantly affected by K2SO4 (adj. p < 0.001). Node 4, the group of plants exposed to NH4NO3 ≤1.701× and KH2PO4 >2.012× had significantly higher quality (2.037) than the other nodes (Fig. 3). This indicates that the resulting optimal medium for overall shoot quality would be NH4NO3 ≤1.701×, KH2PO4 >2.012× for all three genotypes.

Fig. 3
figure 3

The CHAID decision tree diagram for plant quality of ‘Dorris’, ‘Wepster’ and ‘Zeta’. Nodes were determined by the significance of the factors. Salt cut-off values are × DKW

The differences in KH2PO4 and NH4NO3 requirement of plants noted in earlier studies could be related to genotype, the statistical methods applied, and the concentration ranges of variables used (Hand et al. 2014; Nas and Read 2004; Reed et al. 2013). Hand et al. (2014) linked quality with nitrogen factors, mesos and potassium sulfate, but the response was greatly dependent on genotype. All of the hazelnut genotypes in the current study required very low K2SO4 for better quality according to RSM graphs. K2SO4 was also significant in CHAID but only when NH4NO3 ≤1.701×. Ca(NO3)2 was significant for quality of ‘Dorris’ in RSM, but wasn’t significant in CHAID (Fig. 3). This difference could be due to the fact that CHAID algorithm simultaneously evaluated all three hazelnut cultivars in the present study. The cultivars were analyzed separately with RSM because genotype is not a continuous variable. Genotype was also evaluated separately in previous research using RSM (Hand et al. 2014; Reed et al. 2013). RSM is generally used to define the relationships between continuous (measurable) independent and dependent (response) variables, whereas CHAID is able to deal with categorical as well as continuous variables, and this allows evaluation of genotype as a factor, which could help to develop a general medium for diverse hazelnuts.

Shoot length

The RSM models for shoot length were statistically significant for all the hazelnut cultivars (p < 0.05) (Table 2). Interactions of several factors influenced (p < 0.05) shoot length of ‘Dorris’ in RSM (Table 2). ‘Dorris’ required high KH2PO4 and Ca(NO3)2, but low NH4NO3 for an ideal shoot length of 40 mm (Fig. 4a). ‘Wepster’ had the same requirements of KH2PO4 and NH4NO3 for longer shoots (40 mm), but did not require high Ca(NO3)2 concentrations (Fig. 4b). Shoots of ‘Zeta’ were typically long and the low-medium NH4NO3 and low-medium K2SO4 concentrations found in DKW were all that needed for good (40–50 mm) shoot length (Fig. 4c).

Fig. 4
figure 4

Response surface graphs of mineral nutrient effects on shoot length (mm) of a ‘Dorris’, b ‘Wepster’ and c ‘Zeta’. The shoot lengths (mm) were color coordinated from longest (red-yellow) to shortest (greenblue). The red dot represents the control with average shoot length of 50 mm. (Color figure online)

A CHAID decision tree diagram was constructed to predict shoot length from several significant factors: genotype, NH4NO3, KH2PO4, CaCl2, and K2SO4 (Fig. 5). There was a very high Pearson correlation of 0.853 between actual and predicted shoot length values (p < 0.01). The main factor affecting shoot length was NH4NO3 (adj. p < 0.001), and the second most important factor was genotype (adj. p < 0.001). KH2PO4, CaCl2, and K2SO4 were also influential for shoot length (adj. p < 0.001). All of the plants in Node 0 at the top of the tree diagram were split into two nodes (Nodes 1 and 2) with respect to NH4NO3. Node 1 had significantly longer shoots (39.34 mm) than Node 2 (25.98 mm) (adj. p < 0.0001). Genotype affected shoot length within both Nodes 1 and 2 (adj. p < 0.001). Node 1 (NH4NO3 ≤1.701×) branched into two nodes by genotype. Node 3 (‘Dorris’ and ‘Wepster’) was divided into three nodes (Nodes 7, 8 and 9) based on KH2PO4. The greatest shoot length for these genotypes (39.44 mm) was with NH4NO3 ≤1.701× and KH2PO4 >2.75× (Node 9). Node 4 (‘Zeta’) on treatments with NH4NO3 ≤1.701× had an average shoot length of 50.79 mm, and was divided into three new nodes (Nodes 10, 11 and 12), based on Ca(NO3)2. The longest shoots (60.389 mm) for ‘Zeta’ were recorded from Node 12 with NH4NO3 ≤1.701× and CaCl2 >1.8×. A common salt concentration for the tested genotypes with ideal shoot length of 40 mm is suggested to be: NH4NO3 ≤1.701×, KH2PO4 >2.75×, CaCl2 <1.738×.

Fig. 5
figure 5

The CHAID decision tree diagram for shoot length of ‘Dorris’, ‘Wepster’ and ‘Zeta’ hazelnuts. Nodes were determined by the significance of the factors. Salt cut-off values are × DKW

Shoot length results varied between the two analyses. The CHAID data mining algorithm determined that KH2PO4 > 2.75 × and NH4NO3 ≤1.701× were critical cut-off values for the longest shoots of ‘Dorris’ and ‘Wepster’. This was similar to RSM results. However, for ‘Zeta’ RSM indicated only NH4NO3 was important (Fig. 4; Table 2), while CHAID found CaCl2 also to be significant (Fig. 5). Nas and Read (2004) defined a micropropagation medium for hybrid hazelnuts (C. avellana × C. americana) based on the chemical composition of the hazelnut kernel. Better shoot elongation was observed on lower NH4NO3, Ca(NO3)2 and CaCl2, as well as high MgSO4 and KH2PO4 compared to DKW. The differences in salt requirements could be atrributed mainly to genotype (C. avellana versus hybrid hazelnuts), experimental design and the various statistical methods applied. Hand et al. (2014) found shoot length quite variable by genotype, however increased nitrogen factors and mesos were commonly involved.

Shoot number

RSM models for shoot number were significant (p < 0.01) for all of the hazelnut cultivars (Table 2). NH4NO3 was significant for all of the genotypes (p < 0.01), and low concentrations were required for high shoot number (Fig. 6). ‘Dorris’ produced high shoot multiplication (4.5) at the highest KH2PO4 and the lowest MgSO4 amounts (p < 0.05). Low amounts of K2SO4 and very low Ca(NO3)2 increased shoot numbers for ‘Dorris’, depending on the amounts of NH4NO3 and MgSO4 (p < 0.05) (Fig. 6a). A negative interaction of NH4NO3 × CaCl2 resulted in the highest shoot number (2.6) for ‘Wepster’ (p < 0.001), with the lowest NH4NO3 and the highest CaCl2 and a low amount of K2SO4 (Fig. 6b). ‘Zeta’ showed very low Ca(NO3)2 and NH4NO3 requirements for high shoot number (2.4) (p < 0.0001) (Fig. 6c). Low Ca(NO3)2 and NH4NO3 were both important for increased shoot production in other C. avellana genotypes (Hand et al. 2014). Nas and Read (2004) suggested lower NH4NO3, Ca(NO3)2 and CaCl2, as well as higher MgSO4 and KH2PO4 for better multiplication compared to DKW.

Fig. 6
figure 6

Response surface graphs of mineral nutrient effects on shoot number of a ‘Dorris’, b ‘Wepster’ and c ‘Zeta’. The shoot numbers were color coordinated from most (red-yellow) to fewer shoots (greenblue). The red dot represents the control with 2.2 average shoot number. (Color figure online)

The CHAID algorithm indicated that the most significant factor for shoot number was NH4NO3 (adj. p < 0.001), followed by Ca(NO3)2 (adj. p < 0.01) (Fig. 7). Genotype was an insignificant source of variation for multiplication. The Pearson correlation coefficient for shoot number was 0.653 (p < 0.01). All the shoots (Node 0) were split into three nodes (Nodes 1, 2 and 3) based on the response to NH4NO3 concentration. Node 1 (NH4NO3 ≤0.5×) shoots were affected by the Ca(NO3)2 concentration (average shoot number 2.197). Node 2 was those grown within the range of 0.5× <NH4NO3 ≤2.6× (average shoot number 1.773), were also influenced by Ca(NO3)2 and KH2PO4 (adj. p < 0.01). Shoot numbers of plants exposed to NH4NO3 >2.6× were not affected by any other basal salt factor or genotype (adj. p < 0.001). The highest multiplication (2.46) was obtained in Node 4 with NH4NO3 ≤0.5× and Ca(NO3)2 ≤1.725× (Fig. 7).

Fig. 7
figure 7

The CHAID decision tree diagram for shoot number of ‘Dorris’, ‘Wepster’ and ‘Zeta’. Nodes were determined by the significance of the factors. Salt cut-off values are × DKW

Significant salt requirements for good multiplication varied depending on genotype in RSM (Fig. 6), whereas clear cut-off values of NH4NO3 ≤0.5× and Ca(NO3)2 ≤1.725× were suggested by the CHAID for best multiplication for the all tested genotypes (Fig. 7). Hand et al. (2014) using RSM suggested low amounts of NH4NO3 and Ca(NO3)2 for a higher multiplication rate of five Corylus avellana cultivars. The current study RSM results showed higher shoot number (4.5 versus 2.5) for ‘Dorris’ than Hand et al. (2014), which could be attributed to the higher concentration range of KH2PO4 within the experimental design (3× versus 1.5×), as well as to the differences in micro nutrient concentrations used in both studies.

Callus

Responses of ‘Wepster’ and ‘Zeta’ were significant for callus formation in RSM (p < 0.05), and the most influential factor was NH4NO3 (p < 0.001) (Table 2). KH2PO4, K2SO4, NH4NO3 and MgSO4 were considered significant for callus of ‘Zeta’ (p < 0.05). An interaction of NH4NO3 × K2SO4 affected callus formation of ‘Wepster’ (p < 0.05) (Table 2). The highest concentrations of NH4NO3 and K2SO4 resulted in the least callus (rating of 2.6) for ‘Wepster’ (Supplement 1). ‘Zeta’ required the highest amounts of NH4NO3, K2SO4 and the lowest concentrations of KH2PO4 and MgSO4 for low callus (rating of 2.8) (Supplement 1b). These requirements contradict those needed for good plant quality (Fig. 1).

Mineral nutrient factors NH4NO3, MgSO4, K2SO4 along with genotype were statistically defined as affecting callus formation using a CHAID decision tree (Supplement 2). The Pearson correlation coefficient was strongly significant at 0.70. At the top of the decision tree diagram, the root node showed an average callus rating of 2.18 for all the analyzed shoots. NH4NO3 had a dominant effect on callus formation for all of the hazelnut genotypes (adj. p < 0.001) and the shoots were classified into three Nodes (Supplement 2). Node 1 was the group of hazelnut shoots exposed to NH4NO3 ≤0.5× (mean rating 1.986); Node 2 within the range of 0.5 × < NH4NO3 ≤1.701× (mean rating 2.136), and Node 3 within NH4NO3 >1.701× (mean rating 2.364). Callus formation of shoots within Node 2 (0.500× < NH4NO3 ≤1.701×) was not affected by any other basal salt factor or genotype. Genotype was an important factor for callus formation in Node 3 (NH4NO3 >1.701×) (adj. p < 0.01), which branched into Nodes 7 and 8. Callus of ‘Dorris’ exposed to NH4NO3 > 1.701× (Node 7) was not affected by any other salt factor. The cut-off values for callus of ‘Wepster’ and ‘Zeta’ hazelnuts were estimated as 1.701× NH4NO3 and 0.5× K2SO4. Callus production of ‘Wepster’ and ‘Zeta’ within group NH4NO3 >1.701× (Node 8), was affected by K2SO4 with a mean rating of 2.482 (adj. p < 0.05). The least callus (2.585) was obtained in Node 10, which was the group of ‘Wepster’ and ‘Zeta’ exposed to NH4NO3 >1.701× and K2SO4 >0.5×. In addition, Node 9 (‘Wepster’ and ‘Zeta’ in the range of NH4NO3 >1.701× and K2SO4 ≤0.5×) produced the mean callus rating of 2.21. Despite resulting in less callus, both of those nodes were poor for overall shoot quality. Terminal node 5 (average callus 2.2) which is within the range of NH4NO3 ≤0.5× and 0.95× < MgSO4 ≤1.762×, is consistent with the requirements for good plant quality.

For callus, RSM was generally consistent with the results of CHAID, but didn’t provide a significant model for ‘Dorris’ (Table 2). Hand et al. (2014) suggested high amounts of NH4NO3 and Ca(NO3)2 for less callus formation of ‘Dorris’. In the current study, very high concentrations of K2SO4 and very low amounts of KH2PO4 and MgSO4 were required for less callus formation of ‘Zeta’ (Supplement 1a). These differences could be attributed to differences in the statistical background of RSM and CHAID methodologies. All the cultivars were simultaneously analyzed using CHAID algorithm, which provided special information about the classification of plants giving similar or different responses to the combinations defined by RSM design.

In the RSM analysis, great variation was apparent between hazelnut cultivars in terms of quality, shoot length, shoot number and callus (Hand et al. 2014; Hand and Reed 2014) where cultivar was not included as a factor, but the CHAID analysis revealed the similarities and differences between the cultivars. In earlier RSM based studies, cultivar variation made it extremely difficult to formulate a common hazelnut micropropagation medium (Hand et al. 2014; Hand and Reed 2014). The results of the tree-based CHAID algorithm (Figs. 3, 5, 7; Supplement 2) in the current study closely mirrored the graphical results of RSM (Figs. 1, 4, 6; Supplement 1), but were easier to interpret. Alteration of the concentration ranges of nutrient factors within the RSM design might also improve the optimization process. Previous studies showed that a neuro-fuzzy logic data mining algorithm was useful for characterizing predictive nutrient factors that directly correlated with plant responses to tissue culture medium (Alanagh et al. 2014; Gago et al. 2011), but the exact cut-off amounts of nutrients, and how their significance on the response could change based on the factor concentration range, were not determined. Nas and Read (2004) hypothesized that the seed mineral and organic composition could be ideal starting point for tissue culture medium optimization. However, the seed nitrogen content was found toxic for the in-vitro hazelnuts.

Overall shoot quality, shoot length and multiplication improved with lower NH4NO3 concentrations. In the CHAID analysis low NH4NO3 provided good shoot quality, shoot length, multiplication and reduced callus (Figs. 3, 5, 7; Supplement 2). In RSM it was more complicated to determine a common formula for high quality, shoot length and multiplication versus a low callus model (Fig. 1). However, in general, analysis using RSM indicated that increased KH2PO4 and lower K2SO4 and NH4NO3 concentrations improved the quality of all three cultivars (Figs. 1, 2). We can more easily conclude from the prediction trees of the CHAID algorithm (Figs. 3, 5, 7; Supplement 2) that NH4NO3 ≤0.5×, KH2PO4 >2.75×, Ca(NO3)2 <1.725 × are the required critical cut-offs for optimum growth medium of the three hazelnut genotypes evaluated. Based on this analysis the new medium amounts were set at: 0.5× NH4NO3, 3× KH2PO4, 1.5× Ca(NO3)2. The other salt factors that were analyzed could be set at the standard DKW concentrations (1×) and remain within the indicated cut-off ranges.

Conclusions

Computer generated optimal design, like RSM, is an excellent tool for reducing treatment numbers compared to traditional factorial designs, and analyzing the resulting data with CHAID provides clear direction for developing a single optimal medium or a small number of suitable media for a range of genotypes. These advanced data mining approaches could be implemented to deduce optimum cut off-values of nutrient factors from mineral nutrition studies. CHAID is a novel and promising approach in tissue culture medium optimization. It provides a visual tree and exact cut-offs of the significant nutrients which makes it easier to define optimal concentrations of the nutrient salts. Evaluating in vitro culture data with the CHAID algorithm would provide clearer evaluation of the responses of the in vitro plants to the tested variables. The property of CHAID allowing the analysis of all genotypes together would contribute to developing one or a few optimal media for many genotypes rather than many cultivar-specific formulations.