Introduction

In wood, radio frequency vacuum (RFV) drying (Koumoutsakos et al. 2001a, b, 2003) and other high frequency electric field heating applications such as veneer and finger-joint gluing and parallam manufacturing (Torgovnikov 1993), the knowledge of the fundamental dielectric properties of the material such as dielectric constant (ε′), loss tangent (tan δ) and loss factor (ε″) are imperative in process design, control, optimization and simulation.

In the past, these properties have been researched as a function of fiber direction and field frequency for a variety of wood species that were conditioned to various moisture contents and temperatures. All studies revealed a strong relationship between the dielectric properties and wood attributes studied plus frequency (James 1975; Torgovnikov 1993; Siau 1995; Skaar 1948). For RFV drying, knowledge of ε″ is imperative since this attribute is directly proportional to power density, i.e., the thermal power transferred to wood. This is because the amount of electric power converted to thermal power is given by PD = 5.56 × 10−11 E 2 fε″ where PD is the power density in W/m3, E is the field strength in V/m and f is the frequency in Hz (Koumoutsakos 2001a). PD is analogous to dry-bulb in convective drying, for example, as PD increases, so does the rate of heating and consequently the final temperature of the material. Therefore, for timber RFV drying simulation and optimization, knowledge of PD variation in space and time is crucial. It is therefore apparent that PD is a strong function of E, and consequently of the voltage of the dielectric field for a fixed frequency. By increasing or decreasing the field voltage drying can be accelerated or decelerated, and consequently optimal drying conditions can be maintained. Naturally, the increase of voltage cannot be unlimited because the electric field will break down above an E value and arcing will occur. This phenomenon has been extensively reviewed in Biryukov (1961) and Torgovnikov (1993).

Equation (1) also demonstrates the strong relationship between ε″ and PD. As the former is a measure of how much electric filed is converted to heat in the material, its relation to various woods attributes, such as, density, moisture content, temperature and direction are documented in the literature (Torgovnikov 1993). The former three have been shown experimentally to have a positive effect on ε″, whereas no significant effect has been found for direction. The above relationships can explain why the voltage has to increase as wood dries (Avramidis et al. 1996; Avramidis and Zwick 1996, 1997).

Although there is clear knowledge regarding the effect of wood macro-attributes such as those previously mentioned on ε″, there is very little respective knowledge regarding the effect of wood chemical composition on the same factor. Norimoto (1976) and Norimoto et al. (1978) have investigated the dielectric properties of some wood chemical constituents as a function of frequency and temperature, but no attempt was made to correlate their percent content in the cell-wall composition to the gross wood ε″ values. Knowledge of the effect of wood chemical constituents in combination with the macro-physical properties of wood could exclude the wood species variable, and thus allow for a more global modeling of ε″ behavior.

While past research has produced useful ε″ data, unfortunately, they have originated from assorted species, under different thermo-physical conditions and variable frequencies and thus, it has been difficult to employ this data in drying modeling. Furthermore, there is a noticeable absence in the literature of sophisticated models that will describe ε″ as a function of macro-physical attributes and chemical composition of wood. This type of model is necessary in RFV drying heat and mass transfer simulations as described previously by Koumoutsakos et al. (2003) and in the development of process optimization software.

Since the development of a ε″-model based on thermo-physical and dielectric theories is particularly complicated, this study attempted to develop a more general model based on artificial neural networks (ANN). Neural networks represent a technology that is rooted in many disciplines, like neurosciences, mathematics, statistics, physics and engineering, and ANN subsequently find applications in such diverse fields as modeling, time series analysis, pattern recognition and signal processing, due to their ability to learn from input data with or without a trainer.

The computing power of ANN is achieved through their massively parallel distributed structure and their ability to learn and therefore generalize (Haykin 1999), the latter referring to their ability to produce reasonable output for given inputs not encountered during the training process. In fact they are networks of interconnected elements, which were inspired from studies of biological nervous systems. In other words, ANNs are an attempt to create systems that work in a similar way to the human brain. The brain consists of tens of billions of neurons tightly interconnected. Their aim is to produce an output pattern when presented with an input. Specifically, an ANN is a collection of units that are connected in some pattern to allow communication between them. These units are called neurons or nodes or processing elements, and they are simple processors whose computing ability is typically restricted to a rule for combining input signals and an activation rule that takes the combined input to calculate the output signal (Callan 1999). Output signals may be sent to other units along connections known as weights which excite or inhibit the signal that is being communicated. More information about ANNs can be found in Avramidis and Iliadis (2005).

Materials and methods

The wood ε″ and macro-physical data that were used in developing the ANN model were previously reported by Zhou and Avramidis (1999). In that study, all-sapwood and all-heartwood western hemlock (Tsuga heterophylla (Raf.) Sarg.), and all-heartwood western red cedar (Thuja plicata Donn) specimens were evaluated in the radial direction (thickness) and at various moisture contents and temperature levels and were exposed to two levels of electric field voltage. The ε″ values were calculated indirectly from heating studies at a 13.56 MHz fixed frequency with a laboratory size RFV dryer.

The same wood species/types were also analyzed for chemical composition as follows: air dried wood samples were ground in a Wiley Mill to pass a 40-mesh screen and extracted in a Soxhlet apparatus with acetone for 12 h, and extractives were determined gravimetrically. Lignin content was determined using a modified Klason method derived from the TAPPI Standard method T222 om-98. Briefly, 0.2 g acetone-extracted ground stem sample was treated with 3 ml 72% H2SO4 for 2 h at 20°C with mixing every 10 min. This mixture was diluted with 112 ml de-ionized water to achieve a final acid concentration of 4% H2SO4 and transferred to a serum bottle. The solution was then autoclaved at 121°C for 1 h and filtered through a medium coarseness sintered glass filter for the gravimetric determination of acid-insoluble lignin. Acid soluble lignin was quantified by spectrophotometric analysis of the filtrate at 205 nm (Tappi Useful Method UM-250). Carbohydrate concentrations in the hydrolysate were determined by HPLC (Dionex DX-600, Dionex, CA, USA) equipped with an ion exchange PA1 (Dionex) column, a pulsed amperometric detector with a gold electrode, and a Spectra AS 3500 autoinjector (Spectra-Physics, CA, USA). Prior to injection, samples were filtered through 0.45 μm HV filters (Millipore, MA, USA) and a volume of 20 μl was loaded on the column equilibrated with 250 mM NaOH and eluted with de-ionized water at a flow rate of 1.0 ml/min followed by a post-column addition of 200 mM NaOH at a flow rate of 0.5 ml/min. Each experiment was run in triplicate.

Once the experimental data (exp) were gathered and tabulated, the selection of an optimal configuration for the ANN commenced. In order to develop the most suitable ANN, the Neuralworks Professional II/Plus (NeuralWare Inc, PA) software integrated environment was employed. Eleven variables were used as inputs, including: the percentage of glucose (GLU), mannose (MAN), xylose (XYL), galactose (GAL), arabinose (ARA), lignin (LIG), extractives (EXTR), density, moisture content, voltage, and temperature (Tables 1, 2, 3, 4). Only one variable was used as an output, namely, dielectric loss factor (ε″) as shown in Fig. 1.

Table 1 Experimental (exp), calculated regression (reg) and ANN data for Hemlock sapwood (HS)
Table 2 Experimental (exp), calculated regression (reg) and ANN data for Hemlock heartwood (HH)
Table 3 Experimental (exp), calculated regression (reg) and ANN data for cedar heartwood (CH)
Table 4 The chemical analysis of the three wood types
Fig. 1
figure 1

Architecture of the optimal ANN used in this study

Following the design of the ANN structure, the input data were normalized in order to avoid problems like saturation that can occur due to the range of the input data and the nature of the learning function employed. For example, when a value that is coming into a processing element is beyond its transfer function range, that element is said to be saturated. The tangent hyperbolic function used here is mapping into the range −1.0 to 1.0, but it accepts values only between −3 and +3. Saturation occurs when an element’s summation value (the sum of the inputs times the weights) exceeds this range. Thus, the input data were normalized by dividing them properly (by 1,000) so that they will not exceed the acceptable range.

The equations developed by Zhou and Avramidis (1999) of the form

$$ \varepsilon ''{\text{ = }}\alpha {\text{ + }}\beta m^{{\text{2}}} + \gamma m + \delta T + \zeta mT $$
(1)

were used (m = 0.01 M) to calculate the regression (reg) values for the ε″ and are listed in Tables 1, 2 and 3. The coefficients of the independent variables were obtained by stepwise regression and can be found in Zhou and Avramidis (1999).

Results and discussion

Learning or training is the process of adapting or modifying the connection weights in response to the stimuli being presented at the input buffer and optionally to the output buffer. A stimulus presented at the output buffer corresponds to a desired response to a given input. This desired response is provided by a knowledge “teacher”. In such a case the learning is called “supervised learning”. Though an ANN consists of units that have a very limited computing capability, when many of these units are connected together, the complete network is capable of performing a very complicated task. Typically, the architecture of an ANN consists of the input layer, the hidden layer and the output layer. The hidden layer processes the data and may consist of one or more sub-layers depending on the designer’s view (Fig. 1).

The selection of the proper model for the ANN structure always requires the performance of several training runs. In the current study, a feed forward network structure with input, output and hidden layers varying from 1 to 3 was used (Gaupe 1997). The input layer consisted of eleven neurons that correspond to eleven input variables (Fig. 1), whereas the output layer had only one neuron representing ε″. Standard back-propagation algorithm and tangent hyperbolic function (mapping into the range –1.0 to 1.0) with the extended delta bar delta (Ext DBD) learning rule (Jacobs 1988; Minai and Wiliams 1990) was employed for the training of the ANN. It should be clarified here that the Ext DBD is a heuristic technique that has been successful in a number of application areas, and that it utilizes termed momentum. A term is added to the standard weight change, which is proportional to the previous weight change. In this way good general trends are reinforced and oscillations are damped. The back-propagation algorithm is one of the most popular local algorithms for adjusting the weights of a multi-layer neural network (Rummelhart et al. 1985). The random number seed was kept constant before each training round and the learning coefficient ratio was kept at 1.

In this study, one hundred forty-four (144) sets of experimental data, where the ε″ was measured under different temperature and moisture conditions and for various types of chemical composition, were used. In order to achieve a good level of generalization, a sub-set of 72 experimental cases were used in the training phase whereas the remaining 72 data records were used for testing. The set of experimental values used for ANN training can be seen in Tables 1, 2 and 3.

Initially, we trained the ANN using the original 72 data sets and applying one to three sub-layers in order to determine its most appropriate structure. The optimal ANN was found to consist of eleven processing elements in the input layer, three sub-layers in the hidden layer each one consisting of nine processing elements, and one output layer consisting of one processing element. Figure 1 depicts the architecture of the developed optimal ANN.

The optimal configuration was determined based on minimizing the difference between the ANN predicted values and the actual experimental data. The initial prototype ANN performed very well in the training phase giving an R = 0.9989 after 100,000 iterations. Additionally, two ANN instruments, the root mean square error (RMS error) and the confusion matrix were used to check its validity. The RMS error adds up the squares of the errors for each PE in the output layer then divides it by the number of process elements in the output layer to obtain an average and lastly takes the square root of that average – hence the name root square. The RMS error of the prototype ANN during training had a very low value, specifically 0.0170.

The confusion matrix provides an advanced mechanism of measuring the ANN’s performance during the “learn” and “recall” phases. It allows the correlation of the actual results of the ANN to the desired results in a visual display (Neuralware 2001). Furthermore, it provides the user with a visual indication of how well the ANN is performing. The network with the optimal configuration must have the bins (the cells in each matrix) on the diagonal from the lower left to the upper right. The fact that the confusion matrix has all of its bins positioned in the main diagonal as in this case, suggests excellent performance of the ANN. The values of the instruments can be seen in Fig. 2. After determining the final structure of the optimal ANN the training process was performed.

Fig. 2
figure 2

Instruments values after training of the ANN

Testing is the process that actually determines the strength of the ANN and its ability to generalize. The performance of an ANN is critically dependent on the training data that must be representative of the task to learn (Callan 1999). For this purpose we have chosen randomly 72 actual cases that were not applied in the training phase. In the testing process we have applied the same evaluation instruments that we used in the training. The testing process proved to be very successful giving an R 2 of 0.9945 and RMS error of 0.0382 after a single pass. Furthermore, the confusion matrix demonstrated a very desirable shape with all of the bins positioned in the main diagonal (Fig. 3).

Fig. 3
figure 3

Evaluation of instruments values after testing of the ANN

The input contribution (IC) instrument of the ANN has verified that galactose plays the most important role in the evaluation of ε″, between the chemicals characterizing the wood type, whereas lignin and glucose are close second and mannose is ranked as the last one (Fig. 4). The IC is a bar chart which displays an analysis of weights between the Input layer and other layers. The height of each bar of the graph readily shows the relative contribution made by each Input layer processing element (PE), with the tallest bar indicating the highest contributor. This provides a useful method of determining the key inputs to an application (Neuralware 2001). Normal usage requires the network to be trained, and then the contribution of each Input PE can be gauged by performing a test command. By default the IC graph is not active during Learn and Save Best operations.

Fig. 4
figure 4

The input contribution instrument of the importance of various polymers on ε″

Tables 1, 2 and 3 contain the experimental ε″ values, the calculated ones from the regression equations (Zhou and Avramidis 1999) and the ones estimated by the ANN. The ANN was designed and developed in order to estimate ε″ under different temperature and moisture conditions and for various types of chemical compositions. On the other hand the existing regression analysis efforts (Zhou and Avramidis 1999) were performed without considering the chemical composition of the wood. Thus we have performed an integrated linear regression analysis (LRA), using data from all three wood types (HS, HH and CH) and considering their chemical composition as well. In this way the performance of the ANN would be comparable with the one of the pure statistical approach. The independent variables of the linear regression were T, M, D, V, EXP, LIG, ARA, GAL, XYL, MAN, GLU and the only dependent variable was ε″. The LRA has proven that there exists a line with the following equation describing the case of our 144 data records:

$$ \varepsilon '' = 12.406 + 0.026 \times T + 0.072 \times M - 3.334 \times V + 0.551 \times {\text{GAL }} - 0.224 \times {\text{GLU}} .$$
(2)

The linear regression has also proven that galactose plays the most important role in the evaluation of ε″ and glucose is also included in the equation whereas the other chemicals characterizing the wood type are not considered to be significant. The R 2 value of the LRA was equal to 0.8685 and the degrees of freedom were 136. This regression effort is deemed reasonable, but its performance was obviously much poorer than the one of the ANN which gave much higher R 2 values as it was discussed in the previous chapter. Furthermore, the ANN was proven to be adjustable to new data rows and its performance has been confirmed to be of high calibre even when new (first time seen) data were entered in the testing phase.

Conclusion

The artificial neural network developed in this study has exceptional performance and clearly indicates that it is a reliable tool for estimating the dielectric loss factor values when wood attributes like its chemical composition are taken into consideration. Its main advantage is the fact that it can produce dielectric loss factor estimations without any cost in time or money. Obviously new data sets at more voltage levels and greater variety of wood attributes should be produced and used as input to the ANN in order to improve it further. This project was aimed at proving the potential use of ANN towards easy estimations of dielectric loss factor, and has succeeded by conclusively demonstrating their predictive power.