Introduction

Increasing demand for bioenergy has led to a greater need to develop technologies for assessing the energy content of available biomass resources. A measure of energy content is gross calorific value (GCV), which is synonymous with higher heating value (HHV), as well as gross heat of combustion (GHC). Several studies have attempted to relate these values for energy content to specific chemical constituents in various biomass resources. For example, linear regression was used by White (1987) to show that HHV values from several softwoods and hardwoods could be correlated (R 2 = 0.70) with lignin content. The correlation was slightly improved (R 2 = 0.76) by including a factor for extractive contents in the equation. Analysis of extractive-free samples resulted in lower HHVs and a higher correlation (R 2 = 0.97) with lignin content alone. Similarly, Demirbas (2001) determined HHV on an extractive-free basis for a variety of biomass sources with a wide range of lignin values (15.01–55.29 %). In a later study, finding that extractive-free plant parts resulted in lower HHVs than the unextracted plant parts, Demirbas (2003) then also produced an equation relating HHV to extractive contents.

The application of near-IR (NIR) spectroscopy, coupled with multivariate analysis, allows for the predictions of energy content. This technique has previously been used as a rapid means to predict physical, mechanical, and chemical properties of woody biomass (So et al. 2004). Regarding energy content, Lestander and Rhen (2005) used this technique to determine the calorific content of Norway spruce samples and suggested its applicability to process monitoring in biofuel plants. The technique was also applied to the determination of calorific value for Populus spp., although the accuracies of the calibration models were limited (Maranan and Laborie 2007). Gillon et al. (1997) found this technique to be half as accurate as the direct determinations. Recently, NIR-based models were successfully used to predict GCV in longleaf pine wood samples, and unlike previous studies, results demonstrated that the GCV models were driven by the extractives and not the lignin contents (So and Eberhardt 2010).

Compared to NIR spectroscopy, the application of mid-IR spectroscopy provides the added benefit of allowing clearer spectral investigations of the samples to determine those chemical features imparting the greatest effect. Given that most correlations between chemical composition and GCV were based on different plant species and/or plant parts with widely varying lignin contents, it remained to be determined whether variability in the lignin content of wood from a single species could be correlated. Only recently, Zhou et al. (2011) used mid-IR spectroscopic data coupled with multivariate analysis to predict both the calorific value and lignin content for a single species of wood, hybrid poplar, and concluded that calorific value was unrelated to lignin content. Since only extractive-free samples were analyzed, any relationship to extractives could not be assessed.

In the current study, the impact of both extractive and lignin contents on GCV for a single softwood species was investigated. Only the wood was used here, rather than including other plant parts (e.g., leaves and bark), to focus specifically on the impact of chemical composition. Samples were analyzed in an extractive-free condition, but more importantly also analyzed in their native state (before extraction). The application of mid-IR spectroscopy to rapidly determine GCV, extractive, and lignin contents provided the added benefit of allowing spectroscopic investigations into these relationships.

Materials and methods

Materials

Twenty 70-year-old longleaf pine (Pinus palustris) trees were harvested from a spacing, thinning, and pruning study on the Kisatchie National Forest, Louisiana, USA. Disks (5 cm thick) were cut at a height of 1.4 m (i.e., breast height) and at a mid-height along the length of the bole of each tree. The 40 disks were dried under ambient conditions before sectioning to remove a 1-cm-thick slice along the north–south direction and through the center of each disk. Any remaining bark was removed and the wood strips were further reduced in size to allow grinding in a large Wiley mill equipped with a 2-mm-mesh sieve plate.

Extractive and lignin contents

Milled wood samples were exhaustively extracted with acetone using a Soxhlet apparatus. Extracts were concentrated by rotary evaporation, transferred to tared vials, evaporated under a stream of nitrogen, and dried further in vacuo before weighing. Extractive-free milled wood samples were dried under ambient conditions in a fume hood and then further ground in a small Wiley mill equipped with a 40-mesh screen. Lignin content was determined using the acetyl bromide method (Morrison 1972). A standard sample of extractive-free loblolly pine milled wood was used with each set of reactions as a control. The acetyl bromide solution was freshly prepared prior to each determination. Lignin contents were calculated using an absorptivity value of 23.30 g−1 L cm−1 (Johnson et al. 1961). Moisture content determinations were carried out in an oven (102 ± 3 °C) and used to adjust values for extractive and lignin contents to a dry weight basis.

Calorimetry

This was conducted using a Parr oxygen bomb calorimeter 6100 (Parr Instruments, Moine, Illinois, USA). GCV was determined in triplicate following the instructions in the manufacturer’s operating manual (Parr Instruments 2006).

Mid-IR spectroscopy

Spectra of the samples were collected using a Nexus model 670 Fourier transform IR spectrometer (Thermo Nicolet Instruments, Madison, Wisconsin, USA) equipped with a Golden Gate MKII single reflection ATR accessory. Milled wood and extract samples were applied directly to the diamond window. Three spectra were collected for each sample.

Multivariate analysis

Analyses of the data were performed using the Unscrambler (version 8.0) software (CAMO, Woodbridge, New Jersey, USA). The mid-IR data were first averaged to one spectrum per sample. Two data preprocessing techniques were applied to the spectra, multiplicative scatter correction (MSC), and first-derivative transformations (Savitsky Golay 5-point averaging). Both were evaluated for removing the systematic variations in the spectral data unrelated to the properties studied. Principal component analysis (PCA) was performed on the spectral data to observe differences and groupings between the sample sets. Partial least squares (PLS) regression was used to predict GCV, extractive content, and lignin content for the samples. Calibration and test sets were randomly created for the analyses with 30 and 10 samples, respectively. Models were generated using cross-validation and assessed using several common measures of calibration performance, including correlation coefficient (R 2), standard errors of cross-validation (SECV) and prediction (SEP), which are determined from the residuals of each cross-validation phase and prediction, respectively. A high SECV indicates that the sample excluded in each cross-validation phase was not well predicted by the corresponding calibration. SEP provides a measure of the effectiveness of a calibration in predicting the parameter of interest for an unknown set of samples. The ratios of performance to deviation (RPD) for calibration and prediction are calculated as the ratio of the standard deviation of the reference data to the SECV and SEP, respectively (Williams and Sobering 1993). This parameter accounts for their various properties and their ranges. A RPD value of 2.5 is considered satisfactory for screening, while 1.5 may be acceptable for preliminary use (Williams and Sobering 1993).

Results and discussion

Extractives, lignin, and gross calorific value

The variation in extractive contents for the full set of longleaf pine wood samples was large (0.0–20.6 %; SD = 5.0 %; mean = 5.9 %), but the range was not as wide as reported by Via et al. (2007) with values as high as 33 %. The corresponding values for GCV, before any extractions, ranged from 20.2 to 23.6 MJ kg−1 (SD = 0.8 MJ kg−1; mean = 21.2 MJ kg−1). The relationship between GCV and extractive contents was investigated and a strong correlation (R 2 = 0.91; p value <0.0001) was found. Demirbas (2003) also reported a similar type of correlation between differential HHVs and extractive contents, but for different plant parts from several tree species. In contrast, Fuwape (1989) stated that the effect of extractives on GHC was not statistically significant for a single species, Gmelina arborea (Roxb). While this was true for the heartwood, bark, branches, and leaves, it was not all encompassing since the data showed a statistically significant effect on the sapwood.

Compared to the wide variation for extractive contents, the variation in lignin content was much smaller (26.6–31.5 %; SD = 1.1 %; mean = 28.2 %); values for lignin content were similar to those found by Via et al. (2007) and Koch (1972). Following extraction, the lower GCVs covered an even narrower range (19.5–20.6 MJ kg−1; SD = 0.3 MJ kg−1; mean = 20.1 MJ kg−1). Therefore, it was not surprising to find that the relationship between lignin content with GCV for the extracted samples was poor (R 2 = 0.12, p value = 0.0266). Similar lignin and GCV values have previously been observed with extractive-free poplar samples (Zhou et al. 2011). In reporting a poor relationship between lignin with GCV (R 2 = 0.0973, p value = 0.1344), Zhou et al. (2011) concluded that the lignin range was too narrow to obtain usable calibration models. For comparison, Telmo and Lousada (2011) used a variety of wood species thereby providing large ranges for both lignin (22.2–41.4 %) and extractive (2.1–14.5 %) contents. A multiple linear regression of the HHV of the unextracted wood with lignin and extractives gave an equation with a R 2 of 0.84 (r = 0.915). Curiously, lignin content was quoted on the basis of the extractive-free samples while HHV and extractive contents were on the basis of the unextracted samples; the lignin contents should have been normalized to account for the extractives contributing to the HHV of the unextracted samples. Nevertheless, it is readily apparent that the strength of these relationships is dependent on the relatively wide value ranges. With respect to lignin content, the necessary wide ranges have been obtained by using various tissues and/or species.

Mid-IR spectroscopy

The typical mid-IR spectrum of a milled sample of unextracted longleaf pine wood is shown in Fig. 1a exhibiting the commonly observed bands at: 1,000-1,150 (C–O stretch), 1,700–1,750 (C=O stretch), 2,800–3,000 (C–H stretch), and 3,100–3,600 (O–H stretch) cm−1. Figure 1c shows the spectrum of the corresponding extract, likely comprising mostly of resin and fatty acids. Clearly defined peaks at 1,697, 2,928, and 2,955 cm−1, typical of the pine resin components, were greatly reduced following extraction as shown in the spectrum of the corresponding extracted wood sample in Fig. 1b. A difference spectrum was also calculated between the extracted and unextracted wood spectra (Fig. 1d) and closely resembles the spectrum of the extract. This difference spectrum was also very similar to a difference spectrum between resinous and non-resinous Scots pine heartwood (Nuopponen et al. 2003).

Fig. 1
figure 1

Mid-IR spectra of unextracted (a) and extracted (b) wood samples and resultant extract (c). Difference spectrum (d) and PCA loadings plot (e) from unextracted and extracted wood samples

PCA of unextracted wood samples

The resultant PCA scores plot of the mid-IR spectra clearly displayed differences between the unextracted and extracted wood samples with two distinct clusters along PC1 (plot not shown). Figure 1e is a plot of the PCA loadings for PC1, explaining 93 % of the variation, resembling the (extracted–unextracted) difference spectrum in Fig. 1d, except for the presence of the prominent C–O peak at 1,016 cm−1. Further analysis did not reveal any clear differences between the samples taken at breast height and mid-height along the bole of each tree.

PLS of unextracted wood samples

Partial least squares regression was then carried out to determine calibrations for extractive contents and GCV. The calibration for extractive contents from the unextracted wood samples provided strong calibration statistics (R 2 = 0.94; SECV = 1.46 %; RPD = 3.68; 2 factors) with MSC and is listed in Table 1. Meder et al. (1999) built an extractive model based on mid-IR spectra from samples of Pinus radiata wood with a resultant multiple correlation coefficient, r, of 0.87 using four factors. The model based on GCV for the unextracted samples also provided a strong correlation with 2 factors (R 2 = 0.91; SECV = 0.28 MJ kg−1; RPD = 3.09), but poorer than that based on extractive contents. Analyses were also performed with the first-derivative spectral data providing similar results to that for MSC with extractives (R 2 = 0.96; SECV = 1.53 %; RPD = 3.50) and GCV (R 2 = 0.89; SECV = 0.30 MJkg−1; RPD = 2.83) using a different number of factors. PLS regression for lignin content was not carried out with the unextracted samples as the calculated lignin values would be a function of the extractive contents.

Table 1 Model statistics for the 30 calibration samples

PLS of extracted wood samples

Models based on GCV were also built with the extracted wood samples in the current study (Table 1); however, these performed poorly with both MSC (R 2 = 0.58; SECV = 0.23 MJ kg−1; RPD = 1.06; 3 factors) and first derivatives (R 2 = 0.35; SECV = 0.23 MJ kg−1; RPD = 1.05; 2 factors). The use of a narrower wave number range (650–1,800 cm−1) did not result in much improvement with the GCV models. In contrast, Zhou et al. (2011) obtained strong correlations with GCV from extractive-free hybrid poplar samples using the mid-IR range of 700–2,000 cm−1 and a variety of preprocessing methods. The range of R 2 values was from 0.86 to 0.90 using models based on raw spectra, automatic wave number restriction with baseline correction, and other methods. The caveat here was that these models required between 10 and 12 factors. The model based on first derivatives provided a R 2 of 0.873 and root mean squared error of cross-validation (RMSECV) of 92 J g−1 requiring 10 factors.

Similar to the GCV model, the lignin model generated with MSC data performed poorly (R 2 = 0.61; SECV = 0.89 %; RPD = 1.01) when using 4 factors (Table 1). Zhou et al. (2011) were able to build lignin models with R 2 values ranging from 0.74 to 0.91. Preprocessing by automatic wavelength selection with vector normalization provided the best correlation (R 2 = 0.91 and RMSECV = 0.77 %), but required 12 factors. Wavelength restriction (1,380–1,650 cm−1) exhibited the largest lignin differences and provided a R 2 of 0.82 and RMSECV of 0.86 % while utilizing only 3 factors. Orthogonal signal correction was applied to mid-IR spectra from Sitka spruce, in which a R 2 of 0.78 was obtained with a root mean squared error of estimation (RMSEE) of 0.8 % using 2 factors (Nuopponen et al. 2006). In the current study, the application of the first-derivative transformation to the spectral data resulted in a model with a R 2 = 0.91 and SECV = 0.76 % using only 4 factors; however, the RPD value, while slightly better, was still very poor (1.18). Furthermore, wavelength restriction (650–1,800 cm−1) did not improve the models for the MSC or first-derivative data. A lignin model with first-derivative data was also built by Zhou et al. (2011), but no improvement (R 2 = 0.742; RMSECV = 0.978 %; 4 factors) was observed.

PLS of extract samples

The spectra of the extracts were more resolved than those from the unextracted wood samples, and a typical spectrum is shown in Fig. 1c. Spectral differences were observed between the extract samples, and relative peak intensities appeared to be dependent upon the extractive contents of the wood samples (Fig. 2). Thus, a model was built based on the corresponding wood extractive contents from the extract spectra (MSC corrected). This provided a strong correlation (R 2 = 0.88; SECV = 2.40 %; RPD = 2.23; 3 factors), though not as strong as that obtained from the unextracted wood samples. The models for extractive contents using the unextracted wood samples are based on the amount of total extractives relative to the wood polymeric matrix. That clearly is not the case for the models obtained from the spectra of the actual extracts. Any correlation to the wood extractive contents is instead based on the relative composition of the extractive components comprising the extracts. Thus, by extension, it can be surmised that there is a proportional change in the extractive composition coinciding with the amount of total extractives present in the unextracted wood samples. Bands showing positive trends (increasing peak height) with wood extractives at 2,928, 2,855, and 1,697 cm−1 can be assigned to resin and fatty acids. Thus, relative to other pine extractives, such as the phenolic compounds known to occur in pine extracts (e.g., lignans and stilbenes), the increasing total extractive contents observed here can be attributed to an increasing proportion of resin and fatty acid components. The application of first derivatives to the complex spectrum of the extracts resulted in a poorer model as compared with the MSC treatment. The authors are unaware of any previously reported studies in which calibration models were built from the spectra of the actual extracts.

Fig. 2
figure 2

Mid-IR spectra of three extracts (obtained from low to high extractive wood samples)

Regression coefficients

The regression coefficients showing the bands which have the greatest influence on the models, based on the unextracted wood samples and the extracts, are shown in Fig. 3 for the MSC data. It can be clearly seen that the GCV model for the unextracted wood is influenced by many of the same bands as those for the extractive model with the regression coefficients being similar. The C=O stretch band at 1,697 cm−1 plays the most important role, followed by the C–H stretch bands, between 2,800 and 3,000 cm−1, with them all providing clearly defined peaks. These prominent bands were not so evident in the plot for GCV of the extracted wood (not included in Fig. 3). The extractives regression coefficients plot from the actual extracts provided many of the same main bands as those from the unextracted wood and appear better resolved. Similarly, many of these bands correlate with the extract spectra obtained from low to high extractive wood (Fig. 2), with additional bands located at 1,363 and 1,100–1,200 cm−1. Figure 3 appears to support earlier findings of the relationship between GCV and extractives (So and Eberhardt 2010).

Fig. 3
figure 3

Regression coefficients for GCV and extractives (based on unextracted wood spectra and values). An extractives model was also built using extract spectra and unextracted wood values

Evaluation of prediction samples

The results from the test sets are shown in Table 2. The predictive ability for the extractive contents for the unextracted samples appeared to perform the best, with the RPD values closely matching those from the calibration models. This was also observed for the extract with MSC data; however, first-derivative data from the complex extract spectra performed poorer. There was significant deterioration from the calibration results (Table 1) to the test results (Table 2) for GCV with the unextracted samples. In the case of the extracted test samples, the RPD values appeared better than those obtained from the calibration results. However, this may be primarily attributed to the larger variation in lignin content and GCV for the test sets. It can be seen that the SD’s for the test sets are larger than those for the calibration sets. The samples in this study were taken at both breast height and mid-height along the bole of each tree. Since the samples were randomly assigned to the calibration and test sets (30 and 10 samples, respectively), this observation is merely a manifestation of variability experienced in sampling, although the property range of the calibration samples should encompass that of the test sets for good model testing.

Table 2 Model statistics for the 10 test samples

Conclusion

A strong relationship was apparent between the variation in GCV and variation in extractive contents. Mid-IR spectroscopy coupled with multivariate analysis provides a useful technique for demonstrating this relationship. Plotting the regression coefficients for GCV and extractive contents showed that the same bands were responsible for these strong models. Although total lignin content does impact total GCV, no correlation was found between the variation in GCV with lignin content for the extractive-free wood samples. Strong models were also built for wood extractive contents using the actual extracts. Thus, there would appear to be a proportional change in extractives composition coinciding with the total amount of extractives present in the unextracted wood samples.