Introduction

One of the greatest challenges in advanced oxidation process (AOP) is undoubtedly to evaluate the concentration of the analytes involved in the degradation process. Furthermore, the guarantee of the treatment effectiveness only occurs when it is ensured that the formation products are less toxic than the starting pollutant.

Since AOP depends on several factors, the modeling of these processes requires a multivariate system. In the last decade, the success related to the use of chemometric tools for quantitative analysis of species (Kumar et al. 2014; Yehia and Mohamed 2016), environmental quality monitoring and evaluation (Platikanov et al. 2014; Szymanska et al. 2015), and even prediction of toxicological effects (López-Doval et al. 2016) has increased the interest in data processing techniques to analyze organic pollutants, mainly due to the possibility of quantifying these compounds in high complexity matrices (Mas et al. 2010).

Spectroscopic methods associated with multivariate tools are a real alternative to the usual methods of degradation product monitoring, showing instrumental simplicity, high sensibility, and low cost. A limited number of studies have been reported about multivariate calibration and photocatalysis application (Bosco et al. 2006; Emilio et al. 2007; Abbas et al. 2008; Calza et al. 2008; Tian et al. 2008; Liu et al. 2009). In situ UV–Vis spectroscopy was successfully combined with partial least squares regression (PLSR) to accomplish photodegradation of 2-nitrophenol and 4-nitrophenol over TiO2 nanotube catalysts (Tian et al. 2008). Recoveries calculated from results predicted by the PLSR model were between 93.3 and 113.3 %, demonstrating the determination of two analytes simultaneously without the interference of other degradation products during the photoelectrocatalytic reaction. Bosco et al. (2006) monitored phenol photodegradation in the presence of its main degradation products during TiO2 photocatalysis applying Parallel Factor Analysis (PARAFAC) in fluorescence emission–excitation matrix. The proposed model (four factors and 99.98 % fitted) was able to quantify the degradation photocatalytic products of phenol.

Among the pollutants, phenol (PH) has been considered extremely harmful to the environment and human health due to its toxicity and carcinogenicity, and it has been also classified as priority pollutant (Wang et al. 2005; Rodrigues et al. 2010; Grabowska et al. 2012; Ling et al. 2015). The main sources of phenol pollution in environmental aqueous systems are wastewaters discharged from various industrial branches, such as paint, pesticides, paper, and petrochemicals. Moreover, this compound is recalcitrant under biological conventional treatment, which represents a challenge on the removal of trace contaminants (Wang et al. 2005; Rodrigues et al. 2010; Ling et al. 2015). Heterogeneous photocatalysis is a great and powerful technique for promoting phenol degradation (Pimentel et al. 2008; Moussavi et al. 2009; Segura et al. 2012; Minella et al. 2014). Among photocatalysts, titanium dioxide (TiO2) is the most used, because it is inert and has low cost and a band gap (3.2 eV) activated by UV-A light (Wang et al. 2005; Ling et al. 2015). The phenol degradation mechanism is well reviewed by Grabowska et al. (2012), and the majority of degradation products are hydroquinone (HQ), pyrocatechol (PC), resorcinol (RS), and p-benzoquinone (pBQ) as proposed by Bosco et al. (2006) and Guo et al. (2006). Hyphenated methods such as LC-MS (Peiró et al. 2001) and GC-MS (Sobczyński et al. 2004; Tao et al. 2013) have been employed to achieve these determinations, despite involving expensive, time-consuming procedures and infeasible real-time analysis. PLSR employed to phenol concentration determination during photocatalysis process without the interference of another degradation product has not been reported.

Hence, a multivariate spectrophotometric method to quantitatively and simultaneously determine the concentrations of phenol and four intermediates (hydroquinone, pyrocatechol, resorcinol, and p-benzoquinone) during the heterogeneous photocatalysis process was developed. This work showed a remarkable application of UV-PLSR to accomplish and quantify these compounds in the photocatalytic process over the usual HPLC-DAD method.

Materials and methods

Materials

All reagents were of analytical grade: hydroquinone and resorcinol (<99 %, Synth), p-benzoquinone and pyrocatechol (98 %, Merck), and phenol (99 %, Sigma-Aldrich). In photoctalytic processes, commercial Degussa P-25 titanium dioxide (TiO2, 75 % anatase/25 % rutile, surface area of 50 m2 g−1, particle size approximately of 30–50 nm) was used.

Working solutions were daily prepared with deionized water (18.2 MΩ, Milli-Q system, Millipore, Bedford, MA, USA). HPLC-grade solvents were purchased from J.T.Baker (Philipsburg, NJ, USA).

Apparatus

The spectrophotometric measurements were performed on a Cary 50 spectrophotometer (Varian, Australia), and spectra were recorded between 200 and 350 nm.

High-performance liquid chromatography (HPLC, Varian 920-LC, Mulgrave, Australia) equipped with an autosampler, quaternary gradient pump, and diode array detector (DAD) was used to validate the proposal method. Separations were carried out on a Microsorb-MV 100-5 C18 column (250 × 4.6 mm, 5 μm) from Agilent Technologies (Santa Clara, USA) with a guard column packed with the same stationary phase.

Multivariate methodology

The calibration and validation sets were built with 44 and 8 synthetic mixtures, respectively. All concentration solutions of multivariate sets were selected according to experimental design, which simulates phenol degradation as well as the products formed by the photocatalytic reaction (HQ, PC, pBQ, and RS) similar to Bosco et al. (2006). As control, the sum of all concentrations reached 30 mg L−1 for each mixture assayed. The relationship between phenolic compound concentrations during the degradation process and the linearity range of the absorbance measurements on UV spectroscopy was recognized as important for developing the experimental design shown in Fig. 1. Multivariate calibration models using the PLSR algorithm were performed in MATLAB software (MathWorks, Natick, USA, version 7.0.1).

Fig. 1
figure 1

Experimental design for calibration and validation sets: the phenol concentration decreased while the concentration of degradation products increased. After every five samples, one contains all compounds studied

Heterogeneous photocatalysis experiments

Briefly, photocatalytic activity depends on a photon with energy greater than or equal to the semiconductor band gap, able to excite and promote an electron from the valence band to the conduction band, thus leaving a hole able to generate hydroxyl radicals (•OH) (Fujishima et al. 2000; Konstantinou and Albanis 2003). The photocatalytic processes were carried out on a bench-scale photoreactor (200 mL, phenol concentration at 30 mg L−1), using an immersed Philips 125-W high-pressure mercury vapor lamp protected by bulb glass (UV-A) and 100 mg of semiconductor (TiO2). Aliquots of 3.0 mL were collected, and the photocatalyst particles were removed by a 0.45-μm nitrate cellulose membrane filter (Sartorius, Germany). Photocatalysis experiments were performed at natural pH of the phenolic solution. The system was cooled by a water jacket to maintain the solution at 25 ± 1 °C during the reaction. All solutions were maintained in the absence of light until spectrophotometric measurement. The samples for HPLC-DAD analysis were diluted tenfold to keep phenolic compounds within the concentration range for this technique.

Degradation kinetics were evaluated in duplicate for 120 min, and the samples were collected at 15-min intervals during the first hour of the photocatalytic process, and subsequently at 30-min intervals.

HPLC-DAD measurements

The spectrophotometric method developed has been compared with HPLC-DAD data which was based on information described in the literature (Abbas et al. 2008; Cesarino et al. 2012). Therefore, the mobile phase was a mixture of ultrapure water (solvent A) and acetonitrile (solvent B) according to the gradient program, 0–10 min, 15 % B (isocratic); 10–16 min, 15–60 % B (linear gradient); 16–22 min, 100 % B (linear gradient); 22–26 min, 100 % B (isocratic); and 26–27 min, 15 % B (linear gradient), which was maintained until 36 min. The flow rate was 0.6 mL min−1, and the sample injection volume was 50 μL; temperature was kept at 30 °C. DAD monitoring wavelengths were 212 nm for HQ, RS, PC, and PH and 247 nm for pBQ. Individual analytical curves were prepared between 0.005 and 5.00 mg L−1.

Results and discussion

Preliminary studies

In literature, different chemical structures of phenol degradation products were described (i.e., Grabowska et al. 2012). Accordingly, a heterogeneous photocatalysis was performed and HPLC-DAD analysis was used to investigate the intermediates qualitatively as shown in Fig. 2. The HPLC-DAD data showed that hydroquinone, resorcinol, and p-benzoquinone were majority species formed during degradation, and it was confirmed with individual spectra shown by DAD, which was in agreement with the previous papers. During the heterogeneous photocatalysis, the resorcinol concentration was very close to the limit of quantification (25 μg L−1) as shown in Fig. 2 (black line), suggesting a qualitative analysis by chromatographic method. Therefore, these compounds were selected for further molecular spectroscopy studies. In addition, pyrocatechol was included as a by-product even though it was not detected by HPLC-DAD (very low limit of detection on chromatographic conditions).

Fig. 2
figure 2

HPLC-DAD chromatograms obtained at different times of photocatalysis (0, 15, and 60 min) and resorcinol standard solution at 25 μg L−1

Furthermore, inasmuch as these species are formed during photocatalysis reaction at low concentration and high overlapping, the application of multivariate calibration methods has been used in assistance with UV potential expansion, particularly in complex environmental matrix analyses.

In Fig. 3, phenolic compound absorption spectra in the ultraviolet region at 15 mg L−1 were shown. A detailed analysis of these spectra revealed a slight difference between the compounds with two well-defined maxima for most of compounds (212 and 270 nm), except for p-benzoquinone, which has a single band at 247 nm. The spectral overlap between these compounds is quite evident, mainly to PH, PC, RS, and pBQ, representing a practical limitation for the determination of several analytes in a mixture with conventional calibration methods.

Fig. 3
figure 3

UV absorption spectra of phenol-PH and its main degradation products (hydroquinone-HQ, resorcinol-RS, pyrocatechol-PC, and p-benzoquinone-pBQ) at 15 mg L−1. Phenolic compound mixture was recorded at the same conditions

The influence of pH on the absorption was not investigated, since effluents are usually discharged at neutral pH. Based on that, phenol degradation processes were developed in natural pH of the solution (approximately 6.0).

Multivariate method

The multivariate study was carried out by PLSR, with 52 synthetic mixtures (1–44 training and 45–52 validation sets), according to the experimental design shown in Table 1. The number of statistically significant PLS factors in the calibration model were selected by a leave-one-out cross-validation procedure developed by Haaland and Thomas (1988) to avoid overfitting. The lowest root mean standard error in cross-validation (RMSECV) was achieved with smoothed spectra (Savitzky–Golay algorithm, filter width 5 and second polynomial order). Six latent variables (LVs) were necessary for the simultaneous determination of phenolic compounds, explaining 100 and 99.9 % of total captured variance for X and Y blocks, respectively. The optimum LV number expected for the PLS model corresponds to one LV for each component in the mixture. Probably, an additional LV (sixth factor) was necessary to solve the high overlap between phenolic compounds.

Table 1 Relative means error to predictive capacity in synthetic mixtures by PLS method

For multivariate prediction modeling, clearly each sample in the calibration set is represented by approximately 240 absorbance values, recorded in the same number of wavelengths. The first step in the multivariate approach demonstrated that almost 100 % of the data variance could be represented by a small number of new variables (LV), which originated from the linear combination of the original variables. If a small number of LV is used to represent the original multidimensional variance in the spectra, development of calibration model is greatly improved.

Loading data (Fig. 4) provided a satisfactory similarity with the individual spectra (Fig. 3) which demonstrates the complementarities to LV, as well as the correspondence between the first LV spectral signal and the mixture. LV2, LV3, LV4, LV5, and LV6 represent PH, HQ, RS, PC, and pBQ, respectively.

Fig. 4
figure 4

Six PLSR loading vectors to optimized model. The loadings depict the correlation between latent PLS variable and spectral information

Detecting outliers in the calibration set could be improved using the PLSR model. The typical criterion to detect outliers is studentized residuals versus leverage values (Fig. 5). By definition, the maximum allowable values for these parameters are ±2.5 and 0.41 (3 LV/N), respectively, where N is equivalent to the number of samples in the calibration set (FERREIRA et al. 1999). Accordingly, sample 29 showed high values on leverage. Removing this sample from the model, better predictive capacity was provided.

Fig. 5
figure 5

Studentized residual versus leverage for detecting outliers in the optimized PLSR model

The predictive capacity confirmation was done against the external validation set to yield the values shown in Table 1. The relative errors were between 0.66 and 4.6 %, with good repeatability (low standard deviation estimated values).

The multivariate model showed all the required features to support an analytical method to the control of photocatalytic degradation processes, including good fit with small number of latent variables, good correlation between individual spectra and loadings, good prediction to external validation set, and suitable regression coefficients.

Photocatalysis TiO2-mediated phenol and chemometrics

The selected model developed was applied for monitoring PH, HQ, RS, PC, and pBQ during a TiO2 photocatalysis assay, and these results were compared to a well-established method (HPLC-DAD). After the establishment of chromatographic conditions, individual calibration curves for the analytes were made. Concentration linear range was established from 7.5 to 5000 μg L−1 for PH, from 75 to 5000 μg L−1 for HQ, and from 25 to 5000 μg L−1 for RS and pBQ (R 2 > 0.99). In this chromatographic condition, pyrocatechol was not determined. After TiO2 photodegradation, a decrease in the phenol peak was observed at the same time as phenolic compound peaks appear: hydroquinone at 7.37 min; resorcinol at 10.4 min; phenol at 22.5 min when seen through a 212-nm channel; and another by-product formed at 12.1 min attributing to p-benzoquinone monitored at 247 nm (inset Fig. 6). Besides, one can observe that the PH absorbance decrease during the monitored time with a change at the maximum wavelength absorbance, evincing formation of some intermediates during the photocatalytic reaction (Fig. 6).

Fig. 6
figure 6

UV-Vis spectra recorded during TiO2-photocatalysis degradation of phenol (30 mg L−1). Inset: HPLC-DAD chromatograms obtained from kinetic data (detection at 212 nm) and the phenolic degradation products formed

A relatively good approximation of a multivariate and chromatographic method is shown in Table 2. These results are in accordance to Sobczynski et al. (2004) who monitored the formation of HQ at 14.7 % (about 4.2 mg L−1) via GC-MS for photocatalysis process under similar conditions. Resorcinol was only detected during the photocatalytic degradation. Among the results, an inconsistency was observed with pyrocatechol and resorcinol determination, primarily due to the limitations of the chromatographic routine and secondly because resorcinol concentration levels generated during the process were below the limits of chromatographic quantification. p-Benzoquinone was detected during all times evaluated, which suggests that the degradation process kept occurring until the mineralization. This result disagrees with those reported by Abbas et al. (2008) where pBQ was not detected due to its disappearance during the filtration of the mixture before analysis.

Table 2 Predicted results for the simultaneous quantification of PH, HQ, PC, RS, and pBQ during the photocatalytic degradation process by UV-PLS and HPLC methods

The UV-PLSR method developed was suitable for monitoring the progress of degradation products in one sort of AOP. Further application of chemometrics in photochemistry depends on article dissemination as described in this paper.

Conclusions

Multivariate models based on partial least squares regression have been developed and successfully applied for simultaneous quantification of phenol and its degradation products throughout the photocatalytic process in the aqueous solution. The operational simplicity, fast analysis, and low cost of molecular electronic spectroscopy associated with multivariate tools are ideal for degradation product determination.

Satisfactory quantitative analysis of the validation set allowed to quantify simultaneously five analytes that participate in the degradation process with low prediction errors. This methodology provided an alternative to traditional analytical methods used to monitor pollutant degradation in the photodegradation process. Another advantage is the absence of pretreatment of the sample.

The developed method ensured valuable information for further optimization of degradation processes, generalizing this strategy with natural samples and more complex matrices.