1 Introduction

Nitrous oxide is a powerful greenhouse gas whose increasing emissions due to anthropogenic activities like agriculture are of high concern to the atmospheric science community. The development of accurate measurement techniques to determine concentrations and fluxes is required to identify sources and sinks of N2O in relations with the nature of soil or with the extensive use of fertilizers [13]. The infrared diode laser spectrometry is an effective tool to reach such an objective as it allows a rapid, selective, sensitive, and precise gas monitoring [47]. Hence, we have developed a laser spectrometer, called “QCLAS”, based on the cutting-edge quantum cascade laser technology and devoted to the in situ monitoring of N2O (respectively CH4) by absorption spectroscopy at 4.5 (respectively 7.9) micron [8]. This laser sensor has been deployed in several field campaigns to determine emission fluxes of N2O and CH4 over agricultural farms, in collaboration with the French institute for agronomical research (INRA). The achieved precision error in the N2O retrieval lays near 0.2 % for a measurement time of 100 milliseconds. To enhance further the precision error we have used signal processing tools for laser spectrometry that we discuss in this paper.

There are several types techniques to reduce noise in spectrometric signals among which wavelet transforms [912] or singular decomposition value (SVD) [1316]. The latter is a very common technique of multivariate analysis. It can be applied to multidimensional data in various fields (acoustic, imaging, remote sensing, etc.). Basically, the SVD method consists of determining a space of dimension as small as possible that permits to characterize the studied dataset; it is further decomposed into two complementary subspaces, the first one characterizing the signal and the second one the noise). The SVD technique is highly suitable for infrared spectroscopy as it permits to denoise the molecular spectra as we discuss in more details further in this paper.

In the next chapter, we briefly describe our laser spectroscopy technique. Then, we will discuss the SVD method applied to the in situ monitoring of N2O. The first approach is based on simulated spectra using various types of noises to assert the most appropriate SVD parameters for our particular application. Moreover, the method will be applied to experimental data achieved on the field or in the laboratory with calibrated N2O gas samples. We will show out that SVD technique is powerful for both denoising the spectra and improving the dispersion in the concentration data.

2 Infrared laser diode spectroscopy

A typical set-up of diode laser spectroscopy is shown in Fig. 1 (see Ref. [17] for more details). The diode laser source has a usual average output power ranging from a few milliwatts up to several tens of milliwatts, and the laser line width is usually less than 10 MHz. Moreover, there are no mode-hops over the tunability range. The continuous tuning range (at constant temperature) is usually within 3 cm−1. The laser wavelength is temperature-stabilized by means of a Peltier thermo-element and is driven by a low-noise current supply. A low-frequency triangular ramp is used to scan the laser over the selected absorption lines by sweeping of the driving current.

Fig. 1
figure 1

Schematic of a tunable diode laser spectrometer

The laser beam is usually separated into two parts via a beam splitter. The reflected beam is coupled with a Fabry-Pérot (FP) etalon used for relative frequency calibration. The FP gives good insight into the linearity of the tuning of the laser emission frequency; the signal from the FP further permits detection of mode hops over the laser spectral tunability range. The FP signal is used to perform the frequency scaling, i.e. to attribute a frequency value to each point number in the spectrum. The second beam is passed through an optical multipass cell filled with the gas under study. Both beams are focused on photo detectors adapted to the laser wavelength. The spectra are usually averaged to enhance the signal to noise ratio. The spectra are digitized with a 16-bit converter and stored on a computer for further processing. To retrieve the gas concentration a nonlinear least-squares fit is applied to the molecular transmission using a Voigt-profile for the modeling of the line shape. The molecular transmission T(σ) is obtained from a procedure extendedly presented in [17]. First, the FP signal is used to perform the frequency scaling with a polynomial interpolation on the interference fringes. In a second step, the molecular transmission T(σ) is retrieve from the direct spectrum A using the following relation:

$$ A(\sigma ) = A_{0} (\sigma )T(\sigma ) $$
(1)

A 0 is what would be the laser flux in the absence of absorber in the cell. A 0 corresponds to the baseline and is usually obtained from A with a polynomial interpolation over the full transmission region. Then the density of absorbing molecules n is related to the molecular transmission through the Beer–Lambert law:

$$ T(\sigma ) = A(\sigma )/A_{0} (\sigma ) = \exp [ - k(\sigma ,T,p)nl] $$
(2)

where l is the optical path length, and k(σ, T, p) the absorption coefficient at temperature T and for a gas pressure p usually modeled using a Voigt profile. The retrieved concentrations are usually smoothed with a moving average. In this paper we will show the improvement achieved using a signal denoising procedure based on singular value decomposition.

3 Overview of the singular value decomposition method

We give in this chapter an overview of the SVD theory applied to laser spectrometry. A more complete presentation of the theoretical basis is found for instance in Ref. [16]

We call X a dataset of N2O spectra of dimension n × m and rank r, defined as

The singular decomposition value of the matrix X is the factorization:

$$ X = U\Upsigma V^{\text{T}} = \sum\limits_{i = 1}^{q} {\lambda_{i} u_{i} v_i^{\text{T}} } $$
(3)

where U and V are orthogonal matrices of respective dimensions n × n and m × n, and their columns vectors, u i and v i in (3), are, respectively, called left- and right singular vectors. Moreover, u k is called propagation vector and v k normalized wavelet. Σ is the diagonal matrix of dimension n × m whose diagonal elements, \( (\lambda_{i} )_{{i \in \{ 1,..q < = \min (m,n)\} }} \) ranked in descending order, are called singular values of X.

According to (3), with the SVD method, the matrix X is written as the sum of r (r is the rank of X) matrixes (u i v T i ) of size (n, m), each weighted by a singular value λ i of X. Filtering the signal X with SVD consists of assuming that the most significant singular values convey the principal information on the signal, while the low singular values characterize the noise. Assuming the s first singular values are sufficient to characterize the signal X, we can rewrite

$$ X = U\Upsigma V^{\text{T}} = \sum\limits_{i = 1}^{q} {\lambda_{i} u_{i} v_{i}^{\text{T}} = \sum\limits_{i = 1}^{s} {\lambda_{i} u_{i} v_{i}^{\text{T}} } } + \sum\limits_{s + 1}^{q} {\lambda_{i} u_{i} v_{i}^{\text{T}} } $$
(4)

The initial dataset X is thereby separated into two complementary subspaces, the signal, X signal, « characterized by s singular values » and the noise, X noise, « characterized by the rs remaining singular values » such as:

$$ X = X^{\text{signal}} + X^{\text{bruit}} $$
(5)

Furthermore, according to the Eckart–Young theorem [18], X signal is the matrix of rank s that minimizes the Frobenius norm of the difference between X and X signal. Hence, it is the best approximation of X in the least-squares sense by a rank s matrix.

Practically, the X signal is constructed from relation (3) with a weight thresholding process that consists of setting to zero in the Σ matrix the (qs) singular values that are estimated insignificant as far as the main signal information are concerned. Since the number of significant or insignificant singular values of dataset X depends on the application, we will discuss the major point of determining the threshold value in the next section using simulated absorption spectra.

4 Approach by simulation

In this chapter, we use simulated N2O spectra to find out the most appropriate SVD parameters for our applications. The simulations are drawn using pressure, temperature and N2O concentrations that are close to the experimental conditions reported later. With the used QCL experimental setup [8], the Flicker electronic low-frequency noise [19] and the standard Gaussian white noise [20] are the predominant sources of noises and are both randomly added to each simulated spectrum. Interfering Fabry–Perot fringes due to scattered light in the multipass optical cell are weak and hidden by electronic noise. Experimental spectra are reported and we will assess this point upon the nature of predominant noise later in this paper.

Given the experimental conditions, three cases will be considered to simulate the matrix X of N2O spectra:

Case 1 simulation of a number n 1 of spectra, randomly noised, with the same known concentration. This case will be useful to show that the SVD preserve the stability.

Case 2 simulation of a number n 2 of spectra, randomly noised, with different steps of concentrations. This case will be useful to show that the SVD preserve real variations. In the particular case of two concentration levels, we consider m spectra with the same concentration value followed by n 2m spectra with another concentration value.

Case 3 simulation of n 3 spectra, randomly noised, with increasing concentrations. To show that the SVD is adapted to flux measurements.

We consider the N2O rotation-vibration transition at 2,235.49 cm−1 of the v 3 vibrational band. The parameters for the simulation are temperature = 25 °C, pressure = 40 mbar and optical path length = 76 m. In each case, we simulate a matrix of spectra of range 2,500 made of 2,500 spectra with 2,500 points each. For both types of noise, spectra are represented in Fig. 2.

Fig. 2
figure 2

Simulated N2O spectra at 2,253.49 cm−1 with Gaussian and Flicker noises (T = 25 °C, p = 40 mbar, l = 76 m, N2O concentration = 320 ppb)

4.1 Significant singular values

In the ideal case of a single concentration value, the matrix spectrum is composed of 2,500 identical spectra and a single axis (singular value) is sufficient for the SVD decomposition. In the case of a variable concentration, one axis will no more be sufficient, because additional axis describing the changes in concentrations are to be taken into account, which are independent of the first axis.

Hence, to determine the number of necessary axis, we have plotted in Fig. 3 the singular values in logarithmic scale obtained by applying the SVD decomposition to a simulated matrix of noisy spectra in the case of increasing concentrations between 320 and 340 ppb. We observe in Fig. 3a and b that for both noise cases, the singular values are first dramatically decreasing, and much more slowly afterwards, which suggests that it is possible to approximate the original matrix by a matrix with an inferior rank constructed by keeping only a few significant singular values. In Fig. 3a, b, it is obvious that for both noise types, the first singular value is the most significant one. In Fig. 3c and d, we have expanded the view by plotting the singular values from the second until the 30th ones for both noises cases. In the Gaussian noise case (Fig. 3a), we can see that the second singular value also differs from the remaining ones. In the Flicker noise case (Fig. 3d), there are two singular values to be taken into account, the second and the third ones, that are more significant than the remaining ones; compared with the Gaussian noise case, one axis more is needed here which may be explained by the distortions in the baseline of the spectra caused by the Fliker noise. From these observations, it appears that we need to consider at least three significant axes to proceed with both noise types.

Fig. 3
figure 3

Singular values with Gaussian noise (a all the singular values, c zoom between the second and the 30th singular value) and Flicker noise (b all the singular values, d zoom between the second and the 30th singular value) in the case of increasing concentrations

To go deeper in the analysis, we have plotted in Fig. 4, in the case of the Flicker noise, the first four normalized wavelets (v1 to v4), for stable (Fig. 4a) and increasing (Fig. 4b) concentrations. The choice of keeping at least three significant axes seems coherent, as we observe that the first three normalized wavelets mainly carry the information upon signal peak and background. Regarding the fourth normalized wavelet, it describes noise for the stable concentrations case, while in the increasing concentrations case, it carries a little information upon the peak of absorption, which is due to the little difference between spectra (the absorption peak varies with increasing concentration from one spectrum to the other). Hence, for the case of increasing concentrations, we have decided to keep a fourth singular value to truncate the signal matrix, to take into account the slight signal information that remains in the fourth normalized wavelength.

Fig. 4
figure 4

First four normalized wavelets for Flicker noise case: stable concentrations in (a) and increasing concentrations in (b)

To conclude, from this approach by simulation, we have decided to keep the three most significant singular values when constant concentrations are considered and the four most significant singular values when increasing concentrations are to be processed. Hence, the cut-off index, k, will be fixed to 3 or 4 accordingly and this truncation will be used to construct the X signal matrix.

4.2 Bias and data precision

Figure 5a, b display a simulated spectra with and without the application of a SVD decomposition and truncation by keeping the three most significant singular values for both type of noises. By comparing both spectra, we can observe that for these cases the signal-to-noise ratio (SNR) in the spectra is strongly improved by a factor of 60. In Fig. 5c, d, the residuals which are the difference between the noisy and noiseless spectra are plotted with and without the application of a SVD decomposition and truncation of the noisy spectrum. It shows the dramatic improvement of a factor 60.

Fig. 5
figure 5

Simulated N2O spectra, with and without the application of the SVD decomposition and truncation, for both Gaussian (a, c) and Flicker (b, d) noises

The improvement in the SNR of the spectra by applying the SVD decomposition is not sufficient; one has to show that the concentration values are not distorted by the signal processing. Purposely, we have constructed large X matrices made of simulated spectra for cases 1, 2 and 3 with various amplitude for the Gaussian and Flicker noise. The X signal matrix was then constructed by applying a SVD decomposition followed by a truncation (with the rules given at the end of the previous section). The set of concentrations were then yielded by applying a least-squares fit to the spectra in the X and Xsignal matrixes. The inversion technique for a given spectrum which is based on a non-linear least-squares fit to the full molecular line shape is fully described in Ref. [17]. In particular, the linebase is reconstructed from a third-order Spline interpolation over the zero-absorption region on both side of the spectrum.

The results are displayed in Table 1. The mean value and the standard deviation of the residuals, i.e., the difference between the concentration values obtained by fitting the spectra in the X (or Xsignal) matrix and the exact concentration value (used for the simulation), are calculated. Two cases are reported but the conclusions can be extended to whatever combination. A case with a step of concentrations (315, 330, and 360 ppb) with Gaussian noise as well as a case with a constant concentration (320 ppb) with Flicker noise is reported. In addition, the amplitude of the Gaussian and Flicker noise added to the simulated spectra is varying; we have taken three values, 50, 100 and 200 for the standard deviation of the noise. In addition to Table 1, Fig. 6 displays the yielded concentrations for both studied cases (steps and stable concentrations, case where the standard deviation is equal to 200).

Table 1 Mean values and standard deviations for concentration residuals with and without the SVD transform
Fig. 6
figure 6

Concentrations and residuals with and without the SVD treatment and for Gaussian (a) and Flicker (b) noises

From Table 1, we conclude that the SVD decomposition does not bias the concentration values because the means of the residuals obtained with and without the SVD are close whatever the nature and the amplitude of the noise are. For instance, for the constant concentration and the Flicker noise, the mean of the concentration residuals are 0.900 and 0.898 without and with the SVD treatment for a noise STD. The same conclusion can also be drawn immediately by examining the concentrations plotted in Fig. 6.

The SVD improves strongly the data precision because the standard deviations of the concentration residuals are much weaker when a SVD treatment and truncation are applied to the spectra. As can be seen in Table 1, the improvement in the standard deviations of the concentration residuals lays between 30 and 60 % whatever the concentration case or the noise amplitude are. The improvement seems most notable in the case of flicker noise. It is not surprising because the flicker noise distorts the background signal, which impacts the reconstruction of the baseline and further strongly impacts the error in the retrieved concentration [17]. The SVD filtering method seems powerful to return a correct background signal in the case of Flicker noise which is also demonstrated in Fig. 5b, d.

To conclude, what we learn from the simulation part of this work is that the SVD decomposition in conjunction with a truncation that keeps only the three or four most significant singular values, can strongly improve the concentration precision (by a factor from 30 to 60 % according to the nature and the amplitude of the noise) without biasing the concentration value.

In the next chapter we will apply these treatments to real measurements from laboratory or field campaign.

5 Experimental results

The SVD treatment as defined previously is now applied to real experimental data. Calibrated N2O samples in the laboratory as well as N2O flux measurements achieved in the field are used to investigate the improvement in the concentration data precision offered by the SVD method.

For the measurement of N2O, a quantum cascade laser emitting around 2,235.49 cm−1 is used [8]. The instrument and gas temperatures are thermally controlled to be maintained around 25 °C. The laser beam is passed through an optical multipass cell developed by Aerodyne Inc. (absorption path length 76 m; volume 0.5 L). The gas pressure is regulated at 40 mbar with a stability of about 0.1 mbar. The gas flow is roughly 5 L/min. At atmospheric pressure, 6 s are needed to renew the gas sample inside the cell; at 40 mbar, it takes only 300 ms. One elementary N2O spectrum is obtained in 25 ms. Then ten successive elementary spectra are co-added to give one N2O measurement.

Figure 7a shows an experimental N2O spectrum and the simulated spectrum obtained from the non-linear least-squares: the atmospheric transmission is featured that is obtained by taking the ratio between the spectrum and the reconstructed linebase. To show the noise we have plotted in the Fig. 7b the difference between the experimental and the simulated spectra. The nature of the experimental noise observed with our laser set-up is predominant contribution of Gaussian and Flicker noise. Perturbating Fabry–Perot types fringes have been limited by taking a great care in the optical set-up and are hidden by the electronic noise. With this level of absorption depth (~17 %) we are mostly facing electronics noise that we try to reduce with the SVD approach.

Fig. 7
figure 7

a An experimental N2O spectrum and his fitted line, b the residuals which is the difference between the spectrum and his fitted line

In this section, we will consider three datasets of N2O spectra recorded in conditions similar to the cases treated formerly with the simulation approach. They are different by their concentrations and measurement protocol.

The first dataset of measurements were recorded with a calibrated bottle featuring a concentration of N2O of 303.52 ± 0.11 ppb (purchased from National Oceanic and Atmospheric Administration, NOAA).

The second set of measurements was yielded by means of three calibrated bottles with concentrations, respectively, of 318.00 ± 0.14, 333.00 ± 0.13 and 363.06 ± 0.13 ppb (purchased from NOAA). The gas was passed through the optical cell in cycles using an automatic control system (solenoid) that opens and closes successively the bottles, as shown in Fig. 8.

Fig. 8
figure 8

Automatic control system for successive measurements with different calibrated N2O bottles

The last set of N2O measurements was obtained during a field campaign dedicated to N2O flux measurements. It happened in October 2011 in the territory of the Wallis and Futuna islands (14S) with the field QCL spectrometer described in [8]. The technique used to yield flux (emitted by the soil) over farm parcels is rather standard and explained in Fig. 9. The gas emitted by the soil is collected into an enclosure and driven from the enclosure to the spectrometer optical cell using a pumping system; the air sample is then rejected back to the enclosure. The emitted gas is trapped in a closed circuit so that any gas emission leads to an increase of the measured concentration as shown in Fig. 9. The gas flux is obtained by linearly fitting the increase of the concentration with time; it is the slope of the diagram in Fig. 9.

Fig. 9
figure 9

Principle of flux measurement

The first two datasets will be used to confirm that SVD does not bias absolute concentration values and does strongly improve the precision. Indeed, for these two datasets, the real concentration value is known accurately (NOAA calibrated gas samples). The data flux measurements can be considered like a set of N2O concentrations that increase linearly. These three datasets permit to consider cases which are very similar to what was done in the section devoted to the approach by simulation: we can study set of absorption spectra corresponding to stable concentrations, concentration steps and concentrations that increase linearly. The SVD decomposition and the truncation applied to the X matrix to construct the X signal matrix was described in the previous section: the three most significant singular values are kept for the first dataset and the four ones for the second and the third datasets.

Figure 10a shows the percentage of signal inertia for all singular values for the third dataset, which quantify the signal energy on a sum of axes or directions, defined by

Fig. 10
figure 10

Signal inertia (a) and the first fifth normalized wavelets (b–f) for the third dataset (increasing concentrations)

$$ {\text{Inertia}}(N) = \frac{{\sum\nolimits_{i = 1}^{N} {\lambda_{i} } }}{{\sum\nolimits_{i = 1}^{r} {\lambda_{i} } }} $$
(6)

\( \lambda_{i}\) are the singular values, and r the rank of the matrix, N is the number of selected singular values. The four most significant singular values carry more than 98.6 % of the signal energy.

According to Fig. 10b, c, d, e, and f, respectively, representing the first five normalized wavelets, the choice of four significant singular values is appropriate: indeed, the first three wavelets are mainly characteristics of the peak of the line and of the background signal while the fourth carries a little information upon signal as well as noise. The fifth wavelet does only feature noise.

The Figs. 11, 12 and 13 displays the concentrations obtained with and without SVD for the three datasets by fitting of the spectra in the X and X signal matrixes. The mean value as well as the standard deviation has been calculated for the residuals. The residuals are the difference between the retrieved concentrations and the values from the calibrated bottles for the first and second datasets. For the third dataset, we have performed a linear regression over the set of retrieved concentration and defined the residuals as the difference between the retrieved concentrations and the fitted concentrations. The results are summarized in Table 2.

Fig. 11
figure 11

Concentrations and residuals with and without the SVD treatment for the first dataset (stable concentrations): a concentrations, b residuals

Fig. 12
figure 12

Concentrations and residuals with and without the SVD treatment for the second dataset (steps of concentrations): a concentrations, b residuals

Fig. 13
figure 13

Concentrations and residuals with and without the SVD treatment for the third dataset (increasing concentrations): a concentrations, b residuals

Table 2 Mean values and standard deviations for concentration residuals with and without the SVD treatment for the three datasets

Figures 11, 12 and 13 show clearly that the SVD significantly reduces the dispersion of the concentration data; the residuals are much smaller after the SVD transform. In Table 2, the improvement in the precision of the concentration measurements lays between 40 and 55 % what is in good agreement with what was predicted with the approach by simulation.

We further see that SVD does not bias the concentration values. The yielded concentration values are in good agreement with the calibrated values and the residuals are centered around zero. In Table 2, the mean of the residuals without and with SVD tend to zero for all the datasets. Indeed they are, respectively, equal to “0.097 and 0.03”, “0.088 and 0.032”, “0.000 and 0.000” for the first, second and third dataset. The value close to zero observed for the third dataset is due to the fact that the residuals are the difference between the experimental concentrations and the ones fitted with a linear regression.

The examination of the sets of experimental data confirms our conclusions drawn from the approach by simulation. The choice of the cut-off level in the singular values seems adequate. The application of the SVD decomposition offers a dramatic improvement in the concentration data precision and does not bias the concentration value.

6 Conclusion

In conclusion, the singular value decomposition treatment proposed in this paper is very efficient to improve the signal to noise ratio in the spectra and the precision of the concentration data (by a factor of up to 50 %) and it does not bias the concentration value. Both the simulated data and the experimental ones confirm these conclusions. This data treatment is easy to implement for the processing of large set of absorption spectra. The SVD decomposition and truncation is to be applied to the matrix constructed with the experimental spectra; then the processed spectra are to be fitted in a standard manner to give the concentration. The matrix formalism permits to handle large set of data. The method was exposed using N2O measurements, but it can be extended to the monitoring of whatever gas by laser spectrometry. We intend to implement this formalism for the Relaxed Eddy Accumulation (REA) and eddy correlation measurements for which large sets of data are to be manipulated and for which the improvement of the precision is very important. Indeed, the improvement of the precision in concentration retrieval is helpful to better estimate the N2O fluxes and thereby to better characterize these fluxes in relation with the soil and climatic conditions.