1 Introduction

CO2 is the primary greenhouse gas in the atmosphere, and about 60% of the greenhouse effect is caused by CO2. With the rapid development of the industry, the problem of air pollution has become an increasingly prominent issue [1, 2]. The increase of carbon emissions not only exacerbates the greenhouse effect but also has a serious impact on the daily life of human beings. Therefore, the accurate measurement of atmospheric CO2 concentration is of great importance in improving the scientific understanding of the CO2 impacts, which could also provide guidance on the control of carbon emission.

The methods used to determine CO2 concentrations currently can be divided into chemical methods and spectrum measurement methods. The methods of spectral measurement have been widely used in environmental detection because of its wide detection range, high sensitivity, and real-time online analysis. The absorption spectrum detection technique is an effective detection technique with high sensitivity, wide detection range, and strong practicability, which has a good application prospect in the field of gas detection. Infrared spectroscopy technologies commonly used include Tunable diode laser absorption spectroscopy (TDLAS) [3], Fourier transform infrared spectroscopy (FTIS) [4], Photoacoustic spectroscopy (PAS) [5]. Compared with these three technologies, super-continuum laser detection technology is a new type of absorption spectrum detection technology, which can not only measure the single gas component but also measure the multi-component gases in the atmosphere simultaneously [6]. Super-continuum laser has become an ideal light source for detecting gases for its easy collimation, high brightness, wide spectrum and high stability [7,8,9]. Jihyung et al. [10] have measured low concentrations of hydrocarbon gases such as acetylene and ethane by super-continuum laser detection technology. The experimental results were consistent with the results of HITRAN database. It is verified that the super-continuum laser can detect broadband absorption spectra. Wen et al. [11] proposed a multi-model fusion variable selection method for the characteristics of near-infrared spectroscopy data, which improved the prediction ability of the model and verified the feasibility of the method experimentally. Pan et al. [12] used the spectroscopy method to detect the total nitrogen concentration in water, in which the data of a single model of different bands was fused. The established fusion model improved the system measurement accuracy effectively. To improve the reliability and stability of the mine environmental monitoring system, Wang et al. [13] used the weighted data fusion technology to process the data of multiple sensors to decrease the measurement error of gas concentration. Tobias et al. [14] measured N2O and CO2 in cultivated land, forests, and grassland through a new data fusion model, which helped to control the gas emissions. In order to perform the high-level fusion for quantitative analysis, Li et al.[15] proposed two MID methods of RMSEPW and RPDW to proceed with high-level fusion.

In summary, the application of fusion models improves the measurement accuracy of the detection systems, but researches using the super-continuum laser absorption spectroscopy technology combined with the fusion method on gas concentration detection have rarely been reported. To reduce the system measurement error and improve the accuracy of CO2 prediction, this paper proposes a multi-band weighted fusion model for CO2 concentration measurement. The prediction effect is better than other single-band prediction models and it overcomes the shortcomings of single model prediction, also it is easy to implement and it has strong applicability for the CO2 measurement. This study takes CO2 as an example and set up a super-continuum laser absorption spectrum detection system. The absorption spectrum of CO2 are measured in different bands. The method of inversion measurement of gas concentration through the integrated area of absorption peaks is proposed, and the measurement model of CO2 concentration in different bands is established. The fusion model obtained based on RMSE improves the accuracy of the single model effectively by decreasing the maximum relative error from 3.4 to 1.2% for a single model. This study verifies that the multi-band fusion model is feasible in the measurement of gas concentration, which provides a new method and new idea for the concentration measurement of other gases in the future.

2 Theoretical basis

2.1 Multi-band fusion model

To improve the predictive ability of a single model, weights are assigned to the three measurement models to obtain a multi-band fusion model, thereby improving the system measurement accuracy, which is of great significance for actual environmental monitoring. The fusion models obtained by weighted fusion of the multiple single models are ZG, and it can reduce the interference of water vapor and other gases in the air on the experimental results effectively. The construction of the fusion model is shown in Fig. 1. The model can be described as:

$$\mathop Z\nolimits_{G} = \sum\limits_{{i = {1}}}^{n} {\mathop Y\nolimits_{i} \mathop W\nolimits_{i} } = \left( {i = 1,{2}...n} \right).$$
(1)
Fig. 1
figure 1

The construction of the multi-band weighted fusion model. The new fusion models ZA and ZB are obtained and the weights are determined by R2 and RMSE, respectively

In Eq. (1), ZG (G = A, B…G) is the fusion model, G is the number of the fusion model, n is the number of the single model, Yi is the single model. The weight W is introduced into the model, Wi∈[0,1], (i = 1,2…n). The weight represents the relative importance of a single model in the fusion model. The most significant step in establishing the fusion model is to determine the weight coefficient of each single model.

In this paper, the weight distribution of the three models is evaluated by the fitting coefficient R2 and the mean square error RMSE of each single model. The fusion model is used to reduce the prediction error and improve the accuracy of the concentration inversion.

2.2 Weighting indicator determined method

2.2.1 Fusion model Z G based on R 2

R2 refers to how well the linear model fits the values observed. The closer the value of R2 to 1, the better the model fits. The weight coefficients of the fusion model ZG are determined by the fitting coefficients R2 of the three models. The weight is determined by the proportion of R2 of a single model in the sum of R2 of three models. The sum of the fitting coefficients e1 of the single model can be obtained through experimental results. The e1 and Wi are shown below:

$$\mathop e\nolimits_{1} = \sum\limits_{{{\text{i}} = 1}}^{n} {\mathop R\nolimits_{i}^{2} } (i = 1,{2}...n),$$
(2)
$$\mathop W\nolimits_{i} = {R_{i}^{{2}} }/e_1\quad \left( {{{i}} = {1},{2}...n} \right),$$
(3)

2.2.2 Fusion model Z G based on RMSE

Root mean square errors (RMSE) is used to measure the deviation between the observed value and the true value. It is often used to evaluate the prediction result of the model. The weight coefficients of the fusion model ZB are derived from the mean square error RMSE of the three single models [16]. The e2 is the sum of the RMSE of the single model, also it can be derived from the experimental data, M is the number of single models. The e2 and Wi can be calculated as:

$$\mathop e\nolimits_{{2}} = \sum\limits_{i = 1}^{n} \text{MSEi} \quad \left( {{{i}} = {1}, {2}...n} \right),$$
(4)
$$\mathop W\nolimits_{i} = \frac{{\mathop e\nolimits_{{2}} { - }\text {MSE}_{i} }}{{\mathop e\nolimits_{{2}} }}{*}\frac{{1}}{{M - {1}}}\quad {{i}} = {1}, {2}...n,$$
(5)

2.3 Selection and analysis of absorption lines

The super-continuum laser selected has a spectral range of 400–2400 nm, and the wavelength range filtered by a Laser Line Tunable Filter (LLTF) is 1000–1700 nm. To improve the sensitivity and signal-to-noise ratio of the spectrometer and to enhance the absorption effect of CO2, a wavelength band with strong absorption was selected in the experiment. We should evaluate the spectral interference generated by the absorption line of atmospheric gases to ensure the measurement precision. To further avoid the peak and trough of the background spectrum, the wavelength of 1280–1700 nm is selected for the detection. According to the HITRAN2016 gas spectrum database, the absorption spectrum of CO2 within the range 1280–1700 nm (5882–7813 cm−1) is shown in Fig. 2.

Fig. 2
figure 2

Absorption spectrum of CO2 within the range 1280–1700 nm at 1 atm, 295 K. Data obtained from HITRAN2016 database

It can be seen that there are six CO2 absorption clusters lying between 1425 and 1615 nm. Therefore, the multi-band measurements of CO2 were performed in three wavelength bands which are 1425–1445 nm, 1565–1585 nm, and 1595–1615 nm, respectively. It is well known that the characteristic absorption spectra of gas molecules are decided by their own structure. To avoid the interference of other gases on the CO2 absorption line, the absorption line distributions of H2O and the other gases in the air are compared with CO2 at 1280–1700 nm at 1 atm based on the HITRAN2016 database. The absorption intensity of HF, O2 and O3 in this band is relatively weak, and their influence on CO2 measurement is negligible. SF6, SO2, C2H4, C2H6, HCHO, CH3OH, NO2 and other gases have no absorption spectrum in this band. The possible interference from H2O, CO, H2S, C2H2, NH3, CH4, and other gases on the CO2 absorption line in this band is shown in Fig. 3. The absorbance of CO2 and H2O, CO, C2H2, CH4, H2S, NH3 at different concentrations obtained from the database at the bands of 1425–1445 nm, 1565–1585 nm, and 1595–1615 nm are shown in Fig. 4.

Fig. 3
figure 3

The comparison of absorption spectra of H2O, CO, H2S, C2H2, NH3 and CH4 with CO2 in 1280–1700 nm at 1 atm and 296 K based on the HITRAN2016 database

Fig. 4
figure 4

The comparison of absorbance of H2O, CO, C2H2, CH4, H2S and NH3 with CO2 in 1425–1445 nm, 1565–1585 nm and 1595–1615 nm at 1 atm, 296 K and 26.4 m optical path based on the HITRAN2016 database

3 Experiments

3.1 Experimental system

Our experimental SCLAS consists of a super-continuum laser, a Laser Line Tunable Filter (LLTF), a diaphragm, a dynamic dilution calibrator, a White cell, a photo-detector, a data acquisition card (DAQ), and a personal computer (PC). The experimental setup is shown in Fig. 5. The super-continuum laser (type SC400-4, Fianium, UK), a picosecond pulsed and wavelength between 400 and 2400 nm with a maximum output power of 8 W, was used and wavelength filtered by the LLTF, and programing was controlled by the PC, so that the output laser wavelength is scanned in the wavelength range of 1425–1445 nm, 1565–1585 nm, 1595–1615 nm. After filtering out the stray light around the beam, the laser incidents in the White cell. It is then reflected in the White cell multiple times and reaches the photo-detector. The photo-detector converts the optical signal into an electrical signal and transmits the data to the DAQ and PC terminals.

Fig. 5
figure 5

Schematic of the experimental setup. SCL Super-continuum laser, PCF Photonic crystal fiber, LLTF Laser Line Tunable Filter, DDC Dynamic dilution calibrator, PD Photo-detector, DAQ Data acquisition card

The LLTF is used in the experiment, which is the light source’s spectroscopic accessory in this system. The spectral bandwidth is 5 nm, and the wavelength tuning resolution is 0.1 nm. The output spectral range is 1000–2300 nm, which has a high damage threshold and long lifetime.

The White cell (model 35-V-H, Infrared Analysis, USA), with a multipass long path of 2.2–35 m and volume 8.5 L, was used to improve the sensitivity of the spectrometer. The incident laser is reflected multiple times in the cell to increase the optical path length, thereby achieving the purpose of gas circulation absorption. The dynamic dilution calibrator employs two mass flow meters with high-precision to control the gas flow rates, thus obtaining the desired concentration of the test gas. The flow measurement accuracy is ± 1.0%, the flow control repeatability is ± 0.2%, and the flow measurement linearity is ± 0.5%.

The diaphragm is a GCM-5711 M variable square aperture diaphragm, which can filter out the surrounding stray light, thereby adjusting the intensity of the passing beam. The photo-detector (PDA50B, THORLABS, USA), with a spectral range of 800–1800 nm, a gain of 0–70 dB and response time of 50 ns, is used to detect the characteristic light signal absorbed by the gas with low noise.

The experiment was carried out at room temperature (295 K) and at the gas pressure of 1 atm. The light path was aligned by visible light before the experiment to ensure that the laser could be smoothly injected from the left side of the white cell and emitted from the right side. The laser was reflected back and forth in the cell for 21 times and the corresponding optical path reached 26.4 m. During the experiment, high-purity nitrogen (N2) (99.999%) was first introduced into the absorption cell and the measured signal was taken as the background signal. Afterwards, 5.0%, 5.5%, 6.5%, 7.0%, 7.5% and 8.0% of CO2 were injected in the absorption cell for testing. In order to ensure the accuracy of the experimental results, the absorption cell was purged with N2 for ten minutes at 1 atm before each measurement.

4 Results and analysis

4.1 Measurement results

The absorption spectrum of CO2 at different concentrations tested at the bands of 1425–1445 nm, 1565–1585 nm, and 1595–1615 nm are shown in Fig. 6(a)–(c). The background subtraction method is used to eliminate the influence of low-frequency background noise such as the baseline drift. The data is smoothed by the S–G filtering method, and then the sampling points and wavelengths are converted from a time domain to a frequency domain. The experimental results show that in all the three tested bands, there appears to be main and secondary absorption peaks, which are 1432 nm/1437 nm, 1572 nm/1579 nm, and 1603 nm/1609 nm, respectively. The results are consistent with those in the database. The absorbed signal intensity has a good positive correlation with gas concentration. As the gas concentration increases, the gas absorption signal gradually increases.

Fig. 6
figure 6

The absorption spectrum of CO2 with different concentrations at (a) 1425–1445 nm, (b) 1565–1585 nm, and (c) 1595–1615 nm at 295 K and 1 atm

4.2 Modeling and evaluation

However, the intensity of the gas absorption peak may be deviant from the real value due to the drift of the line center, the spectral line type, and the spectral line broadening. Therefore, this paper proposes a method for the derivation of gas concentration through the integrated area of the CO2 absorption peak. The method can eliminate the influence of single-wavelength light intensity fluctuation and spectral line wavelength shifting effectively, and avoid interference of other factors such as noise. Three linear models Y1, Y2 and Y3 are established for the absorption bands 1430–1435 nm, 1570–1575 nm, and 1600–1605 nm respectively, which reveals the relationship between the integral area and the CO2 concentration of the main absorption peak. The models are depicted in Fig. 7(a)–(c).

Fig. 7
figure 7

The linear fitting results of the integrated area and concentration at the main absorption peaks of (a) 1430–1435 nm, (b) 1570–1575 nm, and (c) 1600–1605 nm at 295 K and 1 atm

The prediction accuracy of the three models is evaluated by the determination coefficient R2 of the actual value and the predicted value, the root mean square error (RMSE), and the relative analysis error (RPD). The larger the R2, the smaller the RMSE, indicating the higher the accuracy of the model. RPD [17, 18] is defined as RPD = SD/RMSE, where SD is the sample standard deviation. When RPD ≥ 2.0, it indicates that the model is suitable for prediction. When 1.4 < RPD < 2.0, the model is considered to be more reliable, and the prediction accuracy of the model can be improved by other modelling methods. When RPD ≤ 1.4, the model is considered to be unreliable. The estimation accuracy of the three models is shown in Table 1.

Table 1 Accuracy assessment of model Y1, Y2 and Y3

The experimental results show that there is a good positive correlation between the integral area of the absorption peak and the gas concentration. The larger the CO2 concentration, the larger the integrated area of the peak. The RPD of the three concentration measurement models is all greater than 2, indicating that the model has high stability and strong prediction ability in deriving the CO2 concentration. However, the strong absorption lines of other gases in the air will interfere with the CO2 absorption line thus affecting the predicting results of the models. Therefore, to reach the best prediction results, the three models have to be used according to different situations.

The model Y1 has a higher fitting coefficient and the highest stability among the models. However, the measurement band of the model is greatly affected by H2O, NH3 and CH4, thus the concentration of CO2 can be measured in the absence of these three gases. When Model Y2 is employed, it is interfered with by the absorption lines of CO, C2H2, NH3, H2S, etc., thus the CO2 concentration can be measured in the absence of these gases. Model Y3 is only greatly affected by the CH4 and H2S. Compared with the first two models, the model Y3 is less interfered by other gas types. When the effect of CH4 and H2S in the air is small, the model Y3 can be used to detect the CO2.

The weights were calculated based on the R2 and RMSE of the single model Y1, Y2, Y3, and the results are shown in Fig. 8.

Fig. 8
figure 8

The pie chart of the weight of a single model. (a) Weighting indicator based on R2; (b) weighting indicator based on RMSE

The maximum relative errors of the two fusion models obtained through data processing are shown in Table 2. After analysis, it can be found that the least-squares method with weight can improve the model inversion ability effectively, and the multi-band fusion model can reduce the measurement error greatly, and thus achieving the purpose of improving the concentration measurement accuracy. Comparing with the single model, the fusion models obtained by R2 and RMSE have both reduced the measurement error, and the fusion model ZB reduces the error greatly, which improves the accuracy of the measurement results effectively. In summary, the fusion model ZB overcomes the problem of the low prediction accuracy of a single model, therefore, this fusion model can be selected to measure the CO2 concentration.

Table 2 Relative error analysis of concentration inversion for each model

Comparing with the CO2 concentration measurement results reported currently, the maximum concentration measurement error is 1.2%, which is improved the prediction accuracy effectively. Yao et al. [19] used DAS and wavelength modulation spectroscopy (WMS) to measure the CO2 concentration, while the maximum relative errors for DAS and WMS were 2.64% and 1.65%, respectively. Deng [20] reported that they used a distributed feedback (DFB) diode laser as the local oscillator to measure atmospheric CO2 column concentration continuously in the near-infrared region, the averaged measurement precision is 1.6% by analyzing the standard deviation of CO2 column concentration. Yang [21] used the Li-7500 analyzer to calibrate and analyze the miniaturized atmospheric CO2 detection system. the absolute value of the relative error of carbon dioxide volume ratio was less than 2.0% by the inversion.

5 Conclusion

In this study, a SCLAS with a multi-band fusion algorithm was proposed and demonstrated. The absorption spectra of CO2 in different bands of near-infrared were measured. Three linear models of the integrated absorption peak area and the CO2 concentration in different bands are established, and the stability of the model is evaluated. Two new fusion models are obtained by assigning weights to the three linear models based on R2 and RMSE, respectively. It is found that the fusion model obtained based on RMSE is better, and the relative error of CO2 concentration inversion is reduced to 1.2%, which improves the accuracy of the model effectively. The experimental results show that it is feasible to measure the CO2 concentration by a multi-band fusion model, which provides a new idea and new method for the detection of other gases and multi-component gases.