1 Introduction

Tropospheric ozone is a key constituent for various physical and chemical processes in the atmosphere. The tropospheric ozone is formed as a secondary photochemical product of the oxidation of carbon monoxide (CO) and hydrocarbons in the presence of NOx (Zeng et al. 2008). There are two sources of tropospheric ozone (Stevenson et al. 2006): transport from the stratosphere and in situ chemical production. Ozone production takes place when CO and hydrocarbons are photooxidized in the presence of nitrogen oxides (NOx = NO + NO2). An extensive review of the global tropospheric ozone dynamics is available in Kondratyev and Varotsos (2001). Photochemical ozone production of human-emitted precursors has significant impact on human health, terrestrial ecosystems and materials degradation. Tropospheric ozone changes may also come about from climate changes such as an increase in stagnation episodes or other altered transport patterns. While air quality concerns are focused near ground level, the climatic and oxidizing impacts of tropospheric ozone are significant through the entire depth of the troposphere (Oltmans et al. 2006). Climate change influences tropospheric ozone and aerosols through effects on emissions, transport and atmospheric chemistry. The potential impacts of climate change on transport of ozone and aerosols have been established by general circulation model (GCM) studies (Bell 2005). The tropospheric ozone is produced in the troposphere in the presence of hydrocarbons or carbon monoxide together with nitrous oxide. As these emissions probably take place simultaneously, they are of special importance for tropospheric ozone production (Fishman et al. 1979). The complicated relationship between climate and tropospheric ozone has been discussed in the works of Fishman et al. (1979), Cartalıs C and Varotsos 1994, Portmann et al. (1997), Shindell et al. (2005, 2006), Liao et al. (2006) and Fiore et al. (2008). The physical and chemical processes mentioned above include radiative forcing as ozone is a greenhouse gas and an infrared absorber (Oltmans et al. 2006). The principal oxidants in the lower atmosphere are ozone and two by-products of ozone photodissociation, the hydroxyl radical and hydrogen peroxide (Thompson 1992). A number of critical atmospheric chemical problems depend on the oxidizing capacity of the atmosphere. Tropospheric ozone is important in determining the oxidizing capacity of the atmosphere, both through its direct role and through its role as a precursor of the hydroxyl radical OH (Edwards et al. 2003).

Tropospheric ozone is a valuable absorber of solar ultraviolet (UV) radiation (Varotsos et al. 1995). During postmonsoon, the tropospheric column ozone is larger in the Northern Hemisphere (NH) as compared to the Sourthern Hemisphere (SH). This may be an important contributor to the larger surface UV amounts recorded in the SH (McKenzie et al., 2003; Alexandris et al., 1999, Katsambas et al. 1997). Concentrations of ozone are larger at midlatitudes in the NH than at corresponding SH Fishman and Crutzen (1978) proposed that the observed excess of ozone in the north may reflect significant photochemical production associated with sources of NOx and CO from combustion of fossil fuels. Increases in tropospheric ozone due to photochemical production, mostly due to growing industrial and technological NOx emissions, in the industrialized northern hemisphere can overcompensate for increased UV-B radiation resulting from ozone depletions due to chlorine catalyzed reactions in the stratosphere (Brühl and Crutzen 1989; Varotsos et al. 1994).

The first assessment of tropospheric column ozone derived from satellite measurements was based on the residual method of Fishman et al. (1990) which subtracted SAGE stratospheric column ozone from TOMS total column ozone to determine tropospheric ozone. The largest ozone occurs in winter and premonsoon months in both the hemispheres with greater abundance in the NH. The most significant amounts occur in mid- to high latitudes. In the SH, in winter and spring seasons, the horizontal gradients and mean amounts in ozone are weaker than winter and premosoon seasons in the NH.

The tropospheric ozone content, which has increased since 1900, contributes to the enhancement of the greenhouse effect (tropospheric ozone behaves as greenhouse gas) (Kondratyev and Varotsos 1995a; 1995b; 1996a). In turn, changes to the ozone layer affect climate through radiative processes, especially those that are closely associated with the variability of the solar UV radiation (Kondratyev and Varotsos 1996b; Efstathiou et al. 1998; Bandyopadhyay and Chattopadhyay 2007). Many authors such as Hingane and Patil (1996), Londhe et al. (2003, 2005) and Sahoo et al. (2005) discussed the ozone variability and the impact on the weather systems over India. Scientists have expressed their views concerning the inevitability of studying the complexity of the interactive processes associated with geophysical events characterized by chaotic features (e.g., Acharya et al. 2011; Bandyopadhyay and Chattopadhyay 2007; Chattopadhyay 2007).

The present study is focused on the spatial distribution of tropospheric ozone extended over southern as well as northern hemisphere (50°S–50°N). Instead of taking other precursors into consideration, this work is based on a univariate approach. A systematic statistical procedure is adopted to view its spatial behavior. Mann–Kendall's (MK) method of trend analysis (Kahya and Kalayci 2004) has been executed to view its spatial trend in monthly as well as seasonal scale. Subsequently, autoregressive integrated moving average (ARIMA) (Sprott 2003) has been examined as a representative model for tropospheric ozone. The detailed methodology and implementation procedure are discussed in the subsequent sections. The remaining part of this paper is organized as follows: in Section “2”, the theoretical overviews of MK's trend test, ARIMA and the statistical model assessment procedures are presented. Outcomes of the work are presented in Section “3” and conclusions are presented in Section “4”.

2 Methodology

Methodology adopted in the present paper may be subdivided into three parts:

  • Trend analysis MK test,

  • Univariate modeling by ARIMA and

  • Model assessment.

The components of the methodology mentioned above are discussed in details in the following subsections.

2.1 Mann–Kendall (MK) test for trend

The MK test is a very popular tool for identifying the existence of increasing or decreasing trend within a time series. Detailed description of MK test is available in Jhajharia et al. (2009). The MK test is the rank-based nonparametric test for assessing the significance of a trend and has been widely used in climatological trend detection studies. Examples include the studies of Hanssen-Bauer and Førland (1998), Shrestha et al. (1999), Yue et al. (2002), Domonkos et al. (2003) and Chattopadhyay et al. (2011). The MK test, usually used to be applied by considering the statistic S as (Modarres and da Silva 2007):

$$ S = \sum\limits_{{i = 2}}^n {\sum\limits_{{j = 1}}^{{i - 1}} {{\text{sign}}\left( {{x_i} - x{}_j} \right)} } $$

where x j is the sequential data values, n is the length of the time series and \( {\text{sign}}\left( {{x_i} - x{}_j} \right) = - 1 \) if \( \left( {{x_i} - x{}_j} \right) < 0 \).; \( {\text{sign}}\left( {{x_i} - x{}_j} \right) = 0 \) if \( \left( {{x_i} - x{}_j} \right) = 0 \) and \( {\text{sign}}\left( {{x_i} - x{}_j} \right) = 1 \) if \( \left( {{x_i} - x{}_j} \right) > 0 \). The null hypothesis \( {H_0} \) is that a sample of data \( \left\{ {{X_t}:t = 1,2, \ldots, n} \right\} \) is independent and identically distributed. The alternative hypothesis H 1 is that a monotonic trend exists in {X t }. Each pair of observed values (x i , x j ) where i > j is inspected to find out x i  > x j (first type) or x i  > x j (second type). The mean E[S] and variance V[S] of the statistic S are obtained as:

$$ \matrix{{*{20}{c}} {E\left[ S \right] = 0} \\ {{\text{Var}}\left[ S \right] = \frac{{n\left( {n - 1} \right)\left( {2n + 5} \right) - \sum\limits_{{p = 1}}^q {{t_p}\left( {{t_p} - 1} \right)\left( {2{t_p} + 5} \right)} }}{{18}}} \\ } $$

where t p is the number of ties for the pth value and q is the number of tied values.

A standard normal variate Z is now constructed following Yue et al. (2002) and Xu et al. (2004) as:

$$ \begin{gathered} Z = \left\{ {\matrix{{*{20}{c}} {\frac{{S - 1}}{{\sqrt {{{\text{Var}}(S)}} }},{\text{ S}} > {0}} \\ {{\text{0, S}} = {0}} \\ {\frac{{S + 1}}{{\sqrt {{{\text{Var}}(S)}} }},{\text{ S}} < {0}} \\ } } \right.. \hfill \\ \hfill \\ \end{gathered} $$

In a two-sided test for the trend, the null hypothesis of no trend is rejected if \( \left| Z \right| > {Z_{{\alpha /2}}} \) where α is the significance level.

Kendall's τ is defined as

$$ \tau = 2\frac{{S\prime }}{{n\left( {n - 1} \right)}} $$

where, \( S\prime \)is the Kendall's sum and is estimated as and is estimated as \( S\prime = L - M \) where L is the number of cases with \( \left( {{x_i} - x{}_j} \right) > 0 \)and M is the number of cases for which \( \left( {{x_i} - x{}_j} \right) < 0 \). A thorough discussion on Kendall's τ is available in Xu et al. (2004).

2.2 Autoregressive integrated moving average (ARIMA): an overview

Univariate approaches have been adopted by several authors in the study of climatology and other geophysical studies. Autoregressive models for modeling geophysical time series have been explored in Delleur and Kavvas (1978), Zwiers and Storch (1995), Xu et al. (2008) and Chattopadhyay et al. (2011). Prybutok et al. (2000) developed and compared a neural network model for forecasting maximum daily ozone levels in a nonattainment area to regression and ARIMA models for the Houston metropolitan area. The ARIMA model (1, 0, 0) × (1, 0, 1)24 was shown to satisfactorily predict hourly ozone concentrations in the urban area of Southeastern Spain in the study of Dueñas et al. (2005). Chattopadhyay and Chattopadhyay (2010) compared the prediction performance of ARIMA with an autoregressive neural network for forecasting total ozone concentration. ARIMA forecasting for various ambient air pollutants including ozone over an urban area of India was attempted in the work of Kumar and Jain (2010). A mathematical overview of the ARIMA model is presented below. The set of adjustable parameters \( {\varphi_1},{\varphi_2}, \ldots, {\varphi_p} \) of an autoregressive process of order p, i.e., AR(p) process (Box et al. 2007)

$$ {\tilde{z}_t} = {\varphi_1}{\tilde{z}_{{t - 1}}} + {\varphi_2}{\tilde{z}_{{t - 2}}} + \ldots + {\varphi_p}{\tilde{z}_{{t - p}}} + {a_t} $$

satisfies certain conditions for the process to be stationary. Here, \( {\tilde{z}_t} = {z_t} - \mu \). The parameter ϕ1 of an AR(1) process must satisfy the condition |ϕ1| < 1 for the time series to be stationary. It can be shown that the autocorrelation function satisfies the equation

$$ {\rho_k} = {\varphi_1}{\rho_{{k - 1}}} + {\varphi_2}{\rho_{{k - 2}}} + \ldots + {\varphi_p}{\rho_p}. $$

Substituting \( k = 1,2,.....,p \) in the above equation, we get the system of Yule–Walker equations (Box et al. 2007)

$$ {\rho_1} = {\varphi_1} + {\varphi_2}{\rho_1} + \ldots + {\varphi_p}{\rho_{{p - 1}}} $$
$$ {\rho_2} = {\varphi_1}{\rho_1} + {\varphi_2} + \ldots + {\varphi_p}{\rho_{{p - 2}}} $$
$$ {\rho_p} = {\varphi_1}{\rho_{{p - 1}}} + {\varphi_2}{\rho_{{p - 2}}} + \ldots + {\varphi_p}. $$

The Yule–Walker estimates of the autoregressive parameters \( {\varphi_1},{\varphi_2}, \ldots, {\varphi_p} \) are obtained by replacing the theoretical autocorrelation ρ k by the estimated autocorrelation r k . Thus, the matrix notation, the autoregression parameters can be written as:

$$ \Phi = {R^{{ - 1}}}r. $$

The pth order autoregressive process may be written as

$$ \varphi (B){\tilde{z}_t} = {e_t} $$

where e t follows the qth order moving average process

$$ {e_t} = \theta (B){a_t}. $$

Now, an ARMA (p, q) process is presented as

$$ \varphi (B){\tilde{z}_t} = \theta (B){a_t} $$

where ϕ(B) and θ(B) are polynomials of degrees p and q, respectively, and B is the backward shift operator. The ARMA process is stationary if the roots of \( \varphi (B) = 0 \) lie outside the unit circle and it exhibits explosive nonstationary behavior if they lie inside the unit circle. If ϕ(B) is a stationary autoregressive operator, then the ARIMA process is derived as

$$ \varphi (B){\left( {1 - B} \right)^d}{\bar{z}_t} = \theta (B){a_t} $$

where d denotes the number of times the stationary process is summed. Introducing the backward difference operator \( \nabla = 1 - B \), and \( {\nabla^d}{\bar{z}_t} = {\nabla^d}{z_t} \) the above equation becomes

$$ \varphi (B){\nabla^d}{z_t} = \theta (B){a_t} $$

where considering the various values of p, d and d, the ARIMA model is written as ARIMA (p, d, q).

2.3 Model assessment

In the present work, the ARIMA model fitted to the above-described data is assessed by Willmott's index. Supremacy of Willmott's index over other conventional measures of goodness of fit is discussed in the work of Chattopadhyay and Chattopadhyay (2008). Willmott (1982) advocated an index to measure the degree of agreement between actual and predicted values. This is given as:

$$ {d^2} = 1 - \left[ {\sum\limits_i {{{\left| {{P_i} - {O_i}} \right|}^{\alpha }}} } \right]{\left[ {\sum\limits_i {{{\left( {\left| {{P_i} - \bar{O}} \right| + \left| {{O_i} - \bar{O}} \right|} \right)}^{\alpha }}} } \right]^{{ - 1}}}. $$

Here, P implies predicted value and O implies observed value for the ith data point. For good predictive models, d 2 is close to 1. For convenience, we shall denote the Willmott's index by WI in the subsequent part of the paper.

3 Results and discussion

In this section, a discussion is made on the outcomes of Mann–Kendal (MK) trend test on the tropospheric ozone. Here, we have investigated the existence of any trend in the spatial distribution within the tropospheric ozone between the 50°S and 50°N. In the beginning, we have considered the ozone in the monthly scale. In order to carry out a trend analysis based on the MK method, we had the measurement of the troposheric ozone at every 5° latitude along both hemispheres. In this way, there is a data series of tropospheric ozone at equidistant spatial points. For every month, we carried out a MK test separately and computed Kendall's г value based on the null hypothesis that there is no trend within the spatial tropospheric ozone data series. This null hypothesis has been tested against the alternative hypothesis of existence of trend. Among the 12 months considered, we could find trends only in the months of January, March, April, November and December. The results are displayed in the Table 1.

Table 1 Results of MK trend test for the spatial distribution of tropospheric ozone in monthly scale

Studying the result of the analysis for the months of November, December and January, it is revealed that there is a spatial trend within the tropospheric ozone during winter. This monthly data analysis further indicates the existence of a trend during the premonsoon months. Tropospheric ozone further elucidates the findings based on monthly data. We further execute a MK test in seasonal scale by reconstructing the data series by means of doing simple average during the seasons of premonsoon (March–May), monsoon (June–September), postmonsoon (October–November) and winter (December–February). While studying the monthly scale through MK test, the existence of trend is established for all seasons except monsoon. For all the other three seasons, the null hypothesis of no trend is rejected based on p value coming out of the MK test. The results of the MK test for seasonal scale are presented in Table 2.

Table 2 Results of MK trend test for the spatial distribution of tropospheric ozone in seasonal scale

In the previous paragraph, we have already established the existence of a linear spatial trend within the data series of tropospheric ozone between 50°S and 50°N. At this juncture, we define a random variable X n as follows:

X n :

The measure of tropospheric ozone at latitude n.

Hence, we have a series of X n that may be used for a linear prediction model where X n can be regarded as a linear combination of its previous values where we have considered the tropospheric ozone at 50°S as the beginning of the data series. To generate a linear prediction model for this spatial data series, we fit an ARIMA model to the data series. The details of ARIMA (p,d,q) modeling are available in Sprott (2003). For every season (except monsoon), we have executed ARIMA (1,1,1), ARIMA (0,1,1) and ARIMA (0,2,2). This approach is already adopted for total ozone by Chattopadhyay and Chattopadhyay (2010). The predictions are presented in Figs. 1, 2 and 3.

Fig. 1
figure 1

Schematic showing the prediction of tropospheric ozone through ARIMA (1,1,1) model in premonsoon, postmonsoon and winter, respectively. The blue line corresponds to the observed values and the red line corresponds to the model estimations, respectively

Fig. 2
figure 2

Schematic showing the prediction of tropospheric ozone through ARIMA (0,1,1) model in premonsoon, postmonsoon and winter, respectively. The blue line corresponds to the observed values and the red line corresponds to the model estimations, respectively

Fig. 3
figure 3

Schematic showing the prediction of tropospheric ozone through ARIMA (0,2,2) model in premonsoon, postmonsoon and winter, respectively. The blue line corresponds to the observed values and the red line corresponds to the model estimations, respectively

In all seasons under consideration, the ARIMA are apparently showing good fit to the tropospheric ozone by visual inspection. To assess the result, we analyze the residuals statistically for each ARIMA model. To assess the skill of the ARIMA as representative of the spatial distribution of tropospheric ozone, we compare three commonly used forms of ARIMA namely, ARIMA (1,1,1), ARIMA (0,1,1) and ARIMA (0,2,2) that are examined for three respective prediction yields with permissible errors of 5% and 10%, respectively. At the 5% and 10% levels of permissible error, performance is similar for postmonsoon and winter in case of ARIMA (1, 1, 1) and this performance is better than premonsoon. ARIMA (0, 1, 1) has a similar performance to that of ARIMA (1,1,1). However, ARIMA (0, 2, 2) performs best for premonsoon. These results are exhibited in Table 3. In all the cases, WI are above 0.9. However, the maximum value of WI occurs for premonsoon with ARIMA (0, 2, 2). The remaining two ARIMA models also perform best in case of premonsoon months. These results are displayed in Table 4.

Table 3 Results of absolute residuals at 5% and 10% levels of permissible error for the spatial distribution of tropospheric ozone in seasonal scale
Table 4 Results of Willmott's index of order two (WI2) for the spatial distribution of tropospheric ozone in seasonal scale

4 Conclusion

Considering the discussions presented above, we may conclude the following about the behavior of spatial distribution of tropospheric ozone:

  • As we move from Southern to Northern hemispheres, the monthly tropospheric ozone exhibit a linear trend in the months of January, March, April, November and December. In the monthly scale, there is no trend in the remaining months.

  • Considering the seasonal scale, we could identify the presence of a linear trend in all seasons, except monsoon.

  • Assessing the individual estimation performance, ARIMA (1,1,1) and ARIMA (0,1,1) are found to be the best representatives for the tropospheric ozone in postmonsoon and winter, whereas ARIMA (0,2,2) represents best the premonsoon months.

  • Based on the WI of order two, the best overall estimation capacity for premonsoon is found to be exhibited by ARIMA (0,2,2). The maximum values of WI for postmonsoon and winter are also occur for ARIMA (0,2,2).

It is, therefore, finally concluded that the tropospheric ozone, in general, increases from southern to northern hemisphere. ARIMA (0,2,2) can be used as a representative of the spatially distributed tropospheric ozone over southern and northern hemispheres. In future, this study may be extended to the exploration of the intrinsic complexity of the tropospheric ozone by Grassberger–Procaccia correlation dimension method (James 1987) and subsequent modeling by soft computing (Pal and Mitra 1999) methods.