1 Introduction

Analysis of moment structures (AMOS) implements a general approach to data analysis known as analysis of covariance structure, analysis of linear structural relations, structural equation modeling, or casual modeling. AMOS will analyze data from several populations at once. It will estimate means for exogenous variables, and it will estimate intercepts in a regression equation. The program will also compute full information maximum likelihood estimates in the presence of missing data. Any parameter can be fixed at a known value in advance, and any parameter can be constrained to be equal to any other parameters. AMOS offers a choice of the four estimation criteria discussed [1]: (a) maximum likelihood, (b) unweighted least squares, (c) generalized least squares, and (d) Browne’s asymptotically distribution-free criterion.

AMOS estimates the following quantities: (a) model parameters; (b) standardized regression weights; (c) squared multiple correlation for each endogenous variable in the model, indicating the proportion of the variance of that variable that is accounted for by the remaining variables in the model; (d) total effects; and (e) regression weights for regressing unobserved variables on the observed variables (factor score weights) [4].

Multivariate monitoring methods that consider all available data simultaneously can extract key information about the relationships and combined effects of environmental pollutants. Additionally, when failures occur in environmental quality management systems, univariate monitoring methods are often inadequate to identify causes because the signal-to-noise ratio is very low in each pollutant measurement. But multivariate monitoring can improve the signal-to-noise ratio through averaging, resulting in a more realistic evaluation of the environmental context [13].

In the chemometrics area, multivariate statistical techniques have become one of the most active research areas in modeling and analysis over the last decade [15, 21, 7]. However, to the authors’ knowledge, only limited research on the effectiveness of multivariate models for the assessment and management of air pollution has been conducted thus far [1618]. Additionally, multivariate statistical techniques such as cluster analysis (CA), factor analyses (FA), principal component analysis (PCA), and discriminant analysis (DA) have been widely used as unbiased methods in the analysis of water quality data for drawing meaningful information [26, 19]. The multivariate treatment of data is widely used to characterize and evaluate surface and freshwater quality, and it is useful for evidencing temporal and spatial variations caused by natural and anthropogenic factors linked to seasonality [11, 25].

The functions of reservoir watershed are mainly divided into four types: (i) source of water treatment plant; (ii) water used to maintain sustainable development of a water body; (iii) recreation and leisure including direct contact (swimming) and indirect contact (boating); and (iv) agricultural and other industrial uses. At present, due to improper development and use of land, some reservoir watersheds in Taiwan have encountered the issue of eutrophication. This has a huge impact on water use and flood prevention, as well as shortening the life of reservoirs [12].

This study mainly utilizes AMOS to confirm the goodness of fit of the previous factor analysis model. The basic purpose of AMOS is to perform structural equation modeling (SEM) that confirms models or hypothesized model drawings to be used as analysis of covariance structures and analysis of causal modeling. This study, before the application of AMOS, utilized factor analysis of multivariate statistical analysis to identify common factors for six water quality monitoring stations and selected nine important parameters of water quality: water pH (pH), temperature (Temp), dissolved oxygen (DO), biochemical oxygen demand (BOD), suspended solids (SS), anionic surfactant (MBAS), ammonia nitrogen (NH3-N), total phosphorous (TP), and chlorophyll (Chloro) to examine the correlation. Additionally, since various results have been identified for the application of factor analysis of multivariate statistical analysis to the analysis of water quality and time series studies at a single watershed in a reservoir [22, 23, 25] and there is lack of the use of AMOS to confirm research results and build goodness of fit of models, the authors are motivated to research on this topic. This study aims to build a standard set of methods that can be applied by the reservoir authorities in order to dramatically improve the application of statistical analysis of water quality data as well as the enactment of managerial strategies of watershed in reservoirs.

2 Methodology

2.1 The Application of AMOS Software

AMOS is a set of analysis software. In this study, the AMOS 22.0 version of English edition is used and developed by IBM Corporation [2]. Besides, AMOS can share the same database with SPSS software. Along with universalness of SPSS and SEM, Linear Structural Relations (LISREL) has been regarded as the same with AMOS.

2.2 Analysis Background of AMOS

Before the AMOS analysis, this study utilized parameters proposed by Wu and Kuo [24] to identify nine parameters of water quality with the SPSS application to six water quality monitoring stations in the watershed of the Feitsui Reservoir in Northern Taiwan. The results of the factor analyses discovered four main factors that influence water quality in the watershed: “organic pollution,” “eutrophication,” “seasonal,” and “sediment pollution” as shown in Table 1. The above factor analyses are conducted with orthogonal rotation. In Table 1, four main factors were highlighted respectively to interpret the water quality parameters, such as the BOD, DO, and NH3-N in factor 1. The factor loading of the three water quality parameters is 0.771, 0.667, and 0.636, respectively, which reveals the factor loadings are high. It also signifies that there is higher correlation of pollution in watershed area around the reservoir. The factor loading of Chloro is 0.413 because the loading value is not as high as others, and it means that there is no significant pollution that occurred by Chloro in this factor, either in behavior or level. This study also utilizes oblique rotation to confirm whether validity coefficient and factor load help to improve the goodness of fit of reflective indicators distributed to orthogonal rotation.

Table 1 Matrix of water quality factor loadings for the watershed

2.3 Study Area

The selected scope is six water monitoring stations in the watershed of the Feitsui Reservoir in Northern Taiwan. It consists of a total area of 303 m2 that covers approximately 30 km of Shungxi, Shiding, and Xindian of New Taipei City and Taipei City. The Feitsui Reservoir is the second largest reservoir in Taiwan and is located in Shi Creek. It is the water divide for the six branches that run to the upstream area in Xindian. The purpose for this reservoir is to provide water for household use in Taipei City. The watershed area is the only protected one in the reservoir. In order to lessen the burden brought by recreation use on the deteriorating water quality, the Feitsui Reservoir Administration, in 2001, banned water recreation activities in the watershed area. Figure 1 shows the geographical location of the Feitsui Reservoir in Northern Taiwan (w1∼w6 indicate the monitoring sites of the six water quality monitoring stations).

Fig. 1
figure 1

Geographical location of the Feitsui Reservoir watershed

2.4 Factor Analyses

FA is a useful tool for extracting latent information, such as relationships between variables that are directly observable [10]. The original data matrix is decomposed into the product of a matrix of factor loadings and a matrix of factor scores plus a residual matrix. In general, by applying the eigenvalue-one criterion, the number of extracted factors is less than the number of measured features. So, the dimensionality of the original data space can be decreased by means of FA. After rotation of the factor loading matrix (e.g., by varimax rotation), the factors can often be interpreted as origins or common sources. Factor analyses were performed on a correlation matrix of rearranged data in order to explain the structure of the underlying data set. The correlation coefficient matrix measures how well the variance of each constituent can be explained by the relationship with each of the others [14]. Then, the variance/covariance and correlation coefficients of the variables were computed. The model for the relevant factor analyses is shown below:

$$ \begin{array}{l}{x}_1={\mu}_1+{\ell}_{11}{f}_1\kern-0.2em +{\ell}_{12}{f}_2+\dots \kern-0.4em +{\ell}_{1q}{f}_1+{\varepsilon}_1\hfill \\ {}{x}_2={\mu}_2+{\ell}_{21}{f}_1+{\ell}_{22}{f}_2+\dots +{\ell}_{2q}{f}_2+{\varepsilon}_2:\hfill \\ {}{x}_i={\mu}_i+\mathit{\ell i}1f1+{\ell_i}_2{f}_2+\dots +{\ell}_{iq}{f}_2+{\varepsilon}_i:\hfill \\ {}{x}_p={\mu}_p+{\ell_p}_1{f}_1\kern-0.2em +{\ell_p}_2{f}_2+\dots \kern-0.4em +{\ell_p}_q{f}_q+{\varepsilon}_p\hfill \end{array} $$
(1)

where f 1,…, f q are common factors contained in each variable x i ; є i is a special factor contained only in the ith variable (x i ); and ij is the loading of ith factor to the jth common factor (f j ).

2.5 Confirmation of AMOS Software Program

This study applies factor analyses to identify four factors related to organic pollution, eutrophication, seasonal, and sediment pollution; these four factors are analyzed with AMOS modeling software. AMOS is mainly used to analyze p values and chi-squares. We apply AMOS to analyze the goodness of fit, reliability, and validity of four factors under factor analyses. AMOS simulates the goodness of fit of factor analysis model from the perspective of confirmation. Therefore, a p value higher than 0.05 is defined as good [2, 5] and is different from a statistical p value defined as lower than 0.05; it has its own principle variance and explanation. In this study, p value and chi-square are used to confirm the goodness of fit, but the model also requires that it should be confirmed with diverse indices for self tests in order to understand the fairness of the internal quality of the model. The evaluation standard is shown in Table 2 [9].

Table 2 SEM evaluation indices and standard for the goodness of fit of model

Before the evaluation of the goodness of fit, the model needs to be examined with an offending estimate. In SEM analysis, the “maximum and minimum” standard deviation estimates are often not defined [2, 8, 20].

2.6 AMOS Modification Indices

Modification indices are important clues used to examine sequence errors, but the value of modification indices needs to be discussed for the modification requirement; as proposed by scholars [3, 8, 27], modification is needed when modification indices are higher than 3.84.

2.7 AMOS Specification Search

Specification search is the last modification tool used in this research. The purpose of the model search function of AMOS is to conduct various possibility analyses for the selected covariance curve in the hypothesized model in order to identify the optimal goodness of fit with the highest p value with the connected combination of covariance curves among the four factors [3, 8].

3 Results and Discussion

3.1 Application and Analysis of Factor Analyses

Before AMOS is used in this study, Wu et al. [25] utilized varimax rotation in factor analyses for orthogonal rotation to interpret number characteristics. Since a number value lower than 1.00 cannot be shown directly when SPSS 17.0 calculates a characteristic value, researchers select a value higher than 1.00 as the latent factor for the convenience of future studies. In practical analysis, a characteristic value smaller than 1.00 but close to 1.00 shall also be manipulated for the purpose of analysis. Therefore, according to the analysis results, there are four factors with a characteristic value higher than 1.00 as shown in Table 3, and they are selected as the main factors used to explore the influence on water quality in the watershed.

Table 3 Results of factor analyses and the variance explained

In this study, the above numbers with characteristic value higher than 1 decide the main factors; we then select the parameters of each factor with the component matrix acquired through orthogonal rotation. Table 1, provided in the “Methodology” section, shows the matrix, after orthogonal rotation, that explains the characteristics of the four factors. It can also be utilized to describe the main factors that influence water quality in the watershed area as well as the variance between the four factors.

3.2 Selection of AMOS

Table 4 shows the kurtosis of the nine parameters of water quality as a non-normal distribution because they all are not close to the number value of 3. Normal distribution refers to a number value close to 3, and the value of the whole model reaches to 191.214, a very steep distribution. Generalized least square (GLS) shall be used to replace maximum likelihood in the algorithm.

Table 4 Evaluation of normal distribution (factor analyses)

3.3 Orthogonal Rotation Analysis

In this study, at the beginning of the factor analyses, assume that there is no correlation between the four factors, and orthogonal rotation should be utilized for multivariate statistics. As shown in Fig. 2, there is a connected covariance line, but this study hypothesizes the line as 0.00001. With a low correlation, it can be regarded as a zero correlation.

Fig. 2
figure 2

Results of orthogonal rotation (factor analyses)

According to Fig. 2, p value only reaches to 0.000 and chi-square is 93.003 implying poor initial goodness of fit of the hypothesized model and a failure to explain the estimate.

3.4 Analysis of Oblique Rotation

This study uses oblique rotation to analyze and modify the original orthogonal rotation to identify the correlation of the four factors because of the biological, chemical, and physical relationship of the nine parameters of water quality.

In terms of observation of factor load, reflective indicators of the three items in organic pollution improve their factor load after oblique rotation to ensure the goodness of fit of the hypothesized model for the factor of organic pollution. Reflective indicators of the three items in eutrophication dramatically improve their factor load after oblique rotation, and it is worth noting that originally, in orthogonal rotation, Chloro has a factor load of 0.05, but after oblique rotation, it increases to 0.28 indicating there is no hypothesis error of the three indicators of the eutrophication factor. The pH value and SS of sediment pollution also maintain a negative correlation indicating the goodness of fit of sediment pollution in the hypothesized model.

From the analysis results of oblique rotation, there is a covariance connection between the four factors, and among them, the correlation coefficient between organic pollution and eutrophication is the highest with a value of 0.52 indicating a correlation of organic pollution to the other three factors as well as stronger correlation with eutrophication. The explanation is excessive algae growth due to eutrophication, and the death of a high number of algae produces excessive BOD. Likely, the amount of ammonia nitrogen also increases.

3.5 Application of AMOS Modification Indices

This study selects the best results of p value, chi-square, and interior quality acquired via oblique rotation between the two types of rotation for the use of modification indices and model search. Sequence modification focuses first on modification indices because there is a need to examine whether any parameters (covariance) need to be redefined due to its significant influence on the continuous model search.

Due to smaller p value as well as the failure to complete the iteration estimate procedure after several modifications, we still cannot acquire the results and do not choose to use orthogonal rotation.

Table 5 provides three modification indices higher than 3.84 and the covariance relationship between ammonia nitrogen, DO, and temperature. Moreover, it indicates that BOD and Chloro need to be redefined.

Table 5 Covariance of modification indices

According to Fig. 3, the model after the second modification has met the requirement for p value >0.5, and there is a positive correlation between BOD and Chloro. The explanation is that Chloro means algae and more algae are required for higher BOD.

Fig. 3
figure 3

Modification indices of oblique rotation

When factor load exceeds 0.5, the increasing number of water quality indicators reaches standards such as DO of 0.559, BOD of 0.661, MBAS of 0.554, pH of 0.634, and the rest of the water quality indicators are also close to the standard factor load of 0.5. When the factor load is higher, the reliability coefficient also increases.

3.6 Specification Search of AMOS

This study selects all covariance curves to execute the first specification search and a total of 256 models of covariance curves that need to be analyzed. For the results of the first specification search, this study uses Akaike Information Criterion (AIC) and Bayes Information Criterion (BIC) to explain the optimal model of the hypothesized one.

AMOS, during the specification search, uses AIC0, BCC0, and BIC0 as the judgment basis; when the three values are close to 0, they present the optimal hypothesized model [6]. Hence, this study conducts two specification searches and selects Fig. 4 as the best hypothesized model. At the same time, the second modification of the specification search in this study maintains good external quality as shown in Table 6.

Fig. 4
figure 4

Completion of oblique rotation model

Table 6 Evaluation indices of second specification search (oblique rotation)

4 Conclusion

This study uses multivariate statistical analysis to identify the four factors of organic pollution, eutrophication, seasonal, and sediment pollution that mainly influence water quality in the watershed area.

This study uses the AMOS computing program to confirm factor analyses; the initial hypothesized orthogonal rotation model is reported with p value and chi-square of 0.000 and 93.003 completely unfitted. But after oblique rotation, it is confirmed that with the inclusion of a covariance curve, the initial orthogonal rotation is not a complete error estimate. After several explorative factor analyses, this study discovers the optimal model diagram with a p value and chi-square of 0.611 and 17.643 reaching the standards and confirming the normal distribution of reflective indicators of the four factors in the hypothesized model of orthogonal rotation. The good external quality has the biological, chemical, and physical relationship to support the orthogonal rotation model. This also explains the biological, chemical, and physical influences in nature on the nine parameters of water quality.

When multivariate analyses are applied, factor analyses, cluster analyses, and discriminant analyses shall also be conducted at the same time to obtain optimal results of statistical analyses. Due to the length limit, this study only presents factor analyses in stage 1 and confirmatory factor analyses in stage 2 to establish the complete set of methods.

In the future, heavy metals that affect water quality shall be included to improve water quality analyses and managerial completeness.

This study aims to establish a set of standard methods for the reference of the reservoir authorities to dramatically improve the application and statistical analyses of water quality monitoring data as well as the enactment of water quality managerial strategies in the watershed areas of reservoirs.