Introduction

One of the most important global challenges of the twenty-first century, especially in developing countries such as Iran, is the optimal use of lands for supplying the needs of the ever-growing population. Despite the limited amounts of land resources, they are currently being exploited to the point of exhaustion, which, in turn, is accompanied by adverse ramifications. To meet the various nutritional demands of the ever-growing global population, certain measures must be taken with respect to the long-term sustainability of land use and conservation of land while maintaining their high potential for production. Therefore, determining soil fertility and locating suitable areas for cultivating different products could play a major role in land productivity. Among the various agricultural products used by throughout the world, maize seems to be of utmost importance and, therefore, locating the optimal areas for its cultivation would, indeed, be of great merit to agricultural productivity.

Along these lines, perhaps, the most important factor in corn grain yields improvement in soil fertility. Various methods such as geographic information system (GIS), fuzzy membership functions, analytic hierarchy process (AHP), and other hybrid models have been applied to assess soil fertility status of crops (Cai et al. 2007; Malczewski 1999; Naseriparsa et al. 2014; Zhu et al. 1997, 2001). Hybrid methods have appeared to be more flexible in reflecting human ingenuity and making more intelligent decisions.

Several studies have been conducted in the area of soil fertility counting: (Sun et al. 2018); (Bhat et al. 2012); (Davatgar et al. 2012); (Sun et al. 2018); (Bijanzadeh and Mokarram 2013 and 2017; Mokarram et al. 2017, 2018; Mokarram and Hojati 2017).

Grain corn is the third most important strategic agricultural product following wheat and barley. In Iran, corn is commonly used as a major source of nutrition in poultry feed. Considering the high rate of production and cultivation of corn in the agricultural lands of Shiraz, it is necessary to determine optimal areas, in terms of soil fertility, for the cultivation of this product. The most common approaches to preparing soil fertility maps for corn production are the analytic network process (ANP), analytic hierarchy process (AHP), and feature selection algorithms. Thus, the present study seeks to incorporate these models with the ArcGIS environment to obtain soil fertility maps for corn production. The main contributions of this study include: selecting relevant features and eliminating redundancies, which, in turn, lead to lower costs and time complexity as well as the use of ANP models for procuring soil fertility maps for corn production.

The rest of the paper is organized as follows. Section 2 maneuvers on the applications of the case study and describes the particular use of ANP and AHP techniques and feature selection algorithm, the structural hierarchy of the decision network, the proposed model for soil fertility mapping for corn production, and the results. Section 3 concludes the paper with a discussion and various proposals for improving the method. Finally, Sect. 4 draws conclusions. Figure 1 shows a summarized schema of the methodology employed in this study.

Fig. 1
figure 1

Flowchart for soil fertility of corn

Materials and methods

Study area

The study area is located in Fars province in the south of Iran, between latitude of 29° 42′ N to 29° 35′ N and longitudes 52° 42′ E to 52° 36′ E with an area of 302.2 km2 (Fig. 2). The elevation is between 1,557 and 2,297 m. The required dataset is extracted from a land classification study conducted by Fars Soil and Water Research Institute in 2012. The average temperature in July (the warmest month) is 30 degrees centigrade, in January (the coldest month of the year), 5 degrees, in March, and 17 and 20 degrees in October, with an average annual temperature of 18 degrees centigrade. Annual precipitation in the study area is 337.8 mm. Input data for the determination of soil fertility were derived from 34 field samples collected through purposive sampling. A set of seven parameters were used to predict soil fertility for the 34 soil samples including organic content (OC), phosphorus (P), potassium (K), iron (Fe), zinc (Zn), manganese (Mn), and copper (Cu) (Table 1). The maximum values of OC, P, K, Fe, Zn, Mn, and Cu were 1.48, 25, 539, 12.3, 1.5, 28, and 1.8 mg/kg, respectively, with minimum values equal to 0.37, 5, 167, 1, 0.13, 2.8, and 0.55 mg/kg, respectively.

Fig. 2
figure 2

Location of the case study in Shiraz, Fars province, Iran

Table 1 Summaries of effective parameters for soil fertility of the study area

Method

The IDW method was initially deployed to obtain zoning maps for each parameter, after which ANP was used to allocate weights for the parameters and sub-parameters. Finally, feature selection algorithms were employed to obtain the most relevant features for crop production and ANP models were once again used to assign weights and generate the final map. Details for each of the proposed phases are stated in the following.

Inverse distance weighting (IDW)

Inverse distance weighted (IDW) model was used to obtain interpolation maps for soil properties. The inverse distance weighting (IDW) method was used to prepare an interpolation map for the climatic factors including maximum temperature, minimum temperature, rainfall, relative humidity, sunshine hours, and GDD. The interpolation maps could be used to predict values for other factors not measured.

The IDW method works by assigning a weight to each point based on the distance between that point and the unknown point of interest. The weights are then configured so as to decrease large weights of points located at a far distance from the point of interest, thereby attaining a more uniform distribution among adjacent points. It should, however, be noted that the method only considers the distance between the points regardless of their original location and arrangements. Put differently, points located at a similar distance from the estimation point are not assigned similar weights. Statistically, it is a weighted moving average (Burrough 1989):

$$ \hat{m}(x_{0} ) = \frac{{\sum\limits_{{{\text{i}} = 1}}^{\text{n}} {m(x_{\text{i}} )d_{\text{ij}}^{ - r} } }}{{\sum\limits_{{{\text{i}} = 1}}^{\text{n}} {d_{\text{ij}}^{ - r} } }}, $$
(1)

where x0 is the estimation point and xi are the data points within a chosen neighborhood. The weights (r) are related to distance by dij.

Fuzzy set theory

Zadeh (1965) utilized MF (Membership Functions) to create a fuzzy set of characteristics of objects. A fuzzy set is defined as follows (McBratney and Odeh 1997):

$$ {\text{A }} = \left\{ {x,\;\mu_{A} \left( x \right)} \right\}\;{\text{for}}\;{\text{each}}\;x \in X, $$
(2)

where μA is the membership function (MF) that defines the degree of membership of x in fuzzy set A.

The development of GIS has greatly contributed to facilitating mapping procedures, including soil fertility mapping using both Boolean and fuzzy methods. Seeing as to how increases in soil nutrition lead to lower demands for fertilization and vice versa, a trapezoidal membership function was employed. The following function was used for each parameter (Shobha et al. 2013):

$$ \mu_{A} (x) = f(x) = \left\{ {\begin{array}{*{20}c} 0 & {x \le B} \\ {x - B/C - B} & {B \prec x \prec C} \\ 1 & {x \ge C} \\ \end{array} } \right\}, $$
(3)

where x is the input data and B and C are the limit values. The B and C are the critical bounds for each parameter (minimums and maximums), as shown in Table 2 (Shobha et al. 2013).

Table 2 Critical level of soil nutrients for corn production

According to Table 2, the critical ranges for OC, P, K, Fe, Zn, Mn, and Cu are < 1 (%), < 18.5, < 260, < 6. 5, < 1.4, < 10, and < 1 (mg/kg), respectively. Ergo, values less than the critical thresholds indicate a need for further compensatory fertilization of corn cultivation.

Analytic hierarchy process (AHP)

AHP was then applied to assign weights to each layer and conduct pair-wise comparisons as well as overlay the layers. The analytic hierarchical process (AHP) model is the most efficient tool for multi-criteria decision-making first proposed by Saaty (1980). The model works by performing a pair-wise comparison between the main components and assigning the highest preference to layers with maximum relevance to pinpointing the target of the decision-making process. Oral assessments are performed for ranking the various criteria by means of pair-wise comparison between the constituent factors. The output of this assessment, according to Saaty, is normalized between 1 and 9 (Saaty and Vargas 1998).

To attain the proper weights for each parameter, elements in each column of the pair-wise matrix are summed together and each element in the matrix is divided by the sum of its corresponding column. The final weight vector is then obtained by calculating the average value of all elements in each row of the normalized matrix (Eq. 4):

$$ \begin{aligned} r_{ij} = \frac{{a_{ij} }}{{\mathop \sum \nolimits_{i = 1}^{m} a_{ij} }} \hfill \\ W_{i} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} r_{ij} }}{n}, \hfill \\ \end{aligned} $$
(4)

where m is the number of columns, n is the number of rows, aij is the element in the ith row and jth column of the pair-wise matrix, and rij is the corresponding value in the normalized matrix. wi is the weight assigned to the ith option.

The final score for each option is then obtained by summing over the values of each parameter multiplied by its corresponding weight (Eq. 5):

$$ V_{H} = \mathop \sum \limits_{k = 1}^{n} W_{k} \left( {g_{ij} } \right), $$
(5)

where vj is the final score for the jth option, wk is the weight for each parameter, and gij is corresponding weight relevant to the parameter.

According to (Saaty 2005), a Consistency Ratio (CR) smaller than or equal to 0.1 is admissible for determining the accuracy of the pair-wise comparison matrix:

$$ CR = \frac{{{\raise0.7ex\hbox{${\lambda_{\hbox{max} } - n}$} \!\mathord{\left/ {\vphantom {{\lambda_{\hbox{max} } - n} {n - 1}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${n - 1}$}}}}{RI}, $$
(6)

where λmax is the largest eigenvalue, n is the number of classes, and RI is the Ratio Index.

Analytic network process (ANP)

The ANP is an extended version of the AHP which is commonly used to solve decision-making problems. The ANP was proposed as a novel method for resolving the deficiencies of the AHP algorithm which considered the hierarchical relation between the main objective and the criteria and options. In contrast to the AHP method, the ANP approach incorporates a network structure to model the relation between sub-criteria with their parent criteria, thereby more practical for real-world applications in decision-making.

Neural structure and pair-wise comparison

The first step of the ANP model is to create a network structure based on the relations between the criteria, sub-criteria, options, and objectives. The pair-wise comparison matrix is then defined between these relations. Similar to the AHP method, the ANP model also performs pair-wise comparisons by determining a standard scale upon which the significance of each option is weighted. According to Saaty, the consistency rate for the ANP model alike the AHP method must be below 0.1 to ensure the consistency of the system.

Super matrix

The scores obtained from pair-wise comparison of matrices are inserted into the super matrix for each corresponding criterion. The super matrix shows the preference of one element over another in relation to a specific criterion. Figure 3 shows the structure of the super matrix.

Fig. 3
figure 3

Pairwise comparison matrix, factor weights, of the data layers used

Here, Cm is the cth cluster, emn is the nth element in the cth cluster, and Wij is the eigenvalue for the elements compared between the jth and ith cluster. If there is no edge from the jth node to the ith node in the network structure, then Wij is equal to zero. The weighted super matrix is then multiplied by itself until the probability matrix is obtained, i.e., a matrix wherein the sum of elements in each of its columns are equal to 1. This matrix is also known as the limit matrix. The elements of the limit matrix show the final score for each option (Publications et al. 2007).

Feature selection

As shown in Fig. 4, the feature selection process is carried out in four steps of generation, the elevation of subset, termination (based on stopping criteria), and validation (Chen et al. 1988). Weka v.3.8 was deployed in this study as the machine learning tool for selecting relevant data required for the preparation of corn cultivation maps for the study area. A series of three methods’ search procedures were used including Best-First, Greedy-Stepwise, and Ranker. CFS-Subset-Evaluator was utilized for both the Best-First and Greedy-Stepwise approach, while Information-Gain-Attribute-Evaluation, Gain-Ratio-Attribute-Evaluation, Symmetrical-Uncertainty-Attribute-Evaluation, Relief-Attribute-Evaluation, and Principal-Components were used as for the ranker method.

Fig. 4
figure 4

Process of feature selection algorithm

Performance evaluation

The average number of misclassified samples (ANMS) was used to evaluate the FSA algorithm (Naseriparsa et al. 2014) and select the best model. ANMS is measured by dividing the average number of misclassified points in each of the models by the total number of data. ANMS can be calculated according to Eq. 7 (Dash and Liu 2003):

$$ {\text{ANMS}} = \frac{{\sum\limits_{i = 1}^{n} {{\text{MS}}_{i} } }}{N}, $$
(7)

where MSi is the number of misclassified for each model and N is the total data.

Results and discussion

Inverse distance weighting (IDW)

A total of 45 soil samples were selected from the study region randomly. Raster maps were then prepared for each parameter including organic content (OC), phosphorus (P), potassium (K), iron (Fe), zinc (Zn), manganese (Mn), and copper (Cu) of the soil, using IDW model (as shown in Fig. 5). As is observed in Fig. 5a, the OC of soil ranged from 0.18 to 3.2% with only certain parts to the north and south of the study area indicating OC values above the critical level (Table 2). The P content in the southern and northern sectors of the area was generally higher than the critical value (p > 18.5 mg/kg) (Fig. 5b). Overall, the K value of soil was fairly appropriate for the entire study area, with the exception of southern and northern parts (p < 260 mg/kg) (Fig. 5c). The Fe value of soil ranged between 0.8 and 35.9 (mg/kg), with the most suitable values located in the eastern parts of the study area with Fe values of more than 6.5 mg/kg, which are quite suitable for corn production (Fig. 5d). The Zn value was between 0.1 and 2.9 mg/kg for the study area, with only a few small parts located to the east and south of the study area showing suitable Zn content. (Fig. 5e). The Mn value of soil in northern and southern regions of the study area was higher than the critical value, stipulative of suitable conditions for corn cultivation (Mn > 10 mg/kg) (Fig. 5f). Finally, the results of the IDW method showed that surface soil in the case study had a Cu value between 0.01 and 0.23 (mg/kg) which, according to Table 2 (Cu < 1 mg/kg), was not sufficient for corn production (Fig. 5g).

Fig. 5
figure 5

Raster maps for each of the parameters using inverse distance weighting (IDW). OC (a), P (b), K (c), Fe (d), Zn (e), Mn (f), and Cu(g)

Fuzzy model

The desired fuzzy maps for each parameter were prepared in accordance with Eq. 1. The result of the proposed fuzzy model is shown in Fig. 6 for all parameters. As can be observed, the larger share of the region, with the exception of southern areas as well as small parts to the north of the study area, had high OC content and, therefore, was in need of fertilization (Fig. 6a). P, Fe, and Zn contents were quite suitable in the entire study area, excluding the southern and northern parts which displayed values close to one (Fig. 6b–e). Contrarily, the greater area of the study region showed, not counting the southern and northern sectors, displayed K values close to one according to the fuzzy map (Fig. 6c). Furthermore, the fuzzy maps specify that most of the study area, except certain parts to the east and central regions, had suitable values of Mn (Fig. 6f). Moreover, the northern and southwestern sectors of the study area had Cu values around 1 (Fig. 6g).

Fig. 6
figure 6

Fuzzy maps for each parameter for determining the soil fertility for corn. OC (a), P (b), K (c), Fe (d), Zn (e), Mn (f), and Cu(g)

Analytic hierarchy process (AHP) and analytic network process (ANP)

The final soil fertility map was eventually prepared in the ArcGIS environment according to the fuzzy maps and weights obtained using the AHP method. As shown in Table 3, the most effective factor on soil fertility was OC content (with the weight of 0.35), whereas the least significant factor was Mn (with the weight of 0.05) for the entire study area. As shown in Fig. 7, the final fuzzy map was generated using fuzzy maps and weights (measured using AHP, Table 3) obtained for each parameter.

Table 3 Pairwise comparison matrix for soil fertility of corn production using analytic hierarchy process
Fig. 7
figure 7

Fuzzy-AHP combination map for soil fertility for corn

The ANP method was also incorporated in this study as a means for preparing a corn cultivation zone map using weighed maps obtained from employing the Index Overlay method (Fig. 8). According to Fig. 8, as the number of classes increases, the suitable area grows larger in size. The corresponding information layers were ultimately combined using the raster calculator tool to form the final maps in the ANP method (Fig. 9). As is evident from this figure, areas located to the north, northeast, southeast, and east of the study area, shown in brown, are more suitable for corn cultivation.

Fig. 8
figure 8

Class of each factors for olive cultivation: OC (a), P (b), K (c), Fe (d), Zn (e), Mn (f), Cu(g)

Fig. 9
figure 9

ANP map for soil fertility for corn

Feature selection algorithm

The tenfold mode and entire training set were used for classification. Also, to select the best information method, the number of falsely and correctly predicted classes (false positives and false negatives) was initially determined for each method, upon which corresponding ANMS values were obtained. According to the results, the CFS-Subset-Evaluation method outperformed the others, with the same precision for determining the suitable regions for corn cultivation. As is evident from the data, the greatest impact belongs to OC, K, Fe, and P (Table 4). Thus, the corresponding data for these 4 contents were used as inputs for preparing soil fertility maps using the ANP and the AHP methods (Fig. 10). As observed from Fig. 10a, the southern regions of the study area (brown color) are more suitable for corn cultivation based on the results of fuzzy-AHP method, while the results obtained from employing the ANP method on the relevant data obtained using feature selection algorithms showed that the northern and eastern parts of the study area were more desirable for corn cultivation (Fig. 10b).

Table 4 Evaluation of classes using the LMT
Fig. 10
figure 10

Soil fertility for corn using a AHP and feature selection algorithm, and b ANP and feature selection algorithm

There appears to be little difference between soil fertility maps obtained from deploying the entire data as well as relevant data using both the ANP and the fuzzy-AHP. Ergo, removing data redundancies and using only relevant data could still attain acceptable results at a cheaper cost and lower time complexity. To compare the two models and select the optimal approach to locating suitable areas for corn cultivation, 20 points were randomly selected and fed to each model. The results were then compared, as shown in Fig. 11.

Fig. 11
figure 11

Comparison of two methods and feature selection with the temperature values

As can be seen from Fig. 11, increases in soil yield are brought on by increases in soil fertility as obtained from both methods. In some cases, decreases in yield were accompanied by increases in the value of the fuzzy-AHP map. Accordingly, the ANP method with recent data was selected as the feature selection algorithm with higher accuracy than the AHP method in determining soil fertility for corn cultivation.

Conclusions

Corn, as an economically significant plant, is largely used as feed for birds and livestock as well as in the pharmaceutical industry or in starching, alcohol, and acid preparation and so on. Considering the importance of this plant, this study sought to put forward a model for locating suitable areas for corn cultivation with respect to soil fertility. The AHP and ANP methods were used to locate suitable areas for corn cultivation. To select relevant data and save time and cost, a feature selection algorithm was incorporated along with the two mentioned methods to pinpoint suitable areas for the cultivation of this plant. Based on the results, the ANP method with feature selection achieves higher accuracy in generating soil fertility maps for corn cultivation as opposed to the fuzzy-AHP method, and at the same time, due to using only relevant data, it is more cost-effective.