1 Introduction

Land use and cover change (LUCC) investigation are of importance to promote insightful management of Earth’s land use to refrain environmental damage. Moreover, LUCC is a complex process that relates the interaction between environmental, economic and social systems at different temporal and spatial scales. Computational frameworks are the most used technique to simulate LUCC models for its ability to cope with its complexity.

Agent-based model (ABM) has been incorporated into LUCC models, and many other real-world problems, to explicitly simulate the effects of human decisions in complex situations. They are based on the multi-agent system paradigm that features autonomous entities that interact and communicate in a shared environment. These entities perceive the environment, reason about it and act on it to achieve an internal objective. Therefore, ABM can capture emergent phenomena and provide an original description of the modeled system.

The Multi-Agent System for Environmental simulation (MASE) is a freeware software developed at the University of Brasilia. MASE is a tool for exploring potential impacts of land use policies that implement a land use agent-based model [28]. Considering the purpose and reliance upon external data, MASE may be characterized as a predictor-type agent-based simulation (ABS) model [12]: a data-driven model with the overall goal of performing medium to long term predictions. MASE simulations were calibrated to match available GIS data [4]. Simulation results were validated according to a standard methodology for spatially explicit simulations [27] and then compared to similar frameworks [29]. MASE performance was found to be higher than other 13 LUCC modeling applications with nine different traditional peer-reviewed LUCC models according to [27]. Despite this fact, the lack of uncertainty assessment and sound experimentation is the main reason for criticism and questioning about the real contribution of frameworks to decision support for LUCC.

According to [3], any ABS has levels of uncertainty and errors associated with it. ABS continues to harbor subjectivity and hence degrees of freedom in the structure and intensity of agent’s interactions, learning, and adaptation [18]. There are significant chances of finding results which may be the consequence of biases. Furthermore, almost every ABS review have expressed the need for statistical methods to validate models and evaluate the results to improve the transparency, replicability and general confidence in results derived from ABS. These problems continue to be underestimated and often neglected. Some authors [12], likewise, argued that validation is one of the most important aspects of a model building because it is the only means that provides some evidence that a model can be used for a particular purpose. However, at least 65% of the models in their survey were incompletely validated. Of the models validated in some way, surprisingly less than 5% used statistical validation techniques. Traditionally, ABS types of systems are difficult to analyze given their non-linear behavior and size [6].

Treatment of uncertainty is particularly important and usually difficult to deal with in the case of ABM’s stochastic models. While acknowledging the differences in data sources and the causes of inconsistencies, there is still need to develop methods to optimally extract information from the data, to document the uncertainties and to assess common methodological challenges. To look away could reinforce inconsistent results and damage the integrity and quality of simulation results.

This work aims to briefly discuss how uncertainty is being portrayed in ABS and to perform an exploratory study to use statistical methods to estimate uncertainty in a LUCC agent-based prediction simulation tool. The MASE system will be the simulator under study. The Cerrado case study simulations [29] will be the basis for the analysis. As a first investigation step, we assessed the uncertainty within the inputs and configuration parameters of the simulation. Our final goal would be to document, quantification and to foresee its propagation impacts in the results. A particular challenge in performing measurements is coming up with appropriate metrics. The thorough experimentation and repeatability would, therefore, improve our understanding of the uncertainty and relations among the variables that characterize a simulation. The remainder of the paper is structured as follows. In Sect. 2, we present some background on uncertainty and in Sect. 3 some related work. In Sect. 4, we summarize the MASE characteristics and case study. We also present the methodology for the exploratory study. In Sect. 5 we show results together with discussions. In Sect. 6 we conclude with a summary.

2 Overview of Uncertainty in ABS

The relevance of the treatment of uncertainty is dependent of the modeling objective. Requirements regarding model uncertainty may be less critical for social learning models, where communication and interaction among stakeholders would be of more significance. Conversely, parameters, measurements, and conditions used for model runs influence much more data-based predictions of future states. Projection, forecasting and prediction models are usually very affected by the variation of a system output from observed models.

Also, there are different sources of uncertainty that can influence the prediction of a simulation model. It can arise from simulation variability in stochastic simulation models or from structural uncertainty within assumptions of a model. We will emphasize input uncertainty, what McKay [24] defined as incomplete knowledge of ‘correct’ values of model inputs, including model parameters. If the inputs of a model are uncertain, there is an inherent variability associated with the output of that model. Therefore it is crucial to communicate it effectively to stakeholders and technical audiences when outputting model predictions.

Uncertainty in environmental prediction simulations may limit the reliability of predicted changes. This issue is one of the recurrent conclusions of the Intergovernmental Panel on Climate Change (IPCC). Back at 1995, IPCC stated that “uncertainties in the simulation of changes in the physical properties have a major impact on confidence in projections of future regional climate change” [13] and that was necessary to reduce uncertainties to increase future model capabilities and improve climate change estimates. Since 2010, IPCC dedicates an integral feature of its reports to the communication of the degree of certainty within IPCC assessment findings [23]. In the most recent report, IPCC assesses a substantially larger knowledge base of scientific, technical and socio-economic literature to reduce uncertainty and uses a large number of methods and formalization [7]. Especially for future predictions, validating a model’s predictive accuracy is not straightforward due to a lack of appropriate data and methods for ‘validation’ [15]. That is another reason why applications, frameworks, and methods of formalization in this research area are relevant and should be promoted.

Regarding the type of modeling, there are approaches such as Bayesian networks, able to explicitly deal with uncertainty in the interpretation of data, measurements or conditions. In contrast, other approaches such as ABMs require the development of comprehensive or compelling analysis of output data and a lot of resource-intensive attention [18]. The level of testing required to develop this understanding is rarely carried out, mainly due to time and other resource constraints [15].

Indeed, uncertainty assessment in ABM can be a hard task for even relatively small models. Due to their inherent complexity, ABS are often seen as black boxes, where there is no purpose in explaining why the agents acted as they did, as long as the modeler presents some form of validation (i.e., shows a good fit). According to Marks [22], ABMs simulations can prove existence, but not in general necessity. Despite that, there is a research effort to make ABS more transparent and to demonstrate that the simulations behave as intended through efforts in standardization in simulation model analysis and result sharing [21]. Besides from verification, uncertainty assessment aims to increase understanding, to improve the reliability of the predicted changes and to inform the degree of certainty of key findings. To achieve this effort, some techniques and methods such as uncertainty and sensitivity analysis should be part of the modeling process.

Uncertainty Quantification is defined as the identification, characterization, propagation, analysis and reduction of uncertainties. Sensitivity analysis (SA) is defined as the study of how uncertainty in the output of a model can be apportioned to different sources of uncertainty in the model input [30] and is a method to assess propagation of uncertainties. SA responds the question of which inputs are responsible for the variability of outputs. Local SA explores the output changes by varying one parameter at a time, keeping all the others constant. Although it is a useful and straightforward approach, it may be location dependent. Global SA gives a better estimate of uncertainty by varying all parameters at the same time by using probability density functions to express the uncertainty of model parameters. Uncertainty analysis is a related broader uncertainty propagation practice to SA. It focuses rather on quantifying uncertainty in model output, addressing the variability of results. Ideally, uncertainty and SA should be run in tandem.

3 Related Work

There are a growing number of attempts to assess uncertainty in ABS. However, there is a lack of specific guidance on effective presentation and analysis of the simulation output data. There is a variety of approaches to quantifying or reduce uncertainty. The work of [18] offers an overview of the state-of-the-art methods on the social simulation area, in particular examining the issues around variance stability, SA and spatiotemporal analysis. Because of our interest in LUCC simulations, we choose to review how those approaches are being applied and communicated on spatially-explicit simulations.

In [1], the authors propose an algorithm as an alternative to goodness-of-fit traditional validation to answer if the agents in a simulation are behaving as expected. To them, the key for effective interaction in multi-agent applications is to reason explicitly about the behavior of other agents, in the form of a hypothesized behavior. This approach would allow an agent to contemplate the correctness of a hypothesis. In the form of a frequentist hypothesis test, the algorithm allows for multiple metrics in the construction of the test statistic and learns its distribution during the interaction process. It is an interesting approach to addressing the uncertainties within the model and agents behavior. We believe it would be even more effective if coupled with an uncertainty quantification technique.

The work of [26] assesses uncertainty that is characteristic of spatially explicit models and simulations. The authors propose a benchmarking scheme of LUCC modeling tools by various validation techniques and error analysis. The authors investigate LUCC tools that are based on map comparisons to analyze the accuracy of LUCC models in terms of quantity, pixel by pixel correctness and LUCC components such as persistence and change. Also, they investigated the map outputs of these simulations to test the fidelity of spatial patterns and the congruency of the simulation maps from different modeling tools. Although the variability of LUCC models does not allow strict comparisons, there is still room for improvements in methodologies, validation and uncertainty quantification.

The work of [8] assesses model output analysis through a global SA, a commonly used approach for identifying critical parameters that dominate model behaviors. They use the Problem Solving environment for Uncertainty Analysis and Design Exploration (PSUADE) software, to evaluate the effectiveness and efficiency of widely used qualitative and quantitative SA methods. Each method is tested using a variety of sampling techniques to screen out the most relevant parameters from the insensitive ones. The Sacramento Soil Moisture Accounting (SAC-SMA) model, which has thirteen tunable parameters, is used for illustration. The South Branch Potomac River basin near Springfield, West Virginia in the U.S. is chosen as the study area. The authors show how different sampling methods and SA measurements can indicate different sensitive and insensitive parameters and that a comprehensive SA is paramount to avoid misleading results.

The work of [20] also performed a global SA to show which model parameters are critical to the performance of land surface models. The authors considered 40 adjustable parameters in The Common Land Model and therefore compare different SA methods and sampling. The size of each sample would vary as well. The sampling techniques and SA measures that were considered optimal were distinct from the results found by [8], meaning that not all LUCC ABS propagate uncertainty the same way.

Another approach was performed by [17], also in a LUCC model. They use the method of independent replication. In the case study, the authors replicated the simulation 12 times for each mechanism and computed the mean values of the impact indicators and their confidence intervals (CI) at a reliability of 95%. They used uncertainty quantification to define a minimum certainty threshold in the simulation outputs.

All these authors used several indicators to measure the variability of model results based on changing input parameters. Table 1 illustrates a brief comparison among those works. MASE exploratory uncertainty assessment will be described in the next sections. A large panel of statistical tools exist to help with the accuracy of the predictions such as DakotaFootnote 1, PSUADE [32], UQ-PyLFootnote 2, MEME SuiteFootnote 3 and MC2MABS [2]. There are initiatives to apply the potential of classic Design of Experiments (DOE) for ABS [16, 21]. ABS field of research would benefit from a systematic empirical research with standardized procedures, but ABS idiosyncrasies in model output turn the task even harder. Researchers so far failed to reach consensus and to determine sound methodological guidelines. Hence the studies are still mostly investigative and exploratory.

Table 1. Overview of the general characteristics of each related work

4 MASE Exploratory Study

The MASE ProjectFootnote 4 objective is to define and implement a multi-agent tool for simulating environmental change. MASE enables modeling and simulations of LUCC dynamics using a configurable user model. The multi-agent architecture is composed of three hierarchical layers (from top to bottom) [29]: a User Interface (UI), a Pre-processing and an Agent layer. In the agent layer, there are cell agents representing land units hosting natural processes, such as crop/forest grow, and there are transformation agents, representing human agents and their behavior as farmers or cattle rancher.

The Cerrado-LUCC model of MASE is used as a test problem. The simulations depict the land use and cover changes of the most endangered biome in Brazil. The Cerrado is the second largest biome in South America and harbors significant endemism and biodiversity. The landscape has been undergoing severe transformation due to the advance of cattle ranching and soy production. To promote transparency and replicability, the Cerrado-LUCC simulation model was documented and described employing the standard ODD-protocol (Overview, Design concepts, and Details) [10, 11]. We also applied empirically grounding ABM mechanisms for the characterization of agent behaviors and attributes in socio-ecological systems [31]. In this article, we provide some core information of MASE and the Cerrado-LUCC Model, mainly about the parameters and outputs. Readers who are interested in the details of this model and the implementation of MASE multi-agent system should refer to [28, 29], respectively.

The input of the simulation is a couple of grid raster maps consisting of the land cover of the region, from two different time periods (an initial and a final map). Also, each simulation carries a set of maps to describe the physical characteristics of the environment, such as water courses, water bodies, slope, buildings, highways, environmental protected areas, and territorial zoning maps.

The simulations are calibrated from the two time-steps and project the land use and cover change for future steps. The result of a MASE simulation is a couple of predicted maps (Fig. 1), with the allocation of change and a set of metrics calculated during runtime. The resulting image is submitted to a goodness-of-fit measurement and the quality and errors of the quantity of change and allocation of land use change are calculated.

Fig. 1.
figure 1

A land cover predicted map of the Cerrado in Federal District, Brazil

Methodology

The objective is to perform an exploratory analysis, based on classical statistics, to reduce uncertainty and to understand how the model behave. MASE LUCC model is under input uncertainty investigation, to calculate their influence in the simulation output. For exploratory purposes, we want insight on the parameters that affects the multi-agent system implementation, so we selected a subset of Cerrado-LUCC model inputs for this demonstration. The subset of input parameters of the multi-agent system are displayed in Table 2: TA-Number of Transformation Agents, TG- Number of Group Transformation Agents, IE- Potential of Individual Exploration and GE- Potential of Group Exploration. These parameters characterize the instantiation of MASE agents and therefore, should be analyzed regarding uncertainty.

Table 2. MASE multi-agent input configuration parameters

The number of transformation agents is a parameter that reflects the number of computational agents (in the multi-agent system paradigm) instantiated in a simulation run. In this study case, one agent does not represent one single individual. The Cerrado-LUCC model was formulated based on an empirical characterisation of agent behaviors, proposed by [31], with two basic steps: the development of behavioral categories and the scaling to the whole population of agents. TA was derived from the Brazilian Agricultural Census of 2006 and comprises a set of Producer legal status. The range of 1 to 100 is an abstraction to the 3407 register producers in the region that may be active or inactive in a given period. The details of this agent characterization are thoroughly illustrated in [29]. Likewise, a particular type of agent is GT, which represent not an individual but an organization, cooperative, business or so. The range is an abstraction of the 548 group producers, 10 of which have permanent exploration licenses.

The potential of exploration, individual or of a group, represent the impact an agent can produce in the natural vegetation cover of a cell during a step. In the Cerrado LUCC Model, considering the deforestation process, the potential of exploration is again an abstraction for the amount of \(m^3\) of wood that can be obtained from a particular grid cell, until a nominal limit that represents resource depletion.

In addition to the final LUCC maps, the simulation generates a set of metrics as results, mainly spatial analysis measurements, which includes pixel by pixel comparison, a quantitative and an allocation agreement. Those measurements are certain statistical LUCC indices to determine the produced map accuracy, proposed by [27]. It includes an objective function called the figure of merit (FoM), a ratio between correct predicted changes and the sum of observed and predicted changes. To evaluate the response of the model to the different parameters, the experiments considered the outputs described in Table 3 and tried to identify and quantify the influence of the simulation input configurations on the model outputs. The identification (ID) of each of the outputs follows the numbering of its generation in the file .csv produced by MASE at the end of each simulation.

Table 3. MASE output parameters

To identify and analyze these uncertainties we performed a method of elementary effects (EE) of global SA on the MASE LUCC model. For this calculation, we used the software package developed by Tong [32] called PSUADE, containing various methods for parameter study, numerical optimization, uncertainty analysis and SA.

Screening methods are based on a discretization of the inputs in levels, allowing a fast exploration of the system behavior [14]. The aim of this type of method is to identify the non-influential inputs with a small number of model calls. The most used screening method is based on the one-parameter-at-a-time (OAT) design, where each input is varied while fixing the others. The simplicity is one of OAT’s advantages, but there are drawbacks when applying to ABM. For one, it does not consider parameter interactions and may cover a slight fraction of the input space.

The EE method we chose to apply is the Morris method (MOAT) proposed by [25] and refined by [5], an expansion of the OAT approach that forsakes the strict OAT baseline. It means that a change in one input is maintained when examing a switch to the next input and the parameter set is multiply repeated while randomly selecting the initial parameters settings. EE is suited for spatially explicit simulations, usually computationally expensive models with large input sets.

MOAT allows classifying the inputs into three groups: inputs having a negligible effect, inputs having large linear effects without interactions and inputs having significant non-linear and interaction effects. In overall effect and interaction effect of each parameter can be approximated by the mean \(\mu \) and standard deviation \(\sigma \) of the gradients of each parameter sampled from r.

The MOAT sampling technique was designed for the particular MOAT method. The work of [8] details how the MOAT sampling works: the range of each parameter is partitioned into \(p-1\) equal intervals. Thus the parameter space is an n-dimension p-level orthogonal grid, where each parameter can take on values from these p determined values.

First, r points are randomly generated from the orthogonal grid; and then, for each of the r points, other sample points are generated by perturbing one dimension at a time. Therefore, sample size will be \((n+1)\cdot r\). For the sampling size, [19] report that one needs at least \(10 \cdot n\) samples to identify key factors among the parameters.

To avoid the effect size on the sample, we determining a minimum sample size of \(800 (=20\cdot 4)\), for four inputs. For MOAT sampling we used 160 replications, resulting in sample size of \(800 (=(4+1)\cdot 160)\).

Moreover, as in other stochastic models, it is not advisable to draw conclusions from a single MASE simulation run. For an initial uncertainty assessment, we applied the method of independent replications proposed by [9]. We run the model approximately eighty-five thousand times (an arbitrary choice to explore all the input parameter space) and randomly clustered the results into five independent replication groups. We computed the mean values of the outputs and their confidence intervals (CI) at a reliability of 95%. Another approach to estimating the uncertainty of the model output is to study the variance in the model outputs by using the Coefficient of Variation (CV) (the ratio of the standard deviation \(\sigma \) of a sample to its mean \(\mu \)), to compare the variance of different frequency distributions.

5 Results

In the current work, we analyzed four input parameters, displayed in Table 2, regarding the multi-agent configuration of MASE LUCC model. First, we present the results of the SA. Figure 2 presents the EE of CERRADO-LUCC model parameters. Figure 2 (left) illustrates the modified means of MOAT gradients and also their spreads based on bootstrapping. The results show that GE and TA are the most sensitive parameters in term of having the largest average median (26.466 and 25.205, respectively). The other two parameters have median sensitivities close to zero, denoting the impact of these parameters on the simulation output is minimal.

Fig. 2.
figure 2

Parameter sensitivity rankings of MOAT method

Figure 2 (right) is a MOAT diagram that shows a consensus view among mean \(\mu \) and standard deviation \(\sigma \) of the gradients of each parameter sampled from r. The more sensitive the parameter, the closer it is to the upper right corner of the graph. These results show a positive correlation between input and output uncertainties. Since GE and TA describe the amount of land transformation in a simulation, high values of these parameters will increase the model output. GE is the most sensitive parameter, followed by TA. To understand and to reduce uncertainty within this two variables will, therefore, reduce the uncertainty of the simulation as a whole.

GE represents the amount of land cover that is transformed by a group of human agents in a cell of the map. GE is a sensitive value for it indicates the voracity and velocity of the current land exploitation, what will directly affect the result of the simulation. GE is probably sensitive because the socio-economic groups responsible for large-scale cattle ranching and permanent agriculture are the principal driver of deforestation in Cerrado. Their rates of land change are more significance than the number of groups, what explain TG as an insensitive parameter to the output. As for TA, the more agents one instantiates in a simulation, more land cover will be affected, higher will be the land use transformation rates. Conversely, the potential of exploration of a single individual is less determinant than the number of single individuals acting on the land, with SA indicating TA a sensitive and IE as an insensitive parameter.

To investigate MOAT sensitivity results, we used different replications times r and different levels p to know for sure the relevance of the parameters as displayed in Fig. 3. It is possible to see that even within the same method, results may vary. The results for four replications are not very consistent with the other replication results, mainly with the mean. The results with \(r=56\), \(r=108\) and \(r=160\) present minor variations. We can infer that four replications are not enough to identify the parameters sensitivity in the MASE model successfully and therefore the number of replications should be higher to be effective.

Fig. 3.
figure 3

Sensitivity of parameters at different replication times r

Table 4 is a summary of the Basic Output Statistics of the MASE LUCC model. Each replication is assigned by \(i=[1 \ldots 5]\), the sample mean from the coefficient variation by \(CV_{i}\), and the mean of all replications by \(\bar{Z}\). We performed independent replications to verify the variation of the indicators, and for an initial analysis, we consider this variation as noise (uncertainty). Any impact conclusions in predictions can only be drawn if the changes in standards are greater than the uncertainty rate. Therefore, we have a first threshold to define if some result is valid, compared to the simulations behavior.

We also estimated the expected average FoM for simulations, using the five replication grouped results (\(b=5\)). Considering the \(\bar{Z}_{FoM}=43.87\) and the estimated Variance \(\hat{V}_{R}=100.99\), we have an approximately \(100(1-\alpha )\%\) two-sided CI for \(\theta \), according to the formalization proposed by [9]. For level \(\alpha =0.05\), we have \(t_{0.025, 4}=2.78\), and gives [31.39, 56.34] as a 95% CI for the expected FoM for MASE simulations.

Table 4. Coefficient of variation for MASE outputs

6 Conclusions

In this study, we first identified the most sensitive parameters for the MASE LUCC model using MOAT SA. We investigated some proper sampling design and sample size needed for MOAT screening the parameters effectively. Although these conclusions are model-specific, it corroborates possible variation among sampling techniques and SA methods.

This paper is the first exploratory study towards quantifying uncertainty within MASE simulations. Following experiments must be done to promote more standardization to this effort through the application of Design of Experiments. We look forward to investigating further on the model parameters, analyzing the remaining inputs besides the agent’s quantities and their impacts.

This paper is the first exploratory study towards quantifying uncertainty within MASE simulations. The presented results allow us to understand the uncertainty when defining the parameters of the simulation of the LUCC model under study. Our feeling is that the uncertainty is very high which means that either model need to dramatically improve or LUCC policy need to be reevaluated. Most simulation tools fail to validate models and to state the uncertainty in simulation results. Consequently, policymakers and the general public develop opinions based on misleading research that fails to give them the appropriate interpretations required to make informed decisions. The efforts to assess ABMs through statistical methods are paramount to corroborate and improve the level of confidence of the research that has been made in LUCC simulation.