Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Understanding social phenomena is hard. There is all the complexity found in other fields of enquiry but with additional difficulties due to our embedding as part of what we are studying.Footnote 1 Despite these, understanding our own nature is naturally important to us, and our social aspects are a large part of that nature. Indeed some would go as far as saying that our social abilities are the defining features of our species (e.g. Dunbar 1998). The project of understanding human societies is so intricate that we need to deploy any and all means at our disposal. Simulation is but one tool in this vast project, but it has the potential to play an important part.

This chapter considers how and to what extent computer simulation helps us to understand the social complexity we see all around us. It will start by discussing two simulations in order to raise the key issues that this project involves, before moving on to highlight the difficulties of understanding human society in more detail. The core of the chapter is a review of some of the different ways that simulation can be used for, with examples of each. It then briefly discusses a conceptual framework, which will be used to organise the different ways that simulations are used. It ends with a frank assessment of the actual progress in understanding human societies using simulation techniques.

1.1 Example 1: The Club of Rome’s “Limits to Growth” (LTG)

In the early 1970s, on behalf of an international group under the name “The Club of Rome” a simulation study was published (Meadows et al. 1972) with the attempt to convince humankind that there were some serious issues facing it, in terms of a coming population, resource and pollution catastrophe. To do this they developed a system-dynamics model of the world. They chose a system-dynamics model because they felt they needed to capture some of the feedback cycles between the key factors – factors that would not come out in simple statistical projections of the available data. They developed this model and ran it, publishing the findings – a number of model-generated future scenarios – for a variety of settings and variations. The book (“Limits to Growth”) considered the world as a single system, and postulated some relationships between macro variables, such as population, available resources, pollution etc. Based on these relations it simulated what might happen if the feedback between the various variables were allowed to occur. The results of the simulations were the curves that resulted from this model as the simulation was continued for future dates. The results indicated that there was a coming critical point in time and that a lot of suffering would result, even if mankind managed to survive it.

The book had a considerable impact, firmly placing the possibility that humankind could not simply continue to grow indefinitely. It also attracted considerable criticism (e.g. Cole et al. 1973) mainly based on the plausibility of the model’s assumptions and the sensitivity of its results to those relationships. (For example it assumed that growth will be exponential and that delay loops are extended). The book presented the results of the simulations as predictions – a series of what-if scenarios. Whilst they did add caveats and explore various possible versions of their model, depending on what connections there turned out to be in the world-system, the overall message of the book was unmistakeable: that if we did not change what we were doing, by limiting our own economic consumption and population, disaster would result. This was a work firmly in the tradition of Malthus (1798) who, 175 years earlier, had predicted a constant state of near-starvation for much of the world based upon a consideration of the growth processes of population and agriculture.

The authors clearly hoped that by using a simulation (albeit a simplistic one by present standards) they would be able to make the potential feedback loops real to people. Thus this was a use of simulation to illustrate an understanding that the authors of LTG had. However, the model was not presented as such, but as something more scientific in some sense.Footnote 2 A science-driven study that predicted such suffering was a definite challenge to those who thought the problem was less severe.Footnote 3 By publishing their model and making it easy for others to replicate and analyse it, they offered critics a good opportunity for counter-argumentation.

The model was criticised on many different grounds, but the most effective was that the model was sensitive to the initial settings of some parameters (Vermeulen and de Jongh 1976). This raised the question of whether the model had to be finely tuned in order to get the behaviour claimed and thus, since the parameters were highly abstract and did not directly correspond to anything measurable, the applicability of the model to the world we live in was questioned. Its critics assumed that since this model did not, hence, produce reliable predictions that it could be safely ignored. It also engendered the general perception that predictive simulation models are not credible tools for understanding human socio-economic changes – especially for long term analyses – and discouraged their use in supporting policy-making for a while.

1.2 Example 2: Modelling 1st Millennium Native American Society

A contrasting example to the Club of Rome model is the use of simulation models to assess and explore explanations of population shifts among the Native American nations in the pre-Columbian era. This has been called “Generative Archaeology” (GA) by Tim Kohler (2009). Here a spatial model of a population was developed where movement was fitted to a wealth of archaeological and climatologically data, in order to find and assess possible explanations of the size, distribution and change in populations that existed in the first millennium AD in the Southwest US. This case offers a picture of settlement patterns in the context of relatively high-resolution reconstructions of changes in climate and resources relevant to human use of these landscapes.

The available data in this case is relatively rich, allowing many questions to be answered directly. However, many interesting aspects are not directly answerable from a static analysis of the data, for example those about the possible social processes that existed. The problem is that different archaeologists can inspect the same settlement pattern and generate different candidate processes (explanations) for its generation. Here agent-based modelling helps infer the social processes (which cannot be directly observed) from the detailed record over time. This is not a direct or certain inference, since there are still many unknowns involved in that process.

In Kohler et al. (2005, 2008)Footnote 4 the use of agent-based modelling (ABM) has been mainly to see what patterns one should expect if households were approximately minimizing their caloric costs for access to adequate amounts of calories, protein, and water. The differences through time in how well this expectation fits the observed record and the changing directions of departure from those expectations provide a completely novel source of inference on the archaeological record. Simulations using the hypothesis of local food sharing during periods of mild food shortage may be compared to the fit in a simulation where food sharing does not occur. In this way we can get indirect evidence as to whether food sharing took place.

The ABM has hence allowed a comparison of a possible process with the recorded evidence. This comparison is relative to the assumptions that are built into the model, which tend to be plausible but questionable. However despite the uncertainties involved, one is able to make a useful assessment of the possible explanations and the assumptions are explicitly documented. This approach to processes that involve complex interaction would be impossible to do without a computer simulation. At the very least, such a process reveals new important questions to ask (and hence new evidence to search for) and the times when the plausible explanations are demonstrably inadequate. However for any real progress in explanation of such cases, a very large amount of data seems to have been required.

1.3 Some Issues That the Aforementioned Examples Illustrate

The previous examples raise a few issues, common to much social simulation modelling of human societies. These will now be briefly defined and discussed as an introduction to the problem of understanding social phenomena using simulation.

  1. 1.

    Abstraction. AbstractionFootnote 5 is a crucial step in modelling observed social phenomena, as it involves choices about what aspects are or are not salient in relation to the problem and what level of analysis is appropriate. The LTG example, being a macro-model, assumes that distributive aspects such as geography and local heterogeneity are less important with respect to feedbacks among global growth variables. In this model all the detail of a whole world is reduced to the interaction of a few numeric variables. The GA model was more specific and detailed, including an explicit 2D map of the area and the position of settlements at different times in the past. It is fair to say that the LTG model was driven by the goals of its modellers, i.e. showing that the coming crisis could be sharp due to slow feedback loops, the GA model is more driven by the data that was available, with the model being used to try and answer a number of different questions afterwards.

  2. 2.

    Replicability. Replicability is the extent to which a published model has been described in a sufficiently comprehensive and transparent way so that the simulation experiments can be independently reproduced by another modeller. Replicability may be considerably easier if care is taken to verify the initial model by means of debugging tests, and also if the original source code is effectively available and is well commented. Here the LTG model was readily replicable and hence open to inspection and criticism. The GA models are available to download and inspect, but its very complexity makes it hard to independently replicate (although it has been done).

  3. 3.

    Understanding the model. A modeller’s inferential ability is the extent to which one can understand one’s own model. Evidence suggests that humans can fully track systems only for about two or three variables and five or six states (Klein 1998); for higher levels of complexity, additional tools are required. In many simulations, especially those towards the descriptive end of the spectrum, the agents have many behavioural rules which may interact in complicated and unpredictable ways. This makes simulations very difficult to fully understand and check. Even in the case of a simple model, such as the LTG model, there can be unexpected features (such as the fine tuning that was required). Although the GA was rich in its depiction of the space and environment the behavioural rules of sub-populations was fairly simple and easy to follow at the micro-level. However this does not rule out subtle errors and complexities that might result from the interaction of the micro-elements. Indeed this is the point of such a simulation that we can’t work out these complex outcomes ourselves, but require a computer program to do it.

  4. 4.

    Prediction vs . Understanding. The main lesson to be drawn from the history of formal modelling is that, for most complex systems it is impossible to model with accuracy their evolution beyond an immediate timeframe. Whilst the broad trends and properties may be to some degree forecast the particulars, e.g. the timing and scale of changes in the aggregate variables, generally they cannot (Moss 1999). The LTG model attempted to forecast the future, not in terms of the precise levels but in the presence of a severe crisis – a peak in population followed by a crash. The GA does not aim to strongly predict anything, but rather it seeks to establish plausible explanations for the data that is known. Most simulations of human society restrict themselves to establishing explanations, the simulations providing a chain of causation that shows that the explanation is possible.Footnote 6

  5. 5.

    Going beyond what is known. In social science there are a multitude of gaps in our knowledge and social simulation methods may be well placed to address some of these gaps. Given some data, and some plausible assumptions, the simulations can be used to perform experiments that are consistent with the data and assumptions, and then inspected to answer other questions. Clearly this depends on the reliability of the assumptions chosen. In the GA case this is very clear, a model with a food-sharing rule and one without can be compared to the data, seeing which one fits it better. The LTG model attempts something harder, making severe assumptions about how the aggregate variables relate, it “predicts” aspects of the future. In general: the more reliable the assumptions and data (hence: the less ambitious the attempt at projection), the more credible the result.

A social scientist who wants to capture key aspects of observed social phenomena in a simulation model faces many difficulties. Indeed, the differences between formal systems and complex, multi-facetted and meaning-laden social systems are so fundamental that some criticise any attempt to bridge this gap (e.g. Clifford 1986). Simulators have to face these difficulties which have an impact as to how social simulation is done and how useful (or otherwise) such models may be. We briefly consider six of these difficulties here.

  • Firstly, there is the sheer difference in nature between formal models (i.e. computer programs) that modellers use as compared to the social world that we observe. The former are explicit, precise, with a formal grammar, predictable at the micro-level, reproducible and work in (mostly) the same way regardless of the computational context. The later is vague, fluid, uncertain, subjective, implicit and imprecise – which often seems to work completely differently in similar situations, and whose operation seems to rely on the rich interaction of meaning in a way that is sometimes explicable but usually unpredictable. In particular the gap between essentially formal symbols with precise but limited meaning and the rich semantic associations of the observed social world (for example as expressed in natural language) is particularly stark. This gap is so wide that some philosophers have declared it unbridgeable (e.g. Lincoln and Guba 1985, Guba and Lincoln 1994).

  • Secondly there is the sheer variability, complication and complexity of the social world. Social phenomena seem to be at least as complex as biological phenomena but without the central organising principle of evolution as specified in the neo-Darwinian Synthesis. If there are any general organising principles (and it is not obvious that this is the case) then there are many of these, each with differing (and sometimes overlapping) domains of application. In that sense, it is clear that a model will always capture only a small part of the phenomenon among many other related aspects, hence reducing drastically the possibility to predict with any degree of certainty.

  • Then there is the sheer lack of adequate multifaceted data about social phenomena. Social simulators always seem to have to choose between longitudinal studies, narrative data, cross-sectional surveys or time-series data. Having all of these datasets about a single social process or event is to date very unlikely. There does not seem to be the emphasis on data collection and measurement in the social sciences that there is in some other sciences and certainly not the corresponding prestige for those who collect it or invent ways of doing so.

  • There is the more mundane difficulty of building, checking, maintaining, and analysing simulations (Galán et al. 2009). Even the simplest simulations are beyond our complete understanding, indeed that is often why we need them, because there is no other practical way to find out the complex ramifications of a set of interacting agents. This presence of emergent outcomes in the simulations makes them very difficult to check. Ways to improve confidence that our simulations in fact correspond to our intentions for themFootnote 7 include: unit testing, debugging, and the facility for querying the database of a simulation (see Chap. 8 (David 2013) in this handbook). Perhaps the strictest test is the independent replication of simulations – working from the specifications and checking their results at a high degree of accuracy (Axtell et al. 1996). However such replication is usually very difficult and time-consuming, even in relatively simple cases (Edmonds and Hales 2003).

  • Another difficulty is that of the inevitability of background assumptions in all we do. There are always a wealth of facts, processes and affordances to give meaning to, and provide the framework for, the foreground actions and causal chains that we observe. Many of these are not immediately apparent to us since they are part of the contexts we inhabit and so are not perceptually apparent. This is the same as other fields, as it has been argued elsewhere; the concept of causation only makes sense within a context (Edmonds 2007). However it does seem that context is more critical in the social world than others, since it can not only change the outcomes of events but their very meaning (and hence the kind of social outcome). Whilst in other fields it might be acceptable to represent extra-contextual interferences as some kind of random distribution or process, this is often manifestly inadequate with social phenomena (Edmonds and Hales 2005).

  • The uncertainty behind the foreground assumptions in social simulation is also problematic. Even when we are aware of all of the assumptions they are often either too numerous to include in a single model or else we simply lack any evidence as to what they should be. For example, there are many social simulation models which include some version of inference, learning or decision-making within the agents of the model, even when there is no evidence as to whether this actually corresponds to that used by the observed actors. It seems that often it is simply hoped that these details will not happen to matter much in the end – thus becoming a rarely checked, and sometimes wrong, aspect of simulations (Edmonds 2001; Rouchier 2001).

  • Finally there is a difficulty from the nature of simulation itself. Simulation will demonstrate possible processes that might follow from a given situation (relative to the assumptions on which the simulation is built). It does not show all the possibilities, since it could happen that a future simulation will produce the same outcomes from the same set-up in a different way (e.g. using a different cognitive model). Thus simulation differs in terms of its inferential power from analytic models (e.g. equation-based ones), where the simplicity of the model can allow formal proofs of a general formulation of outcomes that may establish the necessity of conditions as well as their adequacy. This difficulty is the same that plagues many mathematical formulations since, in their raw form, they are often unsolvable and hence either one has to use numerical simulation of results (in which case one is back to a simulation) or one has to make simplifying assumptions (in which case, depending on the strength of these assumptions, one does not know if the results still apply to the original case).

These difficulties bring up the question of whether some aspects of societies can be at all understood by means of modelling. This hypothesis asserting that simulation is a credible method to better explore, understand or explain social processes, is implicitly tested in the current volume and is discussed in some detail below. We are not going to take any strong position but will restrict ourselves to considering examples within the context of their use.Footnote 8 Agent-based social simulation is not a magic-bullet and is not yet a mature technique. It is common sense in the social simulation community that best results will be achieved by combining social simulation with other research methods.

2 Styles of Modelling and Their Impact on Simulation Issues

2.1 Models of Evidence vs. Models of Ideas

One response to the above difficulties is not to model social phenomena directly, but rather to restrict ourselves to modelling ideas about social phenomena. This is a lot easier, since our ideas are necessarily a lot simpler and more abstract than the phenomena itself (which can be formalized with the notion of pattern modelling (Grimm et al. 2005) rather than strict adequacy to data). Some ideas need modelling, in the sense that the ramifications of the ideas are themselves complex. These kinds of models can be used to improve our understanding of the ideas, and later this understanding can be applied in a rich, flexible and context-sensitive way. This distinction is made clear in (Edmonds 2001).

Of course to some extent any model is a compact abstraction of the final target of modelling – there will, presumably, be some reason why one conceptualises what one is modelling in terms of evidence or experience by someone, and there will always be some level of theory/assumption that motivates the decision as to what can be safely left out of a model. Thus all models are somewhat about ideas and, hopefully, all models have some relation to the evidence. However there is still a clear difference between those models that take their primary structure from an idea, and those whose primary considerations come from the available evidence. For example, the former tend to be a lot simpler than the later. The later will tend to have specific motivations for each feature whilst the former will tend to be motivated in general terms. These two kinds of simulation have a close parallel with the theoretical and phenomenological models identified by Cartwright (1993).

Unfortunately these kinds of model are often conflated in academic papers. This seems frequently not deliberate, but rather due to the strong theoretical spectacles (Kuhn 1962) that simulation models seem to provide. There is nothing like developing and playing with a simulation model for a while to make one see the world in terms of that model – it is not only that the model is your creation and best effort in formulating an aspect of the social world, but one has interacted with it and changed it to include the features that you, the modeller, think it should have. Nevertheless, whatever the source it can take some careful “reading between the lines” to determine the exact nature of any model, and what it purports to represent.

2.2 Modelling as Representation of Social Phenomena vs. as an Intervention in a Social Process

It must be said that some simulation models are not intended to represent anything but rather created for another purpose, such as a tool for demonstrating an approach or an intervention in a decision-making process. This may be deliberate and explicit, or not, for various different reasons. Of course if a computer model does not represent anything at all, it is not really a simulation but simply a computer program, which may be presented in the style of a simulation. Also for a simulation to be an effective tool for intervention it will have to have some credibility with the participants.

However in some research the representation is either not the primary goal or what they seek to represent is deliberately subjective in character. Thus in some participatory approaches (see Chap. 11, Barreteau et al. 2013) it may be the primary goal to raise awareness of an issue, to intervene in or facilitate a social process like a negotiation or consensus process within a group of people. The modeller may not focus as much on whether the model captures an objective reality but rather on how stakeholdersFootnote 9 understood the issues and processes of concern and how this might influence the outcomes. This does not mean that there will be no elements that are objective and/or representative in character – for example such models might have a well-validated hydrological component to them – but that the parts of the model that are the focus are checked against the opinions of those being modelled or those with an interest in the outcomes rather than any independent evidence.

Of course, this is a matter of degree – in a sense most social simulations are a mixture of objective aspects linked to observations and other aspects derived from theories, opinions, hypotheses and assumptions. In the participatory approaches the modeller seeks not to put their own ideas forward but rather take the, possibly more democratic, approach of being an expert facilitator expressing the stakeholders’ opinions and knowledge. Whilst some researchers might reject such ideas as too “anecdotal” to be included in a formal model, it is not obvious that the stakeholders’ ideas about the nature of the processes involved (for example, how the key players make decisions) are less reliable than the grander theories of academics. However researchers do have a professional obligation to be transparent and honest about their opinions, documenting assumptions to make them explicit and, at least, not state things that they think are false. Thus, although participatory approaches are not a world away from more traditional models of using simulation, they do have some different biases and characteristics.

2.3 Context and Social Simulation

Human knowledge, but particularly human social knowledge is usually not context-free. That is there is a set of background assumptions, facts, relationships and meanings that may be necessary and generally known but not made explicit. These background features can all be associated with the context of the knowledge (Edmonds 1999). In a similar way, most social simulation happens within a particular context as given, thus for example the environment in which racial segregation occurs might be obvious to all concerned. This context is sometimes indicated in papers but is often left implicit. The context of a simulation is associated with the uncountably many background assumptions that can be ignored, either because they are irrelevant or fixed in that context. Social simulation would probably be impossible if one was not able to assume a context whose associated assumptions need not be questioned for a given model (Edmonds 2010). Without such an effective restriction of scope every social simulation model would have to include all potential aspects of human behaviour and social interaction. Whilst such assumptions concerning the context are common to almost all fields of knowledge it is particularly powerful in the social sciences due to the fact that we unavoidably use our folk-knowledgeFootnote 10 of social situations to make sense of the studied social phenomena. This process of (social) context identification is often automatic, so that we correctly identify the appropriate context without expending much conscious thought. For this reason the context is often left implicit, despite the fact that it can impact upon the use of simulation in understanding the phenomena. This leaves the decisions as what to implement as foreground, deliberate decisions.

Choosing a social context which is relatively identifiable and self-contained is an important factor if one is seeking to represent some evidence in a simulation. Being able to include all the important factors of some social process and obtain some evidence for their nature, allows the building of simulations that are not misleading in the sense of not missing out factors that might critically change the outcomes. Clearly the more restricted the context, the easier the representational task. However in this case one does not know whether what one learns from the simulation is applicable in other contexts. Using a simulation developed for one context and purpose for a different context and/or purpose might well lead to misleading conclusions (Edmonds and Hales 2005; Lucas 2010).

Those simulations which are more focused on exploring an idea will often seek to transcend context, in the hope that the models will have some degree of generality – these often deliberately ignore any particular context. Although these may seem general, their weakness can become apparent when its applicability is tested. Here the ideas they represent might give some useful insights, but may be misleading if taken as the defining feature of a specific case study. Clearly a simulation that is claimed to have general applicability needs to have been validated across the claimed scope before being relied upon by others. To date, no social simulation has been found to be generally applicable beyond theoretical and illustrational purposes (Lucas 2011).

3 A Plethora of Modelling Purposes with Examples

Given the different purposes for which simulation models are used (Epstein 2008), they will be considered in groups of those with similar goals. It is only relative to their goals that simulation efforts can be judged. Nowadays it is widely acknowledged that authors should state the purpose of their models clearly before how the model is constituted (Grimm et al. 2006). Firstly however it is worth reviewing two goals that are widely pursued in many other fields but have not been convincingly attained with respect to the simulation of human society.

The first of these goals is that of predicting what will definitely happen in unknown circumstances. In other words, social simulation cannot yet make accurate and precise predictions. The nearest social simulations come (to our knowledge) is predicting some outcomes in situations where the choices are very constricted, and the data available is comprehensive. The clearest case of this is the use of micro-simulation models to predict the final outcome of elections once about 30 % of the results are already known (Curtis and Frith 2008). Thus this is hardly new or unknown circumstances, and is not immune from surprises, since sometimes their predictions are wrong. This model is a micro-simulation model that relies on the balance between parties in each constituency and then translates the general switches between parties (and non-voters) to the undeclared results. Thus although it is a prediction, its very nature rules out counter-intuitive or surprising predictions and comes more into the category of extending known data rather than prediction. The gold standard for prediction is that of making predictions of outcomes that are unexpected but true.Footnote 11

The second goal simulations do not achieve is to decisively test sociological hypotheses – in other words, they do not convincingly show that any particular idea about what we observe occurring in human societies can be relied upon or comprehensively ruled out. Here the distinction between modelling what we observe and modelling our ideas is important. A simulation that attempts to model what we observe is a contingent hypothesis that may always be wrong. However social simulations of evidence are always dependent on a raft of supportive assumptions – that the simulation fails to reproduce the desired outcomes may be due to a failure of any of its assumptions. Of course, if such a model is repeatedly tested against evidence and fails to be proved wrong we may come to rely upon it more (Popper 1963) but this success may be for other reasons (e.g. we simply have not tested it in sufficiently diverse conditions). Hypothesis testing using simulations is always relative to the totality of assumptions in the simulations and thus the gain in certainty is, at best, incremental and relative.Footnote 12 Thus the core assumptions of a field may be preserved by adjusting “auxiliary” aspects (Lakatos and Musgrave 1970).

If a simulation is about ideas then a very restricted kind of test is possible: a counter example. If it has been assumed that factor A will lead to result B, then one might be able to show that this might not be the case in a plausible simulation. Indeed the simulation may show that to obtain result B from factor A an additional and implausible assumption C is necessary. This does prove that “it is not necessarily the case that A leads to B”, but it may shift the burden of proof back onto those who have assumed A will lead to B. This very restricted test is only useful if the context of causation between A and B is appropriately identifiable. This case of using a simulation to establish counter-examples is considered below.

A particular case of seeking for counter-examples is that of testing for the “existence of a sufficient condition” for some particular results. For example, it may be possible to show that there is no need to add some particular hypothesis to see a phenomenon take place, as in economics where it can be shown that in many cases the assumption of perfect rationality for agents does not need to be made.Footnote 13

One might be disappointed that simulation provides neither predictions nor proofs (in the stronger senses of those terms), but that does not stop them being useful in other ways, which the sections below illustrate.

In the following, we look at how simulations might contribute to the understanding of human societies in a number of different ways, with examples from the literature. Unfortunately many articles describing social simulation research do not make their goals explicit (as advocated by ODD, see Polhill et al. 2008 and Chap. 7, Grimm et al. 2013), therefore the categorisation below is that of the chapter’s authors and not necessarily the category that the authors of the papers discussed would choose. Also it appears that some researchers have multiple purposes for their simulations or simply have not thought about their goals clearly.

3.1 Abstract Goals

First we consider simulations that have more abstract goals i.e. these tend to be more about ideas and theories than observed evidence (as discussed above in Sect. 26.2.1).

3.1.1 Illustration of Ideas

Simulations can be good ways of making processes clear, because simulations are what simulations do. Thus, if one can unpack a simulation to show how the outcomes result from a setup or mechanism, then this can demonstrate an idea clearly and dramatically. Of course, if how the outcomes emerge from the setup in a simulation is opaque and/or difficult to understand then this is not an effective technique. For this reason this tends to be done using relatively simple simulations that are specifically designed to bring out the focus idea.

An example is (Rouchier and Thoyer 2006) which models voting and lobbying in the EU decision making process. It does make fairly strong assumptions about how the voting strategies might operate, but it does not pretend to be a descriptive model. Instead it makes clear how the links between public opinion, lobbying groups and elected representatives might operate at the national scale as well as the EU one.

Another example is (Gode and Sunder 1993), a fairly simple demonstration that in some cases market institutions are so constraining that agents do not even need to be clever to achieve excellent results in this setting. They take the example of Continuous Double Auction (CDA), a two-sided progressive auction, which is the protocol that is most used in financial markets. At any moment, buyers can submit bids (offers to buy). Similarly, sellers can submit asks (offers to sell). Both buyers and sellers may propose an offer or accept the offer made by others. The main constraint is an improvement rule, imposed on new offers entering the market, which requires submitted bids/asks at a price higher/lower than the standing bids/asks. Each time an offer is satisfied for one of the participants, he or she announces the acceptance of the trade at the given price, and the transaction is completed. Once a transaction is completed the agents who have traded leave the market and the bid-offer process starts again following the same rule starting from any price. The result of Gode and Sunder’s simulation is that even with completely stupid agents, who know nothing of the market and only follow two constraints: the bid-offer rule described above and not selling below or buying above their reservation price, the market converges and enables agents to get excellent profits. This paper shows how institutional constraints might act to ensure a reasonable allocation of goods when agents are very clear about the value of things they want to sell or buy, and that this does not require any other substantive rationality by the agents. Of course this result cannot necessarily be extended to any observed markets, which are most of the time complex, where the agents do have intelligence, where the value of items might be unclear and where there might be many other social and institutional mechanisms, but at least this result clarifies an idea about why protocols of this kind might be important.

The OpenABM projectFootnote 14 has made significant progress in the development of a community of people using illustrative models to facilitate the communication of ideas. Working with others this group in particular promotes the educational value of agent-based models.

A particular case of using a simulation to illustrate an idea is that of using a simulation in teaching. Whilst demonstrating an idea to one’s peers might lead one to choose a simulation that emphasises the idea’s generality and power, in teaching one may well choose to simplify and highlight certain features of the idea that will be important later on. This is a matter of degree, but tends to result in simulations of a slightly different kind.

For example, researchers at Oxford University Department of Computer Science have developed a web application to assist students (particularly non-programmers) in understanding the behaviour of systems of interactive agents (Kahn and Noble 2009). They model, for example, the dynamics of epidemics in schools and workplaces, and effect of vaccination or school closing/quarantine periods upon spread of disease in the population (Scherer and McLean 2002). The students can quickly and easily test different policies and other parameter combinations, or for more intensive sessions can work through a series of guided steps to build models from pre-existing modular components or ‘micro-behaviours’ – a process called ‘composing’. The models can also be run, saved, and shared through a web browser in order to facilitate discussion and collaboration as well as ownership of the ideas and creative thinking.

3.1.2 Establishing the Possibility of a Process

A simulation can be used to show how a mechanism might result in certain outcomes, and thus established that a proposed process is possible, demonstrated by enfolding micro-processes in the simulation. This established plausibility of the process is relative to the plausibility of the assumptions behind the simulation – clearly if the simulation is one that could not convincingly be related to any observed system then one would not have established that the process is possible in any encountered system, but only be a theoretical possibility. This does not require that the simulation is an accurate representation of any observed system since all that is required is that one could imagine that a version of the target process in the simulation could occur in a real system.

A classic example of this is Axelrod’s (1984, 1997) work on the evolution of cooperation. Previous models in evolutionary biology had suggested that cooperative behaviour would not be selected within an evolutionary setting, as any group of co-operators would be vulnerable to a single non-cooperative invader or mutant. Axelrod’s books described simulations in which a population of competing individuals evolved, playing repeated games against others. Some cooperative strategies, in particular ‘tit-for-tat’ (cooperate unless your partner did not last time) where shown to survive and flourish in many game set-ups. Although the simulations described were highly speculative and abstract, they did firmly establish that it was possible that cooperative strategies might evolve within an evolutionary setting, where selfish strategies had a short-term advantage.

One use for establishing the possibility of a process is as a counter-example to an existing assumption or accepted theory, if the process demonstrated contradicts the assumption. Thus the simulations of Axelrod above can be seen as a counter-example to the assumption that cooperative behaviour cannot survive in an evolutionary setting.

The particular case of the Schelling (1969, 1971) model can be classified in this trend. Through very simple simulations, which Schelling ran by hand at the time, he discovered that segregation could be attained at a group level although each individual agent had no strong preference for segregation. This paper was important, because it was one of the first examples of emergent phenomena applied to social issues. But the most important element was the positive result obtained with the model. Schelling used a very intuitive (though not necessarily realistic) way of describing the change of location of agents in a city where they are surrounded by neighbours, which can be of two different types: identical to themselves or different. Each agent decides if it is satisfied with its location by judging if the proportion of neighbours that are different is acceptable to it. If this is not the case it moves to a new location. Even when each agent accepts up to 65 % of agents different to itself in its neighbourhood, high levels of segregation in the global society of agents result. This is a counter-example to the assumption that segregation results from a high level of intolerance to those of different ethnic origins, since one can see from the simulation that the apparition of high levels of segregation in cities could be due to the movement of people at the edges of segregated areas who are in regions dominated by those of different ethnicities. Of course, what this does not show is that this is what actually causes segregation in cities, it merely undermines the assumption that it must be due to high levels of intolerance.

3.1.3 Understanding the Properties of an Abstract Model

With some analytic mathematical models and very few, very simple simulation models, one might seek to prove some properties of that model, for example the parameter values under which a given end condition is reached. If this is not possible (the usual case), then one has two basic options: to simplify the original to obtain a model that is analytically tractable or to simulate it. If the simplifications that are necessary to obtain a tractable model are well-understood and plausible, then the simplified model might be trusted to approximate the original model (although it is always wise to check). If it is the case that to obtain an analytically tractable model one has to simplify so much that the relationship between the simplified and the original model is suspect (for example, by adding implausibly strong assumptions) then one cannot say that the simplified model is about the same things as the original model. At best the simplified model might be used as an analogy for what was being modelled – it cannot be relied upon to give correct answers about the original target of modelling. In this case, if one wants to actually model the original target of modelling, then simulation models are the only option. In this case one might wish to understand the simulation itself by systematically exploring its properties, such as doing parameter sweeps. In a sense this is a kind of pseudo-maths, trying to get a grasp of the general model properties when analytic proof is not feasible.

An example of such an exploration is (Biggs et al. 2009). This examined the regime-shifts using the fisheries food web model, in particular looking at the existence of turning points in a system with two attractors (piscivore and planktivore dominated regimes). Anthropogenic drivers were modelled as gradual changes in the amount of angling and shoreline development. Simulations were carried out to investigate the onset of regime shifts in fish populations, the possibilities to detect these changes and the effectiveness of management responses to avert the shift. In relation to angling it was found that shifts could be averted by reducing harvesting to zero at a relatively late stage (and well into the transition to alternate regime) whereas with development it required action to be taken substantially earlier, i.e. the lag time was substantially longer between taking action and the resultant shift. The behaviour of different indicators to anticipate regime shifts was examined. This is an example of a mathematical model with stochastic elements that is solved numerically by means of a simulation.

Such stylised models, although based on well-understood processes, are caricatures of real systems and have a number of simplifying assumptions. Nevertheless they may provide an insight that would be applicable to many types of real world issues. In contrast to this some seek to understand the properties of some fairly abstract models, aiming to uncover some structures and results that might be quite generally applicable. This is directly analogous to mathematics that seeks to establish some general structures, theorems and properties that might later be usefully applied as part of the extensive menu of tools that mathematics applies. In this case the usefulness of the exercise depends ultimately on the applicability of the results in practice. The criteria by which pure mathematics are judged can be seen as distinguishing those that are likely to be useful in the future: soundness, generality, and importance.

An example of where the study of an abstract class of mechanisms has been explored thoroughly to establish the general properties is the area of social influence, in particular the sub-case of opinion dynamics. It can be found in works that use physics methodologies (Galam 1997) or more artificial life approaches (Axelrod 1997). The topic in itself is extremely abstract and cannot be validated against data in any direct manner. The notion of culture or opinion that is studied in these models is considerably abstracted and so hard to accept for any sociologist (von Randow 2003). In this area the most studied mechanism is the creation of consensus or convergence of culture represented by a single real number or a binary string (Galam 1997; Deffuant et al. 2000; Axelrod 1997). Many variations and special cases of these classes of model have been undertaken, for a survey see (Lorenz 2007). Some of these studies have indeed used a combination of parameter sweeps of simulations and analytic approximations to give a comprehensive picture of the model behaviour (Deffuant and Weisbuch 2007). Other merely seem to point out possible variations of the model.

Sometimes the exploration of abstract properties of models can result in surprises, showing behaviour that was contrary to expectations, so this category can overlap with the one discussed in the next section (26.3.1.4).

3.1.4 Exploration of the Safety of Assumptions in Existing Models

This is similar to the previous goal, but instead of trying to establish the behaviour of the model as it is, one might seek to explore what happens if any of the assumptions in the model are changed or weakened. Thus here one is seeking to explore a space of possibilities around the original model. The idea behind this is often that one has a hypothesis about a particular assumption that the model is based upon. For example one might suspect that one would get very different outcomes if one varied some mechanism in the model in (what might seem) a trivial manner. Another example is when one suspects that a certain assumption is unnecessary to the outcomes and can be safely dropped. Thus for this goal one is essentially comparing the behaviour of the original model with that of an altered, or extended model.

For example, Izquierdo and Izquierdo (2006) carried out a systematic analysis of the effect of making slight modifications to structural assumptions in the Prisoner’s Dilemma game: in the population size, the mutation rate, the way that pairings were made, etc. all of which produced large changes in the emergent outcome – the frequency of strategies employed. The authors conclude that “the type of strategies that are likely to emerge and be sustained in evolutionary contexts is strongly dependent on assumptions that traditionally have been thought to be unimportant or secondary” (Izquierdo and Izquierdo 2006:181).

How cooperation emerges in a social setting was first fashioned into a game-theoretical problem by Axelrod (1984). The outcome was long thought to be dependent upon the defining questions such as which strategies are available, what are the pay-off values for each strategy, number of repetitions in a match, etc. whereas other structural assumptions, thought to be unimportant, were ignored. On further investigation, however, conclusions based on early work were shown to be rather less general than would be desired, and sometimes actually contradicted by later work.

A different case is explorations of the robustness of the simulation described in (Riolo et al. 2001). This showed the emergence of a cooperative group in an evolutionary setting similar to the Axelrod one mentioned above. Here each individual had a characteristic (modelled as a number between 0 and 1) and a tolerance in a similar range. Individuals are randomly paired and if the difference between the partner’s and its own characteristic is less than or equal to their tolerance then they cooperate, otherwise do not. As a result a group of individuals with similar characteristics formed that effectively shared with each other. However later studies (Roberts and Sherratt 2002; Edmonds and Hales 2003) probed the robustness of the model in a number of ways, but crucially by altering the rule for cooperation from “cooperate if the difference between the partner’s and its own characteristic is less than or equal to their tolerance” to “if the difference between the partner’s and its own characteristic is strictly less than their tolerance”, i.e. from “≤” to “<”. When this change is made the crucial result – the emergence of a cooperative group – disappeared. It turned out that the (Riolo et al. 2001) effect relied on the existence of a group of individuals with exactly the same characteristic with whom they had to cooperate, since the smallest tolerance possible was 0. When the existence of completely selfish individuals was made possible by this change, the cooperation disappeared.

3.1.5 Exploring Counter-Factual Possibilities

We only observe a few of the possible configurations of the social phenomena around us. Thus it is natural to wonder what might happen if events or processes were other than as observed or known to be the case. This is the world of artificial societies, where possible worlds loosely related to the one observed are explored. Sometimes an analogy with artificial life is made, where alternative algorithmic versions of life in the broadest sense are specified and experimented with – not life-as-it-is but life-as-it-might-have-been.

An extreme example of this is Jim Doran’s model of a society with knowledge of the future (Doran 1997) – this can be thought of as what a society might be like whose members’ predictions of the future happen to be correct. Clearly this is a case that does not hold in human society.

Such explorations might not contribute much to the understanding of our society, but it may inform the design of distributed computational systems where the components have a need to flexibly organise themselves in a way analogous, but not identical to, how humans organise (see Chap. 21, Hales 2013).

3.2 Concrete Goals

Here we consider some of the goals that are more at the concrete and descriptive end of the simulation spectrum. These tend to be more concerned to relate to evidence and also to be much more specific. In the subsections below the “plausibility” of assumptions, results and simulations is a frequent issue. The simulation of human societies has not yet reached the situation where there is enough evidence to obtain much more than simple plausibility. At this current stage of social simulation, getting close enough to be deemed a “plausible” model is difficult enough, and there is almost never data enough to justify a stronger claim. Thus claims of anything stronger should be treated with appropriate scepticism.

3.2.1 Building Towards Realism

One common approach is to start with a fairly simple model that is easier to understand and then to add aspects and mechanisms that are thought to be significant aspects of an observed system. That is to build an additional level of realism to make the model more plausible or useful in some way (e.g. as a thought experiment). This is sometimes known as the TAPAS approach, i.e. to ‘take a previous model and add something’. It is consistent with the engineering principle of “KISS” – keep it simple, stupid. In this approach one starts simply and adds more features or aspects one at a time if and only if the simple approach turns out to be inadequate for some purpose.

Thus Izquierdo (2008) starts with some standard models of the iterated prisoner-dilemma games and adds some more “realistic” features, such as case-based learning and reasoning. A key idea in this is to maintain rigorous understanding of the extended model, but take a step towards models that might eventually be validated against observed data from human interactions.

Whether one would, in fact, reach useable and valid models by this means is contested, with the alternative approach being to start with a complex model that reflects the evidence as well as possible and then seek for understanding and simplifications of this (Edmonds and Moss 2005).

To investigate the social aspects of socio-environmental systems, some highly complicated models often have to be used that include the relevant biophysical dynamics, coupled with social simulation. Rather than developing all such components of the simulation system “from scratch” (and because the biophysical parts are relatively universal), these systems have a modular architecture designed to be reusable. It may therefore be more accurate to refer to the software as a “toolkit” from which various sub-models can be configured depending on the desired purpose of a particular study. In the area of land-use simulation, PALM (Matthews 2006) is one such integrative model, and FEARLUS (Polhill et al. 2001, 2008) is part of another longstanding approach to socio-ecological modelling. With each iteration the toolkit obtains further refinement and new features – whilst the level of understanding of its user(s) increases. The social simulator is interested in what additional complexity the human interaction part brings, and to what extent it adds realism to the model’s behaviour when compared with observed evidence.

3.2.2 Extending Evidence to Extrapolate to Unobserved Cases

Data about social systems is often limited to measurements from a limited number of observed cases. Thus there may well be many cases within a spectrum of observed cases that one would like to estimate the outcome for. Of course, one could use simple statistical techniques such as linear interpolation or similar to do this, but such techniques depend upon assumptions concerning the regularity of the results with respect to small changes in the set-up, which may be implausible for some social systems. In this case one might simulate the system using plausible assumptions, and validate it against the known observations, then find the outcomes for set-ups that are different to those observed. For the results of this to be reliable the simulation needs to be well validated; for it to correctly indicate the observed cases; to not differ very much from the observed cases (in contrast to the case described in Sect. 26.3.2.1); and for any unvalidated assumptions to be of a mild and uncontroversial nature.

The plausibility of the results from such experiments depends upon the validity of the original measurements as well as the generality of the assumptions (which must be plausible for the unobserved as well as observed cases).

For example (Brown and Harding 2002) use a microsimulation model to extend regional socio-demographic (census) data to cases that are not directly observed (synthetic householder-level records for each spatial district). The extension is attempted with assumptions that are thought as deliberately cautious.

The “Sienna” programme (Snijders et al. 2010) fits a particular class of dynamic network model to “waves” of panel data. Simplifying a little, what happens is that the modeller specifies some basic assumptions (e.g. symmetry of network links) along with more than one set of panel data concerning the properties of the nodes at certain points in time. The algorithm then finds the dynamic network model that is consistent with the given specified constraints and that most closely fits the data. This is directly analogous to the process of fitting a line to a set of values using minimum total squared errors (or similar). What one gets out of this are some “surprise free” projections to network and node properties for times other than those given in the waves of panel data. This is not simulation in the same sense as other simulations mentioned here, since what is simulated is not a kind of process (that is given in the base specification of the family of models this technique uses) but rather a set of structures and values that fit given data in a statistical sense. When this technique is reliable and what its particular biases are, have not yet been established.

3.2.3 Establishing the Consistency of a Process/Assumption with Evidence

Oftentimes a social process is not included in a study because it is not considered to be valid in the same way as a physical or biological principle might be. This is particularly true in historical examples where social processes are less in evidence. Going back to our second example of generative archaeology (Sect. 26.1.2), there are few archaeological findings that suggest a particular social structure and set of social processes, hence the need often for guesswork and the resulting coexistence of many competing theories. However, as previously discussed, this is an area in which social simulation can make an important contribution.

Perhaps the most well-known example is the Artificial Anasazi simulation model (Axtell et al. 2002). The objective was to see if a model could be constructed broadly consistent with available evidence – the number of households settled in part of the U.S. South West region over the period 800–1350. The performance of the model was impressive in its convergence upon the actual historical time-series after a calibration of several parameters (a ‘fitting’ process), which suggested new social explanations as to the apparent land abandonment after 1350 might be possible. Interestingly a later paper (Janssen 2009) demonstrates that the model fit is mainly explained by two parameters related only to the model’s carrying capacity. The author argues that a more insightful basis might be to generalise the target domain, working initially from the less concrete goals, rather than fitting a particular case and focusing on one evident and quantifiable trend (such as population). If the evidence base is broadened to include more ethnographic knowledge this approach would resemble the pursuit of abstract goals as discussed in Sect. 26.4.1.4.

Data about real world social networks introduced at the design or validation stages can be a valuable way of checking the consistency of a model. For example Guimera et al. (2005) reconstruct the history of team collaborations in five different scientific and artistic fields and the development of corresponding collaboration networks. The authors develop and parameterise a probabilistic model of team selection. Using real data on team sizes, along with estimation of probabilistic parameters, to control the team assembly mechanism, the characteristics of the resulting networks (the degree distribution and the largest component) are compared with the real ones (independently for each of the five cases). The interest is in the transition of the collaboration network from “isolated schools” to an “invisible college” – the point at which the largest component of the network contains 50 % or more of the nodes (which is the case for all representative fields). All simulated network measurements are shown to be in close agreement with the real networks, which establishes the plausibility of the proposed team selection mechanism. However, being a probabilistic model it does not attribute any particular decision process to this mechanism that might be able to reveal new questions.

Another example is in (White 1999), which attempts to evaluate some statistical assumptions against data about marriage systems in different cultures using a “controlled simulation”.

3.2.4 Analysis of Influence Factors

In any complex system it is very difficult to estimate the importance of different factors on particular outcome measures or results. This is due to the “non linearity” in many social systems where a normally insignificant factor can trigger a system-wide change in behaviour. However, given a trusted simulation model of the system, one can perform experiments to determine the importance of each factor in the class of simulation set-ups that are run. Thus one does not have to determine the relative importance of factors on an a priori basis; one can simply run the experiments and measure the outcomes. Clearly this approach depends on having a reliable simulation model.

In (Saqalli et al. 2010) a simulation model of the development over several generations of a rural agrarian society is investigated to weigh the importance of several different model parameters on simulation results. In the simulation experiments reported, four parameters were assessed in relation to six state variables – with measurements taken at the end of the run. The model was based on a case-study of the Nigrien Sahel, typified as a low data situation where, in particular, little has been published on the social factors governing access to economic activities (including off-farm activities so often neglected as an important revenue generating source) or on intra-household dynamics (which the authors recognise as having a complex structure). The objective was to assess the robustness of results against variation in socio-economic and biophysical parameters to show that it is “constrained by the different parameters of its structure” (Saqalli et al. 2010: para. 3.6). This step provides the researcher with an improved understanding of the range of outcomes possible with the model and what might constitute a significant or meaningful difference when comparing outcomes. It is worth noting, however that the single parameter approach neglects any possible parametric interaction that could be identified from a pair-wise analysis of influence factors.

A very different example of this is (Yang et al. 2009), which studies the factors that influenced success in the system of Chinese civil service exams that existed in the Imperial era in mainland China. The simulation model used historical data from civil service records and some assumptions to assess the importance of factors such as class, wealth and family connections in terms of success at passing this exam (and hence obtaining a coveted civil service post). It is difficult to see how such indications about events that are otherwise lost in the past could be obtained, although this is open to the criticism of being unfalsifiable.

The disadvantages of this approach are that the assessment of influence is only as good as the simulation model, and it only samples particular sets of initial conditions – it does not rule out the case where very special values of parameters cause totally different outcomes (unless one happens to be lucky and sample these).

3.2.5 Assessment of Policy Options

Recently more and more articles have appeared in the literature featuring ABMs that address policy-making in contemporary issues such as developmental sustainability and climate change adaptation. For example Berman et al. (2004) consider eight employment scenarios defined by different policies for tourism and government spending, as well as different climate futures, for an ABM case study of sustainability in the Arctic community of Old Crow. Scenarios were developed with the input of local residents: tourism being a policy option largely influenced by the autonomous community of Old Crow (stemming from their land rights), and attracting great local interest. In ABM, policy options are often addressed as a certain type of scenario (scenarios are discussed in Sect. 26.3.2.9), embedding the behaviour of actors within a few possible future contexts. The attraction of this approach is that the model could potentially be used as a decision support tool, in a form that is familiar to many analysts, to provide answers to very specific policy questions. The merit is that it can improve the reckoning of human and social factors and information into the issues at stake; the drawback is the multiplication of uncertainties, not least of which is that we do not convincingly know how social actors might adapt (even if the possible policy options are more concrete).

For example (Alam et al. 2007) investigates the outcomes indicated by a complex, and detailed model of a particular village in South Africa. This model in particular looks at many aspects of the situation, including: social network, family structure, sexual network, HIV/AIDS spread, death, birth, savings clubs, government grants and local employment prospects. It concludes with hypotheses about this particular case. This does not mean that these outcomes will actually occur, but this does provide a focus for future field research and may provide thought for policy makers.Footnote 15

3.2.6 Social Engineering: “Designing” Better Systems

Market design is the branch of economic research aiming to provide insights about which market protocol, i.e. interaction structure and information circulation rules, is the best to obtain certain characteristics of a market. Agent-based simulation seems to be a good method to test several such protocols and see their influence on economic performances, e.g. efficiency, fairness, power repartition (Marks 2007). Each protocol is already known for its advantages and disadvantages (e.g. Dutch auction is fast; Double Auction extracts the highest global profit). Since not every good aspect can be achieved with a single protocol, one has to choose the aim to attain (LeBaron 2006). Then, assuming agents act rationally, it is possible to compare protocols to see what difference it makes in prices or other indicators (e.g. Kirman and Moulet 2008). Many studies were designed to fit the context of electricity markets (very crucial since unpredicted shortages are a problem and prices can vary quickly) and are usually treated by a comparison of protocol (Nicolaisen et al. 2001). One can also note the use of “evolutionary mechanism design” (Phelps et al. 2002; March 2007) where strategies of three types of actors – sellers, buyers and auctioneers – are all submitted to evolution and selection and the actual organization of the market evolves while the context of production and demand is fixed. In today’s economy more and more artificial agents really interact – either on bidding on consumers’ sites or even in financial markets (Kephart and Greenwald 2002) – so there is some convergence between real markets with artificial markets and designed artificial systems which utilise market mechanisms. For a more detailed discussion of modelling and designing markets see Chap. 23 in this handbook (Rouchier 2013).

3.2.7 Data Integration

A mundane and sometimes overlooked aspect of the scientific process is simple description. That is, recording what has been observed in a suitable form. Traditionally these forms have included the like of narratives, logs, videos, measurements, and pictures. However simulations can also be used as a sort of description, where the aim is not to express a theory about a mechanism, but rather to integrate as much of the relevant evidence about what is observed as possible about a particular target. Simulation has some advantages in such a process, since it can allow the integration of several different kinds and levels of evidence within one framework. To take some examples: aspects of narrative texts can be incorporated within the behavioural rules of an agent; the social network of sub communities be compared to those that result from the simulation, the time-series can be compared to the corresponding time-series derived from measurements on the simulation outcomes, and survey data compared to the equivalent answers at instances of the simulation runs. Such integration is far from easy, since some aspects are programmed directly (e.g. agent behaviour) whilst others have to be achieved in terms of the results (e.g. aggregate statistics about the outcomes). Achieving any particular set of outcomes in a social simulation is difficult due to the prevalence of unpredictable interactions and effects (i.e. emergence), so the achievement of a data-integration model is not an easy one. Such models are not entirely (or solely) a description since the structure of a simulation sometimes brings into question the consistency of the various parts of the evidence. Thus if it is difficult to square an account of how individuals behave with some of the outcomes, one may be forced to make some choices, including possibly adding in aspects that are not directly observed. This is alright as long as these are clearly documented and can provide fertile issues for future data collection efforts. However, such data-integration models do not aim for generality beyond the case study (or studies) focused on, and hence can avoid “high” theory to motivate simulation features where this is not supported by the evidence with respect to the target case. It is not that there is no theory in such simulations – any description or abstraction, however mild, will rely on some theory but the point is that in a descriptive simulation such theory is either well established, or relatively mundane.

Examples of simulations that intend to be descriptive in this sense include (Christensen and Sasaki 2008) which aims at producing a simulation of the evacuation from a particular building, with a view to a future evaluation of evacuation plans and facilities, in particular with regard to disabled people. It uses many particulars of the building structure, but makes assumptions (albeit of a plausible variety) about how people behave when evacuating. Likewise (Terán et al. 2007) aim to simulate land-use and users within a forest reserve with a view to producing a computational representation of this. As in similar simulations there is a mixture of assumptions that are backed by some evidence and some that are plausible guesses. This simulation is loosely validated against some data and shows results that confirm the results found in some other models. The ultimate use of this (and similar models) is not described.

Such simulations can take a long time to construct, involving many iterations of model development as well as being complicated and slow to run. The advantage of such models is that they are a precise and coherent representation of a set of evidence – in a sense an encapsulation of a particular case study.Footnote 16 This can be the basis for experiments and inspection that can lead to further abstraction steps, resulting in theories of the processes observed within the data-integration model being modelled in simpler models whose properties are easier to establish, but whose outcomes can be checked against targeted experiments on the data-integration model.

3.2.8 Finding New Questions and Areas of Ignorance, Hypothesis Suggestion

Another use of a simulation is as an aid to good observation. That is, suggesting issues and questions that should be sought in order to gain an adequate observational coverage. The simulation is developed as in the data-integration model above, including different aspects of the observational evidence that are available. It is often the case that it is only when one tries to simulate a process that some of the gaps in knowledge become clear. Thus building a simulation as one is observing can help direct the data-gathering research in order to complete an adequate computational description. In this sense it forms a similar role to simulation in some cognitive science (Newell 1990; Sun 2005).

For example, (Moss 1998) exhibits a simulation built on a mixture of bases: (a) an assumed but plausible cognitive architecture which captures how one might divide up a problem into sub-problems until they are doable, (b) some suggestions elicited from an expert from the domain and (c) plausible guesses for the remainder. This model attempted to examine behaviour in the face of crises (defined as when one unwanted event causes another in an out-of-control chain), in particular how the rotating of crisis-management teams and the information they pass on to the next team might impact upon their effectiveness at fighting the crisis. The results were not independently validated, but this is not the point of this simulation. As the author says:

…results obtained with the North West Water model indicate a clear need for an investigation of appropriate organizational structures and procedures to deal with full-blown crises.

In contrast, (Younger 2005) is a very much more abstract model, which is only loosely built upon evidence, but has the same broad aim of suggesting hypotheses – in this case, hypotheses concerning the occurrence of violence and revenge within egalitarian societies. Clearly the plausibility of the hypotheses or questions suggested by a simulation will be greater when the simulation is more firmly rooted in evidence. However, hypotheses and questions can be worthwhile investigating whatever their source, and at least having a simulation grounds and defines the question in a precise way, making clear what it might explain and the sort of other issues and questions that might accompany it.

3.2.9 Creation/Critique of Scenarios

Berman et al. (2004) present an example of scenarios being used to constrain models to produce simulations of the wider consequences of those scenarios (as measured by relevant socio-economic or environmental indicators or by their possible influence on human institutions) that can then be used to inform discussions with stakeholders and may ultimately produce a better understanding of such changes. Bharwani et al. (2005) use climate change scenarios to investigate adaptive decision making among villagers in the Limpopo district of South Africa, focusing on the use of seasonal forecast information in farming strategies. Data from the Hadley Centre climate model – HadAM3 – showing a 100-year drying trend with increasing potential evapotranspiration (PET), were used as model input (providing PET and precipitation values). Results show a degree of resilience to these changes is afforded when the forecast is correct 85 % of the time so that farmers establish increased trust in, and use of, seasonal forecasts. They are able to choose cropping strategies that are suited to climate change, though this behavioural shift may only occur over a very long time-frame.

Bharwani et al. (2005) introduce the use of scenarios into the methodology in a further and very interesting way: by postulating them as ‘drivers’ of actors’ decision-making processes. In this ethnographic approach, the authors combine simplified scenarios across different domains (irrigation, forecast and market) asking respondents what they would do under each scenario, in a given context. This information was then used to produce the model rules for the agents’ decision-making.

In either case, where conventional scenarios used in futures planning can seem rather terse and lacking in specifics – which may be a limitation to their subsequent use in policy discussion – simulation outputs that explore scenarios offer a great deal of detailed information “that would be difficult to imagine otherwise” (Berman et al. 2004: 410). Moreover, this can apply at different levels of analysis from trends in macro variables down to the impacts on different sectors and regions, as well as differentiated impacts for agents fitting any given ‘profile’ in which the analyst is interested. Perhaps greater care has to be taken, however, in the use of model-generated scenarios, to ensure that these are not taken as ‘more accurate predictions’ by virtue of being ‘computed’ stories rather than conventional ‘imagined’ stories.

Scenarios are often used in policy discussions, e.g. climate change. However they are usually somewhat vague and/or only described in qualitative terms. Simulations can be used to produce consistent scenarios or to produce models that instantiate aspects of given scenarios (Taylor et al. 2009).

3.2.10 Intervention with Stakeholders

Instead of developing a simulation to represent some aspect of society, one can also try to use a simulation to intervene in society. That is: use a simulation to change some interaction between stakeholders, for example to facilitate collective decision making or mutual understanding. One well-known approach is the Companion Modelling approach which has been developed in the last decade (see Chap. 10; Barreteau et al. 2013).

An example is demonstrated by Etienne (2003). Here, a model is used in conjunction with a role-playing game to show chosen participants the issues that can arise when several users compete on a pastoral resource. The building of the model was an integration of multidisciplinary knowledge acquired on French Mediterranean sylvopastoral systems into a model capable of representing the interactions between ecological dynamics and social behaviours. In order to help foresters and livestock farmers to better integrate these interactions into their planning work, a multi-agent system was designed to simulate different management strategies and to compare their impact on forest quality. This model was coupled with a role-playing game (RPG) initially developed as a didactic support to sylvopastoral training programmes and very soon, it proved useful in the negotiations and interactions between livestock farmers and foresters involved in the management of the same forest. The tool revealed itself flexible enough to make it possible to play with actively involved stakeholders such as the current users of the resource (local farmers and foresters), with potential regulators of the system (managers or administrators), technical experts (extensionists, technicians) or learners concerned with the topic (students, scientists).

This model is effectively an intervention between the livestock farmers and foresters by being a subtle mediating tool, allowing the stakeholders to play at decision making, to educate them in the possible effects of their decision-making and to thus encourage debate and introspection. This model has also been used for didactic purposes (Sect. 26.3.1.1) by getting agronomy students to play it.

4 Inputs and Results of Simulation Models

One method of assessing the use, and ultimately the success, of a simulation for understanding aspects of society is to tease out what has gone into making a simulation model, the input, and how the results from the simulation are interpreted and used, the output. These, the input and the output together form the mapping from the computer program and its calculation to and from the target of study. They are crucial parts of what characterises a simulation, even if they tend to be described in a less formal manner than the simulation code and behaviour.

4.1 Inputs

What is put “in” to the design of a model tends to be more explicitly distinguished in papers than what comes “out”. This might be because the “job” of a simulator is seen as a process of deciding what processes and structures will go in to a model and because the inputs are under the control of the simulator in a way which the results certainly are not, and hence can be displayed and talked about with greater confidence. However, all social simulations are based on a raft of different assumptions, settings and processes. These are somewhat separated out for analysis here.

4.1.1 Evidence-Based Assumptions

If there is some evidence about the nature or extent of the processes that are being observed then this can be used to inform the set-up or structure of a simulation. For example, evidence from social psychology might be used to inform the specification of the behavioural rules of a set of agents in a simulation, or the narrative account of a participant used as the basis for programming a particular agent.Footnote 17 Of course, it is rare that such evidence constrains the possible settings and algorithms completely but rather that it partially constrains these or constrains them in conjunction with additional assumptions from another source. Clearly the more assumptions can be constrained by evidence (either directly or as the result of previous research) the better. The presence of other assumptions and inputs does not make a simulation useless, especially if documented, it is just that simulation results and usefulness are relative to the assumptions, so that if assumptions are included that are completely misguided and critically affect the results, then this would seriously limit their use with respect to the observed world.

4.1.2 Indirectly Inferred Settings

In situations where there are some parameters that are unknown, and where there is a relative abundance of time-series data, one can attempt to infer the values of these by seeing which parameter values result in the model giving the best fit to a segment of the time-series data. This is a sort of evidence-based setting, but it often seems to be used when the parameters concerned do not have any discernable meaning in terms of the target of modelling. Thus when there is a tradition of using a certain kind of decision or learning algorithm in an agent, then this might be “fitted” to an initial segment of the data (so called “in sample” data) even when it is unlikely Footnote 18 that the algorithm corresponds to how the target agents think. Thus the credibility of this technique is dependent on the reliability of the other assumptions in the model, and the meaning of the parameters being fitted. If the parameter was a scaling parameter, then this might well be a sensible way to proceed.

4.1.3 Documented Theoretical Assumptions

Clearly, researchers do not invent all the details and algorithms of their model from the ground up, but are doing their research with knowledge of certain approaches and algorithms and within a community of other research, with established techniques and traditions. Thus many parts of a simulation model will be based on those of other models, or algorithms from other fields. Thus many models in Economics will use a decision algorithm based on constrained comparisons of predicted utility and other models might import techniques from the fields of Artificial Intelligence or Evolutionary Computation. It seems impossible to completely avoid all such theoretical assumptions; however there are distinctions to be made in terms of the strength of the assumptions, the likely biases behind such assumptions and the degree to which they are evidence-based.

“Strong” assumptions are those that are surprising or seem to specify conditions that are rarely observed. Thus an assumption that an agent has in effect a perfect model of the economy in its head is a very strong assumption, since even experts find it difficult to understand the economy as a whole. Strong assumptions are often introduced to allow for analytically solvable models to be specified and used, for example the assumption of perfect information in game theory. Whilst analytically tractable models were necessary when there was no other avenue for the precise modelling of many kinds of phenomena, the advent of cheap computing power and accessible simulation platforms means that often more appropriate methods are now available, with analytic models possibly being used to check or understand the reference simulation model, rather than being the focus. Clearly, other things being equal, weaker assumptions are preferable to strong ones – the stronger an assumption the more evidence is needed to justify its use. In any case all such assumptions should be as fully documented as possible.

4.1.4 Explored Conditions

In much simulation work there will be a focus hypothesis or set of hypotheses that are being investigated. In these cases it is usual to try the simulation using that hypothesis and then compare the results to those coming from a version of the simulation with a different hypothesis implemented. This provides evidence about the possible effects of that hypothesis on the outcomes, allowing comparison with evidence and possible subsequent inference as to which is more likely to be the case. The clearest case of this is testing the significance of the inclusion of a hypothesis against that of a “null” model Footnote 19 to see if the properties of the results that are deemed significant indeed result from the hypothesis or from other aspects of the model. Thus a simulation of a stock-market might compare the results obtained with intelligent agents that notice patterns in pricing and try and exploit these to agents that buy and sell at random. Unfortunately it is sometimes the case that a simulation is presented purporting to show the significance of a hypothesis without indicating what the comparison case is.

4.1.5 Randomness and Other Essentially Arbitrary Assumptions

A simulation modeller is often faced with deciding how to design a part of a simulation model for which there is neither evidence nor any tradition of modelling to guide them. In such a case one might simply make that aspect random. For example, where it is unknown how a kind of choice is made in the modelled situation it might be implemented as a random choice in a simulation model of that situation.Footnote 20 This is usually done in conjunction with a “Monte Carlo” approach which runs the simulation a number of times and averages the resulting different sets of outcomes. Presumably this is done under the assumption that the introduced randomness will be averaged out leaving only the effects of the other design settings – however this assumption is rarely proved but often simply remains a hope. Of course, if it can be shown that the value of the particular input does not influence those aspects of the results that are deemed significant by a series of simulation experiments (or otherwise) then a random input or process might well be acceptable. However, in this case, a constant value might be simpler and have the same effect.Footnote 21

We suspect that many uses of randomness in simulations are in the nature of a programming “stub” – that is, a stand-in that the programmer intends (or intended) to expand to a more plausible algorithm at a later date. Whilst this is perfectly acceptable during model development and to some extent inevitable given that researchers always have time constraints, such stubs are likely targets for criticism by other researchers. At the very least some exploration of them to assess the extent to which they affect those aspects of the results deemed significant is advisable.

Randomness can be considered as a special case of a broader class of assumptions: those that are added into the model simply to get it to run, and for no theoretical or evidence-related reason. Hopefully these are honestly declared rather than “dressed up” under some other category, although often these are excused under the broad umbrella of “simplicity”.Footnote 22

4.1.6 Undocumented Assumptions

It is not feasible to document all of the assumptions in a model. Firstly, this might take too much space in a single paperFootnote 23 and secondly, many might be previously established and well known to those in a particular field of work. However it is also likely that researchers are simply not aware of all the assumptions inherent in their simulation models, due to the limitation of human cognition.Footnote 24 Clearly, it is part of the job of other researchers to point out undocumented assumptions where these can be shown to be significant.Footnote 25

4.2 Outputs

A similar set of distinctions can be made about what comes out of a simulation, the results. There is not an obligation to describe all the outputs from a simulation, but rather one tends to get a sample of results, which typically is composed of: sample results, sensitivity analyses, evidence of validation, and the outcomes from experiments designed to test a hypothesis. However not all the details of the results are considered as equally significant – we now consider each of these in order of increasing significance.

  • Firstly, there are those aspects of the results that are considered as artefacts of the model, for example the randomness that might have been input to the model.

  • Secondly, there are those features that might be considered to reflect some of the model structure and the processes that result from them. These features may not be one of those parts of the simulation that reflect what is being modelled, but may be caused by theoretical or arbitrary assumptions that were put in. These features of the results may well not be so much of a surprise to the modeller.

  • Thirdly, there are those features of the results that are interpreted as indicating something about what is being modelled, for example they may suggest a hypothesis about those phenomena. That is they indicate a possibility that may be inherent in what is being modelled or that is possibly inherent in the target of modelling. This may well go beyond what can be directly validated in the model but, for example, track counter-factual possibilities concerning what might have occurred.

  • Lastly, there are those features that would be positively expected of the phenomena being modelled. That is, if it were not present then it would be taken as evidence that there was something amiss with the model. In other words, it is a necessity of the phenomena. It is against this category of results that models are validated.

It is not easy to distinguish these different categories of significance in terms of the results, since causation within a model can be very complicated, being a result of many of the model aspects interacting together. It is also usually the case that the modeller has hypotheses (or assumptions) about what aspects of the results are significant in which ways. This is crucially useful information to impart to a reader interested in the results. However, this is often left implicit.

One might justifiably criticise many social simulations in terms of the lack of empirical grounding of both inputs and outputs. Many social simulations have only the weakest connection with anything observed: the inputs are largely assumption-based, and indeed often highly artificial; the outputs only relating in the broadest way to any data and then only in terms of a few aspects of the possible outputs (i.e. only a few selected aspects are deemed significant to what is observed and then in the loosest, “hand waving”, manner). It may well be that simulating human society is just very, very difficult, and one suspects that it is simply easier to stick to considering abstract ideas.

5 An Assessment of Simulations for Understanding Social Phenomena

As discussed in previous sections, complex social phenomena containing multiple interacting actors can be computationally analysed using ABM. In this sense it is worth noticing that evidence-driven modelling approaches tend to guide modellers towards more up-to-date data and better understanding of social phenomena than theory-driven approaches. To avoid using highly speculative (“strong”) assumptions whilst modelling, it is essential to have very detailed knowledge of the phenomena in question. That comprehensive grasp, backed with evidence, is helpful to identify relevant model parameters, estimate configuration values and evaluate simulation results.

Perhaps the biggest open challenge for modellers is to work out how results from social simulation models can be useful beyond theory and hypothetical illustrations. To harness the potential explanatory prowess of ABM in the social sciences, it seems necessary to safeguard models with scrutinised findings from large amounts of data about the social phenomenon. Nevertheless this is currently very difficult as access to such datasets is often hampered either due to (Lucas 2011):

  • Being non-existent, thus requiring funding to collect and process all relevant information;

  • Being unavailable because of privacy agreements, such as in non-disclosure agreements;

  • Or, if at hand, being incomplete or outdated.

Validating outputs that have no comparable evidence is an eminent issue, and this seems only clarified by comparing simulation results with further data. In (Lucas 2010, 2011) there is a discussion of a survey carried out in 2009Footnote 26 with 12 leading academic researchers – each of them having managed mid to long-term (3 to 5 years) social simulation projects in Europe and the United States – regarding how their endeavours were modelled, applied and whether these have been useful beyond theory. The questionnaire consisted of the following questions aimed at eliciting views:

  1. 1.

    How were fieldwork findings used to guide the simulation development?

  2. 2.

    What were the contributions of simulation results to stakeholders?

  3. 3.

    Were simulation results regarded by them as useful as fieldwork findings?

  4. 4.

    What could have improved the chances of providing these via simulations?

Three businesses offering simulations, which take into account social behaviour, were also approached, but all refused stating non-disclosure agreements. All interviewees mentioned that their models targeted the scientific community and could only, at best, provide plausible results regarding scenarios and that, albeit coherently justified, nothing obtained in simulations could be regarded as directly useful for policy-making purposes. All 12 cited that gathering detailed data about the actual phenomena by interviewing stakeholders and reviewing existent literature helped to better understand their context and served as a good guide during the modelling process. More than half (7) said stakeholders and policy-makers were not interested in simulations per se, and that only real success cases (even those with only anecdotal evidence) are what are taken into account in decisions. On the other hand few (3) mentioned that simulations, despite their shortcomings, attracted significant interest from stakeholders and policy-makers. In their view the modelling process is a time-consuming task and occasionally even regarded it as unproductive. This contrasts with positive experience with fieldwork, which – despite also demanding a lot of time – a majority (10) confirmed was useful in acquiring relevant new knowledge to practitioners. Engaging with stakeholders and policy-makers was interpreted by all (12) as indispensable to improve their understanding of the actual social phenomena. Yet maintaining efficient interaction and managing practitioners’ interest over many months of work engagement was generally deemed as strenuous and difficult. This finding is partially supported by other larger projects regarding the effective collaboration between policy-makers and researchers such as (Young and Mendizabal 2009). Some (4) cited that social simulation models perhaps could be integrated in tools for aiding mediation of group decisions. Lacking confidence about model results is another aspect raised by all (12) interviewees, along with difficulties of coding or interpreting qualitative data appropriately, plus communicating the model itself and its results intelligibly to a non-technical audience. Some (6) interviewees said that they had no intention to influence the social phenomenon in question, but only to model it plausibly.

Commissioned fieldwork, to date, has greater chances of being timely useful in this sense, as resultant reports can provide very specific and up-to-date information that is easily understood by stakeholders. The implication of such new information is usually more quickly recognised than results obtained in simulation models, as these usually deal with intricate processes over longer time scales. Claims that social simulations could support, or guide, decision and policy-making seem only possible with aid of in-house experienced modellers working in close liaison with stakeholders. That is necessary as, meeting social simulation aims and objectives (beyond theory) is still an experimental process of many trials and errors, which requires good technical judgement to know what should be done when simulation results have gone beyond the existing evidence (what is factually known).

A survey of agent-based modelling practices in the literature (Heath et al. 2009) revealed that there were more applications in social sciences (24 %) than in public policy (8 %) among the 279 articles published between 1998 and 2008. Of the 68 social science applications, a majority (66 %) used ABM as a hypothesis generator for systems that were assumed to be less well-known, and the remaining articles (34 %) used ABM as a mediator in order to represent a system that was moderately understood and gain insights on the system’s characteristics and behaviour. In contrast, only a small portion of the 23 public policy applications (4 %) used ABM as a hypothesis generator, and most articles (96 %) used ABM as a mediator.

6 Conclusion

Simulation has undoubtedly helped to improve our understanding of human society, although in a number of different and usually indirect ways. It is fair to say that, so far at least, this has served to improve our understanding of some societal processes and our ideas about society rather than directly in terms of being able to strongly predict aspects of society or conclusively test hypotheses about society.

Simulation is not a replacement for other ways of understanding societyFootnote 27; it is simply a flexible way of precisely modelling it in a way that can represent some of the dynamic and complex aspects of it. It can be especially productive in conjunction with other approaches. For example, analytic models can be used to check the outputs and properties of a simulation model and help us understand the model and, conversely, a simulation used to probe and check some of the simplifications and assumptions used in an analytic model. In participatory models, social science techniques of engagement and elicitation can be used to inform the construction of agent-based social simulations as well as the simulations suggesting what might be usefully investigated in terms of the collection of new data.

Clearly social simulation has some way to go in terms of the maturity of its method and the reporting and use of simulation models. There are still a number of areas in which the methodology needs substantial improvement and standardising. There are also significant unresolved issues, such as how to decide what level of detail to include, and to what extent one should rely on prior theory.

We predict that simulation will be even more significant in helping us understand human society in the future, in particular where it is used in close conjunction with other relevant approaches.