1 Introduction

In agent-based social simulation, the process of validation is understood as an evaluation of the representation relationship between the computational model and the target phenomenon (Cioffi-Revilla, 2014; David, 2013; Edmonds, 2000; Gilbert & Troitzsch, 2005; North & Macal, 2007; Rand & Rust, 2011; Squazzoni, 2012). Interestingly, this relationship is usually defined in success terms, i.e. a validated model is one that ‘successfully’, ‘adequately’ or ‘satisfactorily’ represent the target phenomenon. The problem with success definitions is that they are usually underspecified. In the case of validation, it is not entirely clear what it means to ‘successfully’, ‘adequately’ or ‘satisfactorily’ represent the target phenomenon. Criteria are not made explicit nor included in the definition, ultimately, leading practitioners to adopt a procedural approach to the testing of representational adequacy in which methodological and operational aspects of the evaluation process are overemphasised. Often, a model is considered validated if well-known tests, e.g. sensitivity analysis or empirical output validation, are performed to a, usually statistical, degree of success.

The idea that representation is not simply a methodological issue has been previously acknowledged in the agent-based social simulation literature. It has been claimed, for example, that the disciplinary background might have an effect on the way a researcher approaches validation practices (Rossiter et al., 2010), that simulations are shaped by the modeller’s response to contextual constraints (Graebner, 2018) and that the community is needed to define what constitutes adequate representation (Ahrweiler & Gilbert, 2005). This acknowledgement, however, has not encouraged practitioners to robustly characterise the nature and effect of the different criteria they use to validate their models.

This relatively underdeveloped theoretical account of validation in agent-based social simulation heavily contrasts with the one in general simulation studies and the philosophy of simulation, where issues of evaluation have been extensively discussed (e.g. Balci, 2003; Beisbart & Saam, ??; Jebeile & Barberousse, 2016; Morrison, 2015; Oberkampf & Roy, 2010; Sargent, 2013; Winsberg, 2010). While the already existing literature can be used to theoretically inform practices of validation in agent-based social simulation, its usefulness is partially limited, for it centres mostly on equation-based modelling in formalised disciplinary areas. There are, initially, some key methodological differences: in agent-based models, the focus is on exploring dynamics produced by computational entities with autonomous decision-making interacting with each other and with the environment, rather than on finding numerical solutions for systems of equations. Thus, typical sources of uncertainty in equation-based models, e.g. numerical approximation error (Roy, 2019), might not be present or have a different effect. In turn, variations in the way agent-based models are used to produce knowledge warrant an alternative epistemological approximation (Primiero, 2019; Winsberg, 2019), and, therefore, a somewhat distinct evaluation process. There are, as well, additional theoretical limitations imposed by the distinctive nature of social theory and data. For instance, formal approaches to evaluation in social simulation are rare, even in comparison with agent-based simulation in other domains (Bakar & Selamat, 2018), and typical validation techniques that rely on precise quantification, e.g. benchmarking, are unsuitable (Saam, 2019).

The following discussion addresses an aspect of the validation of agent-based social simulations where the prior literature on equation-based modelling in formal disciplines might prove insufficient: social epistemology. Any account of validation in computer simulation should explicitly consider social epistemology, for some criteria to judge adequacy of representation in computational modelling rely on the normalisation of beliefs and principles not related to the application of any given validation technique or the theoretical-methodological framework behind a computational implementation of the phenomenon of interest. In the case of agent-based social simulation, it will be argued, the analysis of social epistemology has the potential, first, to provide unique insights into how practitioners’ beliefs about representational adequacy are impacted by distinctive disciplinary dynamics in the social sciences, particularly in regard to processes of theory-building and testing, and, second, to evidence that standardisation tools have only limited scope in capturing the multiple criteria used to validate agent-based models of social phenomena.

The text is structured as follows: the next section briefly addresses the key tenets of social epistemology and presents some general examples of its influence on the everyday practice of agent-based social simulation. The following two sections use the processes of interpretation and commensuration, respectively, as proxies to elucidate the impact of social epistemology on the validation of agent-based models. The fifth section briefly discusses potential benefits and challenges of incorporating elements of social epistemology in the conceptualisation of validation and representation. Some general conclusions are presented last.

2 Social Epistemology in Agent-Based Social Simulation

Social epistemology emerged in the late twentieth century to challenge the assumption, standard in epistemology, that knowledge justification is a process entirely dependent on an individual’s cognitive faculties. The social nature of justification becomes evident in local instances of interaction in which knowledge is reported/ acquired (Lackey, 2011) or contested (Christensen, 2007), as well as in the institutional and normative context in which knowledge justification processes take place. Regarding knowledge acquisition, for example, new entrants to social simulation need not develop practical and theoretical knowledge anew, for they can easily resort to formal and informal learning dynamics to acquire it. During the learning process, new entrants will likely respond more positively to knowledge acquired from individuals or sources they consider authoritative (Kitcher, 2011) or with which they share a sense of group belonging (Boghossian, 2011).

Similarly, practitioners might be pressured to adjust research goals to cover for institutional or normative issues such as funding demands (Ankeny & Leonelli, 2011) or engage in types or topics of research where the perceived potential reward is higher (Strevens, 2003). The growing popularity of computational modelling of public policy could be seen as a result, on one hand, of an increasing interest (from different stakeholders, including funding sources) in research with real-world impact and, on the other hand, of practitioners exploring novel topics they believe could become relevant in the future.

Due to the interdisciplinarity of agent-based social simulation, there are also some interesting institutional social epistemology dynamics taking place. Knowledge production and diffusion are not uniform across contexts, among other things, due to variations in communication of and interaction with evidence (Bird, 2014). In agent- based social simulation, there are some differences in the type of validation methods employed (Moss, 2008) and in the style and structure of reporting (Angus & Hassani-Mahmooei, 2015) that can be traced back to alternative traditional disciplinary customs and conventions.

By explicitly acknowledging social epistemology, practitioners of agent-based social simulation will be able to better conceptualise how their everyday practices are affected by a distinctive social, cognitive and physical organisation. When it comes specifically to representation, the inquiry into social epistemology adds another dimension to the discussion about pluralism and contextualism in modelling that has so far taken place in the literature on social simulation. To date, differences in the approach to the testing of representation in computer simulation are more often explained through individual cognitive features, such as modelling goals, interests or knowledge. Social epistemology offers means to link these diverse modelling choices to more general knowledge production and transfer dynamics.

2.1 The Social Underpinnings of Validation

Social epistemology influences in several ways the corroboration of hypotheses by experimental results, the area in which verification and validation operate. Nonetheless, in contemporary philosophy of science, this influence is usually framed within the context of the constructivism-realism debate. It might be expected from agent-based social simulation to incorporate diverse constructivist principles, given the popularity of constructivism in mainstream social science. That, however, is not the case. Realism is widespread in social simulation. Even qualitative-oriented approaches to social simulation, e.g. participatory or companion modelling, stand closer to the principles of realism. Variations in modelling styles and practices that could eventually affect warrants for belief in the adequacy of a simulation are often addressed in technical terms. Questions about realism in the representation of the target phenomenon, for instance, are dealt with mostly as an issue of empirical calibration. For many practitioners, the more data the model is able to accommodate, both for calibration of validation, the more realistic it is.

The neglect of social epistemology in agent-based social simulation might be linked to two major aspects: first, the foundational philosophical literature has regularly centred on methodology. The method is presented as an alternative or third way between the qualitative-quantitative, deductive-inductive, natural-formal languages, formal theory-experimentation, and computational-social sciences (Anzola, 2019a). In this alleged methodological synthesis, epistemic aspects such as representation have been pushed to the background. Second, a relatively formal theory of confirmation has been implemented in social simulation. The notion of empirical confirmation in social simulation is strongly associated to the algorithmic nature of agent-based models. This is probably due to the fact that the verification-validation scheme was not developed directly by practitioners of agent-based social simulation, but adopted from computer science and software engineering through complex knowledge transfer dynamics (Anzola, 2019b). As a result, during the evaluation process, the experimental and representational features of agent-based models are often subordinate to the algorithmic.

Because of the emphasis on methodology, especially on the formal and technical aspects of computer simulation, issues of social epistemology have usually been tackled through standardisation. The most paradigmatic standardisation tools are protocols and frameworks (e.g. Becker et al., 2005; Ghorbani et al., 2013; Grimm et al., 2006; Grimm et al., 2010; Janssen et al., 2008; Laatabi et al., 2018; Richiardi et al., 2006; Wang & Lehmann, 2007). Although the focus is different, ultimately, both tools seek to structure and formalise research practices. Frameworks are aimed at improving practices through abstraction and generalisation. They are based on the assumption that widespread systematisation and categorisation at a higher level, e.g. meta-models, code reuse/repositories or schemes and rules for defensive programming, can lead to better science. Protocols, in turn, are meant to improve practices by providing detailed descriptions of computational models. They work under the assumption that adding rigour, clarity and transparency to the model description can facilitate evaluation and replication.

While standardisation tools have clearly helped make some tacit knowledge explicit in agent-based social simulation, they cannot moderate the effects of social epistemology on validation. The following two sections do not discuss social epistemology directly, for that would require elaborating on a lengthy and intricate conceptual framework, but use, instead, the processes of interpretation and commensuration as proxies. Interpretation and commensuration are transversal to the practice of science. Their effects are not exclusive to validation, or evaluation processes in general, nor are entirely dependent on social epistemology. They, nonetheless, provide useful insights into how the physical, social and cognitive organisation of social simulation impacts warrants for belief in the representational adequacy of a computational model.

3 Interpretation

Interpretation can be briefly described as the process of linking empirical evidence (provided by direct or indirect interaction with the phenomenon of interest) with scientific hypotheses and background knowledge (in the form of data and theories) through a second-order analysis. Issues of interpretation in agent-based social simulation can arise from three different sources: the process of modelling (transformation of the conceptual model into the computational model), the process of confirmation (calibration, verification and validation) and the process of communication and socialisation of findings (the narrative put forward in different forms of scientific reporting). Problems of interpretation need not be exclusive to any particular source. Given the scope of this text, the discussion will centre on issues of interpretation associated with processes of confirmation.Footnote 1

3.1 Holist Underdetermination

During the process of confirmation, researchers must evaluate to what extent empirical results constitute evidence for the theory or hypothesis being tested. Perhaps, the most distinctive problem of interpretation during confirmation processes is what is commonly known in philosophy of science as underdetermination of theory by data. There are two basic forms of underdetermination, labelled here as holist and contrastive (Stanford, 2017). The former, often referred to as the Duhem-Quine thesis, is the most popular form of underdetermination. It puts forward the idea that scientific theories and hypotheses are not tested in isolation. Empirical consequences, it is claimed, can only be derived from a conjoined network of hypotheses, working together with principles and beliefs about the functioning of the world and the practice of science. This claim has several major implications. The present discussion, however, will focus solely on the effect of underdetermination on the process of linking the simulation results back to an underlying network of background knowledge.

Agent-based social simulations are not used to produce knowledge in the same way as the equation-based models regularly considered in the philosophy of simulation literature. Hence, holist underdetermination would be expected to manifest differently in each case. Equation-based simulations, it is argued, are downward, motley and autonomous (Winsberg, 2001). Downward is used in the sense of already established theory often serving as a starting point and as a source of epistemic justification or entitlement for the computational model; motley, in the sense of there being extra-theoretical aspects that need to be decided upon to run a simulation; autonomous, in the sense of judgements about the adequacy of the simulation not being entirely dependent on comparison with external data. The categories do not instantiate in the exact same way in every equation-based model and disciplinary domain. For example, not every equation-based model is derived from already established theory. Still, the typology is useful when considering problems of underdetermination, for it addresses a model’s relationship both with theory and data.

In the case of agent-based social simulation, because of the multiparadigmatic and unformalised nature of social theory, models cannot be characterised as downward. Often, models are developed with only a loose connection to theory and rarely draw on it for epistemic support. In turn, practitioners of social simulation rely on alternative motley methods and techniques. For example, parameterisation or numerical solutions methods are not commonly used. Finally, due to the nature of quantification in social science, additional concerns must be addressed in the comparison of the simulation output with external data. First, lack of correspondence might not be so easily identifiable, for comparison need not be made based on numerical magnitudes. Second, and more important, practitioners of agent-based social simulation operate under a relatively simplistic separation between empirical and artificial data that, among other things, does not sufficiently address the value-ladenness of external data.Footnote 2

Even though the categories of downward, motley, and autonomous do not apply to equation- and agent-based models in different disciplinary areas in the same way, they can still be used to identify sources of holist underdetermination. The extent to which this type of underdetermination impacts validation practices in agent-based social simulation is strongly context-bound. Initially, while not entirely downward, warrants for belief in the adequacy of a simulation do vary according to the level of formalisation and integration in the background theoretical-methodological framework of a model. In formalised areas of social science, hypotheses and theories are more explicitly connected, and basic philosophical principles and beliefs are more uniform. The validity of payoff matrices in iterated games is not extensively discussed in academic reporting, for it is already a standard interaction structure in social science. Most practitioners, conversely, would expect an interaction structure developed ad hoc for an agent-based model to be justified with data or theory. Even though, similar to mainstream social science, agent-based social simulation evidences low levels of formalisation, the literature on evaluation has yet to explore how these variations moderate validation processes.

Holist underdetermination also has different effects depending on the factors that could be categorised as motley, e.g. purpose of the model, type of data used for implementation, calibration and validation, and dimensions and locus of comparison between the model and the target phenomenon. In agent-based social simulation, the overall importance given to the validation process, as well as the amount of epistemic resources used to validate a model, vary in accordance with the purpose. Some practitioners even consider that simple abstract models do not require validation because they are not directly contrasted against empirical data (Edmonds et al., 2019).Footnote 3 Likewise, the systematic use of qualitative data, something distinctive of social simulation, makes it harder to map the relationship between data, models and theories, for, in contrast to quantitative data, qualitative data is linguistically more expressive. Finally, because agent-based models are not implemented with the goal of finding a numerical solution for a system of equations, representational adequacy need not always be tested in the same way. A model might be expected to reproduce specific values or just general tendencies, and be judged based on a comparison of the target phenomenon against the simulation input, output, process or a combination thereof (Rand & Rust, 2011; Tesfatsion, 2017).

Perhaps, where agent-based social simulation evidences a more distinct risk of holist underdetermination is in its epistemic autonomy. Equation-based simulations are considered autonomous because they are regularly used in the study of phenomena for which there are issues of accessibility, availability or reliability of data, thus, the limited scope that ‘comparison against available observations’ has as an evaluation criterion. However, because of the formalised and often downward nature of these simulations, while not straightforward, the role of external data in the potential reduction or elimination of underdetermination is relatively well understood (Jebeile & Ardourel, 2019; Jebeile & Barberousse, 2016; Lenhard & Winsberg, 2010). That is not the case with agent-based social simulations. The matching of agent-based modelling’s computational expressiveness and social theory’s multiparadigmatic and unformalised nature has fostered the emergence of a multiplicity of decidedly distinct models that are meant to represent the same or similar phenomena (with varying degrees of precision and specificity). Given the inconclusive nature of evidence in social science, external data, regardless of the amount and quality, cannot be used in a confirmatory manner.

Take the case of Schelling’s (1971) model of residential segregation, a canonical example of agent-based social simulation. The model is a simple cellular automaton where agents, divided in two populations, must decide whether to stay in the same place or relocate, based on local preferences for similarity. In part, the model became popular because it showed that even mild local preferences for similarity lead to clear segregation dynamics at the population level. Due to the potentially major social implications of this result, the model has been continuously replicated and extended, with varying degrees of contrast against empirical data (Bruch & Mare, 2009; Huang et al., 2014). It is, arguably, the agent-based social simulation for which the extended network of hypotheses has been more extensively explored, which, following the orthodoxy, should make it less susceptible to underdetermination and, overall, more validated.

It is not clear, however, whether these multiple replications and extensions have significantly reduced Schelling’s model risk of underdetermination. The flexibility that practitioners experience when it comes to modelling, because of the features of both agent-based modelling and social theory, hinders the identification of the knowledge being tested and the extent to which it is being tested when a simulation is executed and the results are later validated. This, at the same time, makes it difficult to univocally determine whether a replication or extension is reducing the underdetermination of Schelling’s model or is, instead, contributing more generally to the research programme on residential segregation.

Most researchers will argue that Schelling’s model is about how variations in individual preferences for similarity impact the clustering that emerges at the macro level. Since the original model can be modified in aspects such as its spatial structure, population composition, preference function or relocation decision, several questions could be made about what eventually counts as an implementation of the model. They could refer, initially, to the behaviour of the computational implementation, e.g. whether any kind of clustering constitutes evidence for segregation, or to the model’s capacity to reflect the features of the real-life phenomenon it intends to represent, e.g. whether warrants for belief could be significantly affected if the original cellular automaton is replaced by a GIS. Further inquiry could revolve around the need to make sense of the output at a higher level of abstraction. If the model uses, instead, a population of agents that tend to integration, does it still count as an implementation of Schelling’s model? Likewise, did Schelling develop a model about residential segregation or about a mechanism of spatial asymmetry?

A large amount of questions related to the model’s representational capacity or its implementation have been previously discussed in the literature. Benenson and Hatna (2011), for example, suggest that different types of segregation might emerge if some aspects of the implementation are modified. Similarly, Crooks (2010) claims that more intricate spatial structures affect the overall dynamics of segregation. Zhang (2004), in turn, argues that segregation emerges even if agents prefer to live in integrated neighbourhoods. Finally, Clark and Fossett (2008) argue that analysing the underlying mechanism at a higher level of abstraction might not yield understanding about social dynamics of segregation. Different extensions and replications do not all produce the same or equivalent results. Yet, variations in the simulation output are not used to prove or falsify previous models. Rather, they are approached as evidence of the real-life phenomenon’s complexity. Specific implementations are rarely challenged based on their diverging output, first, because segregation at the population level remains a consistent pattern, so most authors will argue that their model proves the robustness of Schelling’s original conclusions, second, because the different implementations usually remain conceptually plausible and not mutually exclusive and, third, because there are no standardised criteria to decide whether a model is a replication or extension of Schelling’s model or just another model of residential segregation.

Common validation techniques in agent-based social simulation do not test the adequacy of assumptions regarding, for example, whether a model that changes Schelling’s original preference function for one where agents tend to integration should count as an extension or whether conflicting results might be taken as negative evidence for Schelling’s model. To understand the effect of the conjoint network of hypotheses on warrants for belief in adequacy of representation, practitioners must inquire into aspects such as how they make sense of evidence both to position their work within general research agendas (Anzola & Rodri̇guez-Cȧrdenas, 2018) and to engage in wider collective dynamics of justification (Ylikoski & Aydinonat, 2014).

Having access to external data on segregation dynamics could certainly help to better bridge theory and models in agent-based social simulation and reduce underdetermination, among other things, by limiting the possible changes and the parametric space explored in an extension or by setting specific dimensions or loci of comparison. It is unlikely, however, that empirically calibrating the models will permit completely ruling out problems of holist underdetermination. Initially, a researcher will rarely have access to all the relevant empirical data associated with a complex social phenomenon such as segregation. In real life, for example, decisions to relocate have been shown to be affected by a multiplicity of factors, not necessarily connected, such as race, income, marital status, the housing market or job location (Bruch & Mare, 2006). In social science, there is, as well, always the question about the extent to which any empirical data set is representative of every possible situation. As a result, empirical models are likely to remain underdetermined in at least one key dimension. For example, extensions that use external data about preferences or decision-making processes (e.g. Bruch & Mare, 2006; Clark, 1991; Tsvetkova et al., 2016) remain underdetermined regarding contextual aspects, such as the spatial structure of interaction.

External data might also prove insufficient to rule out underdetermination that is not connected to the way the phenomenon is theorised about, but to beliefs and principles regarding how agent-based modelling, as a method, can be used to acquire knowledge of social phenomena. In agent-based social simulation, modelling practices are grounded on a particular understanding of the type of phenomena that can be modelled, i.e. complex social dynamics, and what is important to model about these phenomena. For example, the temporal evolution of the simulation is interpreted as a correlate for processes of emergence (Anzola, 2021). While there will likely be widespread agreement among practitioners that Schelling’s model evidences an emergent process in which local individual action produces unintended and uncoordinated clustering at the macro level, there might not be a consensus regarding the theory of emergence being tested. Does this theory of emergence assume the existence of levels of reality? If it does, are these levels ontological or epistemological? Is their relationship of supervenience and realisation? Is there downward causation?

Holist underdetermination poses a challenge for standardisation tools, and, in general, for validation, because there is a significant amount of knowledge that is incorporated into the model interpretation and, yet, is not really tested during the process of validation. This knowledge is diverse in nature. It might be decidedly practical, e.g. about some aspects of the phenomenon for which there is no external data available, but could also have strong theoretical foundations, e.g. what practitioners believe constitutes a good explanation (and the adequate manner to report it). Practitioners rarely reflect upon the wider network of underlying knowledge during validation processes, for in most cases, it remains tacit during the simulation life cycle and knowledge transfer processes. This knowledge is not validated by practitioners in individual instances of modelling; rather, it becomes normalised through social consensuses in relatively complex dynamics of interaction.

Explicitly inquiring into the social epistemology of agent-based social simulation practices is necessary, for currently, this knowledge is unlikely to be revisited unless consensuses break down. The emergence of the KIDS approach to modelling (‘Keep it Descriptive, Stupid’; a preference for empirically calibrated models), for instance, fostered a reconceptualisation of the approach to validation practices in social simulation with its claim that simpler models are not necessarily truer (Edmonds & Moss, 2005). It made evident a growing discontent with the epistemic status of abstract models that, eventually, led to a change in warrants for belief in the adequacy of a simulation (Anzola, 2019a).

The debate regarding the representational capabilities of abstract and empirically calibrated models fostered major changes in knowledge justification processes. It evidenced, among other things, the need for practitioners to clearly state the purpose and scope of the model, so that others could judge on the adequacy of the inferences. It also brought to the fore a concern with the technical and methodological aspects of the operation and validation of a simulation, which has resulted in an increasing identification and systematisation of the tools and mechanisms practitioners use to interact with evidence. Finally, it fostered a discussion about how agent-based social simulation is inserted into the scientific methodological and institutional landscape, for example, by stressing the importance of empirically calibrating the models, in order to raise the profile of the method. Standardisation tools were not popular at the time the KIDS approach emerged. If they were, however, they might have likely remained unaffected by the debate, for, as it is evidenced by contemporary practices, they can equally accommodate abstract or empirically calibrated models.

3.2 Contrastive Underdetermination

This second form of underdetermination questions whether empirical results can be constructed as evidence for more than one theory or hypothesis. In equation-based modelling in formalised disciplines, contrastive underdetermination is comparatively less significant, for both models and theories are formal and theories regularly have paradigmatic status. While the multiple aspects considered during the implementation of an equation-based model make the relationship between theory and models one-to-many, models are more often contrasted in terms of the adequacy of the numeric approximation they provide (Lenhard & Winsberg, 2010). The discussion of competing explanations proper is still mostly reserved for theories. Conversely, social simulation, because of the multiparadigmatic nature of social theory and the looser theory–model connection, is more likely to experience difficulties with contrastive underdetermination.

The literature in agent-based social simulation shows a longstanding awareness about the potential effects of contrastive underdetermination. This awareness, however, has not led to the formulation of any sort of account of contrastive explanation (i.e. in the form of ‘why p rather than q?’, instead of just ‘why p’). For example, when Epstein formulated the generative motto: ‘If you didn’t grow it, you didn’t explain its emergence’ (1999, p. 43), arguably, the most popular explanatory principle in agent-based social simulation, he acknowledged that growing a phenomenon in silico was not sufficient for explanation, since there might be alternative micro foundations leading to the same macro pattern. His argument, though, was geared towards showing that generation was a necessary condition for explanation, so he limited himself to suggesting that candidate explanations should be dealt with in everyday practice, depending on their correspondence with empirical data.

The rigorous testing of candidate explanations, while conceptually straightforward, is rare in everyday practice. In one article, Gilbert (2003), performs a simple exercise in which he replicates the distinctive clustering of Schelling’s model using several alternative micro foundations (e.g. property values, cell history, self-defined populations). The exercise is carried out, in part, to evidence that it is possible, especially when using simple models, to grow a phenomenon in silico using a multiplicity of mechanisms. More interestingly, reflecting upon his experience as the editor of JASSS, agent-based social simulation’s flagship publication, Gilbert pointed out that contrastive explanations were far from standard in the journal and, overall, the social simulation literature.

It is not hard to understand why contrastive explanation might not be a generalised practice in social simulation: it puts a heavy burden on the process of validation. Theory in social science is not thoroughly formalised, so, in many cases, the same theory could, in principle, be used to support a variety of different models (e.g. Muelder & Filatova, 2018; Poile & Safayeni, 2016). In addition, there are several social phenomena for which available data is insufficient and unreliable, and for which proper collection processes might be expensive or even impossible. There are, as well, some instances where additional data might not be enough to rule out among competing explanations (Ahrweiler & Gilbert, 2005). Finally, there might be some cases where a model is able to fit theory and data and, yet, fail to provide an illuminating explanation (Conte, 2009) or where the modeller is purposefully inquiring about possible rather than actual explanations (Ylikoski & Aydinonat, 2014).

In an article arguing why sociologists should use agent-based models, Chattoe-Brown (2013) claims that Schelling’s model ‘certainly shows a bi-directional interaction process between individuals and social entities (‘neighbourhoods’)’ (para. 5.4), making a reference to Giddens’ (1984) structuration theory. The alleged bidirectionality and the connection with structuration theory are a matter of interpretation. The plausibility of this reading of Schelling’s model depends on the extent to which Giddens’ ‘duality of structure’, a particular approach to the reproductive character of social life, can be made sense of. Some practitioners, for instance, could find it more plausible to interpret the model in terms of Merton’s (1936) idea of unanticipated consequences of action. Merton’s approach, unlike structuration theory, emphasises the bottom-up character of social phenomena and can dispense with any form of downward causation.

In spite of the differences between the two interpretations, both are equally possible, given the loose connection between the main output of the model, i.e. spatial clustering, and the network of beliefs, principles and theories about segregation as a complex social phenomenon. The methodological features of the cellular automaton do not have a univocal straightforward interpretation. Potential interpretations are provided by the modeller and contested or agreed upon by the community, depending on shared theoretical and methodological commitments. While some practitioners might consider Chattoe-Brown’s interpretation compelling, some other will find Merton’s better. There might even be some practitioners that come up with a synthesis of the two.

In the context of Schelling’s model, all these theoretical views are underdetermined by the data provided by the simulation. Since the model does not attempt to realistically represent the reasons for individuals to move, the interpretation of the resulting macro pattern can equally accommodate accounts of preference formation that incorporate or waive downward causation. Empirically calibrating the model may reduce contrastive underdetermination in agent-based social simulation, but, again, only to a certain extent, for models can hardly be used with confirmatory purposes. Regardless of the amount and quality of data that could be potentially collected and used for the design, calibration and validation, a model alone will lack the sufficient explanatory power to rule out alternative theoretical conceptualisations of segregation.

For example, empirical research on segregation has focused on two major issues: the clarification of some aspects of measurement, e.g. the quantification of segregation, and the incorporation and production of suitable data for calibration and validation, e.g. survey data on individual preferences for relocation (Bruch and Mare, 2009; Huang et al., 2014). While empirical research on residential segregation has naturally advanced the overall understanding of segregation dynamics, there are a few issues that could make it difficult to avoid contrastive underdetermination when addressing a highly complex theoretical construct such as segregation.Footnote 4 Initially, residential segregation is an easily observable phenomenon, but its quantification is far from straightforward. There are aspatial and spatial operationalisations of segregation that do not involve the same variables nor the same scales and levels of measurement, ultimately, affecting both the outcome of the model and the conceptualisation of the phenomenon (Reardon & O’Sullivan, 2004; Bruch & Mare, 2009). In addition, the empirical research on decision-making has centred on the preferences themselves, but not on how they form (Clark & Fossett, 2008; Huang et al., 2014), which is crucial to identify whether there are any downward effects. Finally, any potentially relevant downward effect might be difficult to conceptualise, for its computational representation might be affected by non-related issues of implementation. The character of the clustering, for instance, varies depending on the computational agent’s vision (Fossett & Dietrich, 2009). Thus, a model with empirically calibrated decision-making will still not be able to avoid contrastive underdetermination associated with ecological conditions of social interaction.

If competing explanations are not evident and the results of a model cannot be used as evidence to rule out these alternative explanations, it becomes necessary for practitioners to analyse surrounding beliefs and assumptions that affect judgements on adequacy. For example, as mentioned before, agent-based social simulation’s account of explanation strongly relies on the generative principle. This explanatory principle, however, has been questioned because, among other things, it accounts for emergence, but not for downward causation (Conte, 2009). Should downward causation become an important explanatory requirement, a change in beliefs about adequacy of representation, similar to the one produced by the popularisation of the KIDS approach to modelling, will likely occur.

As with holist underdetermination, standardisation practices cannot account for contrastive underdetermination or additional explanatory requirements during validation practices. The contrast of alternative explanations is not something that standardisation tools explicitly contemplate, so practitioners must decide whether to report on it. In agent-based social simulation, however, there are noticeable negative incentives for the identification of alternative explanations: the localised nature of models (i.e. the fact that they offer relatively restricted explanations) and the looser theory–model connection. When narratives put forward in scientific reporting do not directly address alternative models and explanations, additional epistemic resources must be invested to determine whether two models offer competing explanations. For example, Hegselmann (2017) shows the level of detail in the analysis of the implementation that must be reached to identify that Sakoda’s (1971) model, a cellular automaton in which agents, divided in two populations, relocate based on the added weighted sums of attitudes (negative, neutral or positive) that they have towards both groups, can be considered a generalised instance of Schelling’s.Footnote 5

Underdetermination is, by no means, the only source of problems of interpretation. It, nonetheless, illustrates how the validation of a computational model is not a self-contained activity that can be approached as a first-order analysis. In agent-based social simulation, the model output and the available external data for validation alone do not provide warrants for belief in the adequacy of a computational model. Some criteria from which these warrants are produced go beyond the implementation and operation of any given simulation and might remain tacit for the entire simulation life cycle. Considering the scope of standardisation tools, these criteria would be left unaccounted for if the impact of social epistemology is not independently analysed.

Making these criteria explicit could further understanding about dynamics of theory building and testing in agent-based social simulation, particularly regarding the interplay between individual models, model clusters and research programmes. Due to the multiparadigmatic and unformalised nature of social theory, validation practices take place in a context where the theory–model relationship is many-to-many and novel data, because of its inconclusive nature, cannot fully rule out issues of interpretation. By clarifying the links between warrants for belief and wider networks of knowledge, social epistemology could not only render validation practices more transparent, but also contribute to relevant contemporary debates in social science that hinge on this diversity of alternative models and theories, such as the connection between model pluralism and adequacy of explanation (e.g. Aydinonat, 2018; Grüne-Yanoff & Marchionni, 2018).

4 Commensuration

Science’s success greatly depends on being able to classify and compare. Dynamics of commensuration can take different forms, depending on the disciplinary tradition. This section will centre on two popular forms of commensuration in agent-based social simulation: docking and replication. The former is the inquiry about whether two models can produce the same outcome, so a to allow for comparison, selection or subsumption (Axtell et al., 1996). The latter is an alternative implementation of an already existing model that differs from the original model in at least one of the following dimensions: time, hardware, language, toolkit, algorithm and authors (Wilensky & Rand, 2007). Two cases will be discussed to exemplify how issues of social epistemology might affect dynamics of commensuration: Axtell et al.’s (1996) docking exercise, and Will and Hegselmann’s replication of Macy and Sato’s trust model (Macy & Sato, 2002; 2008; 2010; Will, 2009; Will & Hegselmann, 2008a; 2008b).

4.1 Docking

Axtell et al.’s (1996) article discusses the docking exercise carried out to commensurate two computational models of cultural transmission, originally developed by Axelrod (1995) and Epstein and Axtell (1996). Axelrod’s model is a cellular automaton where agents are endowed with culture, codified in a 5-bit vector (each with 10 possible values). In each iteration, two agents interact, i.e. adopt the same value for one of the five vector positions, depending on a probability based on cultural similarity. Agents in Epstein and Axtell’s Sugarscape model are, as well, endowed with culture, codified in a 11-bit vector (each with 2 possible values). In each iteration, agents also have the chance to interact with a partner and adopt the same value for one bit of the culture vector. The model, however, is much more complex than Axelrod’s, for it is set in a foraging environment, where agents engage in dynamics such as trade, combat or reproduction.

The article documents the collaborative effort in which the authors of both models engaged to make the comparison possible. The collaboration started with the agreement of a research agenda that included issues such as a plan to modify the implementation of the models, the selection of commensuration criteria and a timetable for the activities required. Extensive collaboration among the authors was needed to dock the models because, while they both address the phenomenon of cultural transmission, each model follows different research interests and logics of research. While Axelrod’s model is designed to experimentally test a single mechanism of cultural convergence and polarisation, Epstein and Axtell’s model is designed to be a fully fledged artificial society (Axtell et al., 1996).

The article is interesting, for it reports on the entire docking exercise, not just its results. Hence, it provides insights into commensuration as a scientific practice. For the present discussion, it is worth noting that the authors encountered obstacles that are linked to matters beyond implementation and the comparison of the simulation output with the phenomenon of interest. For instance, standard reporting practices are criticised. According to the authors, some models are poorly documented in scientific reports, making commensuration exercises impossible without collaboration. In addition, they question the lack of incentives for docking, even though it is a practice that could clearly help both test the robustness of the results and stress the importance of commensuration in scientific inquiry. Journals and funding bodies, they suggest, should develop a normative framework that rewards this type of validation exercises.

4.2 Replication

Inadequate documentation in academic reporting impacts replication practices as well. Unlike Axtell et al’s (1996) docking attempt, the replication of Macy and Sato’s (2002) trust model is carried out as an independent effort. The original model inquires about possible causal mechanisms that explain cross-societal differences in trust-building processes. It is an iterated prisoner dilemma with option to exit, where trust formation is explored by allowing agents in a network to interact, based on conditional strategies that take into account: the agent’s propensity for cooperation, the position of both agents in the network (which makes the exchange local or global) and the perceived trustworthiness of the partner.

The replicaters initially tried to replicate the model within a pedagogical context. Yet, given the difficulties encountered, they later decided to reflect on their experience in an academic outlet. The most significant obstacle, the replicaters argue, is that the documentation for the initial model was insufficient and, at times, ambiguous and inconsistent (Will & Hegselmann, 2008a). Their effort resulted in two independent models, neither of which was able to reproduce the basic dynamics of the original model. In part, Will and Hegselmann’s exercise was negatively impacted by not being able to properly communicate with the original authors. During the process, some brief questions about the model were answered by email. Yet, the replicaters did not have access to the source code or the unconsolidated results of the original model. Following the first reply from the original authors (Macy & Sato, 2008), there is an additional replication that uses the source code (Will, 2009). While the results of the original model are successfully replicated this time, the replicater considers there to be a problematic assumption built into the model. The original authors, however, disagree with this assessment (Macy & Sato, 2010).

The lack of communication between original and replicating authors, initially, increased the duration of the replication process. For the docking exercise, results were produced after a couple of months and disseminated in a joint publication the following year. Comparatively, for the replication exercise, the original model was published in 2002, the first replication attempt was carried out in 2008 and the last publication on the issue dates from 2010. More importantly, not having direct communication increased the amount of epistemic resources used for commensuration. The narrative provided in the articles documenting the replication process evidences the complex sense-making both the original and the replicating authors went through. These processes were demanding, especially for the latter, given the lack of success in the initial replication. Even though the results of the original model were eventually replicated, the exercised ended, at least in academic publications, without a complete agreement about the practical and theoretical consequences that should be drawn from the trust model. It is not entirely clear whether the model should be considered validated.

4.3 Normalisation of Knowledge

There are two key aspects of commensuration in agent-based based social simulation evidenced by these two cases that are worth highlighting. First, it is common to find models that address the same phenomenon but do not readily lend themselves for commensuration, for they are grounded on significantly different representational strategies. Second, exercises on commensuration, even if rigorously and thoroughly performed, might lack the confirmatory power to fully accept or refute the results of a given model. These are problems the authors themselves identify, which motivates the call for communication and collaboration in commensuration exercises that is made in the articles. Over the years, the literature on commensuration in agent-based social simulation has especially centred on the potential positive effects of the former: communication.

Communication in commensuration processes is fundamental because it allows making tacit knowledge explicit. Standardisation tools could, in principle, be used to avoid problems of contingency, e.g. availability or cooperation from the original author. In a review of the ODD protocol, Grimm et al. (2010) claim that the increasing popularity of the protocol has allowed for more rigorous formulation of computational models. Having to break down the simulation processes into the different categories of ‘Overview, Design concepts, and Details’, they suggest, has made practitioners aware of their theoretical motivations, eventually facilitating practices such as replication.

The protocol, however, cannot guarantee that the necessary or right criteria are used for commensuration. Standardisation yields the process more transparent. While the success of some commensuration exercises depends on making tacit knowledge explicit, in other cases, like Macy and Sato’s trust model, there might be additional representational and operational aspects for which new forms of consensus must be reached. As such, their resolution hinges more generally on the social epistemology of the agent-based social simulation, particularly, the criteria and procedures that practitioners have adopted or devised for the resolution of disagreements. Standardisation tools do not entirely solve issues of validation, for they might help make tacit knowledge explicit. Yet, they cannot normalise it.

The issues that require new consensuses and normalisation do not arise from the process of validating an individual model, but from beliefs and principles about the knowledge that can be obtained through agent-based modelling as a method and agent-based social simulation as a scientific practice. In general science, for instance, replication is often taken as a hallmark of the rationality of science. On one hand, successful replication constitutes one of the most robust types of confirmation; on the other hand, it stresses the peer-reviewed nature of scientific discovery (Burman et al., 2010; Collins, 1992; Resnick, 2013). In agent-based social simulation, conversely, there are different approaches to replication due to the unevenly distributed skill and knowledge landscape. Since practitioners are not trained as computational social scientists, new entrants, particularly those with a background in social science, often lack programming skills. These skills are occasionally acquired by replicating previous computational models. The reason is simple: this approach bypasses the nuances associated with formulating and validating the model. It allows new entrants to focus on the technical aspects of computational modelling, while still providing contextually relevant learning practices. This view of replication as a task for new entrants coexists with a view of replication as an activity that demands significant knowledge and command of the method, for it helps identify and assess differences in alternative implementations of a model. While the former view is not meant to impact validation processes, the latter view has validation as its main concern.

Agent-based social simulation’s idiosyncratic approach to replication practices has produced a distinctive set of beliefs and practices when it comes to commensuration. Unlike general science, in agent-based social simulation, replication (and, to a certain extent, docking) is sometimes perceived as an exercise on verification (e.g. Edmonds and Hales, 2003; Gilbert, 2010), rather than validation. This is due, in part, to the role played by the implementation process in research practices, but also to the fact that, so far, several well-known exercises of commensuration have focused on (and found difficulties with) implementation. As a result, the process of replication has been framed within a view of epistemic justification where the replicater has the role of ‘[...] catching and correcting one’s errors’ (Kerr et al., 1996, p. 696). This ‘individualist’ view of the role of replicaters in social simulation stands in stark contrast with the ‘collectivist’ view that prevails in general philosophy of science. Replication has a privileged epistemic status in both realist (Popper, 1959) and constructivist (Collins, 1992) epistemologies, for it is seen as an exercise on collective knowledge-building: it allows for an inter-subjective testing and confirmation of knowledge.

While commensuration does indeed help with verification in agent-based social simulation, its role in validation should not be overlooked.Footnote 6 Scientific results and their material realisation (in this case, computational models) have a many-to-many relationship. Thus, commensuration is one of the mechanisms to turn simulation results into knowledge claims. To produce and accumulate knowledge, criteria to deal with mismatches in commensuration must be developed and normalised. It is after these criteria, on one hand, that concepts such as ‘success’, ‘error’ or ‘truth’ are articulated, and, on the other hand, that measurement and testing standards are established.

Social simulation faces distinctive challenges for the development of criteria that adequately cover both fronts. The definition and application of measuring and testing standards has so far proven difficult, in part, because programming languages are sufficiently flexible syntactically and semantically, so the same theory (e.g. Muelder & Filatova, 2018; Poile & Safayeni, 2016) or formal model (e.g. North & Macal, 2002) can easily lead to different computational implementations. While the equation-based modelling literature has previously acknowledged that some issues of commensuration mismatch may be due to the methodological nature of computer simulation (Parker, 2017; Lenhard & Küster, 2019; Lloyd, 2018; Roy, 2019), commensuration practices regularly rely on well-defined numerical magnitudes linked to the variables measured through an underlying formal theory. In agent-based modelling, conversely, points of commensuration, especially for docking exercises,Footnote 7 need to be discussed and agreed upon. Axtell et al. (1996) put forward probably the most well-known commensuration typology, according to which commensuration practices in agent-based modelling can be carried out on three separate dimensions: the traditional numerical identity (i.e. correspondence in numerical output), along with distributional equivalence (i.e. statistical equivalence in distributions of results) and relational alignment (i.e. correspondence in internal relationships among results). The last two categories, while rare in general science, they suggest, are probably the most suitable dimensions of commensuration for agent-based social simulation.

Adopting measurement and testing standards that are not common in other branches of science also limits the extent to which external commensuration apparatuses can be successfully employed in social simulation, especially for empirically calibrated models.Footnote 8 Traditionally, ‘measuring’ has been understood as the process of estimating a ratio between a magnitude being measured and a magnitude used as a standard (Michell, 2007). If numerical identity cannot be systematically used as a dimension of commensuration, a different approach to the quantification of the magnitudes, the ratio and the estimation must be adopted.Footnote 9 Practitioners have usually acknowledged that validation techniques cannot be transferred from other disciplines or fields without prior critical discussion, for there should be an agreement on how adequate measurements are produced when using agent-based modelling (Lee et al., 2015; Lorscheid et al., 2012; ten Broeke et al., 2016; Windrum et al., 2007). This has led to an ongoing revision of the methodological tools available for validation, as well as an epistemological revision of how the method instantiates even the most common scientific epistemic goals, e.g. prediction (Troitzsch, 2009). It is not clear, however, whether every aspect that affects warrants for belief in the adequacy of representation can be satisfactorily addressed, especially when considering the place of agent-based modelling in the larger social science’s methodological landscape. It has been argued, for example, that the epistemic opacity of computer simulation could negatively impact judgements about adequacy in social researchers that are used to methods in which there is analytical derivation of the results (Lehtinen & Kuorikoski, 2007).

The definition and application of key concepts of commensuration is also impacted by the methodological features of agent-based modelling. The literature on replication has evidenced that these models are highly sensitive to changes. Simple arithmetic modifications can affect the emergent macro pattern (e.g. Edmonds & Hales, 2003). In turn, modifying apparently innocuous assumptions can sometimes radically change the simulation outcome (e.g. Galán & Izquierdo, 2005). Results can equally vary due to predefined features of the high-level language or software (e.g. Anzola & Rodri̇guez-Cȧrdenas, 2018). In some cases, it has been acknowledged that conflicting results might not be worked out without a socialisation of behavioural or structural assumptions implicitly built into the model (Rouchier, 2003), and that judging the adequacy of two simulations with contradictory output might not depend on the amount of data available (Ahrweiler & Gilbert, 2005). Given that, comparatively, the evidence produced by agent-based social simulation is not easily quantifiable, the definition of ‘experimental success’ has to be carefully articulated, so as to avoid unnecessarily undermining knowledge claims produced with evidence provided by this computational method.

Increasing confidence in commensuration as a tool for validation, particularly when the results are conflicting or contradictory, requires the normalisation of some aspects of social epistemology, such as the effect of epistemic goals on warrants for belief in the adequacy of representation. Rouchier (2003) describes a failed attempt to replicate a model addressing the emergence of speculation in economic exchange involving different types of agents and goods. She claims that the inability to replicate the results of the original model is due to a difference in the implementation of the cognitive processes that control the agents’ decision-making. Her analysis indicates that the decision heuristic of the original model is not realistic, yet, she acknowledges, it might have been purposely designed that way, for the goal of the original author was to use a computer simulation to replicate experimental results. While a more realistic decision-making heuristic could be desirable, it should not be always expected or required. What is missing in this case is a discussion about the extent to which realism in representation influences warrants for belief in adequacy of representation.

As with interpretation, the distinctive nature of social theory and data affects commensuration practices in agent-based social simulation. It has led practitioners to develop idiosyncratic testing standards and criteria of success that accommodate the challenges for quantification in the social sciences. More importantly, it has fostered the adoption of a logic of confirmation that minimises the role of commensuration as an exercise in collective knowledge testing. As a result, the role of several criteria of evaluation that are more generally associated with higher order beliefs, values and research goals, e.g. realism in representation, are not sufficiently addressed in the validation literature and its use remains mostly tacit. In this context, social epistemology has the potential to provide a more nuanced understanding of the connection between judgements on adequacy and the social, cognitive and physical organisation of agent-based social simulation.

5 Accounting for Social Epistemology in Validation Practices

Even though thinking of validation as a matter of correspondence is conceptually straightforward, in practice, judgements about adequacy are mediated by a multiplicity of criteria that are not univocal, nor linked exclusively to a first-order analysis of the simulation and its results. There are, as well, noticeable differences in the way correspondence is meant to be tested. The literature usually distinguishes among four types of validation:

  • input/micro-face validation (i.e. correspondence in the elements implemented in the model),

  • process/macro-face validation (i.e. correspondence in the patterns and processes produced by the simulation),

  • descriptive output/empirical input validation (i.e. accommodation of previous data),

  • predictive output/empirical output validation (i.e. prediction of new data) (Rand & Rust, 2011; Tesfatsion, 2017).

There is a multiplicity of techniques, not necessarily exclusive, that can be deployed to evaluate each type of validation. In turn, not all types of validation techniques measure correspondence in the same way. The specific locus of comparison and the techniques employed depend on issues related, first, to the problem being modelled, e.g. the amount of data available, second, its implementation, e.g. the epistemic goals set for the model, and, third, general beliefs and principles underlying the overall practice of agent-based modelling, e.g. what the modeller believes to count as an explanation.Footnote 10

Because of this diversity, standardisation tools have a limited role in improving validation practices, for, as claimed before, these tools are not designed for normalisation. There might even be issues of social epistemology surrounding the application of standardisation tools that warrant some discussion. To be successful, standards must have widespread adoption. Yet, in spite of the multiple benefits associated with the use of protocols, for example, the proportion of articles that include the ODD, arguably, the most popular protocol, is relatively low and clearly varies according to traditional disciplinary lines (Grimm et al., 2020). There are likely some institutional factors linked to logics of reporting that are preventing both authors and journals, regardless of the disciplinary tradition, from consistently adopting this practice.

Identifying the reasons for the still limited popularity of standardisation tools requires to explicitly addressing the nature of testimony in agent-based social simulation. The epistemology of testimony could also help understand the nature of epistemic trust in social simulation. Practitioners often highlight that individual interests, ideologies, goals and values affect the process of modelling. It is, at the same time, necessary for them to acknowledge and inquire into how several individual cognitive features are learnt or acquired from peers and upheld collectively, and are also permeated and affected by institutional factors, such as the cognitive division of labour, the reward system or the reputation landscape (Kitcher, 1993; Strevens, 2003). The uneven popularity of Schelling’s and Sakoda’s model, for example, could be partly attributed to the academic standing of both authors before the consolidation of computational social science (Hegselmann, 2017). Explicitly addressing these dynamics is necessary in the case of social simulation, given that, as mentioned, models do not univocally contribute to a single theory with paradigmatic status, so, in comparison to equation-based modelling in formalised disciplines, it is more likely for models offering competing explanations to coexist without creating disciplinary tension, and for the community of practitioners to be divided around these models.

The possibility of finding competing explanations that cannot be easily ruled out by linking back to a background formal theory and the additional challenges for commensuration in social simulation require from practitioners, along with testimony, to pay more attention to the epistemology of disagreement. It might be useful, first, to clarify the factors that justify a given set of positions to constitute a disagreement, e.g. whether not using numerical identity as the main commensuration criteria increases or decreases the chance of disagreement, and, second, the types and orders of evidence that might come into play when validating an agent-based social simulation, e.g. whether replication is believed to provide higher-order evidence for extensions.

Social epistemology could equally provide some insights into the effect of group justification dynamics linked to aspects such as the interdisciplinary and collaborative nature of agent-based social simulation. Researchers converging in everyday practices of computer modelling are trained in different disciplines and might operate under philosophical principles that are not necessarily socialised and subsequently normalised. This cognitive gap between practitioners’ knowledge, expertise and approach to scientific practices generates particular conditions of epistemic dependence and also imposes some obstacles for collaborative work, among other things, because there are no shared values, methods or theories (MacLeod, 2016; Wagenknecht, 2016). There is, then, an opportunity for practitioners to engage in an ongoing sense-making process that identifies these disciplinary differences, along with tensions in general science and the philosophy of science, so that the multiple criteria used during validation practices are made explicit. This process should start by characterising the current social, physical and cognitive organisation and its effect on everyday interactions (e.g. how it moderates beliefs about adequate reporting standards), and later move forward to accommodate new concerns that arise with the maturation and evolution of the practice of social simulation (e.g. the ethics of computational modelling).

The sense-making process should also foster a critical assessment and update of the knowledge that has been transferred from other disciplines or fields. Representation, as mentioned, is a current topic of debate in philosophy of science that might influence validation processes and warrants for belief in the adequacy of the method. It is not the same, though, to assume that the model and the target phenomenon are isomorphic, as to assume the former is a fiction that distorts important features of the latter (Frigg & Nguyen, 2017). While the first interpretation can partially fit the idea of the computational model as a direct simplified representation of the target phenomenon that could be validated in terms of similarity or isomorphism, the second interpretation radically differs from it. Practitioners of agent-based social simulation have yet to address the nature of representation in computer simulation, in spite of these models being, in comparison with equation-based models, more representationally flexible.

The results from these sense-making and normalisation processes need not be generalised. It is possible that, in fact, new consensuses lead to further fragmentation of the practice of agent-based social simulation, depending on how different epistemic goals are parsed and different modelling trade-offs are dealt with (Edmonds et al., 2019; Graebner, 2018; Matthewson & Weisberg, 2009; Parker & Winsberg, 2018). For instance, it is regularly argued that, in comparison with comprehension, prediction requires agents with more intricate cognitive structures. Hence, validation processes might eventually vary in accordance with the computational agents cognitive structure. Likewise, the acknowledgement and popularisation of non-traditional modelling goals such as optimisation or control might propagate new practices and lead to more complex modelling styles and preferences. Within the applied subdisciplinary context of organisational decision-making, it might be the case that models’ mechanisms become less transparent because of the subordination of explanation to problem-solving.

The scope of consensuses could also be limited, first, by the model-based nature of social simulation and, second, by the social character of science as an institution. Initially, model-based reasoning is an activity that cannot be separated from the cognitive features of the knowing subjects that intervene in the processes of design, operation analysis and socialisation of computational models, nor from the social and physical context in which these processes are carried out. Thus, there will be significant contextual limits to the level of generality that normalisation processes can achieve in social simulation. In turn, additional validation criteria might be introduced, with varying degree of consensus among practitioners, following bidirectional reinforcing dynamics of influence between science and society. Policy-oriented practitioners could start using criteria of evaluation associated with impact,Footnote 11 which is an issue current validation processes rarely involve.

While normalisation will unlikely be universal, it is nonetheless necessary. Agent-based modelling has yet to cement its status as a reliable scientific method in social science. In part, this is because some questions remain regarding how the method is used to tackle representation, i.e. how it is provides reliable indirect knowledge of the phenomenon of interest. Making explicit and normalising general beliefs and criteria that support validation processes could help, on one hand, to provide a more accurate, consensual and transparent answer to those that still do not fully trust the methodological soundness of the method and, on the other hand, to reduce the amount of epistemic resources used for processes such as commensuration and interpretation, making it easier for researchers to achieve higher levels of theoretical detail and integration in everyday practices.

6 Conclusion

This text discussed the need for practitioners of agent-based social simulation to address the effects of social epistemology on validation practices. It argued, first, that agent-based social simulation experiences distinctive challenges for validation linked to the combined effects of the computational expressiveness of agent-based modelling, the multiparadigmatic and unformalised nature of social theory and the inability to use social data with confirmatory purposes; and, second, that tackling issues of social epistemology through standardisation is a strategy of limited success, for these tools are unable to make explicit all the knowledge that affects warrants for believe in the adequacy of a simulation.

After briefly describing how social epistemology permeates everyday practices of social simulation, its impact on interpretation and commensuration, two fundamental activities in validation practices, was addressed. Social epistemology, it was claimed, can minimise issues of interpretation by making sense of the many-to-many theory–model relationship in agent-based social simulation and equally help make explicit the idiosyncratic rationale for criteria and procedures of commensuration. In both cases, it was suggested, standardisation tools could render the process more transparent, but are unable to normalise some knowledge that is needed to strengthen validation practices.

Overall, the discussion about interpretation and commensuration highlights the opportunity to develop new consensuses in agent-based social simulation in which knowledge production and transfer dynamics, as well as the use of computational models as objects of representation, are better understood as a collective endeavour. The acknowledgement that social epistemology has an impact on validation offers an opportunity to intervene in the evolution of key scientific practices. Agent-based social simulation is an area of research with an approach to validation that provides significant advantages. For instance, it provides exemplary evidence of the role of scientific replication, in a context where the general reproducibility of science has been put into question (Baker, 2016). It also has opportunities to improve, however. In most cases, for example, models are organised in clusters of basic models with multiple replications/extensions that might not be entirely cogent, nor lead to fully articulated theories. Reducing and simplifying the amount of resources required for interpretation and commensuration might not only help appraise individual models, but also systematise the knowledge produced in agent-based social simulation.