Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

10.1 Introduction and Motivation

This chapter examines the superclass of object-oriented social simulation models, also called object-based social simulations. The main families of simulation models in this area of CSS consist primarily of cellular automata models and agent-based models. As in the previous chapter, each will be examined using the MDIVVA social simulation methodology (Motivate-Design-Implement-Verify-Validate-Analyze) developed in Chap. 8.

Both families of object-oriented social simulation models use the simplest social entities (cells or agents, respectively) as elementary units to understand emergent complexity, rather than variables (as in system dynamics and queueing models). Both families are applicable to theoretical research for developing basic science, as well as practical application for policy analysis, as was the case before for variable-oriented models. Historically, agent-based models have enabled theoretical as well as policy applications, whereas cellular automata models have been more confined to theoretical analysis. However, this is a broad generalization regarding the majority of research. Policy applications of cellular automata models also exist, as we will examine in this chapter.

10.2 History and First Pioneers

Object-oriented social simulation models presented in this chapter have scientific roots in John von Neumann's theory of automata and Thomas Schelling's social segregation model. The following summary of major milestones includes developments in cellular automata (CA) and agent-based models (ABM) and some closely related advances in areas such as organizational and spatial models, including geographic information systems (GIS). The chronology is unavoidably incomplete after the late 1990s, when the field exploded (exponentially) with a doubling time of just a few years.

1940s:

John von Neumann [1903–1957] and mathematician Stanislaw Ulam [1909–1984] pioneer the theory of automata, publicly presented for the first time in 1948 and published in 1951 as The General and Logical Theory of Automata.

1949:

Sociologist James M. Sakoda pioneers CA modeling in the social sciences in his doctoral dissertation on “Minidoka: An Analysis of Changing Patterns of Social Interaction” at the University of California at Berkeley, published in 1971 in the Journal of Mathematical Sociology, calling it a “checkboard model.”

1960s:

Computer scientist Edward Forrest Moore [1925–2003] invents the concept of 8 neighbors surrounding a given cell in a CA landscape, providing an alternative to the 4-neighbor von Neumann neighborhood.

1966:

The University of Illinois Press publishes The Theory of Self-reproducing Automata by von Neumann.

1969:

Mathematician Gustav A. Hedlund publishes his influential CA paper on symbolic dynamics in the journal Mathematical Systems Theory.

1969:

Economist Thomas C. Schelling publishes his first CA segregation modeling work in the American Economic Review, among the leading journals in economics.

1970:

Mathematician John Horton Conway invents his famous CA model, Game of Life, popularized by Martin Gardner in Scientific American.

1970s–1980s:

Psychologist Bibb Latané formulates his theory of social impact, a milestone in social CA modeling.

1971:

Schelling publishes his seminal paper on a CA of racial segregation by migration in the Journal of Mathematical Sociology.

1975:

Economist Peter S. Albin [1934–2008] approaches checkerboard models as CA in his seminal book Analysis of Complex Socioeconomic Systems.

1977:

Political scientist Stuart A. Bremer [1943–2002] pioneers CA modeling in political science with a hexagon-based simulation of war and peace in the international system, “Machiavelli in Machina,” published in Karl W. Deutsch's seminal Problems in World Modeling.

1978:

Mathematicians J.M. Greenberg and S.P. Hastings develop a true cellular automaton model of excitable media as a 3-state 2-dimensional CA, published in the SIAM Journal of Applied Mathematics.

ca. 1981:

Physicist Stephen Wolfram begins work on elementary CA theory and modeling, publishing his first paper two years later in Reviews of Modern Physics, and later proposing a general classification of CA models in four major classes.

1987:

Computer scientist James (Jim) E. Doran publishes his seminal agent-based modeling paper “Distributed Artificial Intelligence and the Modelling of Socio-Cultural Systems.”

1987:

Mathematician and theologian Edwin A. Abbott publishes his famous mathematical fiction book, Flatland, inspiring German computational social scientists Rainer Hegselmann and Andreas Flache to write their 1998 seminal paper, “Understanding Complex Social Dynamics: A Plea For Cellular Automata Based Modelling,” in the first volume of the Journal of Artificial Societies and Social Simulation.

1990:

Political scientists Thomas R. Cusack and Richard J. Stoll publish the realpolitik CA hex-based model of inter- and intra-national conflict, building on S. A. Bremer's earlier work.

1994:

Computational social scientist Nigel Gilbert and computer scientist James Doran publish one of the earliest collections of papers on computational applications in social science, Simulating Societies, including chapters by other pioneers such as Rosaria Conte, Klaus Troitzsch, Francois Bousquet, Robert Reynolds, Helder Coelho, and Cristiano Castelfranchi.

1995:

Computational social scientists Rosaria Conte and Cristiano Castelfranchi publish their seminal work on Cognitive and Social Action.

1996:

Computational social scientists Joshua Epstein and Robert Axtell publish their influential book on the Sugarscape model, Growing Artificial Societies.

1996:

Rainer Hegselmann publishes his two influential papers, “Cellular Automata in the Social Sciences” and “Understanding Social Dynamics,” still considered among the best introductions to CA simulation models in the social sciences.

1997:

Computational social geographer Lena Sanders and her team in Paris publish a seminal paper on SIMPOP, one of the earliest ABM systems for modeling historical urban growth, in the journal Environment and Planning B: Planning and Design.

1997:

Computational social scientist Robert Axelrod publishes his seminal book on social agent-based modeling, The Complexity of Cooperation, as well as his influential paper, “Advancing the Art of Simulation in the Social Sciences,” in the journal Complexity published by the Santa Fe Institute.

1997:

Leigh Tesfatsion at Iowa State University publishes the first newsletter of ACE, Agent-based Computational Economics, which rapidly becomes a major resource for the CSS community.

1998:

The Journal of Artificial Societies and Social Simulation is founded by computational social scientist Nigel Gilbert, quickly becoming one of the most influential CSS journals. Rainer Hegselmann and Andreas Flache publish their influential paper on CA, and the same year computational social scientist Domenico Parisi publishes the first CA model of ancient Mesopotamian empires, collaborating with historian Mario Liverani.

1999:

Computational sociologist Kathleen M. Carley of Carnegie Mellon University and computer scientist Les Gasser of the University of Illinois at Urbana-Champaign publish their seminal paper on “Computational Organization Theory” in G. Weiss's influential Multiagent Systems textbook reader.

1999:

Nigel Gilbert and Klaus Troitzsch publish the first edition of the classic textbook, Simulation for the Social Scientist.

1999:

Chris Langton of the Santa Fe Institute establishes the Swarm Development Group for developing the eponymous ABM simulation system that later inspired NetLogo (designed by Uri Wilensky of Northwestern University the same year), Repast (since 2002), and MASON (2002).

1999:

Computational archaeologists Timothy Kohler and George Gummerman from the Santa Fe Institute co-edit the influential volume Dynamics in Human and Primate Societies, including the so-called Anasazi model.

2002:

Stephen Wolfram publishes A New Kind of Science, his magnum opus in 1280 pages.

2002:

The US National Academy of Sciences holds its first Sackler Colloquium and publishes its first Proceedings dedicated to social ABM, co-edited by renowned geographer and NAS member Brian L. Berry, L. Douglas Kiel, and Euel Elliott.

2002:

The North American Association for Computational Social and Organizational Sciences (NAACSOS) is founded at its first annual meeting and Kathleen Carley becomes its first President. Co-founders include Claudio Cioffi-Revilla (4th president), Charles Macal, Michael North, and David Sallach (2nd president).

2002:

The first semester-long courses in CA and ABM are taught in George Mason University's Program in Computational Social Science by an initial faculty consisting of Claudio Cioffi-Revilla (founding chairman, CSS Department), Dawn C. Parker, Robert Axtell, Jacquie Barker, and Timothy Gulden.

2003:

Computer scientist Sean Luke and Claudio Cioffi-Revilla release the first version of the MASON (Multi-Agent Simulator of Networks or Neighborhoods) system at the Agent 2003 annual conference in Chicago, demonstrating the new system with the Wetlands ABM and a suite of other classic models (HeatBugs, Conway's Life, Flockers, and Boids).

2004:

Andrew Ilachinski of the Center for Naval Analysis publishes Artificial War, the largest multi-agent analysis of conflict thus far.

2005:

Thomas Schelling of the University of Maryland and former president of the International Studies Association is awarded the Nobel Memorial Prize in Economic Sciences, with Robert Aumann, for his work on conflict theory and social simulations. He is the first computational social scientist to win such an honor.

2005:

The first US National Science Foundation grant for a large-scale ABM-GIS simulation model of coupled socio-natural systems using remote sensing and ethnographic methods from field research is awarded to the Mason-Smithsonian Joint Project on Inner Asia, led by Claudio Cioffi-Revilla (principal investigator), Sean Luke, and J. Daniel Rogers.

2006:

The first issue of the Journal of Cellular Automata is published, with the goal of disseminating “high-quality papers where cellular automata are studied theoretically or used as computational models of mathematical, physical, chemical, biological, social and engineering systems.”

2010:

Computer scientist Andrew I. Adamatzky from the University of the West of England in Bristol publishes the edited volume Game of Life Cellular Automata. The same year Alfons G. Hoekstra, Jiri Kroc and Peter M.A. Stout publish the edited volume entitled Simulating Complex Systems by Cellular Automata. Both books demonstrate the scientific maturation of Conway's seminal model.

2010:

Claudio Cioffi-Revilla is elected first president of the Computational Social Science Society of the Americas (CSSSA), founded as the successor to NAACSOS.

2010:

Princeton University Press publishes Michael Laver and Ernest Sergenti's Party Competition: An Agent-Based Model, the first major significant advance in the computational political science of multi-party systems for modeling democratic regimes.

10.3 Cellular Automata Models

This section introduces the superclass of social simulations based on cellular automata (CA) models, used in social science spatial applications, and examines their unique characteristics for understanding emergent social complexity. CA models are presented within the broader context of object-oriented models, which includes an even larger class of computational spatial and organizational models. The emphasis of CA is on neighboring cell-like sites interacting in discrete time steps that resemble a broad variety of social phenomena. Formal aspects involving interaction topologies and behavioral rules are important.

Fig. 10.1
figure 1

Major pioneers of cellular automata models: John von Neumann, inventor of cellular automata (upper left); John Horton Conway, inventor of the CA-based Game of Life (upper right); Stuart A. Bremer, pioneer computational political scientist in the use of CA models of international conflict (lower left); Nobel prize winner Thomas C. Schelling, famous for his model of racial segregation (lower right)

We begin with the following definition:

Definition 10.1

(Cellular Automaton Model)

A cellular automaton (CA) simulation is an object-oriented computational model for analyzing complex systems consisting of neighboring entities (x,y), called cells, that change their state s xy as they interact in a (typically two-dimensional) grid-like landscape L using some rule set .

The following are examples of CA social simulation models:

  • Sakoda's Group Attitudinal Model

  • Schelling's Urban Racial Segregation Model

  • Conway's Game of Life

  • Hegselman's Opinion Dynamics Model

  • Bremer-Mihalka's and Cusack-Stoll's Realpolitik Models

  • Axelrod's Tribute Model

  • Parisi's Model of the Neo-Assyrian Empire

While we cannot examine all of them in detail, we use these examples to explain basic features of CA social simulations.

Formally, a CA model consists of an array of cells, each of which is in one of a finite number of states. Neighboring cells are defined with respect to a given cell. The dynamic behavior of a CA begins at t=0 when each cell is initialized in a given state. Given a cell in an initial state s 0, the state at the next step t+1 is determined by rules specified by some mathematical function(s) that determines s t+1 based on information concerning one or more neighboring cells. Rules are local, in the sense that they affect cells, not the global landscape where emergent behavior may occur.

In the simplest CA models all cells are the same and rule sets are homogenous and constant for all cells. Stochastic cellular automata and asynchronous cellular automata are different from simple CA models and use non-deterministic and other rule sets. As suggested by this distinction, CA models can be purely deterministic or contain stochastic elements defined by probability distributions.

A complete CA social simulation model consists of all elements in Definition 10.1. Accordingly, these models are appropriate for rendering the following formal features of a referent social system:

Discreteness::

Spatio-temporal discreteness means that a landscape is divided into cells and time passes in integer units.

Locality::

Cells interact only with contiguous neighbors, not with other cells far away.

Interaction topology::

Square cells may interact with their north-south-east-west neighbors (called a 4-cell von Neumann neighborhood) or with corner neighbors (8-cell Moore neighborhood).

Scheduled updating::

All cells update their state after each time step according to simple rules, resulting in emergent patterns at the macroscopic, global level of the entire landscape.

CA models in social science date to the first pioneering applications to the study of racial segregation and opinion dynamics, followed by models of territorial growth. These models were initially called “checkerboard” and “chicken wire” models, in reference to square and hexagonal cells, respectively. They are also widely used in fields closely related to CSS, such as ecology. Figure 10.2 illustrates racial segregation and territorial growth models, running from initialization at t=0 to long-run conditions at some t N .

Fig. 10.2
figure 2

Examples of cellular automata models: The Schelling model with square cells and Moore neighborhood is initialized with ethnically mixed population (upper left). Racial segregation emerges as neighbors become cognizant of their surroundings and decide to move away from where they started (upper right). The Interhex model with hexagonal cells representing small, simple polities begins with uniformly distributed capabilities (lower left). As neighboring polities interact through normal balance of power dynamics, mild stochasticity is sufficient to grow a system of countries. Both models shown in this figure were implemented in MASON, discussed in Sect. 10.3.3

10.3.1 Motivation: Research Questions

CA models address research questions in many domains of CSS. They are most appropriate for modeling referent systems with the following features, assuming unit cells are simple in terms of attributes and rules, as explained earlier:

  1. 1.

    A landscape, physical or conceptual, well describes the referent system. Examples include urban areas, belief systems, and networks of actors ranging from small groups of individuals to the international system of nations.

  2. 2.

    Actors located on the landscape have information about neighboring actors and use it to update their own state.

  3. 3.

    The state of each actor is determined by rules that govern behavior conditional on information concerning self and relevant neighbors.

  4. 4.

    At the macroscopic system level the landscape of cells might evolve toward some stationary state, oscillate between different patterns, or show chaotic behavior.

  5. 5.

    Emergent properties of social complexity at the systemic level result from interactions at the level of individual cells—the phenomenon known as emergence.

Research questions commonly addressed by CA social simulations typically include one or more of the following:

  • What is the effect of local cell-level rules on emergent social phenomena?

  • Do different interaction topologies (e.g., von Neumann or Moore neighborhoods) matter significantly?

  • Are emergent patterns stationary, fluctuating, or chaotic?

  • If stationary or fluctuating, what determines the time period for convergence or periodicity of fluctuations?

  • Are there patterns of diffusion across the landscape and, if so, how are they characterized?

CA models provide answers to questions such as these through simulation, as long as cell attributes and rules are kept relatively simple, as in the examples provided below.

10.3.2 Design: Abstracting Conceptual and Formal Models

Given some referent system of interest S, a conceptual model C S , consisting of a cellular automaton and its respective cells, topology, and rule set, is abstracted by a three-stage process consisting of landscape tessellation, interaction topology, and behavioral rules.

Thinking one step ahead, in the case of CA models there are no major design or abstraction considerations that have significant consequences for implementation. All CA models discussed in this chapter and most others in the extant literature run fast on basic laptops. (By contrast, implementation in agent-based models can be highly affected by design/abstraction decisions.) Hence, virtually all CA models are considered “lightweight,” computationally speaking. Even when they are large, CA models are easy to distribute due to the total absence of global or long-range interactions.

10.3.2.1 Cellular Tessellation

The first stage in CA abstraction to produce a conceptual model will focus on the referent system's landscape, which should consist of actors represented by cells.

Definition 10.2

(Cell)

A cell is a tile-like object defined by attributes and located adjacent to other, similar objects. The state of a cell is given by its attribute values, where one or more attribute is a function of the state of neighbors.

The procedure of abstracting cells is called tessellation. Cells are the basic elements of a CA model. They can be square (most common form), triangular, hexagonal, or irregular, depending on a landscape's tessellation and features of the referent system. Square cells make sense for urban models, whereas hexagonal cells are sometimes preferable for large territories or open terrain. From a computational perspective each has advantages and disadvantages, depending on multiple factors such as number of cells, movement, and scheduling.

For example, in Conway's Game of Life cells are square in the classic version, defining a rectangular landscape. In other versions cells can also be hexagonal. Regardless of form, each cell can be in one of two states, alive or dead. What happens to each cell and the whole population in the simulation depends on the condition of neighboring cells in the landscape.

As another example, in Schelling's Segregation Model (Fig. 10.2, upper frames) each cell represents a person with a given level of racial tolerance (attribute). Each person is happy or unhappy (the cell's two states) depending on the race of neighbors, which, in turn, will determine whether the person moves away from his/her present neighborhood.

Urban sprawl is a more complex example of a CA-like social phenomenon. Each area surrounding a city may become suburbanized or not, depending on factors (attributes) such as population growth, cost of land, proximity to work, and other variables considered by actors who may decide to move away from a downtown urban center to a suburban neighborhood.

Before the advent of airplanes, when military conquest was mostly land-driven, territorial polities grew and contracted based on the ability of a population center to expand its territory into increasingly large swaths of neighboring territories. Hexagonal cells—such as those in the Interhex model, Fig. 10.2—are good tessellations for open territory, as demonstrated by tabletop games played by the military since the German army (Prussian General Staff) pioneered war games in the early 19th century. However, square cells are also used for modeling polity expansion, as demonstrated by Domenico Parisi in his study of the growth of the Neo-Assyrian Empire during the 9th–7th centuries BC using a CA model.

A distinctive feature of cells in CA models is that the number of attributes they contain is relatively small. (By contrast, agent-based models examined in the next section commonly encapsulate numerous attributes, sometimes in the hundreds, as well as complex methods for updating attribute values.) In the previous examples each cell has just one or a few attributes, such as being alive or dead in the Game of Life, or happy or unhappy in Schelling's segregation model.

The size of a CA landscape in terms of number of cells also matters, since larger numbers can often generate emergent phenomena not possible with smaller worlds. Size is determined by tessellation.

10.3.2.2 Interaction Topology

The second stage of abstraction in developing a CA model consists of specifying the interaction topology—how cells are “wired” to neighboring cells, so to speak. Interaction topology defines an array of local, short-range interactions. This step comes second, because it depends in part on the form of cells. Square cells can have either von Neumann or Moore neighborhoods, as already mentioned. Hexagonal cells commonly have six neighbors, although they can also have three by alternating neighbors. Triangular cells can have the equivalent of von Neumann and Moore neighborhoods, depending on whether they have three side neighbors or all six, including apical neighbors (sometimes referred to somewhat imprecisely as “corner neighbors”).

Another defining feature of interaction topology is neighborhood radius, defined as distance from a cell to its farthest neighbor, normally not more than two or three cells away. Most CA models operate with an interaction topology of radius 1 to ensure only local, short-range interactions.

In the Game of Life, interaction topology is defined by a Moore neighborhood of radius 1, thus including all eight surrounding cells, as is also the case for the Schelling segregation model. CA models of other referent systems can assume different interaction topologies, such as when triangular or hexagonal cells are used to represent a landscape. (Compare square cells to hexagonal cells in Fig. 10.2.) In the interaction topology of the Bremer-Mihalka and Cusack-Stoll inter-state CA systems of hexagons, all six neighbors affect a cell (country or province). This is also typically the case in wargaming (tabletop or computational) simulations.

For some global emergent phenomena in a CA model, details of the interaction topology (cell shapes, neighborhood radius, as examples) may or may not matter. In fact, an interesting research question to analyze is the sensitivity of results with respect to interaction topology, a topic to which we shall return later.

10.3.2.3 Rules of Cell Behavior

The third and final stage of abstraction in a CA model development effort is to specify rules followed by cells. Rules are translated into code when a CA model is implemented. Simple rules are what make a CA interesting in terms of generating unexpected emergent patterns.

In the Game of Life, a cell maintains its current state if it has two dead neighbors. When a cell has three dead neighbors, it too becomes dead. This simple rule generates many different patterns that are unexpected, including “gliders”—collectives of cells that move across the landscape.

In Schelling's segregation model the basic rule is that an agent moves to a different neighborhood when it becomes unhappy. The surprising result is that even when agents have a high level of tolerance for neighbors of different race (i.e., >50 % of different ethnicity among surrounding neighbors), segregated neighborhoods still emerge. In the Interhex model the core rule regards the result of neighboring conflicts and what happens to the territory of the vanquished.

In models of opinion formation, rules specify when an agent changes opinion. Numerous CA models of opinion dynamics show surprising results when seemingly simple rules give rise to divided, uniform, or fluctuating opinion groups.

Other CA spatial models, such as those simulating territorial polities, have simple rules capable of generating complex patterns of land borders.

The main result of the design stage of a CA model is a conceptual and formal model of the referent social system specified by a landscape of cells (specifying their total number and individual geometry), their interaction topology (specifying how cells are wired together in an array), and behavioral rules (specifying what each cell does).

10.3.3 Implementation: Cellular Automata Software

Given a sufficiently complete conceptual or formal model of a referent system as a CA, the next methodological stage consists of implementing the model in code using a simulation system. (As always, the model can also be implemented in native code using an OOP language, such as Python, Java, or C++.) The main milestone in implementation is the transition from CA diagrams and mathematical equations in the conceptual model to code in the simulation model.

Swarm, NetLogo, Repast, and MASON are among the most widely utilized CSS simulation systems that offer CA implementation facilities. Conway's Game of Life and Schelling's Social Segregation have also served as demonstration models for CA social simulations. NetLogo offers several already-built CA models that are easy to use and learn with. In the early 2000's, Repast and MASON used the segregation model among the earliest demos to showcase the new simulations systems. They are still in use today. The choice among these alternative simulation systems for learning purposes largely depends on access and familiarity. NetLogo is often the toolkit of choice for learning a new class of models. For research purposes, the others, especially MASON, assume familiarity with Java.

Figure 10.3 shows a screenshot of a 2-dimensional stochastic CA model running in NetLogo. Simulation systems such as these offer new users several pre-set analytical options. In this case NetLogo makes available several neighborhood topology options, shown by “switches” on the left side of the screen. Screenshots and movies are easy to produce with appropriate software running on a computer's operating system.

Fig. 10.3
figure 3

Screenshot of a 2-dimensional cellular automata model of growth with varying number of neighbors running in NetLogo

Fig. 10.4
figure 4

Pioneers of agent-based models. Joshua Epstein, creator of Sugarscape (with R. Axtell) (upper left); Robert Axelrod, author of The Complexity of Cooperation and other CSS classics (upper right); Nigel Gilbert, editor of Journal of Artificial Societies and Social Simulation (lower left); Hiroshi Deguchi, president of the Pacific-Asian Association for Agent-based Social Science (lower right)

In addition to “The Big Four” (Swarm, NetLogo, Repast, and MASON), other software systems are also available for implementing CA social simulation models. Mathematica has powerful CA modeling facilities, and many other systems are included in the Nikolai-Maddey 2009 survey of simulation Tools of the Trade.

10.3.4 Verification

Verifying a CA social simulation model involves ascertaining that cells, interaction topology, and behavioral rules are all working in the way they are intended according to the conceptual model. In the case of square cells, verification is simplest and relatively straightforward, including checking to see whether landscape borders are behaving properly (edged or toroidal).Footnote 1 Behavioral rules are best verified by detailed tracing of each discrete interaction event within a single simulation step. As always, all general verification procedures examined earlier in Sect. 8.7.4 also apply to CA models, including code walkthrough, profiling, and parameter sweeps.

10.3.5 Validation

Validating a CA social simulation model that has been verified involves two main perspectives. Structure validity refers to internal features of the model, including main assumptions concerning relevant cell attributes, interaction topology, and behavioral rules. The following should be considered when testing structure validity in a CA model:

Empirical tests of validation :

The specification of equations used in the model, as well as parameter values, are features requiring validation. For example, in the case of Schelling's segregation model discussed earlier, this part of the validation procedure would focus on parameters such as an individual's racial tolerance being assumed, as well as the number of neighbors taken into consideration. The classic model assumes a Moore neighborhood, which is an assumption that requires validation using empirical tests. It is also often assumed that coefficients are constant throughout a given simulated run. These are assumptions of structural stationarity, in the sense that cell rules specified do not change over time; i.e., classical CA models assume that the basic clockwork among cells in a landscape does not change throughout history, which may or may not be a valid assumption about the referent system. For example, education may prevent segregation, or household attention may focus more on neighbors next door rather than across the street or around the block.

Theoretical tests of validation :

CA model assumptions should also be checked in terms of theories being used, because the simplicity of these models should not distract attention from theoretical underpinnings. Again, this is a broader perspective than empirical tests of structural validity, because it is based on fundamental, causal arguments that are difficult if not impossible to quantify. For example, in the case of the segregation model, the overall structure is based on Schelling's theory of how interaction between two groups is explained. The fundamental theory is based on three factors or dynamics driving the cells' happiness and its decision to stay in the neighborhood or move away: one's own identity; the identity of neighbors; and distance from neighbors. Is this theory valid? Are there other factors as important or even more significant than these? The theory also assumes perfect symmetry among neighbors; i.e., both make residential decisions in the same way. Is it possible that different neighbors decide based on different criteria, such as, one on racial factors and another by education levels?

Tests of structural validity for CA social simulation models can be quite complex and require considerable attention, as seen for other kinds of models. Again, the empirical social science literature is of great value in navigating through these procedures.

Behavior validity is about actual results from simulation runs, especially in terms of qualitative and quantitative features such as cellular landscape patterns of growth, decay, and oscillation, among others. What matters most in the context of ascertaining behavioral validity in CA models is checking whether simulated spatial patterns correspond to empirical patterns.

10.3.6 Analysis

Cellular automata social simulations are analyzed in a variety of ways, including formal analysis, asking what-if questions, and scenario analysis.

Formal analysis of cellular automata, a tradition begun by von Neumann and Ulam, is a field that extends far beyond CSS, but one that provides insights for better understanding social dynamics. For example, Wolfram's classification of CA into a small number of types (stable, oscillating, chaotic, complex) highlights similarities and differences that can be socially meaningful. Formal analysis of rules can also yield theoretical expectations for testing through simulation.

Asking what-if questions is another way of analyzing CA social simulations. For example, in a racial segregation model we may ask what happens when tolerance coefficients differ significantly across the two groups. Or, what if tolerance deteriorates as a function of time, as can happen when conflict breaks out in a previously integrated community when previously peaceful but heterogenous neighbors no longer trust each other, as happens in many civil wars. What-if questions can also be used to analyze a CA model using different rule sets. For example, in a racial-migration model we may wish to have one group responding to a Moore neighborhood while another uses a von Neumann neighborhood, based on different attitudes toward physical distance.

Scenario analysis provides a more comprehensive analytical approach to CA simulations by using a set of related questions defining a given scenario, rather than analyzing one question at a time. For example, in a racial-migration model interest may lie in examining a scenario in which tolerance coefficients are relatively large, neighborhood radii are short, and the number of cells is large. Intuitively, such a scenario should not generate segregated neighborhoods. By contrast, an opposite scenario would analyze what happens when tolerance is low, radii are long, and the landscape is smaller. Exploring scenarios between these two extremes can uncover interesting qualitative and quantitative properties, some of which may not be as well-known.

CA models are primarily intended for basic CSS research and theoretical analysis, not for developing actionable policy analysis, given their emphasis on simple interaction rules and overall homogeneity of cells, neighborhoods, and rules. Practical policy analysis can only be obtained through social simulations that allow sufficient empirical specificity and high-fidelity calibration, which is generally not viable with CA—but eminently feasible, if not always easy, with agent-based models.

10.4 Agent-Based Models

This section introduces agent-based models (ABM) in CSS, also called social multi-agent systems in computer science. Social ABM simulations are one of the largest and most rapidly growing varieties of computational models. Informally, an ABM can be thought of as a CA with a more sophisticated landscape and actors that come closer to emulating humans through various aspects of reasoning, decision-making, and behaviors.

We begin with the following working definition, which we will later use to examine its main components:

Definition 10.3

(Agent-Based Model)

A social agent-based model (ABM) is an object-oriented computational model for analyzing a social system consisting of autonomous, interacting, goal-oriented, bounded-rational set of actors that use a given rule set and are situated in an environment E.

Formally, therefore, an ABM consists of the three main components in Definition 10.3: agents, rules, and environments where agents are situated, as we will examine more closely below.

Table 10.1 provides some examples of social ABM models in various domains of CSS. They address a variety of research questions using models calibrated at different empirical levels and built with various simulation toolkits or programming languages (Java and C++). We will draw on some of these examples to explain features of ABM social simulations. Paraphrasing an earlier distinction between a chiefdom and a state, an agent-based model is not simply a cellular automaton on hormones—no more so than a jet airliner is a flying bus. The addition of autonomy, goal-directed behavior, and environmental complexity adds entirely new qualitative and quantitative features to a social ABM, compared to the relatively simpler class of cellular automata models.

Table 10.1 Examples of agent-based models in CSS by empirical calibration

The dynamic behavior of an ABM begins at t=0 when each agent is initialized in a given state. Given an agent in an initial state s 0, the state at the next step t+1 is determined by rules applied to each agent's situation. The next state s t+1 will then be based on information processed by rules. Such dynamic behavior is similar but more complex than that of a CA model because now agents have (a) autonomy (whereas cells were strongly dependent on their neighborhood), (b) freedom of movement (whereas cells had fixed locations), and (c) reason-based behavior, among other salient differences. None of these were CA features.

Clearly, agents have more human-like features than cellular automata, making ABMs methodologically appealing and powerful formalisms for social and behavioral science. This is especially so in the case of social theories that are expressed primarily in terms of actors, including their cognitive and decision-making processes, and patterns of social behaviors, including collective behavior and organizational and spatial dynamics.

In the simplest ABM models (e.g., Heatbugs, Sugarscape, Boids) all agents are usually the same and rule sets are homogenous and constant for all agents. Stochastic ABM and asynchronous ABM are different from simple models and use non-deterministic and other rule sets. As suggested by this distinction, ABM models can be purely deterministic or contain stochastic elements defined by probability distributions.

The earliest ABM simulations in social science were Heatbugs (late 1980s), Sugarscape (1996), SIMPOP (1997), and similar spatial “landscape” models that were the first to demonstrate the emergence of social complexity in ways never before seen by social scientists. These pioneer models were followed by many others built during the past decade. ABM simulations are also widely used in ecology and population biology, where they are called individual-based models. Figures 10.5 and 10.6 illustrate behavioral patterns and wealth distribution of agents in Sugarscape, running from initialization at t=0 to long-run conditions at some t N .

Fig. 10.5
figure 5

The Sugarscape agent-based model: agent behavior. The Sugarscape model consists of a society of agents (red dots) situated on a landscape consisting of a grid of square sites where agents with von Neumann neighborhood-vision feed on sugar (yellow dots). Left: At initialization agents are assigned a uniform distribution of wealth and they reside in the southwestern region. Right: After a number of time steps, most agents have migrated away from their original homeland as they move around feeding on the landscape. This MASON implementation by Tony Bigbee also replicates the “wave” phenomenon generated by the original (and now lost) implementation in Ascape, observed here by the northwest-southeast formations of diagonally grouped agents in the northeast region

Fig. 10.6
figure 6

The Sugarscape agent-based model: emergence of inequality. Lorenz curves (top) and histograms (bottom) portray the distribution of agents' wealth. Left: Agents are assigned some wealth at initialization t=0, following an approximately uniform distribution, as shown by the nearly straight Lorenz curve and wealth histogram. Right: After some time, inequality emerges as a social pattern, as shown by the more pronounced Lorenz curve and much more skewed histogram, similar to Pareto's Law and diagnostic of social complexity

Fig. 10.7
figure 7

Pioneers of ABM toolkits. Swarm's Chris Langton (upper left); NetLogo's Uri Wilensky (upper right); Repast's David Sallach (lower left); MASON's Sean Luke (lower right). All of them collaborated with others in creating today's leading simulation systems for building social ABMs

10.4.1 Motivation: Research Questions

Agent-based simulation models address research questions in many domains of CSS—whether from basic research or applied policy perspectives. They are most appropriate for modeling referent systems with the following features, where agents can range from “light” cognition and decision-making capacity to “heavy” agents with more detailed cognitive architecture:

Bounded rationality::

Agents make decisions under conditions of bounded rationality, as examined earlier in Sect. 7.5.2.

Decision-based behavior::

Agents behave based on choices determined by some form of reasoning. This is in contrast to the unreasoned, purely rule-based behavior of cellular automata examined earlier.

Artifacts and artificial systems::

When built artifacts such as institutions or infrastructure matter in a referent system, those entities can be represented in an ABM in a number of ways.

Social or physical spaces::

Referent systems may contain organizational (e.g., social networks), territorial (physical spaces), or other spatial aspects (policy spaces) that are important to model.

Besides these features, ABMs can also have characteristics shared with CA, including various kinds of discreteness, interaction topologies, vision or range, and scheduled updating. All these are ubiquitous and significant features of social complexity that are difficult or impossible to formalize using other modeling approaches (e.g., dynamical systems or game-theoretic models).

Some typical research questions commonly addressed by ABM social simulations may include the following:

  • What is the effect of local agent-level rules and micro behaviors on emergent social phenomena at the macro level?

  • How do alternative assumptions about human cognition and individual decision-making affect emergent collective behavior?

  • Do different interaction topologies (e.g., von Neumann or Moore neighborhoods) or the radius of agents' vision matter significantly?

  • Are emergent societal patterns globally stationary, fluctuating, periodic, or chaotic?

  • If stationary or fluctuating, what determines the time period for convergence or periodicity of fluctuations?

  • Are there patterns of diffusion across the landscape and, if so, how are they characterized?

  • What is the effect of different distance-dependent functions in human and social dynamics?

Comparing these questions with comparable sets of questions for system dynamics models (Sect. 9.3), queueing models (Sect. 9.4), and cellular automata models (Sect. 10.3), it is clear that these have significantly broader scientific scope as well as analytical depth. Questions addressed by social ABMs also have the feature of being inter-, multi-, or cross-disciplinary, or scientifically integrative, because ABM methodology lends itself to leveraging knowledge across the social, natural, and engineering sciences—which is required for understanding complexity in coupled socio-techno-natural systems. Of all the social simulation methodologies seen thus far, ABMs are arguably among the most versatile in terms of the range of feasible research questions that can be addressed. Research questions in the context of scenario analysis are a major application of ABM social simulations. Asking what-if questions of social complexity is an excellent way to motivate an agent-based simulation.

10.4.2 Design: Abstracting Conceptual and Formal Models

Given some referent system of interest S, a conceptual agent-based model C S is abstracted by identifying relevant agents, environments, and rules, as suggested by Definition 10.3.

10.4.2.1 Agents

Human actors in an ABM—whether individuals or collectives (e.g., households, groups, other social aggregates)—are represented as agent-objects that encapsulate attributes and dynamics (computational methods or operations). The state of an agent is determined by its attributes, just as in any object.

The following are standard features of agents:

  • Each agent is aware of its own state, including its environmental situation.

  • An agent is said to be autonomous, in the sense that it can decide what to do based on endogenous goals and information, much like a social actor, without necessarily requiring exogenous guidance.

  • Besides making decisions based on its own internal state, an agent can also decide to act in reaction to some perceived environmental situation.

  • Moreover, agents can also behave proactively, based on goals.

  • Agents can communicate, sometimes generating emergent patterns of sociality (e.g., collective behavior), by making their attributes visible or actually passing information.

Accordingly, we can use these features to define an agent.

Definition 10.4

(Agent)

An agent is an environmentally situated object with encapsulated attributes and methods that enable self-awareness, autonomy, reactivity, proactivity, and communication with other agents and environments. The state of an agent is given by its attribute values.

For example, the agents in Sugarscape satisfy each of these properties: they are aware of being hungry or satisfied; they decide where to move with complete autonomy; they can decide to seek a better patch of sugar, doing so proactively since they seek to survive; and, based on some additional rules, they can communicate and exchange sugar for spice, thereby generating a simple market. Similarly, in the Wetlands model (Table 10.1) agents know their own state: they decide to migrate with autonomy and use memory about various locations; they react to the distribution of other agents and food sites; they communicate among members of their own group, avoiding communication with foreigners. Agents in all models in Table 10.1 share comparable characteristics.

10.4.2.2 Environments

Agents are situated in an environment, which can consist of any number of components related through loose or tight coupling. From a complexity-theoretic perspective, natural and artificial systems are assumed to be disjoint components of agents' environment.

  • Natural environments generally consist of biophysical landscape, sometimes including weather. In turn, landscape can consist of topography, land cover, hydrology, and other biophysical features, depending on what parts of the referent system the model needs to render. Natural environments are governed by biophysical laws, including thermodynamic laws.

  • Artificial environments—what we may call Simon's environment of artifacts—can include any number of human-built or engineered systems, such as buildings, streets, markets, and parks in urban areas, or roads, bridges, and transportation nodes linking urban areas. Critical infrastructure systems, specifically, are comprised of several major components, such as roads, energy, telecommunications, water supply, public health, and sanitation, among others, depending on a country's statutory taxonomy. Artificial environments are also governed by physical laws, except thermodynamics. This is because artificial systems generate more order (decreasing entropy) by using resources, which is the reverse of thermodynamic disorder (increasing entropy).

For example, in terms of ABMs in Table 10.1, Anasazi and Wetlands comprise natural environments, whereas RiftLand, RebeLand, SIMPOP, and FEARLUS also include artificial environments.

10.4.2.3 Rules

Agents and environmental components interact among themselves as well as with each other, generating emergent behavior through the following inter-agent, agent-environment, and intra-environment interactions. Rules are generally local, in the sense that they affect agents but not the global landscape where emergent behavior may occur—similar to micro-motives generating macro-behavior (paraphrasing T.S. Schelling's famous 1978 book). In turn, however, agents can also be affected by global conditions.

  • Inter-agent rules govern interactions among agents through communication, exchange, cooperation, conflict, migration, and other patterns of social behavior, including particularly significant patterns such as collective action and social choice. Generally these rules are grounded in social theory and research. For example, in Wetlands, agents communicate among members of the same group; in RebeLand, government agents and insurgent agents fight each other while general population agents express support for or against government or insurgents.

  • Agent-environment rules govern effects of environmental conditions on agents and, vice versa, environmental impacts on agents' decisions and behaviors (simulating anthropogenic effects on the environment). These rules are also grounded in social theory, as well as environmental science and related disciplines. For example, in RiftLand farmers are affected by rainfall and land cover, whereas in GeoSim and similar war-games countries are affected by balance of power processes with neighboring rivals.

  • Intra-environmental rules pertain to cause and effect mechanisms within biophysical components of the environment, such as effects of rainfall on vegetation, or effects of natural hazards on infrastructure. This third type of rule is grounded in the physical, biological, and engineering sciences. For example, in the Wetlands model and others like it, rainfall affects vegetation. In Riftland, herds of animals are also affected. In turn, herd grazing affects ground cover, which can affect infrastructure by causing erosion and making severe precipitation more hazardous during rainy seasons.

In the case of abstracting a referent system as being agent-based (unlike the earlier case of cellular automata), there are significant design or abstraction implications that must be considered in terms of subsequent implementation. Most ABM models discussed in this chapter and most others in the extant literature run fast on basic laptops. But some models cannot, requiring distributed computational resources, either through multiple processors or an actual cluster. An effective balance between high-fidelity and viable computational speed can be difficult to accomplish in the case of models having more than just local interactions.

The landscape of an ABM can also be tessellated, where sites can be square (most common form), triangular, hexagonal, or irregular (vector shapes), depending on a landscape's features in the referent system. As mentioned for CA, square cells normally are used for urban landscapes, whereas hexagonal cells are often preferable for large territories or open terrain. Each geometry has computational advantages and disadvantages, depending on factors such as total number of agents, sites, decision-making, behaviors, and scheduling. Needed data structures are also a consideration, such as preferring square sites over hexes when remote sensing imagery (using square pixels) is used in a model.

For square grids, agents may have von Neumann, Moore, or other neighborhood topology. For example, the original Sugarscape used von Neumann neighborhoods, whereas hexagonal neighborhoods in Wetlands and GeoSim use all six neighbors. Interaction or visual radii can also vary, depending on what is being abstracted from the referent system.

The main result of the design stage of an ABM is a conceptual and formal model of the referent social system specified by agents (social actors), their behavioral rules (what each agent does), and an environment (where agents are situated). Class, sequential, and state diagrams in UML are useful for specifying a conceptual model, along with traditional flowcharts. Mathematical models are also helpful in specifying a formal model of the referent system of interest.

10.4.3 Implementation: Agent-Based Simulation Systems

Having developed a sufficiently complete conceptual or formal model of a referent system as an ABM, the next methodological stage consists of implementing the model in code using a simulation system. As always, the model can also be implemented in native code using an OOP language, such as Python, Java, or C++. Currently available simulation systems are mostly Java-based. The main milestone in implementation is the transition from UML diagrams and mathematical equations in the conceptual model to code in the simulation model.

The number of agent-based simulation systems (toolkits) today ranges somewhere between fifty and a hundred, with more being created to provide new facilities. Swarm, NetLogo, Repast, and MASON are among the most widely utilized ABM simulation systems. The choice among these alternative simulation systems for learning purposes largely depends on access and familiarity. As was the case earlier for cellular automata, NetLogo is often the toolkit of choice for learning agent-based modeling, although Python software is becoming increasingly available. For advanced research purposes, Repast and, in particular, MASON assume familiarity with Java. Both Repast and GeoMASON can also implement true GIS for developing spatial ABMs with high-fidelity calibration to represent realistic empirical features of terrain and other features of a referent system.

Figure 10.8 shows a screenshot of the Sugarscape model implemented in NetLogo.

Fig. 10.8
figure 8

Screenshot of a Sugarscape model implemented in NetLogo

In addition to “The Big Four” (Swarm, NetLogo, Repast, and MASON), other software systems are also available for implementing ABM simulation models. Mathematica has demonstrated several simple ABMs, such as Sugarscape and Boids. Other ABM simulation systems are included in the Nikolai-Maddey 2009 survey.

10.4.4 Verification

Verifying an ABM social simulation model requires making sure that agents, rules, and environments are all working the way they are supposed to according to the conceptual model. In the case of relatively few agents and square cells, verification is simplest and relatively straightforward. Part of verification must include close examination of landscape borders (edged or toroidal). Behavioral rules are best verified by detailed tracing of each discrete interaction event within a single simulation step. As always, all general verification procedures examined earlier in Sect. 8.7.4 also apply to ABM social simulations, including code walk-through, unit testing, profiling, and parameter sweeps.

10.4.5 Validation

Validating an ABM social simulation model that has passed its verification tests involves the same two main perspectives mentioned earlier for other models: structural and behavioral validity.

Structural validity refers to internal features of the model, including main assumptions concerning relevant agent attributes, interaction rules, and environments. The following should be considered when testing structural validity in an ABM:

Empirical tests of validation :

The specification of equations used by object methods, as well as attribute and parameter values, are features requiring validation. For example, in the case of the Anasazi and Riftland models, this part of the validation procedure focused on parameters such as vegetation grow-back rates, as well as features of weather and land use. The radius of vision or communication used is another assumption requiring validation using empirical tests. It is also often assumed that coefficients are constant throughout a given simulated run. These are assumptions of structural stationarity, in the sense that agent rules do not change over time; i.e., classical object models assume that the basic clockwork of agents, rules, and environment does not change throughout history, which may or may not be a valid assumption in regards to a given referent system. For example, poverty may impair decision-making, or conflict may reduce cognitive bandwidth and complicate reasoning caused by unresolved dissonance (Sect. 4.8.1).

Theoretical tests of validation :

ABM simulation assumptions must also be checked in terms of theories being used, especially concerning knowledge taken from various disciplines. This is a broader perspective than empirical tests of structural validity, as already noted, because it is based on fundamental causal arguments that are sometimes difficult—if not impossible—to quantify. For example, in the case of GeoSim and similar models, the overall structure is based on balance of power and deterrence theory concerning how nations are supposed to interact in an international system. In this case, the fundamental theory is based on factors such as objective capabilities untransformed by perceptions, calendar time undistorted by tension and stress, and other simplifying features. Is such a theory valid? Are there other factors as important or even more significant than these? The underlying theory used in an ABM may also assume perfect symmetry among agents, even when they are heterogeneous in some respects. Even bounded rationality is often implemented in simplistic ways. Is it possible that actors decide with time-dependent or other forms of heterogeneity?

Tests of structural validity for ABM social simulation models can be laborious, but are always necessary to develop confidence in a model. Again, the empirical literature is of critical value in conducting these tests.

Behavioral validity is about actual results from ABM simulation runs, especially in terms of qualitative and quantitative features such as patterns of growth, decay, or oscillation. What matters most for ascertaining behavioral validity is whether simulated spatial patterns generated by an ABM correspond to known empirical patterns in its referent system. Time series, histograms, specialized metrics, and similar results are among the most commonly used. For example, Figs. 10.6 and 10.8 showed the Lorenz curves and wealth distribution histograms generated by the Sugarscape model. The long-run patterns of these (shown on the right side of the figure) are a close match to known empirical patterns in many societies (Pareto's Law). The RiftLand model is capable of generating ground cover patterns that are almost indistinguishable from empirical imagery satellite data obtained through remote sensing. The Anasazi model was among the first empirically referenced ABMs to demonstrate a close fit between simulated results and empirically measured patterns.

10.4.6 Analysis

ABM social simulations are susceptible to many forms of analysis, including formal analysis, asking what-if questions, and scenario analysis.

Formal analysis of ABM, a tradition exemplified by urban dynamics and human geography, is a major field extending far beyond the confines of CSS. For example, various gravity models of agent interactions, as well as driven-threshold systems of agents display significant properties that can be investigated through formal analysis. For the most part, CSS researches have paid relatively little attention to formal analysis of spatio-temporal interactions of agent communities. For example, different distance or temporal interaction structural specifications, and different types of driven-threshold mechanisms remain largely unexplored, in spite of their fundamental theoretical interest. Formal analysis of agent rules can also yield theoretical expectations for testing through simulation.

Another way of analyzing ABM social simulations is by asking what-if questions. For example, in a model such as Sugarscape we may ask what may happen when a Moore neighborhood is used, as opposed to the standard von Neumann neighborhood. Or, what if agent vision deteriorates as a function of time, as can happen also in times of conflict (“fog of war” effect). What-if questions can also be used to analyze an ABM simulation using different rule sets. For example, in an agent migration model we may wish to have one group responding to a Moore neighborhood while another uses a von Neumann neighborhood, perhaps based on different attitudes toward physical distance. Or, one group may be endowed with vision having longer range.

Scenario analysis provides a more comprehensive and versatile methodological approach to analyzing ABM social simulations. A scenario uses a set of related research questions, rather than analyzing one question at a time. For example, in a model such as RiftLand, it is possible to investigate a scenario such as prolonged drought in a given country: Given a three-year drought that has been going on in, say, Kenya, what may happen to crops and herds should the drought continue for another year or two? How might social relations be affected? Will governmental institutions of the polity have sufficient capacity to mitigate the societal effects caused by drought? Will there be displaced persons? Will large-scale refugee flows be generated by the drought? Will refugee flows remain internal or cross boundaries into neighboring countries? Can such analyses provide novel insights that may be valuable to relief planners and responders? Sets of scenarios can also be used for investigating natural, engineering, and anthropogenic (human-caused) disasters.

ABM social simulations are still primarily intended for basic CSS and theoretical analysis, but increasingly they are being called upon to address policy analysis to provide actionable results. Significant methodological and theoretical advances are still necessary to satisfy demand, but sustained progress will enable future generations of CSS researchers to build upon and surpass these recent achievements.