Definition of the Subject

Agent‐based modeling is a bottom‐up approach to representing and investigating complex systems. Agent‐based models can beimplemented either computationally (e. g., through computer simulation) or non‐computationally (e. g., with participatorysimulation). The close match between the capabilities of computational platforms and the requirements of agent‐based modeling make these platformsa natural choice for many agent‐based models. Of course, realizing the potential benefits of this natural match necessitates the use ofcomputer languages to express the designs of agent‐based models. A wide range of computer programming languages can play this role includingboth domain‐specific and general purpose languages. The domain‐specific languages include business‐oriented languages (e. g.,spreadsheet programming tools); science and engineering languages (e. g., Mathematica); and dedicated agent‐based modeling languages(e. g., NetLogo). The general purpose languages can be used directly (e. g., Java programming) or within agent‐based modeling toolkits(e. g., Repast). The choice that is most appropriate for each modeling project depends on both the requirements of that project and the resourcesavailable to implement it.

Introduction

The term agent‐based modeling (ABM) refers to the computational modeling of a system as comprised of a number of independent,interacting entities, which are referred to as ‘agents’. Generally, an agent‐based system is made up of agents that interact, adapt, andsustain themselves while interacting with other agents and adapting to a changing environment. The fundamental feature of an agent is its autonomy,the capability of the agent to act independently without the need for direction from external sources. Agents have behaviors that make them active ratherthan passive entities. Agent behaviors allow agents to take in information from their environment, which includes their interactions with other agents,process the information and make some decision about their next action, and take the action. Jennings [21] provides a rigorous computer science view of agency emphasizing the essential characteristic of autonomousbehavior.

Beyond the essential characteristic of autonomy, there is no universal agreement on the precise definition of the term “agent”, as usedin agent‐based modeling. Some consider any type of independent component, whether it be a software model or a software model of an extantindividual, to be an agent [4]. An independent component's behaviors can be modeled asconsisting of anything from simple reactive decision rules to multi‐dimensional behavior complexes based on adaptive artificial intelligence (AI)techniques.

Other authors insist that a component's behavior must be adaptive in order for the entity to be considered an agent. The agent label isreserved for components that can adapt to their environment, by learning from the successes and failures with their interactions with other agents, andchange their behaviors in response. Casti [5] argues that agents should contain both base‐levelrules for behavior as well as a higher‐level set of “rules to change the rules”. The base‐level rules provide responses tothe environment while the “rules to change the rules” provide adaptation [5].

From a practical modeling standpoint, agent characteristics can be summarized as follows:

  • Agents are identifiable as self‐contained individuals. An agent has a set of characteristics and rules governing its behaviors.

  • Agents are autonomous and self‐directed. An agent can function independently in its environment and in its interactions with other agents, at least over a limited range of situations that are of interest.

  • An agent is situated, living in an environment with which it interacts along with other agents. Agents have the ability to recognize and distinguish the traits of other agents. Agents also have protocols for interaction with other agents, such as for communication, and the capability to respond to the environment.

  • An agent may be goal‐directed, having targets to achieve with respect to its behaviors. This allows an agent to compare the outcome of its behavior to its goals. An agent's goals need not be comprehensive or well‐defined. For example, an agent does not necessarily have formally stated objectives it is trying to maximize.

  • An agent might have the ability to learn and adapt its behaviors based on its experiences. An agent might have rules that modify its behavior over time. Generally, learning and adaptation at the agent level requires some form of memory to be built into the agents behaviors.

Often, in an agent‐based model, the population of agents varies over time, as agents are born and die. Another formof adaptation can occur at the agent population level. Agents that are fit are better able to sustain themselves and possibly reproduce as time in thesimulation progresses, while agents that have characteristics less suited to their continued survival are excluded from the population.

Another basic assumption of agent‐based modeling is that agents have access only to local information. Agents obtain information about therest of the world only through their interactions with the limited number of agents around them at any one time, and from their interactions witha local patch of the environment in which they are situated.

These aspects of how agent‐based modeling treats agents highlight the fact that the full range of agent diversity can be incorporated into anagent‐based model. Agents are diverse and heterogeneous as well as dynamic in their attributes and behavioral rules. There is no need to make agentshomogeneous through aggregating agents into groups or by identifying the ‘average’ agent as representative of the entirepopulation. Behavioral rules vary in their sophistication, how much information is considered in the agent decisions (i. e., cognitive‘load’), the agent's internal models of the external world including the possible reactions or behaviors of other agents, and the extent ofmemory of past events the agent retains and uses in its decisions. Agents can also vary by the resources they have manage to accumulate during thesimulation, which may be due to some advantage that results from specific attributes. The only limit on the number of agents in an agent‐based modelis imposed by the computational resources required to run the model.

As a point of clarification, agent‐based modeling is also known by other names. ABS (agent‐based systems), IBM(individual‐based modeling), and MAS (multi‐agent systems) are widely‐used acronyms, but ‘ABM’ will be used throughout thisdiscussion. The term ‘agent’ has connotations other than how it is used in ABM. For example, ABM agents are different from the typical agentsfound in mobile agent systems. ‘Mobile agents’ are light‐weight software proxies that roam the world‐wide web and perform variousfunctioned programmed by their owners such as gathering information from web sites. To this extent, mobile agents are autonomous and share thischaracteristic with agents in ABM.

Types of Computer Languages

A ‘computer language’ is a method of noting directives for computers. ‘Computer programming languages,’ or‘programming languages,’ are an important category of computer languages. A programming language is a computer language that allowsany computable activity to be expressed. This article focuses on computer programming languages rather than the more general computer languages sincevirtually all agent‐based modeling systems require the power of programming languages. This article sometimes uses the simpler term ‘computerlanguages’ when referring to computer programming languages. According to Watson [47]:

Programming languages are used to describe algorithms, that is, sequences of steps that lead to the solution ofproblems … A programming language can be considered to be a ‘notation’ that can be used to specify algorithmswith precision.

Watson [47] goes on to say that “programming languages can be roughly divided into fourgroups: imperative languages, functional languages, logic programming languages, and others”. Watson [47] states that in imperative languages “there is a fundamental underlying dependence on the assignmentoperation and on variables implemented as computer memory locations, whose contents can be read and altered”. However, “in functionallanguages (sometimes called applicative languages) the fundamental operation is function application” [47]. Watson cites LISP as an example. Watson [47] continues bynoting that “in a logic programming language, the programmer needs only to supply the problem specification in some formal form, as it is theresponsibility of the language system to infer a method of solution”.

A useful feature of most functional languages, many logic programming languages, and some imperative languages ishigher‐order programming. According to Reynolds [35]:

In analogy with mathematical logic, we will say that a programming language is higher‐order if procedures or labels can occur as data,i. e., if these entities can be used as arguments to procedures, as results of functions, or as values of assignable variables. A language thatis not higher‐order will be called first‐order.

Watson [47] offers that “another way of grouping programming languages is to classifythem as procedural or declarative languages”. Elaborating, Watson [47] states that:

Procedural languages … are those in which the action of the program is defined by a series of operationsdefined by the programmer. To solve a problem, the programmer has to specify a series of steps (or statements) which are executed insequence.

On the other hand, Watson [47] notes that:

Programming in a declarative language (or non‐procedural language) involves the specification of a set of rulesdefining the solution to the problem; it is then up to the computer to determine how to reach a solution consistent with the givenrules … The language Prolog falls into this category, although it retains some procedural aspects. Another widespreadnon‐procedural system is the spreadsheet program.

Imperative and functional languages are usually procedural while logic programming languages are generally declarative. This distinction isimportant since it implies that most imperative and functional languages require users to define how each operation is to be completed while logicprogramming languages only require users to define what is to be achieved. However, when faced with multiple possible solutions with different executionspeeds and memory requirements, imperative and functional languages offer the potential for users to explicitly choose more efficient implementations overless efficient ones. Logic programming languages generally need to infer which solution is best from the problem description and may or may not choose themost efficient implementation. Naturally, this potential strength of imperative and functional languages may also be cast as a weakness. Withimperative and functional language users need to correctly choose a good implementation among any competing candidates that may be available.

Similarly to Watson [47], Van Roy and Haridi [45] defineseveral common computational models namely those that are object‐oriented, those that are logic‐based, and those that arefunctional. Object‐oriented languages are procedural languages that bind procedures (i. e., ‘encapsulated methods’) to theircorresponding data (i. e., ‘fields’) in nested hierarchies (i. e., ‘inheritance’ graphs) such that the resulting‘classes’ can be instantiated to produce executable instances (i. e., ‘objects’) that respond to multiple related messages(i. e., ‘polymorphism’). Logic‐based languages correspond to Watson's [47]logic programming languages. Similarly, Van Roy and Haridi [45] functional languages correspond tothose of Watson [47].

Two additional types of languages can be added to Van Roy and Haridi's [45] list ofthree. These are unstructured and structured languages [10]. Both unstructured andstructured languages are procedural languages.

Unstructured languages are languages that rely on step‐by‐step solutions such that the solutions can contain arbitrary jumps betweensteps [10]. BASIC, COBOL, FORTRAN, and C are examples of unstructured languages. Thearbitrary jumps are often implemented using ‘goto’ statements. Unstructured languages where famously criticized by Edsger Dijkstra in hisclassic paper “Go To Statement Considered Harmful” [10]. This and related criticism leadto the introduction of structured languages.

Structured languages are languages that divide programs into separate modules each of which has one controlled entry point, a limited number ofexit points, and no internal jumps [10]. Following Stevens et al. [38] “the term module is used to refer to a set of one or more contiguous program statements havinga name by which other parts of the system can invoke it and preferably having its own distinct set of variable names”. Structured languagemodules, often called procedures, are generally intended to be small. As such, large numbers of them are usually required to solve complexproblems. Standard Pascal is an example of structured, but not object‐oriented, language. As stated earlier, C is technically an unstructuredlanguage (i. e., it allows jumps within procedures and ‘long jumps’ between procedures), but it is used so often in a structuredway that many people think of it as a structured language.

The quality of modularization in structured language code is often considered to be a function of coupling and cohesion [38]. Coupling is the tie between modules such that the proper functioning of one module depends on the functioning ofanother module. Cohesion is the ties within a module such that proper functioning of one line of code in a module depends on the functioning ofanother one line of code in the same module. The goal for modules is maximizing cohesion while minimizing coupling.

Object‐oriented languages are a subset of structured languages. Object‐oriented methods and classes are structured programmingmodules that have special features for binding data, inheritance, and polymorphism. The previously introduced concepts of coupling and cohesion apply toclasses, objects, methods, and fields the same way that they apply to generic structured language modules. Objective-C, C++, C#, and Java are allexamples of object‐oriented languages. As with C, the languages Objective-C, C++, and C# offer goto statements but they haveobject‐oriented features and are generally used in a structured way. Java is an interesting case in that the word ‘goto’ isreserved as a keyword in the language specification, but it is not intended to be implemented.

It is possible to develop agent‐based models using any of the programming languages discussed above namely, unstructured languages, structuredlanguages, object‐oriented languages, logic‐based languages, and functional languages. Specific examples are provided later in thisarticle. However, certain features of programming languages are particularly well suited for supporting the requirements of agent‐based modeling andsimulation.

Requirements of Computer Languages for Agent‐Based Modeling

The requirements of computer languages for agent‐based modeling and simulation include the following:

  • There is a need to create well defined modules that correspond to agents. These modules should bind together agent state data and agent behaviors into integrated independently addressable constructs. Ideally these modules will be flexible enough to change structure over time and to optionally allow fuzzy boundaries to implement models that go beyond methodological individualism [20].

  • There is a need to create well defined containers that correspond to agent environments. Ideally these containers will be recursively nestable or will otherwise support sophisticated definitions of containment.

  • There is a need to create well defined spatial relationships within agent environments. These relationships should include notions of abstract space (e. g., lattices), physical space (e. g., maps), and connectedness (e. g., networks).

  • There is a need to easily setup model configurations such as the number of agents; the relationships between agents; the environmental details; and the results to be collected.

  • There is a need to conveniently collect and analyze model results.

Each of the kinds of programming languages namely, unstructured languages, structured languages, object‐orientedlanguages, logic‐based languages, and functional languages can address these requirements.

Unstructured languages generally support procedure definitions which can be used to implement agent behaviors. They also sometimes support thecollection of diverse data into independently addressable constructs in the form of data structures often called ‘records’. However, theygenerally lack support for binding procedures to individual data items or records of data items. This lack of support for creating integrated constructsalso typically limits the language‐level support for agent containers. Native support for implementing spatial environments is similarly limited bythe inability to directly bind procedures to data.

As discussed in the previous section, unstructured languages offer statements to implement execution jumps. The use of jumps within and betweenprocedures tends to reduce module cohesion and increase module coupling compared to structured code. The result is reduced code maintainability andextensibility compared to structured solutions. This is a substantial disadvantage of unstructured languages.

In contrast, many have argued that, at least theoretically, unstructured languages can achieve the highest execution speed and lowest memory usageof the language options since nearly everything is left to the application programmers. In practice, programmers implementing agent‐based models inunstructured languages usually need to write their own tools to form agents by correlating data with the corresponding procedures. Ironically, these toolsare often similar in design, implementation, and performance to some of the structured and object‐oriented features discussed later.

Unstructured languages generally do not provide special support for application data configuration, program output collection, or program resultsanalysis. As such, these tasks usually need to be manually implemented by model developers.

In terms of agent‐based modeling, structured languages are similar to unstructured languages in that they do not provide tools to directlyintegrate data and procedures into independently addressable constructs. Therefore, structured language support for agents, agent environments, and agentspatial relationships is similar to that provided by unstructured languages. However, the lack of jump statements in structured languages tends toincrease program maintainability and extensibility compared to unstructured languages. This generally gives structured languages a substantialadvantage over unstructured languages for implementing agent‐based models.

Object‐oriented languages build on the maintainability and extensibility advantages of structured languages by adding the ability to bind datato procedures. This binding in the form of classes provides a natural way to implement agents. In fact, object‐oriented languages have theirroots in Ole-Johan Dahl and Kristen Nygaard's Simula simulation language [7,8,45]! According to Dahl and Nygaard [7]:

SIMULA (SIMULation LAnguage) is a language designed to facilitate formal description of the layout and rules of operation ofsystems with discrete events (changes of state). The language is a true extension of ALGOL 60 [2], i. e., it contains ALGOL 60 as a subset. As a programming language, apart from simulation, SIMULAhas extensive list processing facilities and introduces an extended co‐routine concept in a high‐levellanguage.

Dahl and Nygaard go on to state the importance of specific languages for simulation [7] asfollows:

Simulation is now a widely used tool for analysis of a variety of phenomena: nerve networks, communication systems,traffic flow, production systems, administrative systems, social systems, etc. Because of the necessary list processing, complex data structures andprogram sequencing demands, simulation programs are comparatively difficult to write in machine language or in ALGOL or FORTRAN. This alone calls for theintroduction of simulation languages.

However, still more important is the need for a set of basic concepts in terms of which it is possible to approach, understand and describe all the apparently very different phenomena listed above. A simulation language should be built around such a set of basic concepts and allow a formal description which may generate a computer program. The language should point out similarities and differences between systems and force the research worker to consider all relevant aspects of the systems. System descriptions should be easy to read and print and hence useful for communication.

Again, according to Dahl and Nygaard [8]:

SIMULA I (1962–65) and Simula 67 (1967) are the two first object‐oriented languages. Simula 67 introduced most of thekey concepts of object‐oriented programming: both objects and classes, subclasses (usually referred to as inheritance) and virtual procedures,combined with safe referencing and mechanisms for bringing into a program collections of program structures described under a common classheading (prefixed blocks).

The Simula languages were developed at the Norwegian Computing Center, Oslo, Norway by Ole-Johan Dahl and Kristen Nygaard. Nygaard's work inOperational Research in the 1950s and early 1960s created the need for precise tools for the description and simulation of complex man‐machinesystems. In 1961 the idea emerged for developing a language that both could be used for system description (for people) and for system prescription(as a computer program through a compiler). Such a language had to contain an algorithmic language, and Dahl's knowledge of compilersbecame essential … When the inheritance mechanism was invented in 1967, Simula 67 was developed as a general programming languagethat also could be specialized for many domains, including system simulation.

Generally, object‐oriented classes are used to define agent templates and instantiated objects are used to implement specific agents. Agentenvironment templates and spatial relationships patterns are also typically implemented using classes. Recursive environment nesting as well as abstractspaces, physical spaces, and connectedness can all be represented in relatively straightforward ways. Instantiated objects are used to implement specificagent environments and spatial relationships in individual models. Within these models, model configurations are also commonly implemented as objectsinstantiated from one or more classes. However, as with unstructured and structured languages, object‐oriented languages generally do not providespecial support for application data configuration, program output collection, or program results analysis. As such, these tasks usually need to bemanually implemented by model developers. Regardless of this, the ability to bind data and procedures provides such a straightforward method forimplementing agents that most agent‐based models are written using object‐oriented languages.

It should be noted that traditional object‐oriented languages do not provide a means to modify class and objectstructures once a program begins to execute. Newer ‘dynamic’ object‐oriented languages such as Groovy [22] offer this capability. This potentially allows agents to gain and lose data items and methods during the executionof a model based on the flow of events in a simulation. This in turn offers the possibility of implementing modules with fuzzy boundaries thatare flexible enough to change structure over time.

As discussed in the previous section, logic‐based languages offer an alternative to the progression formed by unstructured, structured, andobject‐oriented languages. Logic‐based languages can provide a form of direct support for binding data (e. g., assertedpropositions) with actions (e. g., logical predicates), sometimes including the use of higher‐order programming. In principle, each agent canbe implemented as a complex predicate with multiple nested sub‐terms. The sub‐terms, which may contain unresolved variables, can then beactivated and resolved as needed during model execution. Agent templates which are analogous to object‐oriented classes can be implemented using thesame approach but with a larger number of unresolved variables. Agent environments and the resulting relationships between agents can be formed ina similar way. Since each of these constructs can be modified at any time, the resulting system can change structure over time and may even allowfuzzy boundaries. In practice this approach is rarely, if ever, used. As with the previously discussed approaches, logic‐based languages usually donot provide special support for application data configuration, program output collection, or program results analysis so these usually need to bemanually developed.

Functional languages offer yet another alternative to the previously discussed languages. Like logic‐based and object‐orientedlanguages, functional languages often provide a form of direct support for binding data with behaviors. This support often leverages the fact thatmost functional languages support higher‐order programming. As a result, the data is usually in the form of nested lists of values andfunctions while the behaviors themselves are implemented in the form of functions. Agent templates (i. e., ‘classes’), agentenvironments, and agent relationships can be implemented similarly. Each of the lists can be dynamically changed during a simulation run so the modelstructure can evolve and can potentially have fuzzy boundaries. Unlike the other languages discussed so far, a major class of functional languages,namely those designed for computational mathematics usually include sophisticated support for program output collection and results analysis. An exampleis Mathematica (Wolfram [49]). If the application data isconfigured in mathematically regular ways then these systems may also provide support for application data setup.

Example Computer Languages Useful for Agent‐Based Modeling

Domain‐Specific Languages

Domain‐specific languages (DSL's) are computer languages that are highly customized to support a well defined application area or ‘domain’. DSL's commonly include a substantial number of keywords that are nouns and verbs in the area of application as well as overall structures and execution patterns that correspond closely with the application area. DSL's are intended to allow users to write in a language that is closely aligned with their area of expertise.

DSL's often gain their focus by losing generality. For many DSL's there are activities that can be programmed in most computer languages that cannot be programmed in the given DSL. This is consciously done to simplify the DSL's design and make it easier to learn and use. If a DSL is properly designed then the loss of generality is often inconsequential for most uses since the excluded activities are chosen to be outside the normal range of application. However, even the best designed DSL's can occasionally be restrictive when the bounds of the language are encountered. Some DSL's provide special extension points that allow their users to program in a more general language such as C or Java when the limits of the DSL are reached. This feature is extremely useful, but requires more sophistication on the part of the user in that they need to know and simultaneously use both the DSL and the general language.

DSL's have the potential to implement specific features to support ‘design patterns’ within a given domain. Design patterns form a “common vocabulary” describing tried and true solutions for commonly faced software design problems (Coplien [6]). Software design patterns were popularized by Gamma et al. [13]. North and Macal [29] describe three design patterns for agent‐based modeling itself.

In principle, DSL's can be unstructured, structured, object‐oriented, logic‐based, or functional. In practice, DSL's are often structured languages or object‐oriented languages and occasionally are functional languages. Commonly used ABM DSL's include business‐oriented languages (e. g., spreadsheet programming tools); science and engineering languages (e. g., Mathematica); and dedicated agent‐based modeling languages (e. g., NetLogo).

Business Languages

Some of the most widely used business computer languages are those available in spreadsheet packages. Spreadsheets are usually programmed using a ‘macro language’. As discussed further in North and Macal [29], any modern spreadsheet program can be used to do basic agent‐based modeling. The most common convention is to associate each row of a primary spreadsheet worksheet with an agent and use consecutive columns to store agent properties. Secondary worksheets are then used to represent the agent environment and to provide temporary storage for intermediate calculations. A simple loop is usually used to scan down the list of agents and to allow each one to execute in turn. The beginning and end of the scanning loop are generally used for special setup activities before and special cleanup activities after each round. An example agentspreadsheet from North and Macal [29] is shown in Fig. 1and Fig. 2. Agent spreadsheets have both strengths and weaknesses compared to the other ABM tools. Agent spreadsheets tend to be easy to build but they also tend to have limited capabilities. This balance makes spreadsheets ideal for agent‐based model exploration, scoping, and prototyping. Simple agent models can be implemented on the desktop using environments outside of spreadsheets as well.

Figure 1
figure 1_8

An example agent spreadsheet [29]

Figure 2
figure 2_8

An example agent spreadsheet code [29]

Science and Engineering Languages

Science and engineering languages embodied in commercial products such as Mathematica, MATLAB, Maple, and others can be used as a basis for developing agent‐based models. Such systems usually have a large user base, are readily available on desktop platforms, and are widely integrated into academic training programs. They can be used as rapid prototype development tools or as components of large‐scale modeling systems. Science and engineering languages have been applied to agent‐based modeling. Their advantages include a fully integrated development environment, their interpreted (as opposed to compiled) nature provides immediate feedback to users during the development process, and a packaged user interface. Integrated tools provide support for data import and graphical display. Macal [25] describes the use of Mathematica and MATLAB in agent‐based simulation and Macal and Howe [26] detail investigations into linking Mathematica and MATLAB to the Repast ABM toolkit to make use of Repast's simulation scheduling algorithms. In the following sections we focus on MATLAB and Mathematica as representative examples of science and engineering languages.

MATLAB and Mathematica are both examples of Computational Mathematics Systems (CMS). CMS allow users to apply powerful mathematical algorithms to solve problems through a convenient and interactive user interface. CMS typically supply a wide range of built‐in functions and algorithms. MATLAB, Mathematica, and Maple are examples of commercially available CMS whose origins go back to the late 1980s. CMS are structured in two main parts: (1) the user interface that allows dynamic user interaction, and (2) the underlying computational engine, or kernel, that performs the computations according to the user's instructions. Unlike conventional programming languages, CMS are interpreted instead of compiled, so there is immediate feedback to the user, but some performance penalty is paid. The underlying computational engine is written in the C programming language for these systems, but C coding is unseen by the user. The most recent releases of CMS are fully integrated systems, combining capabilities for data input and export, graphical display, and the capability to link to external programs written in conventional languages such as C or Java using inter‐process communication protocols. The powerful features of CMS, their convenience of use, the need to learn only a limited number of instructions on the part of the user, and the immediate feedback provided to users are features of CMS that make them good candidates for developing agent‐based simulations.

A further distinction can be made among CMS. A subset of CMS are what is called Computational Algebra Systems (CAS). CAS are computational mathematics systems that calculate using symbolic expressions. CAS owe their origins to the LISP programming language, which was the earliest functional programming language [24]. Macsyma (www.scientek.com/macsyma) and Scheme [37] (www.swiss.ai.mit.edu/projects/scheme) are often mentioned as important implementations leading to present day CAS. Typical uses of CAS are equation solving, symbolic integration and differentiation, exact calculations in linear algebra, simplification of mathematical expressions, and variable precision arithmetic. Computational mathematics systems consist of numeric processing systems or symbolic processing systems, or possibly a combination of both. Especially when algebraic and numeric capabilities are combined into a multi‐paradigm programming environment, new modeling possibilities open up for developing sophisticated agent‐based simulations with minimal coding.

Mathematica

Mathematica is a commercially available numeric processing system with enormous integrated numerical processing capability (http://www.wolfram.com). Beyond numeric processing, Mathematica is a fully functional programming language. Unlike MATLAB, Mathematica is a symbolic processing system that uses term replacement as its primary operation. Symbolic processing means that variables can be used before they have values assigned to them; in contrast a numeric processing language requires that every variable have a value assigned to it before it is used in the program. In this respect, although Mathematica and MATLAB may appear similar and share many capabilities, Mathematica is fundamentally much different than MATLAB, with a much different style of programming and ultimately with a different set of capabilities applicable to agent‐based modeling.

Mathematica's symbolic processing capabilities allow one to program in multiple programming styles, either as alternatives or in combination, such as functional programming, logic programming, procedural programming, and even object‐oriented programming styles. Like MATLAB, Mathematica is also an interpreted language, with the kernel of Mathematica running in the background in C. In terms of data types, everything is an expression in Mathematica. An expression is a data type with a head and a list of arguments in which even the head of the expression is part of the expression's arguments.

The Mathematica user interface consists of a what is referred to as a notebook (Fig. 3). A Mathematica notebook is a fully integratable development environment and a complete publication environment. The Mathematica Application Programming Interface (API) allows programs written in C, FORTRAN, or Java to interact with Mathematica. The API has facilities for dynamically calling routines from Mathematica as well as calling Mathematica as a computational engine.

Figure 3
figure 3_8

Example Mathematica cellular automata model

Figure 3 shows Mathematica desktop notebook environment. A Mathematica notebook is displayed in its own window. Within a notebook, each item is contained in a cell. The notebook cell structure has underlying coding that is accessible to the user.

In Mathematica, a network representation consists of combining lists of lists, or more generally expressions of expressions, to various depths. For example, in Mathematica, an agent can be represented explicitly as an expression that includes a head named agent, a sequence of agent attributes, and a list of the agent's neighbors. Agent data and methods are linked together by the use of what are called up values.

Example references for agent‐based simulation using Mathematica include Gaylord and Davis [15], Gaylord and Nishidate [16], and Gaylord and Wellin [17]. Gaylord and D'Andria [14] describe applications in social agent‐based modeling.

MATLAB

The MATrix LABoratory (MATLAB) is a numeric processing system with enormous integrated numerical processing capability (http://www.mathworks.com). It uses a scripting‐language approach to programming. MATLAB is a high‐level matrix/array language with control flow, functions, data structures, input/output, and object‐oriented programming features. The user interface consists of the MATLAB Desktop, which is a fully integrated and mature development environment. There is an application programming interface (API). The MATLAB API allows programs written in C, Fortran, or Java to interact with MATLAB. There are facilities for calling routines from MATLAB (dynamic linking) as well as routines for calling MATLAB as a computational engine, as well as for reading and writing specialized MATLAB files.

Figure 4 shows the MATLAB Desktop environment illustrating the Game of Life, which is a standard MATLAB demonstration. The desktop consist of four standard windows: a command window, which contains a command line, the primary way of interacting with MATLAB, the workspace, which indicates the values of all the variables currently existing in the session, a command history window that tracks the entered command, and the current directory window. Other windows allow text editing of programs and graphical output display.

When it comes to agent‐based simulation, as in most types of coding, the most important indicator of the power of a language for modeling is the extent of and the sophistication of the allowed data types and data structures. As Sedgewick [36] observes:

For many applications, the choice of the proper data structure is really the only major decision involved in the implementation; once the choice has been made only very simple algorithms are needed [36].

The flexibility of data types plays an important role in developing large‐scale, extensible models for agent‐based simulation. In MATLAB the primary data type is the double array, which is essentially a two‐dimensional numeric matrix. Other data types include logical arrays, cell arrays, structures, and character arrays.

For agent‐based simulations that define agent relationships based on networks, connectivity of the links defines the scope of agent interaction and locally available information. Extensions to modeling social networks require the use of more complex data structures than the matrix structure commonly used for grid representations. Extensions from grid topologies to network topologies are straightforward in MATLAB and similarly in Mathematica. In MATLAB, a network representation consists of combining cell arrays or structures in various ways.

Figure 4
figure 4_8

Example MATLAB cellular automata model

The MATLAB desktop environment showing the Game of Life demonstration appears in Fig. 4. The Game of Life is a cellular automaton invented by mathematician John Conway that involves live and dead cells in cellular automata grid. In MATLAB, the agent environment is a sparse matrix that is initially set to all zeros. Whether cells stay alive, die, or generate new cells depends upon how many of their eight possible neighbors are alive. By using sparse matrices, the calculations required become very simple. Pressing the “Start” button automatically seeds this universe with several small random communities and initiates a series of cell updates. After a short period of simulation, the initial random distribution of live (i. e., highlighted) cells develops into sets of sustainable patterns that endure for generations.

Several agent‐based models using MATLAB have been published in addition to the Game of Life. These include a model of political institutions in modern Italy [3], a model of pair interactions and attitudes [34], a bargaining model to simulate negotiations between water users [43], and a model of sentiment and social mitosis based on Heider's Balance Theory [18,46]. The latter model uses Euler, a MATLAB‐like language. Thorngate argues for the use of MATLAB as an important tool to teach simulation programming techniques [42].

Dedicated Agent‐Based Modeling Languages

Dedicated agent‐based modeling languages are DSL's that are designed to specifically support agent‐based modeling. Several such languages currently exist. These languages are functionally differentiated by the underlying assumptions their designers made about the structures of agent‐based models. The designers of some of these languages assume quite a lot about the situations being modeled and use this information to provide users with pre‐completed or template components. The designers of other languages make comparatively fewer assumptions and encourage users to implement a wider range of models. However, more work is often needed to build models in these systems. This article will discuss two selected examples, namely NetLogo and the visual interface for Repast Simphony.

NetLogo

NetLogo is an education‐focused ABM environment (Wilensky [48]). The NetLogo language uses a modified version of the Logo programming language (Harvey [19]). NetLogo itself is Java‐based and is free for use in education and research. More information on NetLogo and downloads can be found at http://ccl.northwestern.edu/netlogo/.

NetLogo is designed to provide a basic computational laboratory for teaching complex adaptive systems concepts. NetLogo was originally developed to support teaching, but it can be used to develop a wider range of applications. NetLogo provides a graphical environment to create programs that control graphic ‘turtles’ that reside in a world of ‘patches’ that is monitored by an ‘observer’. NetLogo's DSL is limited to its turtle and patch paradigm. However, NetLogo models can be extended using Java to provide for more general programming capabilities. An example NetLogo model of an ant colony [48] (center) feeding on three food sources (upper left corner, lower left corner, and middle right) is shown in Fig. 5. Example code [48] from this model is shown in Fig. 6.

Figure 5
figure 5_8

Example NetLogo ant colony model [48]

Figure 6
figure 6_8

Example NetLogo code from the ant colony model [48]

Repast Simphony Visual Interface

The Recursive Porous Agent Simulation Toolkit (Repast) is a free and open source family of agent‐based modeling and simulation platforms (ROAD [44]). Information on Repast and free downloads can be found at http://repast.sourceforge.net/. Repast Simphony (Repast S) is the newest member of the Repast family [30,32]. The Java‐based Repast S system includes advanced features for specifying, executing, and analyzing agent‐based simulations. Repast Simphony offers several methods for specifying agents and agent environments including visual specification, specification with the dynamic object‐oriented Groovy language [22], and specification with Java. In principle, Repast S's visual DSL can be used for any kind of programming, but models beyond a certain level of complexity are better implemented in Groovy or Java. As discussed later, Groovy and Java are general purpose languages. All of Repast S's languages can be fluidly combined in a single model. An example Repast S flocking model is shown in Fig. 7 [33]. The visual specification approach uses a tree to define model contents (Fig. 8 middle panel with “model.score” header) and a flowchart to define agent behaviors (Fig. 8 right side panel with “SearchExampleAgent.agent” header). In all cases, the user has a choice of a visual rich point‐and‐click interface or a ‘headless’ batch interface to execute models.

Figure 7
figure 7_8

Example Repast Simphony flocking model [33]

Figure 8
figure 8_8

Example Repast Simphony visual behavior from the flocking model [33]

General Languages

Unlike DSL's, general languages are designed to take on any programming challenge. However, in order to meet this challenge they are usually more complex than DSL's. This tends to make them more difficult to learn and use. Lahtinen et al. [23] documents some of the challenges users face in learning general purpose programming languages. Despite these issues, general purpose programming languages are essential for allowing users to access the full capabilities of modern computers. Naturally, there are a huge number of general purpose programming languages. This article considers these options from two perspectives. First, general language toolkits are discussed. These toolkits provide libraries of functions to be used in a general purpose host language. Second, the use of three raw general purpose languages, namely Java, C#, and C++, is discussed.

General Language Toolkits

As previously stated, general language toolkits are libraries that are intended be used in a general purpose host language. These toolkits usually provide model developers with software for functions such as simulation time scheduling, results visualization, results logging, and model execution as well as domain‐specific tools [31]. Users of raw general purpose languages have to write all of the needed features by themselves by hand.

A wide range of general language toolkits currently exist. This article will discuss two selected examples, namely Swarm and the Groovy and Java interfaces for Repast Simphony.

Swarm

Swarm [28] is a free and open source agent‐based modeling library. Swarm seeks to create a shared simulation platform for agent modeling and to facilitate the development of a wide range of models. Users build simulations by incorporating Swarm library components into their own programs. Information on Swarm and free downloads can be found at http://www.swarm.org/ from Marcus Daniels [9]:

Swarm is a set of libraries that facilitate implementation of agent‐based models. Swarm's inspiration comes from the field of Artificial Life. Artificial Life is an approach to studying biological systems that attempts to infer mechanism from biological phenomena, using the elaboration, refinement, and generalization of these mechanisms to identify unifying dynamical properties of biological systems … To help fill this need, Chris Langton initiated the Swarm project in 1994 at the Santa Fe Institute. The first version was available by 1996, and since then it has evolved to serve not only researchers in biology, but also anthropology, computer science, defense, ecology, economics, geography, industry, and political science.

The Swarm simulation system has two fundamental components. The core component runs general‐purpose simulation code written in Objective-C, Tcl/Tk, and Java. This component handles most of the behind the scenes details. The external wrapper components run user‐specific simulation code written in either Objective-C or Java. These components handle most of the center stage work. An example Swarm supply chain model is shown in Fig. 9.

Figure 9
figure 9_8

Example Swarm supply chain model

Repast Simphony Java and Groovy

As previously discussed, Repast is a free and open source family of agent‐based modeling and simulation platforms (ROAD [44]). Information on Repast and free downloads can be found at http://repast.sourceforge.net/. Repast S is the newest member of the Repast family [30,32]. The Java‐based Repast S system includes advanced features for specifying, executing, and analyzing agent‐based simulations. An example Repast S flocking model is shown in Fig. 7 [33].

Repast Simphony offers several intermixable methods for specifying agents and agent environments including visual specification, specification with the dynamic object‐oriented Groovy language [22], and specification with Java. The Groovy approach uses the dynamic object‐oriented Groovy language as shown in Fig. 10. The Java approach for an example predator‐prey model is shown in Fig. 11.

Figure 10
figure 10_8

Example Repast Simphony Groovy code from the flocking model in Fig. 7 [33]

Figure 11
figure 11_8

Example Repast Simphony Java code for a predator‐prey model [41]

Java

Java [12] is a widely used object‐oriented programming language that was developed and is maintained by Sun Microsystems. Java is known for its widespread ‘cross‐platform’ availability on many different types of hardware and operating systems. This capability comes from Java's use of a ‘virtual machine’ that allows binary code or ‘bytecode’ to have a consistent execution environment on many different computer platforms. A large number of tools are available for Java program development including the powerful Eclipse development environment [11] and many supporting libraries. Java uses reflection and dynamic method invocation to implement a variant of higher‐order programming. Reflection is used for runtime class structure examination while dynamic method invocation is used to call newly referenced methods at runtime. Java's object‐orientation, cross platform availability, reflection, and dynamic method invocation along with newer features such as annotations for including metadata in compiled code, generics for generalizing class, and aspects to implement dispersed but recurrent tasks make it a good choice for agent‐based model development.

C#

C# [1] is an object‐oriented programming language that was developed and is maintained by Microsoft. C# is one of many languages that can be used to generate Microsoft .NET Framework code or Common Intermediate Language (CIL). Like Java bytecode, CIL is run using a ‘virtual machine’ that potentially gives it a consistent execution environment on different computer platforms. A growing number of tools are emerging to support C# development. C#, and the Microsoft .NET Framework more generally, are in principle cross platform, but in practice they are mainly executed under Microsoft Windows.

The Microsoft .NET Framework provides for the compilation into CIL of many different languages such as C#, Managed C++, and Managed Visual Basic to name just a few. Once these languages are compiled to CIL, the resulting modules are fully interoperable. This allows users to conveniently develop integrated software using a mixture of different languages. Like Java, C# supports reflection and dynamic method invocation for higher‐order programming. C#'s object‐orientation, multi‐lingual integration, generics, attributes for including metadata in compiled code, aspects, reflection, and dynamic method invocation make it well suited for agent‐based model development, particularly on the Microsoft Windows platform.

C++

C++ is a widely used object‐oriented programming language that was created by Bjarne Stroustrup (Stroustrup [39]) at AT&T. C++ is widely noted for both its object‐oriented structure and its ability to be easily compiled into native machine code. C++ gives users substantial access to the underlying computer but also requires substantial programming skills.

Most C++ compilers are actually more properly considered C/C++ compilers since they can compile non‐object‐oriented C code as well as object‐oriented C++ code. This allows sophisticated users to the opportunity highly optimize selected areas of model code. However, this also opens the possibility of introducing difficult to resolve errors and hard to maintain code. It is also more difficult to port C++ code from one computer architecture to another than it is for virtual machine‐based languages such as Java.

C++ can use a combination of Runtime Type Identification (RTTI) and function pointers to implement higher‐order programming. Similar to the Java approach, C++ RTTI can be used for runtime class structure examination while function pointers can be used to call newly referenced methods at runtime. C++'s object‐orientation, RTTI, function pointers, and low‐level machine access make it a reasonable choice for the development of extremely large or complicated agent‐based models.

Future Directions

Future developments in computer languages could have enormous implications for the development of agent‐based modeling. Some of the challengesof agent‐based modeling for the future include (1) scaling up models to handle large numbers of agents running on distributed heterogeneousprocessors across the grid, (2) handling the large amounts of data generated by agent models and making sense out of it, and (3) developinguser‐friendly interfaces and modular components in a collaborative environment that can be used by domain experts with little or no knowledgeof standard computer coding techniques. Visual and natural language development environments that can be used by non‐programmers are continuing toadvance but remain to be proven at reducing the programming burden. There are a variety of next steps for the development of computer languages foragent‐based modeling including the further development of DSL's; increasing visual modeling capabilities; and the development of languages andlanguage features that better support pattern‐based development. DSL's are likely to become increasing available as agent‐based modeling growsinto a wider range of domains. More agent‐based modeling systems are developing visual interfaces for specifying model structures and agentbehaviors. Many of these visual environments are themselves DSL's. The continued success of agent‐based modeling will likely yield an increasingnumber of design patterns. Supporting and even automating implementations of these patterns may form a natural source for new language features. Manyof these new features are likely to be implemented within DSL's.