Keywords

1 Introduction

Agent-based modelling and simulation is an active field of study which, for example, may offer methods for solving complex optimization problems. In this setting, agents are required to cooperate to solve the problem at hand. In complex systems with adjustable agent autonomy, sophisticated planning can often be replaced by norms; see for example [21]. The study of norm-regulated multi-agent systems, often referred to as normative MAS, has also attracted a lot of attention. The NorMAS roadmap [3] is a comprehensive introduction to and overview of the field. The combination of agent-based modelling and simulation and normative MAS is a promising field of study [4].

It is often desirable to replace planning (and replanning), since it may be a complex and time-consuming task, especially in collaborative environments. On the other hand, designing good normative systems is also a challenge. The approach suggested here, whose basic ideas were outlined in [20, pp. 164f], is to use evolutionary mechanisms, employed in a genetic algorithm, to aid the ‘off-line’ (i.e., pre-runtime) design of normative systems for problem-solving multi-agent systems based on the Dalmas architecture for norm-regulated MAS. The normative framework of a Dalmas is based on an algebraic version of the Kanger-Lindahl theory of normative positions, which is well suited as the logical foundation for normative systems in a MAS context, since the types of normative positions are mutually exclusive and jointly exhaustive in the logical sense.

The paper is structured as follows. Section 1.2 briefly introduces the algebraic version of the theory of normative positions, and in Sect. 1.3, previous work on the Dalmas architecture is presented. Section 2 introduces an example Dalmas which will be used in Sect. 3 to demonstrate how to employ evolutionary mechanisms in the process of designing norms, by applying an evolutionary algorithm to this example. Section 4 concludes and suggests some lines of future work.

1.1 Related Work

The runtime emergence of norms within artificial social systems has attracted the attention of many researchers; see, e.g., [2]. However, evolving normative systems as part of the process of designing norm-regulated MAS is not as well studied, although evolutionary approaches for learning behaviour patterns or strategies for coordination have been successfully used in, e.g., the RoboCupFootnote 1 domain; see for example [6, 17, 18]. In fact, the simple decision policies evolved by Di Pietro et al. for the RoboCup Keepaway game can be regarded as simple normative systems consisting of production rules which prescribe certain behaviours in certain situations.

Table 1. One-agent types of normative positions.

1.2 One-Agent Types of Normative Positions

The Kanger-Lindahl theory of normative positions is based on Kanger’s ‘deontic action-logic’; see for example [14]. The theory, further developed by Lindahl [15], contains three systems of types of normative positions, based on the logic of the action operator \(\mathrm {Do}\) and the deontic operator \(\mathrm {Shall}\). The simplest of these systems is a system of seven ‘one-agent types’ of normative positions. \(\mathrm {Do}(x,F)\) is commonly read as ‘x sees to it that F’ or ‘x brings it about that F’, where F is a proposition regarding some state of affairs. The logical properties assumed for \(\mathrm {Do}\) is that it is the smallest system containing propositional logic, closed under logical equivalence and containing the axiom schema \(\mathrm {Do}(x,F)\rightarrow F\), which tries to capture the notion of successful action; if x ‘sees to it’ or ‘brings about’ that F, then F is indeed the case.

Each of the three statements

  1. (i)

    \(\mathrm {Do}(x,F)\),

  2. (ii)

    \(\mathrm {Do}(x,\lnot \) F), and

  3. (iii)

    \( \lnot \mathrm {Do}(x,F)\, \& \,\lnot \mathrm {Do}(x,\lnot \) F),

implies the negation of each of the others, and the disjunction of all three is a tautology. Each of (i) – (iii) can be prefixed with either \(\mathrm {May}\) or \(\lnot \mathrm {May}\), where \(\mathrm {May}\,F\) is defined as \(\lnot \mathrm {Shall}\lnot {F}\), and basic conjunctions containing one statement from each such pair can be formed. By iterated construction of basic conjunctions, a set of eight conjunctions (of which one is self-contradictory) is obtained. The consistent ‘maxi-conjunctions’ are listed in Table 1.

In a series of papers, comprehensively summarized in [16], Lindahl and Odelstad have combined the theory of normative positions with an algebraic approach to normative systems. Their idea is to use the one-agent types of normative positions as operators on descriptive conditions to get deontic conditions. A \(\nu \)-ary condition d can be true or false of \(\nu \) agents \(x_{1},...,x_{\nu }\). Thus, \(d(x_{1},...,x_{\nu })\) is a state of affairs which may be true or false. (To facilitate the presentation, \(X_{\nu }\) will often be used as an abbreviation for the argument sequence \(x_{1} ,...,x_{\nu }\).) In the special case when the sequence of agents is empty, i.e. \(\nu =0\), d represents a proposition which may be true or false. Note that negations \({d}^{\prime }\), conjunctions \((c\wedge d)\), and disjunctions \((c\vee d)\) can be formed in the following way:

$$\begin{aligned}&{d}^{\prime }{(}X_{\nu })\,\text {iff}\,\lnot {d(}X_{\nu }),\\&(c\wedge d)(X_{\nu })\,\text {iff}\,[c(X_{p})\,\text {and}\,d(X_{q})],\,\text {and}\\&(c\vee d)(X_{\nu })\,\text {iff}\,[c(X_{p})\,\text {or}\,d(X_{q})], \end{aligned}$$

where \(\nu =\max (p,q)\).Footnote 2 Therefore, it is possible to construct Boolean algebras of conditions. A Boolean algebra together with an implicative relation R fulfilling certain conditions, forms a so-called Boolean quasiordering (Bqo). As an application of their Theory of Joining-Systems (TJS), Lindahl and Odelstad define the notion of a normative position condition-implication structure, abbreviated np-cis, which is based on Bqo’s on descriptive and deontic conditions, so-called cis-Bqo’s. For details on Boolean quasiorderings, condition implication structures and np-cis’es, see for example [16] or [20].

1.3 Previous Work

Dalmas [20] is an abstract architecture for a class of norm-regulated multi-agent systems. A deterministic Dalmas is a simple multi-agent system in which the actions of an agent are connected to transitions between system states. In a deterministic Dalmas the agents take turns to act; only one agent at a time may perform an action. By allowing ‘do nothing’ actions and accelerating the turn-taking, systems with close to asynchronous behaviour can be obtained. A special kind of Dalmas is the norm-regulated simple deterministic Dalmas, which employs what is often referred to as ‘negative permission’, by letting its deontic structure (i.e., the set of permissible acts) consist of all acts that are not explicitly prohibited by a normative system \(\mathcal {N}\). The Dalmas’s preference structure consists of the most preferable (according to the agent’s utility function) of the acts in the deontic structure. In short, a Dalmas agent’s behaviour is regulated by the combination of a normative system and a utility function; this ‘agent oeconomicus norma’Footnote 3 chooses the most desirable act, according to the utility function, within the ‘room for manouver’ determined by the norms. The Dalmas’s normative framework is based on an algebraic version of the Kanger-Lindahl theory of normative positions, in which normative consequences are formulated by applying normative operators to descriptive conditions. (See Sect. 1.2.) From these general normative conditions follow normative sentences regarding specific states of affairs, which in turn result in permission or prohibition of individual actions in specific situations. (See for example [19, 20] for an introduction.) Hence, the norms in the Dalmas architecture play a different role, and is represented in a fundamentally different way, than, e.g., the decision rules in the RoboCup setting (see Sect. 1.1).

Table 2. ‘Reduced extended’ types of one-agent normative positions.

Since the agents in a deterministic Dalmas take turns to act, each individual step in a run of a Dalmas may be characterized by an ordered 5-tuple \(S=\langle x,s,A,\varOmega ,S\rangle \) whose components are a set of states S, a state s, an agent-set \(\varOmega =\{x_{1},...,x_{n}\}\), the acting (‘moving’) agent x, and an action-set \(A=\{a_{1},...,a_{m} \}\).Footnote 4 In this setting, a may be regarded as a function such that \(a(x,s)=s^{+}\) means that \(s^{+}\) is the resulting state when x performs act a in state s. In the following, the abbreviation \(s^{+}\) will be used for a(xs) when there is no need for an explicit reference to the action a and the acting agent x. As already mentioned, there is no simultaneous action by other agents (including the ‘environment’, which may be regarded as a special kind of agent). Furthermore, we assume that a \(\nu \)-ary condition d is true or false on \(\nu \) agents \(x_{1},...,x_{\nu }\in \) \(\varOmega \) in s; with the abbreviation \(X_{\nu }\) for the agent sequence, this will be written \(d(X_{\nu };s)\).

Let the situation \(\langle x,s\rangle \) be characterized by the moving agent x and the state s in a norm-regulated simple deterministic Dalmas. It is possible in the Dalmas architecture to distinguish between the moving agent and the agent to which normative condition appliesFootnote 5, but to facilitate the presentation it is assumed in the sequel that norms always apply to the moving agent x in a situation \(\langle x,s\rangle \). A norm in \(\mathcal {N} \) is represented by an ordered pair \(\langle c,Nd\rangle \), where the (descriptive) condition c on a situation \(\langle x,s\rangle \) is the ground of the norm and the (normative) condition Nd on \(\langle x,s\rangle \) is its consequence; see, e.g., [20]. Nd is formed by applying a ‘norm-creating’ operator N to the descriptive condition d.

Table 3. Basic transition types.

In the following, the normative framework of the Dalmases employed is based on the notion of an np9-cis [12, Sect. 2.2.1], a structure similar to the np-cis defined by Lindahl and Odelstad, but based on the ‘reduced extended’ set of types of normative positions shown in Table 2. It is argued in [12] that a semantics for the normative framework of a Dalmas can be formed by defining a set of ‘transition type operators’ \(C_{k}^{a}\), \(k\in \{1,2\varLambda ,2\varOmega ,4\varLambda ,4\varOmega ,5,6\varLambda ,6\varOmega ,7\}\), based on Table 4, and a set of corresponding ‘transition type prohibition operators’ \(P_{k}\), such that \(P_{k}d(X_{\nu };x,s)\) is intended to mean that if \(C_{k}^{a}d(X_{\nu };x,s)\) holds, then a is prohibited for x in \(\langle x,s\rangle \). In effect, \(P_{k}d(X_{\nu };x,s)\) implies a prohibition of zero, one or two of the four ‘basic transition types’ (see Table 3, where \(s^{+}\) refers to the resulting state when the acting agent performs its act) with regard to the state of affairs \(d(X_{\nu })\). For example, \(\langle c,P_{k}d\rangle \), where c and d can have different arity, represents the sentence

$$\begin{aligned} \forall x_{1},x_{2},...,x_{\nu }\in \varOmega :c(x_{1},x_{2},...,x_{p};x,s)\rightarrow P_{k}d(x_{1},x_{2},...,x_{q};x,s) \end{aligned}$$

where \(\varOmega \) is the set of agents, x is the acting agent (to which the norm applies) in the situation \(\langle x,s\rangle \), and \(\nu =\max (p,q)\). If the condition specified by the ground of a norm for some agents in some situation, then the (normative) consequence of the norm is in effect in that situation. If the normative system \(\mathcal {N}\) contains a norm whose ground holds in the situation \(\langle x,s\rangle \) and whose consequence prohibits the type of transition represented by x performing action a, then a is prohibited for x in \(\langle x,s\rangle \):

$$ \begin{aligned} \begin{array}{cc} Prohibited_{x,s}(a)\,\text {according to}\,\mathcal {N}&{}\\ \text {if there is a}\,p\text {-ary condition}\,c&{}\\ \text {and a}\,q\text {-ary condition}\,d &{}\\ \text {and a}\,k\in \{1,2\varLambda ,2\varOmega ,4\varLambda ,4\varOmega ,5,6\varLambda ,6\varOmega ,7\},&{}\\ \text {such that}\,\langle c,P_{k}d\rangle \,\text {is a norm in}\,\mathcal {N},&{}\\ \text {and there are}\,x_{1},...,x_{\nu }\,\text {such that}&{} \\ c(x_{1},...,x_{p};x,s)\, \& \,C_{k}^{a}d(x_{1},...,x_{q};x,s),&{}\\ \text {where}\,\nu =\max (p,q). &{} \end{array} \end{aligned}$$
Table 4. Transition type prohibition operators and transition type conditions.
Table 5. Possible changes of \(Lap_n\).

Hence, if \(c(x_{1},...,x_{p};x,s)\) for some sequence of agents \(x_{1},...,x_{\nu }\), then the normative condition \(P_{k}d(x_{1},...,x_{q} ;x,s)\) is ‘in effect’. Thus, if \(C_{k}^{a}d(x_{1},...,x_{q};x,s)\) holds, then a is prohibited for x in s. (Cf. the examples in Sect. 3.1.) Table 4 contains the nine norm-building operators \(P_{k}\), together with the corresponding \(C_{k}^{a}\) operators and the result of applying them to \(d(x_{1},...,x_{q};s)\); cf. Table VI in [11].

A general-level Java/Prolog implementation of the Dalmas architecture has been developed, to facilitate the implementation of specific systems. The Colour & Form system, the Waste-collector system and the Forest Cleaner system are three specific systems that have been implemented using this framework. The reader is referred to [7, 8, 13, 20] for a description of these systems and their instrumentalizations.

The approach to normative systems employed in this framework is ideally suited for evolution of normative systems, since the nine ‘reduced extended’ types of normative positions are mutually exclusive and jointly exhaustive in the logical sense. Therefore each conceivable normative system, consisting of conditional norms based on descriptive conditions selected from a set of potential grounds and normative conditions selected from a set of potential consequences, could become a candidate for evaluation in the execution of an evolutionary algorithm. This idea will be further explored in the following sections.

Fig. 1.
figure 1

Overlap of the protected spheres for three agents.

2 Example: Explorer DALMAS

Let us consider a class of systems of agents operating in an environment consisting of a grid of squares ordered in rows and columns, in which each square is assigned a pair of integer coordinates. Let us assume that the joint goal of the agents is to explore as much as possible of the grid using a fixed number of moves. An agent can stay in the current square, i.e., do nothing, or move one square in one of four directions (east, north, west, south) as long as it stays within the boundaries of the grid. In other words, in a given situation, an action is feasible if and only if it does not move the agent off limits. It should of course be noted that these simple systems (in the following referred to as Explorer Dalmases) in themselves are of limited interest, but the idea here is to illustrate how evolutionary mechanisms could be used in the process of designing normative systems for problem-solving MAS.

To simulate a situation with limited possibilites for communication between agents and only local knowledge of the environment, we further assume that an agent only knows the status (visited or unvisited) of the immediately surrounding squares, and the location of other agents within two squares. An agent’s preference is represented by a very simple utility function such that moving to an unvisited square is preferred over moving to a visited square, and stay is the least preferred action. In the case of a tie between equally preferred actions, one of them is randomly selected. This means that all agents have the same utility function.

To make the situation more concrete, let us assume that the size of the grid is \(c\times r\) squares and place three agents at square (1, 1), the leftmost lowest square. Note that this system can be considered as an instance of the Waste-collector system [13, 20], in which visited (resp., unvisited) squares are represented by 0 (resp., 1) units of ‘waste’. The higher number of ‘waste’ carried by an agent, the higher number of unvisited squares have been entered by that agent. It would not be a very difficult task to design a plan where the agents take turns to act in such a way that all remaining \(cr-1\) squares are visited in \(cr-1\) moves. But if the environment gets changed, e.g., is resized or reshaped, the plan must be recalculated. What if we let norms replace plans in this class of environments? Let us investigate the interplay between the agents’ utility functions, representing their ‘desires’, and a normative system which determines their ‘room for manouvre’. One idea is to base norms on the spatial relationship between the agents, potentially restricting how the agents may move in the proximity of other agents. We define the condition \(Lap_{n}\), \(n\in \{0,1,2,3,4,6,9\}\), with the intended meaning that \(Lap_{n}(x_{i} ,x_{j};s)\) holds if and only if the protected spheres of agents \(x_{i}\) and \(x_{j}\) overlap with n squares in a state s. The protected sphere consists of the agent’s square plus the eight surrounding squares; see Fig. 1, which illustrates a state in which \(Lap_{2}(x_{1},x_{2})\), \(Lap_{1}(x_{2},x_{3})\), and \(Lap_{0}(x_{1},x_{3})\) holds. Table 5 shows how the overlap can change from one state to another, given the five available actions. Note that \(Lap_{n}(x_{i},x_{j})\) implies \(x_{i}=^{\prime }x_{j}\) for \(n<9\), and \(x_{i}=x_{j}\) implies \(Lap_{9}(x_{i},x_{j})\). Furthermore, \(Lap_{n}(x_{i},x_{j})\) implies \(\lnot Lap_{m}(x_{i},x_{j})\) for \(n\ne m\). In other words, \(Lap_{n}\,R=^{\prime }\) for \(n<9\), \(=R\,Lap_{9}\), and \(Lap_{n}\,R\,Lap_{m}^{\prime }\) for \(n\ne m\), where \(d^{\prime }\) is the negation of the condition d and R is the implicative relation on conditions which defines the cis-Bqo’s of grounds and consequences of the normative system.

Now let the ‘elementary’ conditions \(Lap_{0}\), \(Lap_{1}\), \(Lap_{2}\), \(Lap_{3} \), \(Lap_{4}\), \(Lap_{6}\), together with the ‘non-elementary’ condition \((=^{\prime }\wedge Lap_{9})\), form a set of potential descriptive grounds for conditional norms. The set of potential normative consequences corresponding to each ground is constructed by applying the norm-building operators \(P_{1}\), \(P_{2\varLambda }\), ..., \(P_{7}\) (see Sect. 1.3) to the conditions listed in the corresponding rows in Table 5. Thus, the potential consequences for, e.g., \(Lap_{1}\) are \(P_{1}Lap_{0}\),..., \(P_{7}Lap_{0}\), \(P_{1}Lap_{1}\),..., \(P_{7}Lap_{1} \), and \(P_{1}Lap_{2} \),..., \(P_{7}Lap_{2}\). Note that it would be meaningless to, e.g., let \(P_{i}Lap_{4}\) be a potential consequence for \(Lap_{0}\), since none of the available acts can change the state of the system in such a way that \(Lap_{0}(x_{i},x_{j})\) holds in one state and \(Lap_{4}(x_{i},x_{j})\) holds in the next state.

With these building blocks available, normative systems for Explorer Dalmases can be constructed. Let us employ the following scheme: For each condition c in the leftmost column of Table 5, one norm \(\langle M_{1}c,P_{i}d\rangle \) is added to the normative system for each condition d in the rightmost column.Footnote 6 E.g., for \(Lap_{0}\) we add four norms: \(\langle M_{1}Lap_{0},P_{k_{0}}Lap_{0}\rangle \), \(\langle M_{1}Lap_{0},P_{k_{1}}Lap_{1}\rangle \), \(\langle M_{1}Lap_{0} ,P_{k_{2}}Lap_{2}\rangle \), and \(\langle M_{1}Lap_{0},P_{k_{3}}Lap_{3}\rangle \). Note that, as regards the ground \((=^{\prime }\wedge Lap_{9})\), one of \(\langle M_{1}(=^{\prime }\wedge Lap_{9}),P_{k_{0}}Lap_{9}\rangle \) and \(\langle M_{1}(=^{\prime }\wedge Lap_{9}),P_{k_{1}}Lap_{6}\rangle \) is redundant, and can therefore be removed.Footnote 7 This gives a total of 24 norms. Note, however, that not all normative systems formed in this way are coherent. To begin with, some sets of rules may be contradictory, according to the intended meaning of the \(P_{i}\) operators, but the problem of coherence (sometimes referred to as ‘absence of conflicts’) cannot simply be reduced to logical consistency; see for example [1]. We will return to this issue in Sect. 3.1.

We would now like to find the best normative system, i.e., the normative system that, together with the simple utility function described earlier, on average makes the Explorer system most efficient. The following measure of efficiency will be employed: the normative system is applied to three different Explorer Dalmases, operating on grids of (almost) equal sizes but different shapes: \(6\times 8\) squares, \(7\times 7\) squares, and \(10\times 5\) squares, respectively. On each grid, three agents are initially placed on square (1, 1). A k-event run of each of these three systems will be performed, where k is the number of unvisited squares from the beginning, i.e., \(k=cr-1\). For each run, the ratio between the total number of visited squares and the total number of unvisited squares in the beginning is calculated. If the normative system is not coherent, in the sense that, at some point during the run, all actions (including stay) become prohibited for the acting agent, then the evaluation score is set to 0. The score of the normative system under evaluation is then the average of the three ratios obtained. We have now obtained an optimization problem which may be solved with the help of an evolutionary algorithm.

3 Evolution of Explorer Norms

Evolutionary algorithms (EA), being a subfield of evolutionary computation, use the principles of biological evolution (such as reproduction, mutation, recombination, and selection) to solve problems on computers. For a comprehensive introduction to this field the reader is referred to, e.g., [22]. In the Explorer Dalmas setting, there is some randomness in the agents’ choices of actions, and in such ‘noisy’ domains, evolutionary algorithms are known to work well [5]. We thus implement a basic genetic algorithm (one of the most common forms of EAs) for Explorer Dalmas norms:

  1. 1.

    Genesis. Create an initial population of n candidate normative systems, half of which are entirely randomly generated and half of which consist of \(P_{1} \)-consequences (the most permissive consequences) only. Each candidate is represented by a character string consisting of 24 characters, one for each norm, from \(\{\)‘1’, ..., ‘9’\(\}\), where ‘1’ represents \(P_{1}\), ‘2’ represents \(P_{2\varLambda }\), ‘3’ represents \(P_{2\varOmega }\), ‘4’ represents \(P_{4\varLambda }\), etc.

  2. 2.

    Evaluation. Evaluate each member of the population, by translating the character string to a normative system according to the scheme presented in Sect. 2, running three different systems regulated by this normative system and using as fitness function the average of the evaluation scores of the three runs. For example, a ‘chromosome’ starting with “371529...” is translated to the following normative system:

    \(\{\langle M_{1}({=^{\prime }\!\!\wedge Lap_{9} }),P_{2\varOmega }Lap_{9}\rangle \), \(\langle M_{1}Lap_{6},P_{6\varLambda }Lap_{9} \rangle \), \(\langle M_{1}Lap_{6},P_{1}Lap_{6}\rangle \), \(\langle M_{1}Lap_{6},P_{4\varOmega }Lap_{4}\rangle \), \(\langle M_{1}Lap_{6},P_{2\varLambda }Lap_{3}\rangle \), \(\langle M_{1}Lap_{4},P_{7}Lap_{6}\rangle \), \(\ldots \}\)

  3. 3.

    Survival of the Fittest. Select a number of members of the evaluated population, favouring those with higher fitness scores, to be the parents of the next generation.

  4. 4.

    Evolution. Generate a new population of offspring by randomly altering and combining elements of the parent candidates. The evolution is performed by the two basic evolutionary operators cross-over and mutation.

  5. 5.

    Iteration. Repeat steps 2–4 until the termination condition (see Table 6) is met.

The evolutionary algorithm was implemented using the Java-based Watchmaker framework for evolutionary computationFootnote 8 together with a slightly adapted Java/Prolog implementation of the Waste-collector system [7, 13].Footnote 9 The latter was used in step 2 to perform the k-event runs of Explorer systems to be evaluated.

3.1 Result

The algorithm was run with the parameter values shown in Table 6; the execution time on an ordinary laptop was 5–6 h. The graph in Fig. 2 shows the fitness values (evaluation scores) of the best normative system, as well as the average fitness values, in each generation. We can see that, initially, the best fitness (which is obtained by a normative system with \(P_{1} \)-consequences only, i.e., a normative system which allows everything) is around 0.78. Up to around generation 25, we can see a slow but quite steady improvement in the best fitness values, although the impact of the slight randomness in the agents’ choices of actions is clear. The highest scores, just above 0.86, which roughly corresponds to three more visited squares per run, are obtained in generations 41 and 78. After 25 generations there seems to be no significant improvement.

Fig. 2.
figure 2

Evolution progress. Upper curve shows fitness of best individual, lower curve shows mean fitness. Algorithm parameter values are shown in Table 6. A log of the run is available for download via http://drpa.se/norms/nrtssit/.

According to the log, the best normative system in generation 41 (with \(P_{1} \)-norms omitted for brevity) is translated to

$$\begin{aligned}&\langle M_{1}(=^{\prime }\wedge Lap_{9}),P_{2\varOmega }Lap_{9}\rangle , \langle M_{1}Lap_{6},P_{6\varLambda }Lap_{9}\rangle , \langle M_{1}Lap_{4},P_{6\varOmega }Lap_{4}\rangle , \\&\quad \langle M_{1}Lap_{3},P_{2\varLambda }Lap_{6}\rangle , \langle M_{1}Lap_{2},P_{6\varLambda }Lap_{4}\rangle , \langle M_{1}Lap_{2},P_{4\varOmega }Lap_{3}\rangle , \\&\qquad \qquad \, \langle M_{1}Lap_{1},P_{4\varOmega }Lap_{2}\rangle , \langle M_{1}Lap_{0},P_{4\varLambda }Lap_{1}\rangle . \end{aligned}$$

A closer look at the log reveals that, of the best candidates with a fitness over 0.85,

  1. (1)

    all but one (13 out of 14) contain either \(\langle M_{1} Lap_{6},P_{6\varLambda }Lap_{9}\rangle \) or \(\langle M_{1}Lap_{6} ,P_{4\varLambda } Lap_{9}\rangle \), and

  2. (2)

    all but three contain \(\langle M_{1}Lap_{2},P_{6\varLambda } Lap_{4}\rangle \) or \(\langle M_{1}Lap_{2},P_{4\varLambda }Lap_{4}\rangle \).

Let us first consider (1). The intended meaning of \(\langle M_{1}Lap_{6},P_{6\varLambda }Lap_{9}\rangle \) is that if \(Lap_{6}(x_{1},x_{2};s) \) for some agents \(x_{1}\), \(x_{2}\), and \(x_{3}\) such that \(x_{1}=x_{3}\) and \(x_{1}\) is the moving agent x, then the normative condition \(P_{6\varLambda }Lap_{9}(x_{1},x_{2},x_{3};x,s)\) holds, and thus, according to the definition of \(C_{6\varLambda }^{a}\) (see row 7 in Table 4) action a is prohibited for x if

$$ \begin{aligned} \lnot Lap_{9}(x,x_{2};s)\, \& \,Lap_{9}(x,x_{2};a(x,s)) \end{aligned}$$

or

$$ \begin{aligned} Lap_{9}(x,x_{2};s)\, \& \,\lnot Lap_{9}(x,x_{2};a(x,s)). \end{aligned}$$

Now, when \(Lap_{6}(x,x_{2};s)\), the second disjunct never becomes true, since \(Lap_{6}\,R\,Lap_{9}^{\prime }\) (i.e., \(Lap_{6}\) implies \(Lap_{9}^{\prime }\)); hence a is prohibited for x if

$$ \begin{aligned} \lnot Lap_{9}(x,x_{2};s)\, \& \,Lap_{9}(x,x_{2};a(x,s)). \end{aligned}$$
Table 6. Choice of parameter values.

Since \(\lnot Lap_{9}(x,x_{2};s)\) follows from \(Lap_{6}(x,x_{2};s)\), a is prohibited for x if \(Lap_{9}(x,x_{2};a(x,s))\). Similarly, the meaning of \(\langle M_{1}Lap_{6},P_{4\varLambda }Lap_{9}\rangle \) is that if \(Lap_{6} (x_{1},x_{2};s)\) for some agents \(x_{1}\), \(x_{2}\), and \(x_{3}\), such that \(x_{1}=x_{3}\) and \(x_{3}=x\), then \(P_{4\varLambda }Lap_{9}(x_{1},x_{2},x_{3};x,s)\) holds, and thus (see Table 4, row 4) a is prohibited for x if

$$ \begin{aligned} \lnot Lap_{9}(x,x_{2};s)\, \& \,Lap_{9}(x,x_{2};a(x,s)); \end{aligned}$$

i.e., when \(Lap_{6}(x,x_{2};s)\), it follows that \(\lnot Lap_{9}(x,x_{2};s)\), and thus a is prohibited for x if \(Lap_{9}(x,x_{2};a(x,s))\). Hence, \(\langle M_{1}Lap_{6},P_{6\varLambda } Lap_{9}\rangle \) and \(\langle M_{1}Lap_{6},P_{4\varLambda }Lap_{9}\rangle \) are ‘operationally equivalent’ in the Explorer Dalmas setting, in the sense that they prohibit the same actions in the same situation. Furthermore, both are operationally equivalent to \(\langle M_{1}Lap_{6},P_{7}Lap_{9} \rangle \) with the intended interpretation that if \(Lap_{6}\) then the moving agent shall see to it that not \(Lap_{9}\). A similar case can be made for (2); \(\langle M_{1}Lap_{2},P_{6\varLambda }Lap_{4}\rangle \), \(\langle M_{1} Lap_{2},P_{4\varLambda }Lap_{4}\rangle \) and \(\langle M_{1}Lap_{2},P_{7} Lap_{4}\rangle \) are operationally equivalent and thus interchangeable in this setting. The notion of operational equivalence of norms is further discussed in Sect. 3.2.

It is showed by (1) and (2) that, in some settings, the set of consequences may contain redundancy. This is an effect of the fact that, in this particular setting, the set of grounds and the set of consequences are constructed from the same set of conditions. Whether this is a problem or not is probably dependent on the particular setting. We may also note that, for example, the meaning of \(\langle M_{1}Lap_{0},P_{4\varOmega }Lap_{2}\rangle \) would be that if \(Lap_{0}(x_{1},x_{2};s)\) for some agents \(x_{1}\), \(x_{2}\), and \(x_{3}\), such that \(x_{1}=x_{3}\) and \(x_{3}=x\), then \(P_{4\varOmega }Lap_{2}(x_{1},x_{2} ,x_{3};x,s)\) holds, and thus (see Table 2, row 5) a is prohibited for x if

$$ \begin{aligned} Lap_{2}(x,x_{2};s)\, \& \,Lap_{2}(x,x_{2};a(x,s)). \end{aligned}$$

Now, \( Lap_{2}(x_{1},x_{2};s)\, \& \,Lap_{2}(x_{1},x_{2};a(x,s))\) can never become true when \(Lap_{0} (x_{1},x_{2};s)\), since \(Lap_{0} \,R\,Lap_{2}^{\prime }\). Hence, \(\langle M_{1}Lap_{0},P_{4\varOmega }Lap_{2} \rangle \) will never prohibit any actions, and is thus operationally equivalent to, \(\langle M_{1}Lap_{0},P_{1}Lap_{2}\rangle \) in this setting. This illustrates another kind of redundancy. Another consequence of employing negative permission is that normative systems may evolve which are incoherent (see Sect. 1.3) according to the underlying logic of the \(P_{k}\) operators, but still meaningful in an ‘operational’ sense. A discussion of these matters is beyond the scope of this paper, but a more precise representation of genes and a more careful design of the genetic operators could avoid or at least reduce logical incoherence and redundancy in the setting at hand. This could, potentially, significantly reduce the search space for the evolutionary algorithm. For this purpose, the mechanisms for norm addition and subtraction described in [16, Sect. 4.3] might be very useful, as well as a more thorough analysis of the relationships between potential grounds and consequences, in order to exploit the possibility of operational equivalence of norms (Sect. 3.2).

Based on the above analysis, the following set of Explorer norms (again, \(P_{1}\)-norms are omitted) is suggested: \(\{\langle M_{1}Lap_{6},P_{7} Lap_{9}\rangle \),\(\langle M_{1}Lap_{2},P_{7}Lap_{4}\rangle \}\). The intended interpretation is

  1. (1)

    \(\forall x,y:Lap_{6}(x,y;s)\rightarrow P_{7}Lap_{9}(x,y,x;x,s))\); and

  2. (2)

    \(\forall x,y:Lap_{2}(x,y;s)\rightarrow P_{7}Lap_{4}(x,y,x;x,s))\).

Using the deontic operator \(\mathrm {Shall}\) and the action operator \(\mathrm {Do}\), these norms are expressed as follows: (1) For all xy: if \(Lap_{6}(x,y)\), and x is the moving agent, then \(\mathrm {Shall} \,\mathrm {Do}(x,\lnot Lap_{9}(x,y))\); and (2) For all xy: if \(Lap_{2} (x,y)\), and x is the moving agent, then \(\mathrm {Shall}\,\mathrm {Do}(x,\lnot Lap_{4}(x,y))\); cf. [11, 20]. This represents the following simple set of ‘rules of thumb’: (1) If you stand in the square next to another agent’s square, you shall act so that you do not end up in the same location as the other agent, and (2) if your protected sphere overlaps another agent’s protected sphere with two squares, you shall act so that the overlap does not increase to four.

Test runs indicate that the average improvement with this very simple normative system compared with a system with no restrictions is two to three additional squares visited. As the Explorer Dalmas example was chosen primarily for demonstration purposes, we shall be content with the simple analysis performed here. In more complex scenarios, other more powerful (e.g., statistical) methods could be useful.

3.2 Discussion

Validation of the suggested approach to the design of normative systems for problem-solving MAS is, of course, a non-trivial problem. One aspect of this problem is the difficulty of applying this approach, but most important is probably to focus on the quality of the results it produces, i.e., to validate the systems obtained by applying the approach. The performance of norm-regulated MAS designed in this way could, for example, be compared with the performance of systems (norm-regulated systems as well as, e.g., planning systems) designed by hand. Such comparisons require domain-specific performance measures, which makes a general-level (i.e., domain independent) validation very difficult, if not impossible. Even within a specific domain, validation is non-trivial and sensitivity analyses are required. A good starting-point is to consider every tool in the evolutionary toolbox, together with a thorough analysis of the domain at hand, to increase the chance of evolving the optimal normative system. First, the parameters controlling the evolutionary algorithm may be varied: the population size, the number of evolved generations, the level of elitism (i.e., the portion of the best candidates which are allowed to survive into the next generation), the probability of crossover, the number of crossover points, and the selection strategy (e.g., tournament selection instead of roulette wheel). Other ideas include using other representations of chromosomes, such as tree-based representations to allow for normative systems with a variable number of norms, or (as has already been mentioned) more carefully designed evolutionary operators that exclude redundant and/or incoherent candidates from evaluation. More advanced schemes, such as island evolution (where several populations are evolved in parallel, with a small probability of ‘migration’ between such ‘islands’) or cooling (where the crossover and mutation probabilities gradually decrease), could also be tried. One example of a more careful design of an evolutionary operator is to restrict the mutation operator by the notion of ‘deontic paths’ [15, pp. 110ff] between types of normative positions. In short, the deontic path follows the edges in the Hasse diagram of the relation ‘less free than’ on types of normative positions; cf. [15, p. 105] and [12, Fig. 1]. The restriction could be that a gene which represents a type operator \(P_{i}\) applied to a descriptive condition d, may only be changed by mutation to represent a new operator \(P_{j}\) in such a way that \(P_{j}\) lies immediately above or below \(P_{i}\) on the deontic path between them. This could bring more stability into the evolution process, since the effects of mutations would be less dramatic.

Furthermore, the parameters for the particular setting may also be varied. For example, one might want to consider grounds and consequences based on other conditions. In the Explorer Dalmas domain one could try, e.g., \(Lap_{n}\) conditions based on larger protected spheres (since it seems reasonable to expect that a normative system based on small protected spheres will be most ‘effective’ when the agents are relatively close to each other), or generalized versions of \(Lap_{n}\) conditions involving three or more agents. Other ideas are to allow individual utility functions for each agent, or evolving the utility function and the normative system in parallel. In general, special treatment is required for domains such as the Explorer Dalmas where the fitness evaluations are ‘noisy’, i.e., subject to some degree of randomness. To deal with noisy fitness evaluations, a number of techniques are available, for example increasing the population size, and resampling and averaging the fitness. [6, Sect. 3.3] As described in Sect. 2, a variant of the latter technique is used in the Explorer Dalmas fitness evaluations. Another option regarding the evaluation function is to allow more or less variation regarding, e.g., grid sizes or shapes, number of agents, number of events per run and number of runs per normative system. However, large populations, in combination with expensive fitness calculations in each generation, are computationally challenging. The moving average approach by Di Pietro et al. can be used to reduce the number of samples needed per generation, and thus allow for running more generations in a given run-time. When a candidate is generated for the first time, its ‘fitness array’ is initialized with n fitness evaluations. For each new generation, the evaluation score is calculated only once, and the oldest score in the fitness array is replaced with the new score. A candidate’s fitness is then the average of the evaluation scores in the fitness array.

Operational Equivalence of Norms. It was observed in Sect. 3.1 that norms where grounds and consequences are based on the same set of descriptive conditions can become equivalent in the sense that they have exactly the same effect in the same situations. Let us investigate this further. First, recall the definitions of the \(C_{i}^{a}\) operators in Table 4, and note the following:

$$ \begin{aligned}&C_{5}^{a}c(X_{\nu },x_{\nu +1};x,s)\,\text {iff}\,[c(X_{\nu };s)\, \& \,\lnot c(X_{\nu };a(x,s))]\,\text {or}\,[\lnot c(X_{\nu };s)\, \& \,\lnot c(X_{\nu };a(x,s))]\\&C_{6\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\,\text {iff}\,[c(X_{\nu };s)\, \& \,\lnot c(X_{\nu };a(x,s))]\,\text {or}\,[\lnot c(X_{\nu };s)\, \& \,c(X_{\nu };a(x,s))]\\&C_{6\varOmega }^{a}c(X_{\nu },x_{\nu +1};x,s)\,\text {iff}\,[c(X_{\nu };s)\, \& \,c(X_{\nu };a(x,s))]\,\text {or}\,[\lnot c(X_{\nu };s)\, \& \,\lnot c(X_{\nu };a(x,s))]\\&C_{7}^{a}c(X_{\nu },x_{\nu +1};x,s)\,\text {iff}\,[c(X_{\nu };s)\, \& \,c(X_{\nu };a(x,s))]\,\text {or}\,[\lnot c(X_{\nu };s)\, \& \,c(X_{\nu };a(x,s))] \end{aligned}$$

Now suppose for example that \(c\in C\), the set of potential grounds, but also \(c\in D\), the set of descriptive conditions underlying the set of potential normative consequences. Let \(M_{\kappa }\) be a ‘move operator’, for example \(M_{1}\), such that \(\kappa \le \nu \), and suppose that c is true of some agents \(x_{1},...,x_{\nu }\) in a state s. We may then note the following:

  1. (1)

    \(C_{2\varOmega }^{a}c(X_{\nu },x_{\nu +1};x,s)\), and \(C_{4\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\) are false whenever \(c(X_{\nu };s)\). Hence, the set of actions a that would be prohibited by \(\langle M_{\kappa }c,P_{2\varOmega }c\rangle \), resp. \(\langle M_{\kappa }c,P_{4\varLambda }c\rangle \), is exactly the same as the set of actions that would be prohibited by \(\langle M_{\kappa }c,P_{1}c\rangle \), viz. the empty set.

  2. (2)

    The second disjunct of \(C_{6\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\) and the second disjunct of \(C_{5}^{a}c (X_{\nu },x_{\nu +1};x,s)\) must be false. Hence, if \(c(X_{\nu };s)\), then \(C_{6\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\) iff \(C_{2\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\) iff \(C_{5}^{a}c(X_{\nu },x_{\nu +1};x,s)\), which means that the actions that would be prohibited by \(\langle M_{\kappa }c,P_{2\varLambda }c\rangle \), \(\langle M_{\kappa }c,P_{5}c\rangle \), and \(\langle M_{\kappa }c,P_{6\varLambda }c\rangle \), are exactly the same.

  3. (3)

    The second disjunct of \(C_{6\varOmega }^{a}c(X_{\nu },x_{\nu +1};x,s)\) and the second disjunct of \(C_{7}^{a}c (X_{\nu },x_{\nu +1};x,s)\) must be false. Hence, if \(c(X_{\nu };s)\), then \(C_{6\varOmega }^{a}c(X_{\nu },x_{\nu +1};x,s)\) iff \(C_{4\varOmega }^{a}c(X_{\nu },x_{\nu +1};x,s)\) iff \(C_{7}^{a}c(X_{\nu },x_{\nu +1};x,s)\); so the actions that would be prohibited by \(\langle M_{\kappa }c,P_{4\varOmega }c\rangle \), \(\langle M_{\kappa }c,P_{6\varOmega }c\rangle \), and \(\langle M_{\kappa }c,P_{7}c\rangle \), are exactly the same.

    In other words, given that \(c(X_{\nu })\) holds, \(P_{2\varOmega }\), \(P_{4\varLambda }\), and \(P_{1}\) are ‘equally prohibitive’, and the same holds for \(P_{2\varLambda }\), \(P_{4\varOmega }\), and \(P_{5}\), resp. \(P_{2\varLambda }\), \(P_{6\varOmega }\), and \(P_{7}\). Now suppose that \(\lnot c(X_{\nu })\) holds for some agents \(x_{1},...,x_{\nu }\) and some act a in s:

  4. (4)

    Since \(C_{2\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\) and \(C_{4\varOmega } ^{a}c(X_{\nu },x_{\nu +1};x,s)\) are false whenever \(\lnot c (X_{\nu };s)\), the set of actions that would be prohibited by \(\langle M_{\kappa }c^{\prime },P_{2\varLambda }c\rangle \), resp. \(\langle Mc^{\prime },P_{4\varOmega }c\rangle \), is exactly the same as the set of actions that would be prohibited by \(\langle M_{\kappa }c^{\prime },P_{1}c\rangle \), viz. the empty set.

  5. (5)

    The second disjunct of \(C_{6\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\) and the first disjunct of \(C_{7}^{a}c(X_{\nu }, x_{\nu +1};x,s)\) must be false. Hence, if \(\lnot c(X_{\nu };s)\), then \(C_{6\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\) iff \(C_{4\varLambda }^{a}c(X_{\nu },x_{\nu +1};x,s)\) iff \(C_{7} ^{a}c(X_{\nu },x_{\nu +1};x,s)\), which means that the actions that would be prohibited by \(\langle M_{\kappa }c^{\prime },P_{4\varLambda }c\rangle \), and \(\langle M_{\kappa }c^{\prime },P_{6\varLambda }c\rangle \), and \(\langle M_{\kappa }c^{\prime },P_{7}c\rangle \), are exactly the same.

  6. (6)

    The first disjunct of \(C_{6\varOmega }^{a}c(X_{\nu },x_{\nu +1};x,s)\) and the first disjunct of \(C_{5}^{a}c(X_{\nu }, x_{\nu +1};x,s)\) must be false. Hence, if \(\lnot c(X_{\nu };s)\), then \(C_{6\varOmega }^{a}c(X_{\nu },x_{\nu +1};x,s)\) iff \(C_{2\varOmega }^{a}c(X_{\nu },x_{\nu +1};x,s)\) iff \(C_{5}^{a}c(X_{\nu } ,x_{\nu +1};x,s)\); therefore, the actions that would be prohibited by \(\langle M_{\kappa }c^{\prime },P_{2\varOmega }c\rangle \), \(\langle M_{\kappa }c^{\prime },P_{6\varOmega }c\rangle \), and \(\langle M_{\kappa }c^{\prime },P_{5}c\rangle \) are exactly the same.

    Thus, if \(\lnot c(X_{\nu })\) holds, then \(P_{2\varLambda }\), \(P_{4\varOmega }\), and \(P_{1}\) are equally prohibitive, and the same holds for \(P_{4\varLambda }\), \(P_{6\varLambda }\), and \(P_{7}\), resp. \(P_{2\varLambda }\), \(P_{5}\), and \(P_{6\varOmega }\). It seems plausible that ‘equally prohibitive’ could be a suitable foundation for a notion of operational equivalence of norms. It is straightforward to generalize the above arguments to cases where \(c\,R\,d\) or \(c\,R\,d^{\prime }\), i.e., where c implies d, resp., c implies \(d^{\prime }\).

4 Conclusion and Future Work

Concrete advice on how to use evolutionary mechanisms as part of the pre-runtime design of normative systems for problem-solving MAS were presented. The idea behind the methodology sketched here is to use a top-down approach of selecting (a subset of) the most ‘efficient’ norms from an evolved normative system, rather than a bottom-up approach of designing a normative system entirely from scratch. To illustrate the idea, a simple system, based on the Dalmas architecture for norm-regulated MAS was employed as part of the evaluation step of an evolutionary algorithm. The results show that an evolutionary algorithm has the potential of being a useful tool when designing normative systems for problem-solving MAS.

Ideas for future work include trying to formalize and further investigate the notion of operational equivalence of norms, which was introduced in Sect. 3.2. Also left for future work is further validation of the suggested methodology, for example by applying the methodology in other domains in which the grounds of the norms and the consequences are based on different sets of descriptive conditions, or by further validating the evolved normative system for the Explorer Dalmas. One could experiment with different domain-specific parameters as well as evolutionary algorithm parameters, as suggested in Sect. 3.2, to see if better solutions can be found and thus gain more support for the ideas suggested here. It could be interesting to, e.g., explore normative systems of variable size and evaluation functions which impose a penalty for large normative systems, since in many cases it could be desirable to rely on a small number of ‘rules of thumb’ and avoid overly complex normative systems which may become expensive in terms of calculations. Investigating the possibility to design more accurate evolutionary operators, for example by exploiting the fact that certain norms are operationally equivalent in the Explorer Dalmas setting, also seems like a promising idea.