Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Recently, there have been a movement towards the explicit design and use of organizations in multi-agent systems (MAS). An organization helps to better model the problem being tackled, and it helps to increase the system’s efficiency, by defining the MAS structure and the rules which the agents must follow to achieve individual and system level goals. However, in many cases it is difficult to define the organizational model that best solves the problem.

Trying to contribute to this issue, we present in this paper an experimental analysis of the overall result of different organization-oriented MAS, which were created for the “Multi-Agent Programming Contest” scenario.

2 Background

2.1 Agent Organizational Models

Organization is a complex notion: there are several views, definitions, and approaches to characterize them, addressing different issues: it is a supra-individual phenomena [1], it is defined by the designer or by the actors involved [2], and it is a pattern of predefined [3] or emergent [4] cooperation.

We will adopt here the following definition [5]:

“An organization is a supra-agent pattern of emergent cooperation or predefined cooperation of the agents in the system, that could be defined by the designer or by the agents themselves, in order to achieve a purpose.”

One important issue is the relation between organizational constraints and agents’ autonomy, as studied by Castelfranchi [6]. When seen as a predefined cooperation pattern, an organization aims to constrain the agents’ autonomy. In fact, this limitation aims to guarantee that the global goals are achieved in an optimized way. If agents strictly follow their organizational constraints, they will know what to do, when and with whom to interact in crucial problem solving situations.

Given that an organization constrains the agents’ autonomy, a further step is to investigate how this limitation can be properly engineered and designed. Coutinho et al. [7] propose some modeling dimensions for organizational design: (i) the structural dimension, mainly composed of notions like roles and groups, as used in the AGR model [8]; (ii) the interactive dimension, characterized by dialogical interaction structures, as used in the Electronic Institutions model [9]; (iii) the functional dimension, formed by goal/task decomposition structures, as proposed by the TAEMS model [10]; and (iv) the normative dimension, in which we find the concepts of norms, rights, rules, like used in the OPERA model [11].

However, the organizational design problem has not been solved so far by researchers in business and management domains. This problem can be stated as: how to find an optimal constraint set that could guarantee global efficiency for a given task scenario? The same problem arises concerning multi-agent organizations [12].

In this paper, we present a comparison and evaluation of different organization models, that were applied to “Agents on Mars” scenario, described next.

2.2 MAPC

The “Multi-Agent Programming Contest”Footnote 1 (MAPC) is held every year since 2005, and it is an attempt to stimulate research in MAS programming techniques [13]. In the contest, two teams of agents are located in the same environment and compete directly in a scenario set by the organizers. By being a direct competition, it is an interesting testbed to evaluate and compare different systems, allowing to identify strengths and weaknesses, and thus promoting the development of all participants.

Since 2011, a scenario called “Agents on Mars” has been used. In this scenario, two teams of 28 agents compete to explore and dominate the best top wells of the planet. The environment is represented by a weighted graph, where the vertices denote wells and possible locations for the agents, and the edges indicate the possibility of crossing from one vertex to another with an energy cost for the agent. Each vertex has a value corresponding to its water well usefulness, and this value is used to calculate the value of the areas occupied by the agents.

A zone is a subgraph covered by a team according to a coloring algorithm based on the notion of domain [14]. Several agents from different teams can be located in a single vertex, but the team with the highest number of agents dominates this vertex, which receives the dominant team color. An uncolored vertex inherits the color from its neighbourhood dominant team. Hence, if the graph contains a subgraph with a colored border, all the nodes that are within this boundary receive the same color. This means that an agent team may cover a subgraph which has more vertices than the number of its members. Figure 1 shows a map with the colored subgraphs.

Fig. 1.
figure 1

“Agents on Mars” scenario.

At the beginning of the simulation, the map is unknown to the agents. Thus, each team needs to explore the graph before starting to conquer areas, since the agents have a limited view of the map and only perceive their neighbour vertices. Additionaly, sometimes a team needs to sabotage the other team to increase its area, or to defend areas in order not to lose them to the opponent.

Each team consists of 28 players, that can play 5 different roles: explorers (Exp), sentinels (Sen), saboteurs (Sab), inspectors (Ins) and repairers (Rep). These roles define the characteristics of each agent, such as life level, maximum energy, strength, and visibility. The roles also limit the possible actions that the agent can perform in the environment, as shown in Table 1. For instance, explorers can find water wells and help to explore the map, while sentinels have long distance sensors and thus can observe larger areas, saboteurs can attack and disable enemies, inspectors can spy opponents, and repairers can repair damaged agents.

A team receives a cash reward whenever it reaches a major milestone. This reward can be used to empower the agents, increasing, for instance, their maximum energy or strength. Different milestones can be reached during a competition, such as dominating areas with fixed values (e.g., 10 or 20), having performed a successful number of attacks or successful defenses. If not used, the reward is added to the team’s total score.

The goal of each team is to maximize its score, defined as the sum of the values obtained by the occupied zones with the earned (and not yet spent) rewards in each step of the simulation, as shown in Eq. 1:

$$\begin{aligned} score = \sum \limits ^{steps}_{p=1} (zones_p + rewards_p) \end{aligned}$$
(1)
Table 1. “Agents on Mars” roles and actions.

We present next a proposal of an agent team, called LTI-USP, based on different organizational models, to solve this problem.

3 LTI-USP Agent Team

3.1 Architecture

The architecture of the LTI-USP team is shown in Fig. 2. In this architecture, we used BDI agents. Each agent is composed of plans, a belief base and its own world model. The agent decides which plan will be executed according to its beliefs and the local view of the world.

Fig. 2.
figure 2

LTI-USP team architecture.

The world model consists of a graph developed in Java, using simple data structures and classes. It captures every detail received from the MASSim contest server, such as explored vertices and edges, opponents’ position, disabled teammates, etc. At each step, the agent’s world model is updated with the percepts received from the MASSim server, and with the information received from the other agents.

Some of the percepts received from the MASSim server are also stored in the agent’s belief base, such as the agent’s role, energy, position and team’s rewards, thus allowing the agent to have a direct access to these information without having to access its world model. Percepts about vertices, edges and other agents were not stored in the belief base so as to not compromise the agent’s performance, as it could be very expensive to update and to access the belief base with so much information. Moreover, since we wanted to update a belief whenever a new instance was inserted (instead of adding a second one), we decided to use an indexed belief base in which some beliefs are unique and indexed for faster access.

Our team was developed using JaCaMo Footnote 2. JaCaMo [15] is a platform for multi-agent programming which supports all levels of abstractions – agent, environment, and organization – that are required for developing sophisticated MAS, by combining three separate technologies: Jason Footnote 3 [16], for programming autonomous agents; CArtAgO Footnote 4 [17], for programming environment artifacts; and Moise Footnote 5 [18], for programming multi-agent organizations.

Jason is a Java-based interpreter for an extended version of the AgentSpeak programming language, suitable for programming BDI agents.

CArtAgO is a framework for environment programming based on the A & A meta-model [19]. In CArtAgO, the environment can be designed as a dynamic set of computational entities called artifacts, organized into workspaces, possibly distributed among various nodes of a network [15]. Each artifact represents a resource or a tool that agents can instantiate, share, use, and perceive at runtime. For this project, we did not create any new artifact; we only made use of the organizational artifacts provided in Moise.

Moise [18, 20] is an organizational model for MAS based on three complementary dimensions: structural, functional and normative. The model enables a MAS designer to explicitly specify its organizational constraints, and it can be also used by the agents to reason about their organization. We used the Moise model to define the agent’s roles, groups and missions.

Agents communicate with the MASSim server through the EISMASSim environment-interface included in the contest software-package. EISMASSim is based on EISFootnote 6 [21], which is a proposed standard for agent-environment interaction. It automatically establishes and maintains authenticated connections to the server and abstracts the communication between the MASSim server and the agents to simple Java-method-calls and call-backs. In order to use this interface, we extended the JaCaMo default agent architecture to perceive and to act not only on the CArtAgO artifacts, but also on the EIS environment as well.

3.2 Strategies

The main strategy of our team is to divide the agents into two or more subgroups: one in charge of attacking the opponents (infantry), and the others (squads) in charge of occupying the best zones in the graph. Moreover, regarding the agents’ roles, we decided not to map the five types specified in the scenario (Exp, Ins, Rep, Sab and Sen) directly to the roles in our team. Instead, we defined additional different roles in our system according to the adopted strategy, as shown in Fig. 3.

Fig. 3.
figure 3

LTI-USP Team structural specification.

Each of these roles has a mission associated to it, and can be played by one or more type of agents. For example, the map_explorer role can be played only by the explorer type, while the soldier role can be played by all types of agents. Below we describe the missions related to each role:

  • map_explorer (Exp): Explores the whole graph by probing every vertex and surveying all edges on its path;

  • map_explorer_helper (Exp): Helps the map_explorer to explore the graph, but only in the first 250 steps. After that, the agent leaves this role to adopt the soldier role in the best_zone subgroup;

  • soldier (all types): Tries to occupy one of the best zones indicated by the coordinator agent. When all the vertices of the designated best zone are occupied the soldier starts to look to the neighbour vertices of the team’s zone to which he can move to increase the zone size;

  • guardian (Sab): Defends the subgroup best zone by attacking any opponent that is close to the team’s zone, or trying to invade it;

  • medic (Rep): Occupies the center of the designated best zone and is responsible for repairing the agents in the subgroup, or other agents which eventually need to be repaired, such as the map_explorer. In our team, the damaged agents move to the repairers position in order to be repaired;

  • zone_explorer (Exp): Explores the team’s zone by probing the vertices whose values are unknown. When all vertices are probed, the zone_explorer helps the soldiers to increase the zone size;

  • saboteur (Sab): Attacks any close opponent, or the opponent who occupies a good vertex;

  • sentinel (Sen): Tries to sabotage the opponent by moving inside its zone;

  • repairer (Rep): Follows the saboteur, but always staying two vertices away from it, in order to be prepared to repair the saboteur when necessary, but without taking too much risk;

  • coordinator (none): Agent internal to our system which does not communicate with the MASSim server. It builds its local view of the world through the percepts broadcasted by the other agents. Whenever the world model is updated, it computes which are the best zones in the graph and send this information to the other agents. The coordinator is also responsible for creating the organizational artifacts, in the beginning of the simulation, and for distributing the groups, roles and missions among the other agents, in order to eliminate the performance issues caused by two or more agents trying to adopt the same role in a group, or trying to commit to the same mission.

The best zone in the map is obtained by calculating for each vertex the sum of its value with the value of all its direct and second degree neighbours. The vertex with the greatest sum of values is the center of the best zone. Zones with the sum of values below 10 are not considered in the calculationFootnote 7. The same computation is performed again to determine if there is a second, third and fourth best zone, and so on, but this time removing the vertices belonging to the first best zone from the analysis. If the number of best zones is smaller than the number of squads, the first best zone is designated to the subgroups without specific best zone.

4 Experiments and Results

4.1 Experiments

In the MAPC, each team plays against each other team three times, and the team that wins most matches wins the overall tournament. Each match has 750 steps and the map is randomly generated, thus from one match to another the number of vertices, edges and high-valued areas can change.

The fact that the number of high-valued areas may change leads in some cases to situations where to protect a single best area is a better strategy, while in other cases it would be better to divide the team in smaller groups to try to gain control over several areas. Therefore, we have performed some experiments to analyse how the number of squads in our team can impact in its overall performance.

The experiments consisted of four teams (TG1, TG2, TG3 and TG4), all of them with the structure shown in Fig. 3, except with respect to the number of squads as shown in Table 2. These teams competed in three different scenarios/maps (SC1, SC2 and SC3), described in Table 3. In this table, possible zones means areas in the map with high value vertices, that are hence possible candidates for a best zone. These scenarios are also represented in Fig. 4.

Table 2. LTI-USP team configurations.
Table 3. Scenarios properties.

The number of vertices and edges shown in Table 3 were chosen according to the parameters set in the MAPC, in which the maps had from 400 to 600 vertices and the thinning factor, i.e., the number of removed edges from a complete connect graph in percent, varies from 10 % to 60 %.

Fig. 4.
figure 4

Experiment scenarios.

4.2 Results

For each scenario previously described, we performed 10 simulations for each of the following matches: TG1 vs TG2, TG1 vs TG3, and TG1 vs TG4. The data collected in all simulation were: (i) the winner, (ii) the teams’ final scores and (iii) the score conquered in each step for each of the two competing teams. Table 4 shows a summary of the number of wins for each team by match and scenario.

Table 4. Results summary - number of wins.

Given the results, we used a hypothesis test, the Wilcoxon T test, to define for each match if the 10 simulations were sufficient or not to conclude that a team was better than other in a determined scenario. The Wilcoxon T test (also called Wilcoxon signed-rank test) is a non parametric test for dependent samples that can indicate with some stated confidence level if a particular population tends to have larger values than other.

The results of this analysis are shown in Table 5, where the values correspond to the p-value result of the Wilcoxon T test applied on the final score of the 10 simulations performed for each match. A p-value lower than 0.05 indicates that the results observed in the 10 simulations are enough to conclude that a certain team tends to obtain higher scores than other.

Table 5. Wilcoxon T test

The results obtained for each scenario are analysed in the following subsections.

Scenario 1. In the first scenario, the teams with more squads won most of the simulations against TG1 (control team) and, given the p-values of the Wilcoxon T test, we can conclude that TG2 and TG4 are better than TG1, but for TG3 we can not conclude the same.

Fig. 5.
figure 5

Scenario 1: Final scores for TG1 vs TG3.

Figure 5 shows the final scores of the 10 simulations for the match TG1 vs TG3. Analysing the simulations where TG1 defeats TG3, we were able to identify why we have good results for TG2 against TG1, while this has not occurred when TG3 played against TG1. TG3 divides its agents in three squads to occupy three different zones in the map, while TG1 uses all its agents (apart from those used to attack the opponent and to explore the map) to try to conquer the best zone in the map. In this first scenario, there is a unique huge high valued area in the left bottom of map, which is easily conquered by TG1 since the number of agents from TG3 that will fight for the same zone is not enough to defeat the opponent (Fig. 6). Besides that, the score summed from the two others TG3’s squads was lower than the score obtained by TG1 in the best zone.

Fig. 6.
figure 6

Scenario 1 - TG1 (blue) vs TG3 (green) (Color figure online).

Scenario 2. In contrast with the first scenario, the second one does not has a huge high valued area, and all possible best zones have almost the same value, which is good for the teams with more squads, as shown in Fig. 7.

Fig. 7.
figure 7

Scenario 2: TG1 (green) vs TG4 (blue) (Color figure online).

Scenario 3. The third scenario, as the first one, has an unique huge high valued area which now is located in the top of the map, but in this scenario TG2 did not performed as well as in the first scenario.

Figure 8 shows the results of the 10 simulations for the match TG1 vs TG2, where it is possible to see that TG2 narrowly lost two simulations for TG1.

Fig. 8.
figure 8

Scenario 3: Final scores for TG1 vs TG2.

Comparing the match TG1 vs TG2 in this scenario with the first one, we were able to identify that in this third scenario, TG2 does not find in some simulations the best zone in the map, since the zone is not so spread out as in the first scenario. In these cases, TG1 won when it was able to find the best zone and TG2 not, as depicted in Fig. 9.

Fig. 9.
figure 9

Scenario 3: TG1 (blue) vs TG2 (green) (Color figure online).

5 Conclusions

The problem of determining an appropriate or best MAS organization for a given scenario is a key problem in MAS research, and empirical approaches can be very useful in this regard. Aiming to contribute in this issue, we presented an evaluation of different organizations over three distinct scenarios of the Multi-Agent Programming Contest case study. To validate our observations, a statistical test, the Wilcoxon T test, was used to detect differences in the performance of the organizations.

The results obtained by confronting the four LTI-USP teams, even though they can suggest that TG4 is the best organizational choice, are not conclusive since the number of scenarios used in our evaluation was relatively small, and the scenario can greatly impact the performance of the team as we showed in Sect. 4.2.

Therefore, in future work we intend to increase the number of different tested scenarios, and also evaluate different structures of organizational models, changing not only the number of squad but also other parameters, for instance the number of agents in charge of attacking the opponents.

Another possibility is to use the results obtained in this study to develop a team capable of reorganizing according to the characteristics of the environment. As discussed in [22], the reorganization is an important aspect in MAS, since the environment is most often not static. Therefore, MAS should be able to modify your organization to adapt to changes in the environment.