1 Introduction

In 2020/2021 we participated as the GOAL-DTU team in the annual Multi-Agent Programming Contest (MAPC). We are using the GOAL agent programming language [1,2,3,4] and we are affiliated with the Technical University of Denmark (DTU). We participated in the contest in 2009 and 2010 as the Jason-DTU team [5, 6], in 2011 and 2012 as the Python-DTU team [7, 8], in 2013 and 2014 as the GOAL-DTU team [9], in 2015/2016 as the Python-DTU team [10], in 2017 and 2018 as the Jason-DTU team [11, 12] and in 2019/2020/2021 as the GOAL-DTU team [13, 14].

In 2022 we had the Agents Assemble III scenario which further expands upon the Agents Assemble scenario used in the 2019/2020/2021 contests. Several new features are introduced to add to the scenario from the previous iterations. One such addition is the possibility for agents to switch between different roles that each have a set of strengths and weaknesses; one role might be able to move faster but carry less load. Another addition is that goalzones are no longer static and can relocate during a simulation.

In this paper, we describe the strategy behind our implementation and evaluate our performance at the competition. The paper is organized as follows:

  • Section 2 covers the overall strategy of our agents and our implementation.

  • Section 3 evaluates our agents’ performance in the contest.

  • Section 4 reflects on the differences between the teams

  • Section 5 makes some concluding remarks.

2 The Strategy and Implementation of Our Agents

Our system for the agent contest is implemented in the multi-agent programming language GOAL [1,2,3,4] because GOAL has been our recent framework of choice for student projects as well as for several research collaborations.

In GOAL, each agent is an independent entity that can communicate with other agents through an extensive messaging system and perform actions to interact with the environment. These actions are scenario specific and defined for the particular environment that the agents operate in. The agent keeps both predefined and learned knowledge of the environment in a Prolog knowledge base, which constitutes the mental state of the agent. The agent then continually perceives its environment and updates its mental state based on the changes made to the environment. These changes could be the result of an action that the agent itself performed, or they could be the result of some other entity, including what we might call nature (i.e. changes that are not the result of some entity acting in the environment).

The agents operate by simple rule-based decision-making. Predefined rules determine the next action of the agent based on its current mental state and the state of the environment, thereby enabling the agents to react to changes in their environment.

2.1 Discovery

We employ different strategies based on the state of the simulation. Initially, in what we call the discovery phase, the agents are only tasked with exploring the map and locating dispensers, goalzones, rolezones, and each other. This is necessary because an agent only perceives objects and entities within its (very limited) field of vision, and all agents start out knowing nothing about the environment they are placed in. Agents then store the location of the resources they discover relative to their starting position, which means that all agents will have different coordinate systems. When two (or more) agents meet they will therefore try to establish a connection by learning the origins of the other agents, thereby enabling the agents to efficiently translate between their respective coordinate systems, allowing the agents to communicate about specific resources.

As agents do not perceive the identities of other agents that they encounter, the agents must instead verify their identities by comparing the perceivable objects in their shared field of vision. Thus, all agents continuously broadcast their immediate surroundings to all other agents during the initial discovery phase. Agents then check for broadcasts that match their own surroundings (including the location of the agents themselves), and if there are no other identical broadcasts, they can safely establish a connection to the other broadcasting agent. To optimize the process, agents will share their knowledge of the coordinate systems of all the other agents they have already connected to when establishing a new connection.

2.2 Task Plans

Once enough agents have connected to one another (we, somewhat arbitrarily, chose this to be half of the agents), a single agent is chosen to become the task master.

The task master is responsible for creating plans to solve available tasks in order to score points. The task master ranks the available tasks and tries to solve a specific task at a time. The first step in creating a plan is to request information from all other agents relevant for the task at hand. This includes their current attachments and knowledge of available resources necessary to complete the task. The other agents will then respond with their distance to each type of resource and an estimated delivery time.

The task master then collects all responses and checks whether it is feasible to solve the task by attempting to create a task plan. A task plan is a detailed description delegating each block of the pattern to a specific agent, specifying a point of assembly at which the pattern is to be assembled. If the task master is unsuccessful in creating a task plan for the task at hand, the task master will restart the process with an alternative task. On the other hand, if the task master successfully creates a task plan, the plan is sent to each agent that is to take part in executing the plan. Each of the agents receives only part of the plan, namely the steps that the agent is responsible for. An agent will only be assigned to deliver blocks of a single type, and from the task plan, the agent is able to infer exactly where it must position itself to deliver the blocks it is responsible for.

2.3 Solving Tasks

To solve a given task, an agent must first collect the necessary block types, and then move to the point of assembly. In order to efficiently solve tasks, the movement of the agents must be direct and fast. However, because the environment is extremely dynamic, the agents do not maintain an up-to-date representation of the map, and thus we cannot use simple path-finding algorithms without modifications.

Our approach is to mix a heuristics-based approach to global path-finding, with optimal local (i.e. within the field of vision) path-finding using A*. The heuristics based approach naively selects moves that bring the agent closer to its destination as long as possible. Once this simple approach is no longer possible, i.e. there might be something blocking the path, the agent chooses a way point among the locations within its field of vision. This way point is chosen to be the location within the agents’ field of vision that is closest to the final destination. The agent then uses A* to find the optimal path to the way point within its field of vision, and once the agent reaches its way point, the process then starts over. This very simple approach to path-finding is only made feasible by the availability of the clear action. The possibility to clear any blocks that are in the way of the agent means that it will not get stuck in dead-ends; the agent can always clear its way out. However, there is of course an overhead cost attached to heavy utilization of the clear action, but we think that it is a relatively small price to pay in order to achieve such simple path-finding logic.

When the agent reaches the point of assembly for the task plan, it will position itself relative to the agent submitting the pattern, which will locate itself exactly at the predetermined point of assembly. This means, that assembling the pattern is fast and easy, but also extremely inflexible. The agents know exactly where to position themselves in advance, but if the predetermined point of assembly is blocked and cannot be cleared, the agents must find another place to assemble the pattern.

If, at any point during the execution of the task plan, an agent realizes that its individual plan is either unfeasible or cannot be performed within the agreed-upon time-frame, the agent will try to replan. As long as the agent is able to find a plan that can be performed within the time-frame, there is no need to inform the group of the change of plans. Otherwise, the agent will inform the group that the task is no longer solvable and invoke the taskmaster anew. Likewise, once the agreed-upon deadline for solving the task is reached, the default behavior for the agents is to simply drop the task. This ensures that agents will not be stuck waiting for a failed agent.

3 Evaluation of Matches

In this section, we evaluate all of our matches and highlight what goes wrong and what goes right. A match consisted of three simulations, and a simulation started automatically when the previous simulation finished.

3.1 GOAL-DTU VS GOALdigger

SIM1 (0 : 180)

Clearly, something goes wrong for our team of agents. Around step 200 there are three agents that seem to attempt to assemble the pattern for task 4, but it is clear that they do not agree on the exact coordinates for the point of assembly. Maybe something went wrong with inferring map dimensions, leading to a mismatch in coordinates. But this inference should have been disabled at the contest, due to exactly the concern that it might lead to such a mismatch. Because the simulations were run by another student on his local PC, there is however a chance, that it was accidentally left on.

Later in the simulation, there is a long period of time where the agents do not even try to assemble a pattern. This could be further proof that the inferred coordinates are way off, leaving the master agent unable to create any feasible plans.

Compared to our team, we see that GOALdigger performs significantly better. It appears that the strategy of the GOALdigger team is to focus on the simple tasks—all of the 180 points that GOALdigger scores in this simulation are scored by submitting single-block patterns. However, we also see what seems to be an (unsuccessful) attempt to submit a more complex task, see agent 17 from step 160 and onward.

SIM2 (0 : 370)

In general, agents seem to move rather well even in maps like this one that are mostly blocked; the agents creates a path forward by clearing blocks and finding small passages. However, to highlight an instance where this is not the case, consider agent 19 at step 320. Evidently, the “explore score” that prioritizes different moves according to a number of heuristics is not properly tuned. The agent simply rotates in place for many turns before finally finding another path.

Otherwise, this simulation is very similar to simulation 1. We notice the same exact problem; namely that agents do not even bother trying to assemble any patterns, probably due to a mismatch in coordinate systems.

The strategy of GOALdigger also seems to be exactly the same as in the previous simulation. A clear majority of the points are attained by submitting single-block patterns; the GOALdigger team submits an impressive 31 tasks, 29 of which are single-block patterns, and the remaining two patterns are made up of two blocks.

SIM3 (520 : 410)

Unlike the previous two simulations, we actually succeed in submitting tasks and scoring points. In fact, we end up winning the simulation. We note, that most of our points are due to two submissions of the same complex pattern. This showcases some of the beneficial aspects of our simple strategy—the agents are very efficient at building complex patterns. However, this is only the case under optimal conditions.

We notice several occasions during the simulation where it gets extremely crowded at specific goalzones—especially the GOALdigger agents have a tendency to clutter. Crowded goalzones are a known weakness of our simple strategy, and it makes it difficult for our agents to successfully submit patterns. As an example (where the clutter is actually quite limited, but still our agents fail), consider agent 24 around step 360. The agent does not succeed in performing any action for a large number of turns, and instead simply times out. This could be due to problems with the path-finding algorithm. While observing agent 24, we also notice that it seems like two different groups of our agents are trying to assemble the same pattern at the same place, and they actually end up being in the way of one-another.

As another low-hanging fruit for improving our system, we see that our agents try to submit patterns in OLD goalzones. These old goalzones should simply have been removed from the knowledge base once an agent notices that they are no longer present. As an example, observe agent 18 during step 650 and on-wards. Clearly, the agents are trying to submit task 12, but the goalzone disappeared after step 361.

Finally, we notice that the GOALdigger team succeeds in submitting several two-block patterns; a lot more than in the previous simulations.

3.2 GOAL-DTU VS LI(A)RA

SIM1 (120 : 310)

We manage to submit a number of tasks, but we see a clear indication that our strategy is failing. Consider for example agents 4, 5, and 20 around step 225. It seems that their goal is to cooperate to build and submit task 5, but some slight error in the agreed-upon coordinates is stopping them from accomplishing this goal. The same scenario plays out just 45 steps later (at step 270), where two of the same agents, namely agent 4 and 20, fail at assembling task 6 due to the very same error. Fixing this miscalculation of relative coordinates should be top priority, but it also showcases another important problem—our agents ought to be more flexible. It should be easy for the agents to recognize this error, and then dynamically agree on a new point of assembly.

We see that the LI(A)RA team submits a large number of tasks compared to us, and seems to do so quite efficiently. Interestingly, however, the LI(A)RA agents fail to submit any patterns for more than 150 steps (between step 133 and 289), but it is not obvious as to why.

SIM2 (160 : 80)

Again, our agents fare reasonably well in a map with a lot of obstacles. Consider for example agents 8 and 18 from step 230 and on-wards. Both agents must clear their way through obstacles to complete the task, and they succeed in doing so in a quite efficient manner. However, agent 18 does perform some unnecessary clearing operations (where the agent does not even use the cleared space), and thus there is still room for improvement.

We also observe the same errors when trying to assemble patterns as in the previous simulation. From step 260, it seems evident that agents 5 and 9 wish to complete task 5. But despite being present in the goalzone with the necessary blocks, and getting within a single rotation from being able to complete the pattern, they never succeed in actually doing so. Here, our agents should be able to recognize the mistake in the agreed-upon plan, and then realize that agent 5 simply needs to perform a rotation in order to successfully complete the pattern.

SIM3 (0 : 270)

Our agents fail completely this simulation. They do not score a single point, nor do they even try to assemble/submit any task. Again, we think this is because of an error in inferring map dimensions, which can completely paralyze the agents.

3.3 GOAL-DTU VS FIT But

SIM1 (230 : 0)

We fare decently well in this simulation and end up winning it. However, we still see some of the problems touched upon in the previous matches. Further, it is very noticeable in this simulation how agents cannot decide on any tasks themselves—instead, only the taskmaster can delegate task plans. For simple 1 block patterns this approach is extremely inefficient, and we see several cases where agents could have completed tasks by themselves. As an example, consider agent 11 around step 90. The agent seems to be waiting for two other agents in order to complete task 3. However, the other agents fail to show up, and therefore agent 11 must drop the task. The agent actually fulfills all the requirements for task 5 but simply wanders out the goalzone with its block instead of submitting the task. This shows that our agents need more autonomy.

It is also observed, that once an agent is in place and waiting for remaining agents to complete a pattern, they completely ignore the threat of clearing events. Consider agent 3 around step 160. The agent is oblivious to the clear event and does not get out of the way, despite having to move only a single (or maybe 2) cells to get of the radius of the event. Agents should always try to avoid clear events, especially when they are about to complete a task.

SIM2 (630 : 670)

We perform fairly well this simulation but with the same problems of failing to complete task patterns. Consider for example agents 6, 8, 14, and 15 around step 470. The agents have built most of the complex pattern, but agent 8 fails to deliver the last remaining block just two rotations (or moves) away from completing the pattern. However, it seems that the reason is due to the path-finding algorithm failing/timing out and not a disagreement in coordinates. This follows because agent 8 does not skip its turns (as an agent would do once it is in place) but instead fails to perform any action. Completing this pattern would have meant winning the simulation.

We also notice that, again, the goalzones tend to get very crowded and that our agents do not deal well with this.

SIM3: (0 : 1640)

Once again, a complete meltdown of our system. It is identical to the previous matches where we do not get a single point.

However, we would like to use the opportunity to praise FIT BUT and their active use of roles—in this simulation, it really shows how they benefit from using the diggers to remove obstacles from the playing field in order to gain easy access to goalzones.

3.4 GOAL-DTU VS MMD

SIM1 (480 : 910)

We do fairly well, but are not nearly efficient enough. We observe several cases where we begin assembling a pattern but are not able to finish it in time. Further, we notice the large contrast to MMD that always have many agents building patterns in the same goalzone—we do not utilize the large goalzones well enough. We could easily have more agents work in the same goalzones at the same time.

SIM2 (150 : 780)

A lot of cases of agents trying to submit tasks in nonexistent (old) goalzones. This is a severe waste of resources, and leads to a game where we score very few points.

SIM3 (0 : 1520)

Yet another complete meltdown; see the descriptions of the previous simulations where we score 0 points.

4 Discussion

In this section, we retrospectively discuss and review some assumptions of our strategy, and then we take a brief look at the different teams that took part in this year’s contest.

4.1 Strategy

Our strategy is very much developed with the specific scenario in mind (the Agents Assemble scenario), both in regards to the actual problem to solve, but also in regards to the specific parameters of the scenario. In particular, our implementation is based upon an implicit assumption that the number of agents in a map is fairly limited—in the history of this scenario, the maximum number of agents per team has been no more than 50. This allows us to take a centralized approached to task planning which, if the number of agents were to increase by a few orders of magnitude, would quickly prove to be inefficient.

Our centralized approach is also built upon an implicit assumption of the actual layout of maps; in order to select the taskmaster, at least half of the agents must be connected. By experience, this rarely takes more than 60 rounds, but in a very large map this might not happen within the duration of a single simulation.

It would be interesting to introduce simulations that challenge these kinds of implicit assumptions in the contest which would force competitors to create more general systems. One possible approach could be to have each competing team design a simulation for the contest (in addition to those provided by the organizers), which we expect would lead to less similarity between simulations.

4.2 Teams

Team

FIT BUT

GOAL-DTU

GOALdigger

LI(A)RA

MMD

Members

3

3

4

5

2

Time

6 man-weeks

30 h

1200 h prog.

80 h prog. + 40 h

896 h

Lines

12000

2000

10000

1100

5407

Platform

Java(+JADE)

GOAL

GOAL

Jason

Python

Returning

Yes

Yes

No

No

No

The table shows that the time each team has spent in order to prepare for the competition varies a lot. The hours spent differs as much as by a factor 40. However, it is also important to note that our team spent the least time preparing for this year’s competition, but is a returning team. Thus, almost our entire codebase is reused from last year, but with a small number of updates. Likewise, the number of lines of code varies quite a lot between teams. We notice that the teams that have used dedicated multi-agent programming languages do not necessarily need fewer lines of code; both the two teams with fewest lines of code, as well as the two teams with the most lines of code, use such dedicated languages. Even within the same dedicated language, GOAL, the number of lines of code differs by a factor 5 between the two teams that use it.

5 Conclusion

During the contest, we observed some problems concerning robustness due to false information. When the problem did not occur, we generally achieved competitive scores. If an agent somehow incorrectly updates its location, this can lead to agents agreeing on incorrect dimensions of the map, and as a result the agents will be rendered useless. While this suggests that the system is not robust enough, one could also argue that, once false information is introduced into the system, we cannot expect coherent behavior.

Another observation is that, especially in larger maps with lots of agents, we have too many idle agents. Once the map is explored and an agent has connected two blocks, the agent will simply roam the map waiting to be assigned a task. Instead, the idle agents could be put to better use by e.g. protecting goalzones.

In conclusion, we are satisfied with our placement for the contest and with the improvements we have made to the system compared to last year, but at the same time, we have discussed several ways that the system could be improved. For the future we suggest to execute the agents on the same infrastructure by the organisers.