Keywords

1 Introduction

A high-level aim of the three-year EU FP7 euRathlon project is to help speed-up progress towards practical, useable real-world intelligent autonomous robots through competitions; toward this aim euRathlon has created real-world robotics challenges for outdoor robots in demanding emergency response scenarios.

The euRathlon competitions aim to test the intelligence and autonomy of outdoor robots in demanding mock disaster-response scenarios inspired by the 2011 Fukushima accident. Focused on multi-domain cooperation, the 2015 euRathlon competition required flying, land and marine robots acting together to survey the disaster, collect environmental data, and identify critical hazards. The first (land) competition was held in 2013 in Berchtesgaden, Germany [1]. In September 2014, the second (sea) competition was held in La Spezia, Italy [2, 3]. The final euRathlon Grand Challenge (air, land and sea) was held in Piombino, Italy, from 17th - 25th September 2015.

This paper proceeds as follows. First we outline the Grand Challenge concept then, in Sect. 3, we describe the location chosen for euRathlon 2015 and how the requirements of the Grand Challenge map to the physical environment. In Sect. 4 we outline the benchmarking/scoring schema developed for euRathlon 2015. The paper concludes in Sect. 5 by evaluating first the competition itself, including lessons learned, then the performance of the teams in rising to the Grand Challenge.

2 The Grand Challenge

Inspired by the 2011 Fukushima accident and the subsequent efforts to use robots to assess internal damage to the NPP buildings [4], we sought to develop a scenario which would - in some respects at least - provide teams with a comparable challenge. Clearly there were aspects that we could not replicate, in particular the radiological environment or chemical hazards – but we were able to offer significant challenges to radio communication. Other challenges included the weather, which reduced underwater visibility to less than 1m, the rough terrain for land robots, and obstructed access routes inside the building.

Fig. 1.
figure 1

Concept diagram for the euRathlon 2015 Grand Challenge scenario

Figure 1 shows the concept diagram for the Grand Challenge scenario. The key physical elements of the scenario are (1) a building on a shoreline which can act in the role of the ‘reactor’ building, with an internal mock ‘machine room’, (2) valves (stopcocks) in the machine room connected to pipes which lead out of the building and into the sea, with corresponding underwater valves, (3) damage or debris blocking paths or entrances outside or inside the building, (4) damage to the pipes and (5) missing workers. The Grand Challenge scenario comprised three mission objectives – outlined as follows.

  • Mission A: Search for missing workers. Robots must search for two missing workers represented by mannequins dressed in orange suits, which could be inside the building, outside the building, floating on the sea surface near the coast, or trapped underwater. Teams received bonus points if a worker was found during the first 30 min of the Grand Challenge, because in a real scenario the probability of finding a missing person alive decreases rapidly with time.

  • Mission B: Reconnaissance and environmental survey of a building. Robots must inspect a building to evaluate damage (represented by markers) and find a safe path to a machine room where valves were located. This required robots to survey the area, create a map of the building and the outdoor area surrounding it, and locate objects of potential interest (OPIs) in order to provide situational awareness to the team.

  • Mission C: Pipe inspection and stemming a leak. Robots must localize four pipe sections on land, localize another four matching pipes underwater, look for damage to the land pipes and identify a contaminant leak (represented by a marker), reach the valves in the machine room and underwater, and close the two corresponding valves in synchrony.

In the published scenario descriptionFootnote 1 we made it clear that the missions could be undertaken in any order, or in parallel. The Grand Challenge would be successfully met if all three mission objectives were met within 100 min, but importantly we did not specify how the challenge should be met, or with what robots (only limiting their number and kind).

3 Torre del Sale - the Competition Site

Securing a location for euRathlon 2015 was challenging given the requirements. We needed a suitable building on a shoreline and surrounding areas with safe access for land and flying robots, a safe shallow sea for marine robots and sufficient space for team preparation, organisers and spectators. Equally importantly we needed all of the necessary permissions to operate land, sea and air robots: for marine robots from the Port Authority and for flying robots from the Italian Civil Aviation Authority (ENAC).

Fig. 2.
figure 2

The Torre del Sale, with the ENEL power plant in the background, and beach to the right

Fig. 3.
figure 3

Image: Google Earth

Competition site, with the Torre del Sale at the left.

The venue selected was an area in front of the ENEL (Italian National Company for Electricity) electrical power plant in Piombino, Italy. The location offered all the areas needed for the robots, space for hosting participants and public, and also offered a credible industrial context as a background for the competition. Permission was obtained from the State Property Authority to make use of a disused historical building on the sea shore, the Torre del Sale, as the mock reactor building with an internal room playing the part of the machine room. Figure 2 shows the Torre Del Sale building, and Fig. 3 shows a satellite image of the competition site, with the outdoor land, air and sea robot areas indicated.

4 Benchmarking and Scoring

Inspired by and adapted from the benchmarking approach of the RoCKIn Challenge [5] we developed a system-level benchmark (i.e. Task Benchmark) and module-level (i.e. Functionality Benchmark) for euRathlon 2015. The Task Benchmark evaluates the performance of the integrated robot systems while the Functionality Benchmark evaluates the performance of a specific module/functionality of the robot systems. Evaluating only the performance of integrated systems does not necessarily inform how the individual modules are contributing to global performance and which aspects of the module need to be improved. On the other hand, good performance at module level does not necessarily guarantee that systems integrating a set of well performing individual modules will perform well as an integrated system.

Focusing on module-level evaluations alone is also not sufficient to determine which robot system can achieve a specific task. Combining both system-level and module-level benchmarking enables us to perform a deeper analysis and gain useful insights about the performance, advantages and limitations of the whole robot system.

4.1 Matrix Approach to Task and Functionality Benchmarking

As discussed above, in order to perform a specific task which has a set of goals which must be reached a robot needs to execute a set of functionalities. The Functionality and Task Benchmarks can be represented in matrix form as in Fig. 4.

Fig. 4.
figure 4

Source: RoCKIn

Task (Vertical) and Functionality (Horizontal) Benchmarking illustration.

Each task requires the effective implementation of several functionalities to be achieved successfully. Each functionality can be evaluated across different tasks or domains (e.g. Robot Navigation in both Land and Sea domains: indoor/outdoor/underwater navigation).

As illustrated in Fig. 4 suppose that for the competition we define N tasks (T1, T2, ..., Tn) which correspond to the columns (vertical) and M functionalities (F1, F2, ..., Fm) which correspond to the rows (horizontal), we will have N Task Benchmarks (TB1, TB2, ..., TBn) and S Functionality Benchmarks (here \(S\ \le \ M\)). Because we benchmark every task there will be the same number of benchmarks as the defined tasks. For some cases it is not quite necessary to evaluate each functionality in a task separately, for instance, a function of Obstacle Avoidance is an essential functionality of a robot but can be considered as part of the Navigation function, i.e., one Functionality Benchmark can evaluate more than one function at the same time. This is shown as Functionality Benchmark FBi above.

The Task benchmarks were used directly to score the competition results.

4.2 Functionality-Task Mapping for 2015 Scenarios

For the euRathlon 2015 competition, 10 scenarios across 3 domains (Land, Air and Sea) were defined. The 10 scenarios are categorised as Trials with 2 scenarios in each single domain (as shown in Fig. 5 below: L1, L2, S1, S2, A1 and A2), Sub-Challenges with 3 scenarios in combined two domains (L+A, S+A and L+S) and the Grand Challenge (GC) with 3 missions across all three domains. The purpose of the trials and sub-challenges was to, firstly, provide teams with practice in the competition environment and, secondly, provide judged events for single or two-domain teams. Thus there were in total 10 tasks corresponding to the 10 scenarios for euRathlon 2015. We also identified 4 functionalities to be benchmarked as shown in Fig. 5 below:

Fig. 5.
figure 5

Metric representation of the set of tasks and functionalities in euRathlon 2015. The /Domain indicates in which domains (Land, Air, Sea) the Functionalities are involved.

A set of ten detailed judging sheets (one per scenario) were devised for each single-domain trial, two-domain sub-challenge and the Grand Challenge, together with guidelines for judges. Data obtained directly by judges observing each event, when combined with analysis of data provided post-event by teams in standardised formats, provided the basis for both benchmarking and scoring.

The full benchmarking for tasks and functionalities are described in the document D3.2 “Benchmarks Evaluation (Part 2: Benchmarking and scoring for euRathlon 2015)Footnote 2.

5 Evaluation

5.1 The Competition

A total of 21 teams registered for euRathlon 2015 and, of these, 18 progressed successfully through the qualification process. Of those 18, two withdrew one week before the competition for different reasons; both teams did however attend euRathlon 2015 as visitors.

Table 1. Teams with country of origin and domains of participation

The 16 teams that participated in euRathlon 2015 are detailed in Table 1. They comprised a total of 134 team members from 10 countries with \(^{{\sim }}40\) robots. A group photo is shown in Fig. 6. As shown in Table 1 there were 9 single domain teams, 2 two-domain teams and 3 three-domain teams. Through a team matching process we actively encouraged single- and two-domain teams to form combined air, land and sea teams. This process resulted in 3 new matched teams to complement the existing 3 multi-domain teams. Thus, of the 16 teams at euRathlon 2015, 10 were able to compete in the Grand Challenge scenario, as shown in Table 2.

Fig. 6.
figure 6

Group photo of euRathlon 2015 participants

Table 2. Teams participating in the Grand Challenge, showing domains (L=Land, A=Air, S=Sea)

The competition took place over 9 days. The first three days were for practice, then followed 2 days for single-domain trials, 2 days for two-domain sub-challenges, and the Grand Challenge in the final two-days. Including single-domain trials, sub-challenges and the Grand Challenge a total of 48 runs were judged. It should be noted that the position of missing workers, leaks, blocked routes and OPIs were randomised between GC runs, and at no time during the competition were teams allowed access into the Torre del Sale building or the machine room.

In parallel with the competition was a public programme, including evening lectures and public demonstrations in the Piombino city centre and at the competition site. Notably the programme included demonstrations from two finalists, including the overall winner, of the DARPA Robotics Challenge (DRC). A total of \(^{{\sim }}1200\) visitors attended the competition and its public events, including several organised parties of school children, families and VIPs.

The logistics and local organisation work of euRathlon 2015 was considerable. The event was staffed by 78 people in total, including the organising staff, judging team, technical and safety team (including divers and safety pilots), media and film crew, stewards and volunteers; the judging team comprised 16 judges (12 from Europe and 4 from the USA). Despite the considerable challenges the event ran smoothly and – most importantly given the risks inherent in an outdoor robotics event – safely.

Fig. 7.
figure 7

Grand Challenge scores and ranking

5.2 Grand Challenge Results

Using the methodology outlined in Sect. 4, the judges were able to assess the performance of the 6 Grand Challenge teams. As summarised in Fig. 7 scores were derived from 5 components: task achievements, optional task achievements, autonomy class, penalties and key penalties. A number of the task achievements were scored on the basis of judges witnessing an event, such as ‘robot reaches the unobstructed entrance of the building’ or ‘robot enters the machine room’; others were scored following analysis of data supplied by teams after the run had been completed, such as map data or OPIs found. Optional achievements were bonus points awarded if, for instance, teams found both missing workers within 30 min, robots transmitted live video/image data during the run, or for direct robot-robot cooperation between domains. The autonomy class was judged on the basis of observing teams, with 1 point awarded for full autonomy, 0.5 for semi-autonomous operation and 0 for tele-operation. Penalties were marked for each manual intervention per achievement, or key penalties for mission critical errors such as closing the wrong valve.

Fig. 8.
figure 8

Functionality benchmarks for the Grand Challenge (Color figure online)

Within two hours of completion of the Grand Challenge teams were required to provide vehicle navigation data, mission status data, map information and object recognition information, all using the Keyhole Markup Language (KML) format. This allowed judges to load KML files into Google Earth for evaluationFootnote 3.

Figure 8 shows the functionality benchmarks for the Grand Challenge. Of the functionality benchmarks proposed in Fig. 5 we were unable to evaluate obstacle avoidance and object manipulation because of insufficient data. However, we had good data to compare mapping in all three domains, and object recognition (finding OPIs). In Fig. 8, 1.0 is a perfect score, and it is notable that overall winners Cobham+ISEP+UDG achieved 1.0 for finding OPIs, and 0.95 for indoor and underwater maps, however a weakness was outdoor mapping at 0.56. Team ICARUS however achieved a perfect score for the outdoor map, but failed to produce an indoor map. Team Bebot+TomKyle on the other hand produced a perfect indoor map, and was very successful in finding OPIs (0.87) but did not produce an underwater map. As an example Fig. 9 shows the outdoor map generated by fusing the data from air and land robots by team ICARUS.

Fig. 9.
figure 9

Credits: team ICARUS.

The fused map obtained by the ground and aerial vehicles of the ICARUS team during the Grand Challenge.

Fig. 10.
figure 10

Overall winners of the euRathlon 2015 Grand Challenge: ISEP/INESC TEC (Air), Team Cobham (Land) and Universitat de Girona (Sea)

In euRathlon, because of the unstructured nature of the environment and changes in conditions between events the benchmarks are relatively coarse. However, our Benchmarking and Scoring methodology proved to be very successful in allowing a thorough and transparent evaluation of the performance of teams during the euRathlon 2015 competition. Perhaps the best indicator of the success of the approach was the fact that teams were clearly differentiated in both task and functionality benchmarks; notably no scores were appealed. The detailed scores exposed strengths and weaknesses, both between teams and of the state of the art as represented by competing teams and their robots. The overall winners of the Grand Challenge, scoring 53 out of a maximum achievable of 75 points, were team ISEP/INESC TEC (Air), Team Cobham (Land) and Universitat de Girona (Sea), shown with their robots in Fig. 10. This was a particularly impressive outcome given that these three teams had not worked together until arriving at euRathlon 2015. However, of the teams entering the Grand Challenge five achieved creditable performance in mapping, finding missing workers and closing valves in a complex search and rescue scenario that placed great demands on both the robots and the teamwork needed to coordinate those robots.

5.3 Lessons Learned

By all measures euRathlon 2015 was a very successful event. We attracted a larger number of teams than originally planned, and the team matching process proved to be very successful. Indeed perhaps the most significant outcome of not just euRathlon 2015 but the whole project was in bringing together air, land and sea robotics domains to create a new community. We estimate that we have, through workshops and competitions trained \(^{{\sim }}200\) roboticists in outdoor multi-domain robotics.

From a technical point of view we were impressed by the performance of teams in the Grand Challenge noting however that there were a number of common difficulties that all teams experienced. The first was radio communication. Most teams expected to use WiFi networks to maintain communication with land robots, and despite some innovative approaches to overcoming range limitations, such as dropping repeaters or using several land robots as a multi-hop network, all teams experienced challenges. The second was human-robot interfaces – many teams had poorly designed interfaces with their robots which severely tested those operating or supervising robots from inside hot control tents. The third limitation was human-human interaction. We did not specify how the teams communicated between land, sea and air control stations, but it was clear that the most successful multi-domain teams were those who established and rehearsed clear channels and protocols for human-human coordination between the domains. The real challenges are often not technical but human.