Keywords

1 Introduction

Natural and manmade disasters can pose great risks to health as well as to life. Ecuador, for example, is situated in what is referred to as the “Ring of Fire.” It is located next to a large tectonic plate, and volcanoes are found throughout its mountain ranges. Therefore, this zone has a high level of seismic danger [1]. Because of its geographical and geologic features, this region is exposed to adverse circumstances of natural and anthropogenic causes, such as active volcanoes, geologic faults, tsunamis, as well as a possible terrorist risk due to its proximity to Colombia. The latter could trigger chemical and/or nuclear attacks, putting the population’s health at risk.

On April 16, 2017, a series of earthquakes hit the coastal region of Ecuador. This was one of the greatest tragedies in decades, with 671 fatalities [2], many of whom could have survived, but due to the loss of road access, many victims died without even receiving medical attention despite the fact that many of them had managed to escape from the rubble of the collapsed buildings. Areas affected by natural disasters often collapse completely. Tackling floods, earthquakes, landslides, and forest fires becomes increasingly complicated given the inaccessibility of the places where there are still people alive. In the case of terrorist attacks of a chemical or nuclear nature, locating survivors becomes difficult or even unfeasible for health workers.

The seconds and minutes following a catastrophe are critical in minimizing the damage caused. Normally, there is a clear protocol of action for the coordination of specialized personnel and the management of resources, and this facilitates an efficient response to a catastrophe.

However, human intervention can prove very difficult in certain circumstances due to the very consequences of the catastrophe and the real risk to the people in charge of managing the crisis. In this type of setting, technology can contribute to the improvement of said management using robot support. Robots can serve as the first line of action to gather information in difficult-to-reach places in a quick and precise way. If this intervention is performed in a coordinated manner, the chances of minimizing the damage suffered from the catastrophe increase drastically.

Currently, there are various alternatives, such as teleoperated robots, [3] that through various detection systems can perform a search for human bodies, which is a great benefit over the majority of ground-based robots. However, teleoperated robots have limitations to achieving their goal that are inherent to their form of locomotion [4,5,6]. Wheels and caterpillar tracks impede avoiding and overcoming obstacles in the trajectory of the robot. Most existing robots can perform a search for people, but they do not provide information about their vital signs in a non-contact manner. Consequently, the task of the robots becomes even more complicated, for finding the right position to place a sensor on the victim is often impossible.

It is undeniable that a correctly performed triage will help health workers to take good decisions about the priority of attention and transport of the patients [7]. In this way, the rescue efforts can be more efficient and effective. Currently, various proposals for rescue robots exist. The Souryu IV robot [8] is designed to move through debris to carry out a search. It is equipped with RGB and IR cameras as well as a lighting system of high-power light-emitting diodes (LED). This robot is still in an experimental phase, and as of yet there is no information as to whether or not it will have the ability to measure vital signs.

Park et al. [9] propose a victim-recognition system using infrared images for a search-and-rescue robot in dark environments. The robot is equipped with a RGB-D, Microsoft Kinect, and IR camera. The Kinect utilizes a structured light source. In any case, the authors present a system for the recognition of a human victim in dark environments, but not vital sign detection. Another interesting robot is Vital-Bot [10], a small mobile robot that has a 5.8 GHz radar receiver for non-contact vital sign detection. The robot is capable of detecting human heartbeats and is designed for search-and-rescue missions. It is a project on which its authors are still working. Recently, another research group has proposed the miniaturization of the radar [11].

Other techniques to detect victims and measure their vital signs from a distance exist [12,13,14]. Using the Ultra-Wideband radar, the heart and respiratory rates of victims under debris can be detected. There are proposals to process thermal images of the victims and in this manner detect heart and respiratory rates. Thermal cameras can help to determine whether a victim has vital signs, but the cost still remains high. Additionally, they often suffer interference from the environment.

One mobile robot [15] for critical care in the medical attention area is designed with various sensors (GPS, ultrasound, and a heart rate sensor). The problem with this robot is that the heart rate sensor needs to be placed on the victim to carry out an evaluation; therefore, if the victim is unconscious or immobilized, the evaluation is rendered impossible.

Unmanned aerial vehicles (UAV) or drones are small robots that are extremely useful for a variety of applications. Generally, they boast low response times, provide new opportunities for rescue teams, and work despite adverse conditions such as high temperatures, smoke, pollutants, and toxic materials.

This investigation aims to develop a system for the search, detection, and basic triage of victims employing a UAV, the characteristics of their mounted cameras, and their capacity to reach places that are difficult to access for other robots and even men. Once the victim is found, the UAV will proceed to the remote physiological measurement of heart and respiratory rates utilizing the photoplethysmography imaging technique.

2 Materials and Methods

2.1 Hardware

To achieve the aims of the investigation, a drone (UAV) will be used. The brand is Parrot, and the model is Anafi 4k (Fig. 1). Anafi is a drone equipped with a 180-degree tilting gimbal that can capture images from difficult angles, a zoom without loss of resolution up to 2.8X, and a three-axis image stabilization to take stable photos and videos. Furthermore, it has a 1/2.4” 21MP CMOS Sony Sensor, GPS and GLONASS satellite navigation systems, and a flight time of 25 min.

Fig. 1.
figure 1

Parrot Anafi 4k unmanned aerial vehicle

Parrot provides the library Olympe as the programming interface to connect and control the drone from a remote Python script executed on a computer. Olympe sends command messages to the drone (take off, land, camera orientation, flight plans, etc.), checks the current status of the drone, waits for event messages, starts and stops the video streaming, and records the video stream and the associated metadata.

2.2 System Architecture

The system is designed under the Robot Operating System (ROS). For its development and testing, a UAV with an onboard camera was used. The objective of the system is to perform the navigation with a UAV over a disaster zone to perform basic triage on the detected victims. Once the navigation has commenced, by means of the onboard camera, the UAV carries out the online video stream for the search and detection of people lying on the ground. If a person is detected during its general flight plan, the location of the person is calculated and the person detected is positioned in the center of the frame of the video. The flight navigation then changes to victim navigation. It approaches and descends until it is positioned one meter away from the person lying on the ground and at a 45-degree angle from the face of the victim. Once the drone is in position, the detection of the face of the victim is performed. With this information, the location of the face of the victim is calculated, and the flight navigation changes to face navigation to position the camera over the region of interest (ROI) on the face of the victim.

Once the UAV is in position, the measurement of the heart and respiratory rates of the victim is carried out utilizing the photoplethysmography imaging technique as part of the basic triage. Once the geo-positioning coordinates, heart rate, and respiratory rate information is sent, the UAV continues its general navigation until the entire area of its initial flight plan is covered. Upon completion of the general flight plan, the UAV returns to its takeoff position. The general system architecture (Fig. 2) consists of four phases: video stream, navigation, positioning, and detection. Each one of these phases is made up of subsystems that interact with each other.

Fig. 2.
figure 2

System architecture.

2.3 Video Stream

The video stream phase provides us the required information for the analysis of the environment, person detection and positioning, and navigation. From the initiation of the system, the video stream starts and throughout the whole process keeps working in parallel with the rest of the phases.

The obtaining and streaming of the video of the UAV is carried out by means of the Olympe library by Parrot through a Python script for its integration with the ROS. The UAV connects to the computer through a wireless WiFi interface. It constantly sends information as to its status and battery charge level. If its battery charge level reaches 20%, the UAV returns to the initial takeoff location. The algorithms in the detection, positioning, and navigation stages will work upon the images obtained from the video.

2.4 Navigation

This phase of the system is focused on the navigation of the UAV (Anafi drone) with a flight plan. Upon takeoff, the “home” position is determined; that is, the point to which the drone will return at the end of the planned flight.

To carry out the UAV flight plan, we used the Olympe library. With its integrated GPS satellite navigation system, we determined its P0 position for the start and end of the route. The area of the planned flight for the general navigation is 15 by 30 m at a height of 10 m (Fig. 3).

Fig. 3.
figure 3

General route flight navigation.

When the UAV is at position P1 at an altitude of 10 m, it starts on its route to cover the area, passing through points P2, P3, P4, P5, P6, and P7 (Fig. 3). When the UAV reaches position P7, it returns automatically to position P0 and lands.

During the general UAV navigation, the streamed video is processed to carry out the recognition of people lying on the ground. In this manner, if the UAV identifies a person during its route, its flight plan is modified. The UAV will center the image on the person, carrying out the victim navigation (Fig. 4a) to approach the victim to up to 2 m of distance at a 45-degree angle. Once the UAV is in this position, it carries out face identification and positioning. With the help of face navigation (Fig. 4b), it positions itself above a region of interest on the face to carry out the triage. Upon completion, it continues its flight plan with general navigation.

Fig. 4.
figure 4

(a) Victim navigation, (b) face navigation

The navigation phase consists of three subsystems: general navigation, victim navigation, and face navigation.

2.5 Positioning

The positioning phase works in parallel with detection and navigation. When the detection of a person lying on the ground has been carried out, by means of the Pin-hole method, the location and distance of the person are established in relation to the camera of the UAV (Fig. 5a). To initiate the victim navigation and approach the person for face detection, the Pin-hole method is once again used. That way, the location and the distance of the face of the person in relation to the camera is established (Fig. 5b), and by means of the face navigation, the camera focuses on the region of interest for analysis.

Fig. 5.
figure 5

(a) Victim positioning, (b) face positioning

2.6 Detection

This phase consists of three important processes: detection of people lying on the ground, face detection, and detection of heart and respiratory rates.

Detection of People

The detection of people is carried out in parallel with the flight plan of the drone. To perform the detection of people on the ground, we trained an algorithm with TensorFlow. For this training, we obtained 100 images of people lying on the ground. With the Labelling tool, we drew rectangles or boxes around the person. In that way, an XML file of each image was generated with the coordinates of the rectangle drawn. This file helped us to perform the training of the algorithm.

For the training, we used a pre-trained model ZOO faster_rcnn_resnet 101_coco [16]. When the detection of a person is carried out, the victim positioning and victim navigation are executed so that the UAV approaches up to a distance of 2 m away and continues with face detection.

Face Detection

Once the UAV is positioned over the person, it proceeds to search for the face of the victim. There are various algorithms for face detection, but most of them focus on the detection of front-facing human faces. The majority of facial features have a certain similarity, regardless of sex. This allows for the development of techniques for face detection and recognition. The image of the body of the victim captured by the UAV is converted to grayscale and is resized. This makes for a reduction in the computation needed for face detection. The face detection is performed using the Dlib library [17] because of its features and precision. Subsequently, using the coordinates of the face, the program looks for facial marks (Fig. 6). The facial marks in question are the beginning and end of the eyebrows. With this information, the position of the head can be estimated (roll, yaw, and pitch).

Fig. 6.
figure 6

Detection of face and facial marks using Dlib

The beginning and end of the eyebrows define the area of analysis on the forehead of the victim. The imaginary line that connects the eyebrows help define the pitch. The size of the imaginary line, which is the distance between the eyebrows, defines the size of the area of analysis. This distance varies depending on the distance between the face and the camera. Finally, the ROI (Fig. 7) increases horizontally in the opposite direction to the movement of the face (yaw) and in both directions (roll). If the facial marks corresponding to the eyebrows cannot be found, the marks used will be the forehead and the chin. In this case, the cheek of the victim will be the ROI for analysis. Finally, once the region of interest has been estimated, it is analyzed for vital signs detection.

Fig. 7.
figure 7

ROI identification

Vital Signs Detection

The photoplethysmography imaging technique is used for the estimation of the heart and respiratory rates. It is a technique that allows for remote, non-contact measurements of vital signs. It is an alternative to conventional photoplethysmography on humans, as the photoplethysmography imaging technique is performed by means of a video camera instead of a photodetector. The camera can be RGB or multi-spectral, but it is important that the spectral sensitivity of the cameras and the absorption spectrum of the oxygenated and deoxygenated hemoglobin coincide. The technique has a basic layout (Fig. 8). The images of the identified region of interest (forehead or cheek) are analyzed by channel spacing.

Fig. 8.
figure 8

Basic layout for heart and respiratory rate detection

All the pixels of each frame of each one of the channels are averaged. With each new average of each region of interest, a FIFO memory structure is generated (The memory structure has a dimension of 6 s). Inside the memory structure, a photoplethysmographic signal is stored that contains at least six heartbeats. Once the memory structures are full, the method of least squares described [18] is used to filter the trend signal of the averaged signal. Upon subtracting the average signal and the tracking signal, a low-frequency filter is generated. A filter pass whose band type is IIR Butterworth filter pass is then used with the signal without trend, with the heart and respiratory limits of a normal person. Subsequently, the Fourier transform is applied to find the signal spectrum. The matching peak with the heart and respiratory rates is searched for. The respiratory rate is detected through the sinus arrhythmia in the heart rate.

3 Results

A three-stage plan was carried out. The three stages were the detection stage, the navigation and positioning stage, and the complete system stage. Each stage was analyzed separately to make the needed adjustments and to ensure the proper functioning of the entire system.

The trials were conducted in two different places, the parking lot of the Catholic University of Cuenca and a concrete court. Trials were performed with three people, who lay on the floor simulating the victims, to measure the response times and how long each process took. The detection stage worked as intended, correctly detecting the people, the faces, and the heart and respiratory rates, with a small variation in the trial times due to the impact of the wind on the stability of the UAV. The navigation and positioning trial helped us to determine the correct distance and angle for the location of the UAV in the face detection and the detection of heart and respiratory rates.

The complete system trial was carried out successfully, with all the stages of the architecture working properly and as a whole. The UAV set its position with the GPS geo-positioning coordinates upon takeoff, started the video stream, climbed to a height of 10 m, and commenced the general navigation. The navigation continued until the first person was detected, whereupon the positioning was carried out to center the person in the frame of the video. Subsequently, using the victim navigation, it approached up to a distance of 2 m with the camera at a 45-degree angle (Fig. 9a). When it reached that position, it carried out the face detection and positioned the face of the victim in the center of the frame of the video. The face navigation was performed to be in the best position and to detect the region of interest. Finally, it carried out the heart and respiratory rate detection (Fig. 9b). The vital signs measurements were correctly validated with a high degree of correlation. Upon completion of this process, the UAV returned to its general navigation route, completed its trajectory, returned to its takeoff position, ended the video stream, and landed seamlessly.

Fig. 9.
figure 9

(a) Photographic report of system operation, (b) image of the system in operation performing heart and respiratory rate detection

To confirm that the basic triage had worked properly, the data obtained by the photoplethysmography imaging technique were compared with data gathered using a heart rate monitor with a finger sensor and the help of a mobile application that helps one to control one’s respiratory rate.

To calculate the degree of agreement, the measurements taken during 60 s on the three people acting as the victims were used as a sample. The results that were obtained by means of the camera of the UAV were compared with the results obtained by means of the heart rate monitor with a finger sensor (Fig. 10) that was used to obtain the heart rate of the person acting as a victim. The victims controlled their respiratory rates with the help of the previously mentioned application.

Fig. 10.
figure 10

Scatterplot of the relationship between the samples taken by the non-contact method and the samples taken by the method involving contact. The measurements performed correspond to three people. (a) Heart rate and (b) respiratory rate.

To calculate the degree of agreement of the samples taken during the triage carried out by the UAV, the samples were analyzed using the Bland-Altman method [19]. The difference between the measurements obtained using the UAV system by the photoplethysmography imaging technique and those obtained using the heart rate monitor are represented as a function of the average of the two systems for heart rate and respiratory rate (Fig. 11). The averages are represented by dotted lines. The limits of agreement of 95% (±1.96 SD) are represented by continuous lines. The average bias was 0.4 for heart rate with limits of agreement of 95%, −4.3 to 5.2 bpm, and for the respiratory rate, the average bias was 0.11 with limits of agreement of 95%, −1.14 to 1.36 bpm.

Fig. 11.
figure 11

Bland-Altman plots that show the degree of agreement between the measurements taken using the proposed non-contact method and the method involving contact. The performed measurements correspond to three people. (a) Heart rate, (b) respiratory rate.

4 Discussion

During the system testing, some points for future improvement were found. For example, when the face detection was carried out on a person lying face down, the system did not recognize any face. This means that the basic triage would not be performed and that the UAV would return to its general navigation route. In this case, the system should report the presence of a victim and send the coordinates to the rescue team before returning to its general navigation.

The trials were conducted successfully with one and two victims, and the UAV was able to cover the entire designated area. However, when there were three victims, the process was not completed due to the fact that the UAV has a flight time of only 25 min before it returns to its takeoff position and lands. As an alternative, a backup battery could be incorporated into the UAV to increase its flight time.

In the trials, the completion times for the navigation, positioning, and detection processes varied for each of the victims due to the influence of the wind on the stability of the UAV. In our case, the wind had an impact because of the size and weight of the UAV, which can be improved upon by using UAVs that are more robust and have greater engine power.

The trials carried out in open spaces did not have any problems with using the GPS georeferencing to obtain the coordinates of the position of the UAV. But, as was to be expected, the trials conducted in closed spaces had problems with the GPS signal, which in turn caused problems with the normal navigation of the UAV. Therefore, a UAV equipped with sensors or three-dimensional cameras is recommended so that it is able to navigate and position itself in closed environments. In any case, it is a line of investigation which can be furthered, with the advantage that the code developed in ROS can be used on other types of UAVs.

The system was designed for navigation in an environment without obstacles. In future studies, a system for navigation around obstacles could be incorporated.

The results obtained from the non-contact system were compared with the traditional contact systems, a finger pulse monitor and a mobile application for the control of respiratory rate, thereby verifying the reliability of the results gathered within the permitted range of ±5 heartbeats among systems.

It is important to mention that at present some proposed systems use UAVs for the search for people [4]. However, none of these systems incorporate the non-contact measurement of heart and respiratory rates of the person [10, 11, 20] to support their assessment and rescue.

5 Conclusions

The development of the present investigation should be considered as the start of the continual improvement of the proposed system and a prototype for wide-scale implementation after making the necessary adjustments to guarantee proper functioning.

As previously mentioned, further work is needed to develop a navigation system that can avoid obstacles within the intervention area. Furthermore, consideration should be given to the possibility of using an infrared camera for low-light settings, and a two-way audio system could be incorporated that allows the rescue team to main audio and visual contact with the victim in real time while the measurement of the heart and respiratory rates is being performed.

The photoplethysmography imaging technique used to carry out the basic triage delivered good results. The biggest drawback lies in determining an appropriate area for analysis, but it is superior to the usual methods that involve contact, as carrying out the measurement in a non-contact manner is much easier when victims are unable to cooperate.

Using a UAV with better features can considerably increase the size of the search area, reduce the execution times of the processes, and ensure greater stability throughout its navigation. The system architecture and design have been developed to be implemented in other robotic systems and to look for better alternatives in the operation and design of its algorithms. The system proposed for the non-contact basic triage can be implemented in other systems that are used in the case of patients who need continual monitoring of their heart and respiratory rates.