Keywords

1 Introduction

The rapid development of unmanned aerial vehicles (UAV) used to deliver a variety of materials and offering different types of services to improve our quality of life - cargo delivery [1], rescue missions, monitoring of hard-to-reach areas, detection of minerals, detection of poachers, search for people in need [2, 3], etc. Due to their low cost and wide distribution, they are easily accessible, easily manageable and with opportunities for easy abuse. They also prove to be very dangerous for our security and way of life, as drones currently play a leading role in military conflicts [1]. There are not isolated cases of their use for the supply of drugs, weapons, illegal materials, and objects. They can also be used to attack with explosives or poisonous materials. It is forbidden to use them around airports and sites of strategic importance to prevent unwanted accidents. Due to their small size, shape, and the materials from which they are made, their high manoeuvrability and the places of their use - especially in urban conditions, they are very difficult to detect and track.

There are various technologies for detecting, tracking, and recognizing drones [4] such as: 1) Radar-Based Drone Detection. Traditional airspace surveillance radar provides stable detection of aircraft and missiles at long distances and moving at high speed, they are not suitable for detecting small objects such as UAVs that fly at relatively lower speeds and on a variable trajectory [5]. Using Doppler radar, it is possible to detect the presence of rotating drone blades, but this is very difficult, especially for a small drone. The most promising for drone detection are radar systems using FSR principles [6,7,8]. With these systems it is possible to detect both drones and objects realized with stealth technology. 2) RF-Based Drone Detection. RF-based UAV detection system is one of the most popular systems for combating unmanned aerial vehicles as they detect and classify UAVs through their RF signatures [5, 9,10,11]. The RF sensor is passive and eavesdrops on the space to detect the radio signal for communication between the UAV and its controller. However, not all drones have radio frequency transmission, and this approach is not suitable for detecting UAVs operating autonomously or with GPS targeting without RF radiation [4]. 3) Acoustic-Based Drone Detection. Relatively inexpensive acoustic detection systems use a set of acoustic sensors or microphones to detect and recognize specific acoustic models of UAV rotors. These systems are convenient to use in low visibility environments, but their operating range is less than 200–250 m. Speaker systems are very sensitive to environmental noise, especially in urban conditions or in the presence of wind. 4) Camera-Based Drone Detection.

The detection of unmanned aerial vehicles can also be done with the help of cameras. It is known that the detection and classification of UAVs is highest when the target is visible. This method also provides additional visual information such as drone type, size, and payload that other unmanned aerial vehicle detection systems cannot provide [12,13,14]. They are affordable, have a medium range of detection and good location of the sites. However, video surveillance is difficult at night and in conditions of limited visibility, such as clouds, smoke, fog, and dust. There are also systems using in parallel and thermal cameras to improve visibility in these conditions as well as in rain, snow, and bad weather. 5) Combined Drone Detection Systems. As we can see, each of these types of drone detection has its specific advantages and disadvantages. To improve the performance of the systems, it is possible to use a network with different types of sensors, for example: to use acoustic, optical, radar, infrared and visible cameras, etc. at the same time [2]. Therefore, maximum system performance can be achieved by merging several ways to detect unmanned aerial vehicles. However, our focus is the approach that uses camera images in video surveillance. The algorithm proposed in the article was also tested on thermal images to improve the performance of the system at night and in bad meteorological conditions.

The drone detection for the restricted areas or special zones is important and necessary. This paper focuses on the drone detection problem based on image processing for the restricted areas or special zones where used cameras for monitoring.

The proposed algorithm is multi-channel and allows UAV detection at different distances and can be used to process visual and thermal images when detecting drones both in the light and in the dark part of the day. In Sect. 2, a classification of UAVs is made to show specific characteristics of existing drones. Section 3 analyzes the proposed multi-channel drone detection algorithm based on the two-dimensional Otsu algorithm presented in Sect. 4. Section 5 shows the performance results of the multi-channel algorithm tested on visible and thermal images. In Sect. 6, conclusions and proposals for the development of the algorithm in the future are made.

2 Classification of Drones

For better detection of unmanned aerial vehicles, it is necessary to know their tactical and technical parameters and characteristics well. According to their parameters, drones have been classified based on their application, weight, altitude, and range, wings, and rotors [15, 16]:

2.1 Application

  • Government: Used for mapping, agricultural needs, patrolling, firefighting, etc.

  • Military: Used for surveillance, security, or combat attacks.

  • Commercial: Used for applications such as aerial surveillance and photography, delivery of raw materials and artefacts.

  • Personal: Used for entertainment and video recording.

2.2 Weight

  • Nano: Drones with weight less than 250 gm

  • Micro: Drones with weight greater than 250 gm and less than 2 kg

  • Small: Drones with weight greater than 2 kg and less than 25 kg

  • Medium: Drones with weight greater than 25 kg and less than 150 kg

  • Large: Drones with weight greater than 150 kg

2.3 Altitude and Range

  • Hand-held: Drones that can fly at altitudes of less than 600 m and have a range of less than 2 km.

  • Close: Drones with an altitude of less than 1500 m and range less than 10 km.

  • NATO: Drones with an altitude of less than 3000 m and range less than 50 km.

  • Tactical: Drones with an altitude of less than 5500 m and range less than 160 km.

  • MALE (Medium Altitude Long Endurance): Drones with an altitude of less than 9100 m and range less than 200 km.

  • HALE (High Altitude Long Endurance): Drones with altitude more than 9100 m and indefinite range.

  • Hypersonic: Drones with altitude around 15200 m and range greater than 200 km.

2.4 Based on Wings and Rotors

  • Fixed Wing: Drones that resemble an aeroplane design with fixed wings.

  • Single Rotor: Drones that resemble a helicopter design with one main rotor and another small one at the tail.

  • Multi-rotor: Drones that have more than one rotor. The most found are tricopters, quadcopters, hexacopters and octacopters.

  • Fixed-Wing Hybrid VTOL: Hybrid Drones with longer flight time. They have the stability of fixed-wing Drones as well as the ability to hover, take off and land vertically. Here, VTOL refers to vertical takeoff and landing.

3 Drone Detection Algorithm

From the great variety of unmanned aerial vehicles follows the numerous algorithms for their detection. In this article, a detection system using video surveillance of the airwaves will be considered. The presented algorithm in this section is original and can be used both in the visible part of the day and at night.

An algorithm for detecting drones is based on a two-dimensional Otsu algorithm, with the help of which image segmentation and subsequent automatic target recognition is carried out, described in Sect. 4. The proposed algorithm is tested on visible light images and infrared heat wave images (Fig. 1). Initially, the footage from the video is converted to a black and white image. Processing is performed both sequentially frame by frame and in parallel in several channels to detect objects of different sizes. Depending on the type of drone and the distance to it, it occupies a different number of pixels in the image. For example, a drone located at a great distance from the camera occupies a small number of pixels compared to a drone at close range [17, 18]. To detect an object in the image occupying a different number of pixels, it is proposed that the algorithm is to be multi-channel and perform parallel image processing. Each of the channels is present to detect areas of the figure with a different number of pixels. As a result of processing the output of one of the channels, it is possible to detect the presence of an object of certain dimensions. Image reconstruction can improve the efficiency and workability of the proposed algorithm [19, 20].

One of the most important parts of the algorithm is the classification of detected objects. In real-world scenarios, the detected moving objects are insects, birds, drones, airplanes, etc. In our case, indoors, we decided to use a classifier that divides all found objects into three classes: insects, drones, and background. The MobileNetV2 was chosen as the classifier. It is a convolutional neural network architecture that seeks to perform well on mobile devices.

Fig. 1.
figure 1

Drone detection algorithm

In our next research, we will propose an algorithm for estimating the speed of the drone, based on the change in the dimensions of the drone, which are proportional to the distance travelled and the time for this movement measured by the number of frames in the video recording the movement.

4 2D OTSU Algorithm

The proposed UAV detection algorithm is based on a two-dimensional Otsu algorithm, with the help of which image segmentation and sub-sequent automatic target recognition is carried out. It is applied in each of the processing channels and helps to detect objects of different sizes. The two-dimensional Otsu algorithm applies the two-dimensional histogram comprising the image gray and its neighborhood image average gray to find the optimal threshold and then divides the image into the target and background [21]. Suppose an image pixel size is \(M \times N\), gray scale of the image range from 1 to \(L\). The neighbourhood average gray \(g\left(m,n\right)\) of the coordinate definition \(\left(m,n\right)\) pixel point is as follows:

$$g\left(m,n\right)=\frac{1}{kxk}\sum\nolimits_{i=-\left(k-1\right)/2}^{\left(k-1\right)/2}\sum\nolimits_{j=-\left(k-1\right)/2}^{\left(k-1\right)/2}f\left(m+i,n+j\right)$$
(1)

Calculated the average neighbourhood gray of each pixel point, a gray binary group \(\left(i,j\right)\) may form. If the frequency of two tuples \(\left(i,j\right)\) is expressed as \({C}_{ij}\), then the corresponding joint probability density can be determined by the formula:

$${P}_{ij}=\frac{{C}_{ij} }{M\mathrm{x}N},i, j=1, 2, \dots , L$$
(2)

where \(M\mathrm{x}N\) is the number of pixels and L is the gray level of image, and there is \(\sum\nolimits_{i=1}^{L}\sum\nolimits_{j=1}^{L}{P}_{ij}=1\).

Assuming the existence of two classes \({C}_{0}\) and \({C}_{1}\) in two-dimensional form, the histogram represents their respective goals and background, and with two different probability density distribution function. If making use of two-dimensional histogram threshold vector \(\left(s,t\right)\) to segment the image (of which \(0\le s\), \(t<L\)), then the probability of two classes (the target and background regions) are respectively:

The probability of background occurrence is:

$${\omega }_{0}=P\left({C}_{0}\right)=\sum\nolimits_{i=1}^{s}\sum\nolimits_{j=1}^{t}{P}_{ij}={\omega }_{0}\left(s,t\right)$$
(3)

The probability of target occurrence is:

$${\omega }_{1}=P\left({C}_{1}\right)=\sum\nolimits_{i=s+1}^{L}\sum\nolimits_{j=t+1}^{L}{P}_{ij}={\omega }_{1}\left(s,t\right)$$
(4)

The corresponding mean vectors \({\mu }_{0}^{*}\) and \({\mu }_{1}^{*}\) of the target and background are respectively:

$${\mu }_{0}^{*}={\left({\mu }_{0i}^{*},{\mu }_{0j}^{*}\right)}^{T}={\left[\sum\nolimits_{i=1}^{s}\sum\nolimits_{j=1}^{t}\frac{{i.P}_{ij}}{{\omega }_{0}}, \sum\nolimits_{i=1}^{s}\sum\nolimits_{j=1}^{t}\frac{{j.P}_{ij}}{{\omega }_{0}}\right]}^{T},$$
(5)
$${\mu }_{1}^{*}={\left({\mu }_{1i}^{*},{\mu }_{1j}^{*}\right)}^{T}={\left[\sum\nolimits_{i=s+1}^{L}\sum\nolimits_{j=t+1}^{L}\frac{{i.P}_{ij}}{{\omega }_{1}}, \sum\nolimits_{i=s+1}^{L}\sum\nolimits_{j=t+1}^{L}\frac{{j.P}_{ij}}{{\omega }_{1}}\right]}^{T}$$
(6)

Calculate the total mean vector \({\mu }_{T}^{*}\) on the two-dimensional histogram:

$${\mu }_{T}^{*}={\left({\mu }_{Ti}^{*},{\mu }_{Tj}^{*}\right)}^{T}={\left[\sum\nolimits_{i=1}^{L}\sum\nolimits_{j=1}^{L}{i.P}_{ij}, \sum\nolimits_{i=1}^{L}\sum\nolimits_{j=1}^{L}{j.P}_{ij}\right]}^{T}$$
(7)

The definition of dispersion matrix:

$$\begin{gathered} \sigma B = \omega _{0} \left( {s,t} \right)\left[ {\left( {\mu _{{oi}}^{ * } - \mu _{{Ti}}^{ * } } \right)^{2} + \left( {\mu _{{oj}}^{ * } - \mu _{{Tj}}^{ * } } \right)^{2} } \right] + \omega _{1} \left( {s,t} \right)[\left( {\mu _{{1i}}^{ * } - \mu _{{Ti}}^{ * } } \right)^{2} + \hfill \\ \;\;\;\;\;\left( {\mu _{{1j}}^{ * } - \mu _{{Tj}}^{ * } } \right)^{2} ] = \omega _{0} \left( {s,t} \right)[\left( {\sum\nolimits_{{i = 1}}^{s} {\sum\nolimits_{{j = 1}}^{t} {\frac{{i,P_{{ij}} }}{{\omega _{0} \left( {s,t} \right)}} - \sum\nolimits_{{i = s + 1}}^{L} {\sum\nolimits_{{j = 1}}^{L} {i.P_{{ij}} } } } } } \right)^{2} + \hfill \\ \left( {\sum\nolimits_{{i = 1}}^{s} {\sum\nolimits_{{j = 1}}^{t} {\frac{{j.P_{{ij}} }}{{\omega _{0} \left( {s,t} \right)}} - \sum\nolimits_{{i = 1}}^{L} {\sum\nolimits_{{j = 1}}^{L} {j.P_{{ij}} } } } } } \right)^{2} ] + \omega _{1} \left( {s,t} \right)[(\sum\nolimits_{{i = s + 1}}^{L} {\sum\nolimits_{{j = t + 1}}^{L} {\frac{{i.P_{{ij}} }}{{\omega _{1} \left( {s,t} \right)}}} } \hfill \\ \sum\nolimits_{{i = 1}}^{L} {\sum\nolimits_{{j = 1}}^{L} {i.P_{{ij}} } )^{2} + \left( {\sum\nolimits_{{i = s + 1}}^{L} {\sum\nolimits_{{j = t + 1}}^{L} {\frac{{j.P_{{ij}} }}{{\omega _{1} \left( {s,t} \right)}} - \sum\nolimits_{{i = 1}}^{L} {\sum\nolimits_{{j = 1}}^{L} {j.P_{{ij}} } } } } } \right)} ^{2} ] \hfill \\ \end{gathered}$$
(8)

When the track of the above-mentioned dispersion matrix gets the maximum, the corresponding threshold of segmentation is the optimal threshold \(\left( {S^{*} ,T^{*} } \right)\), namely:

$$ \left( {S^{*} ,T^{*} } \right) = max_{1 \le t, s < L} \left\{ {\sigma B} \right\} $$
(9)

The images with noise segmented by Otsu way may get better results compared to one dimensional threshold segmentation methods, but the computation cost gets huge. That is to say, the determination of optical threshold is a function of scale value of images.

5 Results

The algorithm proposed in Sect. 3, basically using the two-dimensional Otsu algorithm described in Sect. 4, was tested with a video recording of a flying drone indoors. During the experiments, visual and thermal images were taken. Our proposed algorithm is used for the processing of both types of images. An XMART OPTICAL FLOW SG900 drone with dimensions 29 by 29 by 4 cm was used for the experiment (Fig. 2).

Fig. 2.
figure 2

XMART OPTICAL FLOW SG900 drone

The drone flies at relatively close distances to the camera between 2 and 10 m from it. As the drone moves away from the camera in the image, the size of the drone decreases. Due to the short distances to the drone, it was decided that the algorithm should be three-channel and to detect objects with sizes between 20,000–10,000 pixels, 10,000–2,000 pixels and 2,000–600 pixels. These areas correspond to distances of 3, 6 and 9 m from the camera. The camera we use has a resolution of 1200 × 720 30 frames per second. During the experiment, the drone moved closer and further away from the camera. When processing the video, after detecting the object in a given channel, a classification of the detected object is made with images of drones known to us. In our next research we will propose an algorithm for estimating the speed of the drone, considering the time between frames in the video. Our proposed algorithm is used to process images both in the visible and the thermal range.

5.1 Visible Light Imaging

Intermediate results of video processing of images obtained in the visible range at these three distances (3, 6, and 9 m) are shown in Figs. 3, 4, 5, 6, 7, 8, 9, 10, 11. The figures reveal that the drone is successfully detected.

Fig. 3.
figure 3

Visual image (distance to target - 3 m)

Fig. 4.
figure 4

Segmented image (distance to target - 3 m)

Fig. 5.
figure 5

Target detection image (distance to target - 3 m)

Fig. 6.
figure 6

Visual image (distance to target - 6 m)

Fig. 7.
figure 7

Segmented image (distance to target - 6 m)

Fig. 8.
figure 8

Target detection image (distance to target - 6 m)

Fig. 9.
figure 9

Visual image (distance to target - 9 m)

Fig. 10.
figure 10

Segmented image (distance to target - 9 m)

Fig. 11.
figure 11

Target detection image (distance to target - 9 m)

5.2 Thermal Imaging

During the experiment, a thermal camera was used, which registered the infrared radiation from the flying drone. Thermal imaging is suitable for UAV surveillance both during the day and at night. Using thermal images, it is possible to detect a flying UAV in total darkness and detect target through smoke. The Figs. 12, 13, 14 below show the results of drone detection with a thermal camera and the processing of the images with the algorithm proposed above at different distances (3, 6, and 9 m).

Fig. 12.
figure 12

a) Thermal image, b) Segmented image, c) Target detection image (distance to target - 3 m).

Fig. 13.
figure 13

a) Thermal image, b) Segmented image, c) Target detection image (distance to target - 6 m).

Fig. 14.
figure 14

a) Thermal image, b) Segmented image, c) Target detection image (distance to target - 9 m).

The obtained results show that the proposed algorithm in Sect. 3 can be successfully applied in the processing of visible light images and thermal images.

6 Conclusions

The proposed algorithm for detection on unmanned aerial vehicles based on image processing is applicable for short distances. At longer distances, it can only be used to detect objects. The algorithm is multi-channel and detects objects at different distances from the camera.

In our next research, we will propose an algorithm for estimating the speed of the drone, considering the change in the size of the drone, which is proportional to the distance travelled and determine the time of this change using the number of frames in the video.