1 Introduction

In an environment comprising of numerous autonomous robots and obstacles, moving robots safely and in a collision-free manner is a pivotal task. Path planning is a problem of finding a continuous path that the robot will use to drive from the source to the destination, also known as a configuration. The purpose of motion planning in robotics is to find several valid configurations that can be employed in moving the robot from the source to the target. Unlike motion planning, one of the design goals in path planning is discovering the optimum path in terms of the least time used, along with modelling the entire environment. Path planning can be used in fully or partially known environments, as well as entirely unknown environments where sensed information defines the desired robot motion. Mobile robots are a cutting-edge technology that can be employed in various novel research areas such as Internet of Things (IoT) [1], military [2], and health [3]. These robots have also been used in Vehicle Ad hoc Network (VANET) [4] and Flying Ad hoc Network (FANET) [5], a subset of the Mobile Ad hoc networks, and Internet of Drone (IoD) [6] and Internet of Vehicle (IoV) [7] in the IoT category. Considering autonomous routing techniques of these systems, one of the aims is to find safe paths in the shortest possible time, using the resources effectively. These robots consist of sensors and actuators. Furthermore, each robot (node) has a processor and a memory. As such it is adequate to state that these devices can function as an all-in smart agent. Therefore, artificial intelligence-based techniques can be easily implemented using these smart nodes. When appropriated in interconnected and successive systems, it is vital to plan a path for each mobile autonomous device such as unmanned aerial vehicles (UAVs) and drones.

In unstructured environments, filled with uncertain elements, issues on the periphery of robots become increasingly difficult to resolve, causing an ever-growing demand for 3D path planning algorithms. Obviously, a simple 2D algorithm cannot qualify for planning complex situations. The difficulties of path planning in a 3D environment, unlike 2D path planning, increase exponentially due to the inherent kinematic nature of the environment. Furthermore, finding optimal 3D path planning is a Non-Deterministic Polynomial-Time (NP-hard) problem. In general, the metaheuristic methods are widely used to solve various optimization problems [8, 9]. Optimal path planning includes the shortest path length where the selected path should be as far as possible from obstacles; it must be smooth, i.e. without sharp turn, and must consider motion constraints. As such, it is pragmatic to focus on three-dimensional areas with many obstacles and find suitable routes for mobile robots. Therefore, in this paper, three-dimensional path planning methods have been made available. Hence, metaheuristic algorithms are suitable candidates for finding a solution. No-Free-Lunch (NFL) [10] asserts that there is no specific algorithm providing the best solution for every optimization problem. As such, there is a considerable demand to develop new metaheuristic algorithms that can be used in various problems. Solution techniques in the 3D path planning algorithms for autonomous robots include visibility graphs [11], probabilistic road maps [12], randomly exploring algorithms [13], random-exploring algorithms [14], and heuristics. However, literature shows that metaheuristic methods are much better at 3D path planning [15,16,17]. On the other hand, one of the effective methods of solving the 3D path planning problem is sampling-based methods [18]. Rapidly exploring Random Tree (RRT) [19] is one of the most famous sampling-based method. The main aim of RRT is to generate a path between source and destination stations avoiding obstacles. However, the biggest shortcoming of this algorithm is the fact that it may not find the optimum path. On the other hand, metaheuristic algorithms can be an appropriate solution to solve the optimal pathfinding.

Generally, the methods in the sampling-based category are very fast and useful in real applications; indeed, they (1) are able to find a feasible motion plan in a relatively short time and guaranteeing probabilistic completeness (the path found may not be optimal). (2) can be applied in different real areas and various dynamic models. However, there are some shortcomings. They suffer from computational overload and time inefficiency in optimal pathfinding. The other disadvantage is that it is not considered successful in finding an optimal path. In addition, they are not as successful at convergence rate, at least like metaheuristic algorithms. Another issue is that they are not very successful in complex environments [15, 18, 22].

On the other hand, in the metaheuristic-based methods, in the process of movement of each robot from the current station to the next station, the options are generally predetermined in advance and the positions of these options (candidate next stations) are also fixed. Therefore, many metaheuristic-based methods used in solving these types of problems do not deal with the angle and direction of motion and only care about the distance to travel or possible other parameters. Another important problem is that the locations and numbers of intermediate stations between the starting and destination points of each UAV are kept in memory. This may cause inefficient memory usage [18, 19].

To get around this defect, in this paper, a novel hybrid method is proposed using the advantages of sampling-based and metaheuristic-based algorithms for solving the 3D path planning problem. Three metaheuristic algorithms (Grey Wolf Optimization (GWO) [20], Incremental Grey Wolf Optimization (I-GWO) [21], and Expanded Grey Wolf Optimization (Ex-GWO) [21]) are used in the working mechanism of this method. That's why the method is presented with three distinct variants. Benefiting from them, the optimal path planning mechanisms are realized for each autonomous mobile robot in different environments, containing various obstacles without any collisions. Each proposed method attempts to find an almost optimal path by eliminating the process of creating complex environment models based on stochastic approaches. They can be faster and more successful in finding the most suitable solutions. The contribution and objective of the proposed method can be summarized as follows:

  1. (1)

    It is usable in dynamic environments and real applications as fast.

  2. (2)

    It tries to find collision-free optimal paths in a short time with the least process costs.

  3. (3)

    It solves the problem of the excessive computational overhead of sampling-based methods in finding optimum paths. Also, unlike sampling-based methods, it is successful in finding the optimal path with a good convergence rate.

  4. (4)

    It provides efficient memory usage of UAVs. Because, with this method, UAVs do not need to store predetermined static map information.

  5. (5)

    Unlike metaheuristic methods, selectable intermediate nodes between target and source are not predetermined and their numbers are not fixed. Therefore, the proposed method will not only be limited to static paths or a limited number of movements but will be able to work efficiently in dynamic and real environments.

  6. (6)

    It focuses on problem solving, not only with the distance parameter, but also taking into account the angle and direction of the movement.

The rest of this paper is organized as follows. The literature review is explained in Sect. 2. The proposed method is described with the relevant problem in Sect. 3. In Sect. 4, simulation results are presented. The final section provides conclusions and future works.

2 Literature review

In recent years, literature shows that there is interest in studies conducted on path planning and mobile autonomous vehicles [22,23,24,25,26], especially in three-dimensional path planning [27,28,29,30]. The general taxonomy of 3D path planning methods [18], which generally consists of four basic areas, is presented in Fig. 1. Sampling-based algorithms necessitate mathematical representation of information pertaining to the whole search space of the environment, known a priori. This kind of method needs some foreknown information of the whole environment (workspace). In these algorithms, the environment is represented by several nodes. They employ matching random search techniques in the path-finding mechanism to reach the appropriate solution. The matching random search techniques are then further divided into active and passive categories. In the active types, the relevant algorithm can obtain a suitable path to the destination all by its own processing procedure. Passive means algorithms only generate a path from initial to destination stations, and thus a combination of search algorithms is used to select the best feasible path in the net map where many possible paths exist. Some related case studies are presented in [31, 32]. Exploration in a node-based optimal algorithm is performed using a decomposed graph. These algorithms are informed search methods. These methods find an optimal path based on certain decomposition. Numerous studies have been proposed in this category [33,34,35]. Mathematical model-based algorithms include linear algorithms and optimal control techniques. These algorithms model the environment (kinematic constraints) as well as the parameters of the system (dynamic). Afterwards, they map the bounds of these two parameters based on the cost function bound. Some studies in this category are presented in [36,37,38]. The methods in these three categories suffer from high time complexity and local minima capture, especially where mobile robots face multiple constraints when planning a path. Metaheuristic algorithms, a subset of nature-inspired algorithms, are the fourth category of taxonomy that imitate biological interactive behaviours or physical events [30, 39, 40]. These methods attempt to find an almost optimal path by eliminating the process of creating complex environment models based on stochastic approaches [41, 42]. As mentioned earlier, this study proposes a hybrid path planning method. The proposed hybrid method with three versions has been inspired by sampling and metaheuristic algorithms where the main object is finding an optimal path. Mathematically based algorithms require complex calculations; on the other hand, the node-based algorithms can operate in certain and limited conditions. Therefore, methods in the other two categories generally attract more attention from researchers. Therefore, the possibility of developing new algorithms may increase. So, in this section, mainly sampling and metaheuristic-based studies are discussed.

Fig. 1
figure 1

3D path planning algorithms taxonomy [18]

2.1 Review on sampling-based path planning methods

In sampling-based algorithms, like many path planning methods, the map and movement areas are predetermined, but the stations in the middle are not known except for the start and end stations. This seems that these methods are dynamic and are used as a suitable solution for path planning in mobile autonomous devices such as UAV. The relevant methods based on this category employ matching random search techniques in the pathfinding mechanism to reach the appropriate solution and they are divided into active and passive categories. Active algorithms prefer to act randomly while passive algorithms suffer from dynamic barrier detection. The most efficient methods in sampling-based algorithms are those in the active category [18, 32]. The RRT is one of the most famous methods in the active model in this category. In the literature, many versions and improvements have been made regarding the RRT [19]. The RRT series solve the most problematic part of 3D path planning, i.e. holonomic and non-holonomic obstacles [43, 44]. The most common deficiency of the RRT algorithm, similar to other sampling-based methods, lies in the fact that it cannot be used to find an optimal path. The main purpose of RRT is to generate a path between source and destination points. In this algorithm, the next station is defined from the previous state behaviours. In a path planning with the RRT algorithm, a UAV has some possible states to move on. In this step, if an obstacle is located between the two points, that path is discarded, otherwise, the path without any obstacle is chosen. For the explanation of the RRT algorithm, the workspace of the UAVs is shown as p. In addition, pobstacle is the position of obstacles and pfree is the possible area without obstacles. Mathematically, pfree ⊂ p and pobstacles ⊂ p. In this algorithm, the maximum length between two points is important. The controlling of this length is realized by r parameter. The structure of RRT algorithms is described in the following five steps:


Step 1 Initialize the source and the destination positions along with parameter r and check the area (pfree). Here, psource ⊂ pfree and pdestination ⊂ pfree.


Step 2 Select a random station from pfree.


Step 3 Find the nearest station (pnear) from the trajectory that the mobile robot passed through before in order to randomly create stations using the Euclidean metric.


Step 4 If there is no obstacle between pnear and prandom, then prandom is the direction of our robot. Otherwise, ignore it and repeat this step until there is no obstacle in the path. If the distance between pnear and prandom is longer than r, then it is in the direction of prandom along with the r. At the end of this step add prandom in set T (trajectory) as new node.


Step 5 Repeat previous 4 steps until next node of a robot is pdestination.

Another method based on sampling is RRT* algorithm [45]. The RRT* is an optimized modified algorithm that aims to achieve the shortest path, whether by distance or other metrics. The basic principle of RRT* is the same as RRT, but two important additions have been made. First, RRT* records the distance each station has travelled relative to its next station. Second, RRT* is the tree rewiring. After a station (vertex) has been connected to the cheapest neighbour, the neighbours are again examined. Neighbours are checked if being rewired to the newly added vertex will make their cost decrease. If the cost does indeed decrease, the neighbour is rewired to the newly added vertex. This feature makes the path smoother. RRT* suffers from a reduction in performance. Due to examining neighbouring nodes and rewiring the graph, the implementation of RRT* needs more time to complete a single path. The convergence rate of RRT* is slow in pure exploration. Also, the successful balance between exploration and exploitation is not achieved and there may be a local optimal trap similar to other sampling-based algorithms. In [46], an algorithm (D* Lite Rapidly exploring Random Tree star (DL-RRT*)) has been proposed for path re-planning in dynamic radioactive environments. This method, whose purpose is to improve the convergence ratio, uses the grid feature of the D* algorithm. However, it has not been very successful in convergence ratio due to its reliance on a single tree version.

In [47], the authors have suggested a method to build two trees rooted at the start and goal configurations, respectively. The name given to its methods is Bidirectional RRT (B-RRT). Although their purpose is to improve the convergence ratio and avoid possible local optima traps, the computational load is quite high in the two trees that will cover the whole area. Quad-RRT [48] method has been suggested to solve this problem using the Graphics Processing Unit (GPU) threads. However, this mechanism means a high monetary cost. So it's hard to come to life in the real world. In [49] has been proposed an improved version of B-RRT that employed a greedy connect a heuristic for the connection of two-directional trees (as named B-RRT*). However, it is still not very successful in exploration. On the other hand, although it is recommended as a 3D path planning method, it is possible to use it mainly in 2D environments. This restriction is mainly valid to all B-RRT versions.

Besides the advantages of sample-based methods, the shortcoming of the sampling-based methods can be summarized in three basic terms: (1) They suffer from computational overload and time inefficiency in optimal pathfinding. (2) They are not considered successful in finding an optimal path. (3) They are not useful in terms of convergence rate. The proposed method benefits from the advantages of sampling-based methods by eliminating their disadvantages. It is a hybrid approach that employs metaheuristic algorithms to find collision-free optimal paths in a short time with the least process costs. This method is described in detail, along its three different variants, in Sect. 3.

2.2 Review on nature-inspired path planning methods

In [50] has been proposed an Interfered Fluid Dynamical System (IFDS) to solve the path planning problem. The authors suggested one method, which is based on mathematical-based algorithms, to detect obstacles shape and another method, which is based on nature-based algorithms, to find the optimal path for a UAV. The authors have taken the land impact on path planning as a key factor and suggested using a stream function method from their previous work. However, the stream function works in two-dimensional systems. Based on the principles of a liquid flowing around objects and the phenomenon of flow from start to end, a 3D soft path has been created using two techniques, namely analytical and numerical methods. The analytical method handles only spherical obstacles and involves less computation. The numerical method deals with all types of obstacles, but it requires more pretreatment and higher calculation cost. In the IFDS method, complex obstacles such as terrains, mountains, radar, and anti-aircraft systems are converted into cylindrical, conical, and spherical shapes. Afterward, the combined mathematical expression of these transformed shapes is fashioned to obtain 3D flow lines around a single obstacle. If there is more than one obstacle in the path, it is derived using the average of the velocity field. Next, methods such as more control force and virtual obstacle are proposed to deal with the stagnation point and overlapping of obstacles. Finally, a genetic algorithm is used to obtain an optimum 3D path of the UAV by taking the path length and flight height as sub-targets. Studies demonstrate that the IFDS method gives better results than both numerical and analytical methods. Furthermore, the IFDS method has been compared with the ant optimization algorithm and results conclude that the IFDS method gives better results.

Perazzo et al. [51] articulate secure positioning and secure position verification, which are two critical problems in path planning. The related method focused on path planning for drones and gave the meaning of a secured path for an optimum path. In this study, 2D path planning is discussed, although it is not clearly stated. Their paper proposes a drone system that passes through a series of waypoints instead of a static station system. At each waypoint, the drone acts as a station and determines positions safely. Here the task is to determine the capability of the drone in finding the correct path for all devices. The authors state three methods called LocalizerBee, VerifierBee, and PreciseVerifierBee. In LocalizerBee method, isosceles triangles have been created to completely cover the rectangular area determined in the waypoint grid construction structure. After that, drones combine all waypoints by Waypoint Ordering, to find the shortest path with the Travelling Salesman Problem algorithm. In VerifierBee method, authors use prior knowledge of the position of the nodes in three phases. A number of waypoints are created during the initial path construction phase. The home waypoint plus three waypoints for each node are placed in fixed positions to form a minimal verifiable triangle. The Iterative Improvement phase creates the first path. The VerifierBee rediscovers the path created in Iterative Improvement phase, using a local heuristic search algorithm. VerifierBee then analyses each possible change (moving a waypoint to another location) at each step and selects the most appropriate as well as the shortest path. This process ends when a local minimum is found. There are two types of changes in the Waypoint Reordering phase. One that changes the position of the waypoint and the other that removes the waypoint and replaces it with another available way. The PreciseVerifierBee method is an extension of the VerifierBee method, where the structure of the Verifier remains the same except Initial Path Construction and Iterative improvement stages. Unlike the VerifierBee method, the angle of the drone to the ground plays an important role in the PreciseVerifierBee method. As a result of the application, VerifierBee always produces shorter paths in comparison with PreciesVerifierBee method. With an increase in the number of nodes, PrecieseVerifierBee shows worse results compared to VerifierBee or even LocalizerBee. The results of their experiments demonstrate that the proposed algorithms securely localize all the devices, even in presence of drone control errors.

Wang et al. [52] aim at improving the flight path by planning a 3D UAV route along with various types of restrictions mainly in complex combat environments. They have proposed the Improved Bat Algorithm (IBA), a mixture of Differential Evolution (DE), and the Bat Algorithm (BA). DE is an evolutionary algorithm that creates new candidate solutions by uniting many other members in the same population as the parents. BA is an optimization algorithm inspired by the direction-finding behaviour and distance of an object by taking advantage of the echo of a bat, the echolocation. In order to improve the selection of a bat during the IBA location update process, a DE mutation factor was employed. The basic idea behind the IBA is that the bat group uses ecological placement to detect distance and random flight over the search area, updating its location and speed. The IBA bat flight aims to find its food/prey (best solution, most convenient way). The authors use the B-Spline Curve Correction Strategy to further improve the resulting path. It is shown in [52] that their algorithm, IBA, demonstrates better results than BA and is an effective tool that can be used in UAV path planning problems.

In [22] has been presented a new method for a mobile robot in an uncertain obstacle-based environment based on Firefly Algorithm (FA). It solves the challenges of navigation by avoiding the random movement of fireflies and minimizing the computational cost. The authors defined an objective function, which is controlled by the trial and error method. With this function, the paths to be chosen are decided. In this study, which aim is crowded environments, the choice of the obstacle closest to each station is required for path planning and an equation has been defined for this purpose. However, this equation, which contains very few parameters, may not be successful in real and complex environmental conditions. Moreover, this study focused solely on 2D path planning. In [29] has been proposed ground robot based on a new version of Ant Colony Optimization (ACO) for 3D path planning. It defines a new phenomenon update mechanism and constructs various paths between the initial and target points of a robot. It avoids obstacles and is used for solving the easily falling into local optimum and long search times in 3D path planning problems. Performance analysis could not be comprehensive by comparing this proposed method with only the basic ACO method. In addition, generally, it has a higher time and space complexity than GWO-based methods, this is due to the nature of ACO.

The GWO algorithm may be more likely to be successful than other metaheuristic methods in this type of problem on various parameters due to its working mechanism. One of the most important and effective of last studies to solve 3D path planning problem is based on the GWO algorithm, which has been presented in [40]. The authors have proposed a new 3D path planning method for UAVs and they used the GWO algorithm to decrease the complexity in finding paths to find a minimal cost best path. The GWO algorithm is based on the hunting behaviour of grey wolves. Grey wolves are classified as alpha, beta, delta, and omega, where each type of wolf exhibits distinct functionally among encircling, attacking, and hunting functions. In [40] there are three different maps with a varying number of obstacles. Furthermore, there are five UAVs with different initial and final stations. According to the results of the study, the better path cost with best time complexity in the GWO-based method in comparison with other methods such as Dijkstra, A*, D*, and a few other famous metaheuristic-based methods is obtained. Euclidean distance between stations, visited by UAVs, is used to calculate the cost of the path. They claim that paths generated using the proposed method have low path calculation costs. Most of the obstacles in this study have been located in the center of the maps that the UAVs not more effort to reach the destination. According to this paper, GWO-based methods, with the specific features and advantage of the nature of their algorithm, perform a more balanced and better performance in similar problems. Therefore, GWO-based algorithms are sought after in many research and application areas due to their balanced behaviour among various metaheuristic algorithms. For this reason, we use three variants of GWO to propose our path planning method for obstacle detection and avoidance, random movement avoidance, and optimal pathfinding. By applying each of these three GWO-based algorithms separately in our method, three different methods are developed to solve the 3D path planning problem.

In addition, the general shortcomings of metaheuristic methods can be outlined as follows: (1) They use fixed number of intermediate nodes with predetermined locations. Therefore, they can use only a limited number of intermediate nodes. (2) They do not deal with the angle and direction of motion and only cater to simple parameters such as the distance to travel. Therefore, in this study, a novel hybrid method is proposed to solve the related problems by providing all the advantages of each category. This method has a working mechanism that finds the optimal paths without collisions in the shortest possible time with the least process cost. The performances of each solution are discussed and analysed in chapter 4. Also, their performance is compared with several other methods. For this, sampling and metaheuristic-based hybrid mechanisms are suggested. Therefore, in this paper, it will be sufficient to have only the starting and ending stations for each UAV, and it will find the best choices to go along the path.

3 Proposed novel algorithms

As mentioned before, one of the most significant and contemporary problems in robotics is 3D path planning for mobile robots. It is necessary to find the optimal (or close to optimal) path between two source and destination points for robots to collision-free move without human intervention. In this paper, a new 3D path planning method is presented using three metaheuristic algorithms for autonomous UAVs. In the metaheuristic-based path planning methods, the selectable stations between the start and endpoints of each mobile autonomous robot (e.g. UAV) may be known before and therefore routes can be static or each robot may have a limited number of directions of movement (especially, in discrete and grid areas) and they are shortcoming. In other words, the stations between the target and start points of each robot are pre-determined and the positions of these stations (next candidate stations) are also fixed. Therefore, most of the metaheuristic-based methods used in solving such problems are not concerned with the angle and direction of movement, but only with the distance to travel or other possible parameters. Also, as mentioned earlier, they are not good at using memory. It is worth noting that this working mechanism is not also exclusive to metaheuristic algorithms, whereas, in this paper, the direction, as well as the path length, has an effect on the next stations' selection. Also, the number and location of the stations between the two starting and final points are uncertain. In this study, the next station for each robot is unclear, and the intermediate stations are not fixed and predetermined. In this way, it can be applied in more realistic environments and applications. On the other hand, the proposed method can solve the problems of sampling-based algorithms. In the RRT and all its variances, the intermediate stations are chosen randomly (unlike metaheuristic methods). Therefore, they are often fast and also useful in real applications. However, these have not been very successful in finding optimal paths. In addition, they suffer from computational overload and time inefficiency in optimal pathfinding. Therefore, in this paper, a hybrid method with three versions inspired by the two categories (sampling and metaheuristic-based algorithms) are recommended. In this way, both the intermediate stations do not need to be determined beforehand and will obtainable optimized paths by avoiding completely random movements. In this paper, an improvement on the RRT algorithm, namely Adapted-RRT, is presented and it is applied with three different well-known metaheuristic algorithms; GWO, I-GWO, and Ex-GWO. These methods are named Adapted-RRTGWO, Adapted-RRTI-GWO, and Adapted-RRTEx-GWO.

3.1 Definitions

The path planning problem is related to finding optimal paths in a defined environment containing various obstacles. UAVs detect these obstacles and attempt to find a path without collisions. UAVs can be categorized under the mobile robots that meant to fly from one point to another autonomously in the most complicated three-dimensional (R3) environments. In these environments, UAVs must consider the Z dimension alongside the X and the Y dimensions. In some cases, environments that UAVs are considered to operate inside can be defined as the workspace, w. The most difficult part of complicated environments for UAVs is the various shapes of obstacles. The ith obstacle in the workspace can be represented as\(w{o}_{i}\). All the free parts of the environment are the set of coordinates,\({w}_{\rm free}\). So, free spaces in the environment that the mobile robots can move inside without facing any possible collision can be calculated using\({w}_{\rm free}=w {U}_{i} {wo}_{i}\). Both the initial (source, start) point of the mobile robot,\({w}_{\rm source}\), and it’s final (destination, target) point in the workspace, \({w}_{\rm destination}\) should be the element included in the\({w}_{\rm free}\). The definition of 3D path planning can be defined as\(\left(S\right)= {w}_{\rm source}\), and\(\left(T\right)= {w}_{\rm destination}\). Here \(\delta\) represents the functions of initial and final points of a robot, respectively, and the bounded variation is given as\( :\left[S, T\right]\to {R}^{3}\). If there is any continues process \(\Phi \) exists that can attain\(\delta (\tau )\in {w}_{\rm free}\), then this process can be called as 3D path planning. Here \(\tau \) is a set in which positions are located on the path that takes the UAV from source to destination. The main purpose of 3D path planning is to find a trajectory between the source (start, initial) and the destination (final, target). For that reason, a path planning function is defined in Eq. 1.

$$f\left({\rm source}, {\rm destination}\right)\to {\rm Trajectory}$$
(1)

where source and destination denote relative coordinates of the \({\rm source}\,\left({X}_{\rm source}, {Y}_{\rm source}, {Z}_{\rm source}\right)\) and the \({\rm destination}\,\left({X}_{\rm destination}, {Y}_{\rm destination}, {Z}_{\rm destination}\right)\) positions on the map. Each path contains a cost for motion from source to destination. There are different parameters that define cost between the two points. In each map, the positions of n number of search agents in the m dimensional search space can be defined as the matrix in Eq. 2.

$${\rm positions}=\left[\begin{array}{ccc}\begin{array}{cc}{p}_{1}^{1}& {p}_{1}^{2}\end{array}& \begin{array}{cc}\cdots & \cdots \end{array}& {p}_{1}^{m}\\ \begin{array}{cc}{p}_{2}^{1}& {p}_{2}^{2}\end{array}& \begin{array}{cc}\cdots & \cdots \end{array}& {p}_{2}^{m}\\ \begin{array}{cc}\cdots & \cdots \\ \begin{array}{c}\cdots \\ {p}_{n}^{1}\end{array}& \begin{array}{c}\cdots \\ {p}_{n}^{2}\end{array}\end{array}& \begin{array}{c}\begin{array}{cc}\cdots & \cdots \end{array}\\ \begin{array}{cc}\cdots & \cdots \end{array}\end{array}& \begin{array}{c}\cdots \\ {p}_{n}^{m}\end{array}\end{array}\right]$$
(2)

where pm’s represents position coordinates of each station that the mobile robot takes on the map. In most studies, cost is calculated using consumption of energy, Euclidean distance, and velocity. In this study optimal cost is considered in a path with minimum Euclidean distance between two points. Due to the Euclidean metrics system distance between 2 completely randomly selected points in the three-dimensional workspace can be expressed as: \(D\left({\rm station}_{1}, {\rm station}_{2}\right)= \sqrt{{\left({x}_{1}-{x}_{2}\right)}^{2}+{\left({y}_{1}-{y}_{2}\right)}^{2}+{\left({z}_{1}-{z}_{2}\right)}^{2}}.\) For the calculation of the cost (length of the trajectory) of nth UAV can be transformed into the sum of the Euclidean distances between tuples from source to the destination stations, we present the equation as below.

$${\rm Cost}= \sum_{m=S}^{D}\sqrt{{\left({x}_{m+1}-{x}_{m}\right)}^{2}+{\left({y}_{m+1}-{y}_{m}\right)}^{2}+{\left({z}_{m+1}-{z}_{m}\right)}^{2}}$$
(3)

Path planning itself for the mobile robots sometimes may not be enough to solve the problem. The 3D path planning algorithm should find the optimum paths to save time, energy, or other consumable sources for mobile robots. The definition of optimal path planning can be defined as follows. Given a path planning problem function in Eq. 1 and cost function in Eq. 3, the definition of the path planning is fulfilled and is found a path δ\({^{\prime}}\). Here, Cost(δ\({^{\prime}}\)) is equal to minimum Cost(δ) and δ is the set of all possible paths. In this situation, \({\delta }^{^{\prime}}\) is an optimal path, \({\Phi }^{^{\prime}}\) is the optimal path planning.

Figure 2 shows a sample trajectory between the source and destination stations in 2D and 3D perspectives. In the trajectory, there are some stations that the UAVs movement over stations. The S indicates the initial station, and the D is the final station. The p presents the possible stations that UAVs can move in the environment. In all path planning methods, the next station is selected from each current station and this process continues until it reaches the destination or target station. While in some methods the stations' elections are chosen randomly, in others, they are realized by a certain mechanism. In the final phase of all methods, the sum of all tuples costs based on certain mechanisms, that have been calculated, is obtained; giving the cost of the path. In the proposed method in this study, the cost of each tuple on the path is calculated by a fitness function (Eq. 3). Therefore, it is possible to find an optimum or close to an optimal path with a minimum cost between two points.

Fig. 2
figure 2

The randomly and optimized trajectory

The other term that needs to be defined is random position matrix. Hence, primarily, the proposed method initializes the random position matrix. Each row of position matrix defines the path, and the columns represent the number of stations, in the path, to the destination. The \(\left({x}_{n}^{m}, {y}_{n}^{m}, {z}_{n}^{m}\right)\) presents coordinates of each station where m is the aforementioned index of stations and n is the number of search agents in each method (Table 1). The search agents are the configuration parameter of the metaheuristic algorithms. Then, for each metaheuristic algorithm, a search space, based on the position matrix, is initialized. Table 2 represents the distance between tuples. In this table, each row represents a path length. Each element of the row shows the distance between two points as \({d}_{\left(i, j\right)}^{n}\), where i is the current state and j is the previous state.

Table 1 The position matrix of each path
Table 2 The search space that represents distance between tuples

3.2 Map design

Typically, the first step in path planning is to represent the workspace as a map. The presence of obstacles in the maps makes the task of finding an optimal path a bit complex for the UAVs. The challenge is avoiding the obstacles and reaching the final position with minimum costs. In this paper, four maps have been considered to evaluate the proposed method; a small map, a medium map, a crowded medium map, and a large map. In Table 3, the boundaries for the four maps are presented. Furthermore, three UAVs with dissimilar start and end positions are used. These three-dimensional points of UAVs are listed in Table 4. The number of obstacles in each map is different, and the position of each obstacle is depicted in Table 5. In this study, obstacles are assumed to be three-dimensional quadrilaterals of different sizes.

Table 3 3D map boundary
Table 4 3D map UAV initial and final positions
Table 5 3D obstacles coordinate for each map

3.3 Obstacle management

Obstacle avoidance is one of the many challenges that exist in the path planning problem. In this study, a mechanism is proposed to avoid the collision of the UAVs with obstacles, which benefits from geometric and calculus-based formulae. The proposed obstacle avoidance mechanism has an important role in finding the shortest path without collision. This mechanism entails two main steps that take place in a sequential fashion.

  1. 1.

    Whenever the randomly assigned station finds itself within the special parameters of the obstacle, it is relocated to the nearest corner of the obstacle as presented in Algorithm 1 and Fig. 3a.

  2. 2.

    Whenever an obstacle is encountered in the path found by the metaheuristic algorithm, the following steps are taken:

    1. 2.1

      Sorting Phase: sort the obstacles by Euclidean distance proximity to the current station. Significance of sorting is for detecting and avoiding various number of obstacles in series from the nearest one to the farthest one.

    2. 2.2

      Detection and Avoidance Phase:

      1. 2.2.1

        Detection: if there is a collision between the couple stations, geometric calculation method will be used. The detection mechanism is based on Algorithm 2. A virtual triangle between current, next stations, and the center point of obstacle is considered. As shown in Fig. 3b, the h is the height from center point of obstacle to the baseline, and the h can be found using Heron's formula. Besides, the R is the radius of the around circle of the obstacle. These parameters are used to check whether there is a collision or not.

      2. 2.2.2

        Avoidance: in this step the closest corner of the obstacle to the current station is calculated using Euclidean distance. The UAV moves to that corner and then across the opposite side of obstacle which has clear sight of next station or next obstacle as shown in Algorithm 2 and Fig. 3c.

      3. 2.2.3

        If the UAV does not reach to the next station, repeat phase 2.2.1 and 2.2.2.

  3. 3.

    If the UAV does reach the next station, repeat sorting and detection and avoidance phase for each couple in the path.

Fig. 3
figure 3

Obstacle avoidance mechanism

figure a
figure b

3.4 Grey wolf algorithms: GWO, I-GWO, and Ex-GWO

In this section, the GWO-based algorithms used in our proposed method are summarized. The GWO algorithm is inspired by grey wolves in nature and the natural behaviour of the grey wolves are mathematically modelled. The main behavioural traits of the wolves are encircling, hunting, and attacking the prey. There are four types of wolves in each pack; alpha, beta, delta, and omega. Each wolf has different responsibilities in the pack. Alpha, beta, and delta wolves encircle the prey, whereas the omega wolves follow the first three wolves' position. In simple terms, omega wolves update their own position based on the other wolves in order to attack the prey. The GWO algorithm is mathematically defined using Eqs. 411. Here t indicates the current iteration, T demonstrates maximum number of iterations, \(\mathop{X}\limits^{\rightharpoonup} \) indicates the position vector of a wolf. Also, D is a vector that depends on the location of the target. Here the coefficient vectors \(\mathop{A}\limits^{\rightharpoonup} \) and \(\mathop{C}\limits^{\rightharpoonup} \) are considered to lead in encircling the prey (Eqs. 6 and 7). These parameters control the trade-off between the exploration and exploitation phases in all variants of used GWO algorithms. In addition, \(\mathop{a}\limits^{\rightharpoonup} \) is linearly decreased from 2 to 0 over the courses of iteration. It is used to get closer to the solution range and r1 and r2 are the random vectors in a range of [0, 1]. In all variants, when \(\mathop{A}\limits^{\rightharpoonup} \) is less than 1, the wolves in the pack attack to hunt, otherwise, they try to find prey to be hunted. Figure 4 shows the working mechanisms of these algorithms by considering exploration and exploitation phases.

Fig. 4
figure 4

Working mechanism considering exploration and exploitation a GWO, b I-GWO, c Ex-GWO

$$\overrightarrow{D}=\left|\overrightarrow{C}\cdot \overrightarrow{{X}_{p}}\left(t\right)-\overrightarrow{X}\left(t\right)\right|$$
(4)
$$\overrightarrow{X}\left(t+1\right)={\overrightarrow{X}}_{p}\left(t\right)-\overrightarrow{A}\cdot \overrightarrow{D}$$
(5)
$$\overrightarrow{A}=2\overrightarrow{a}\cdot \overrightarrow{{r}_{1}}-\overrightarrow{a}$$
(6)
$$\overrightarrow{C}=2\cdot \overrightarrow{{r}_{2}}$$
(7)
$$\overrightarrow{a}=2\left(1-\frac{t}{T}\right)$$
(8)
$$\overrightarrow{{D}_{\alpha }}=\left|\overrightarrow{{C}_{1}}\cdot \overrightarrow{{X}_{\alpha }}-\overrightarrow{X}\right|, \overrightarrow{{D}_{\beta }}=\left|\overrightarrow{{C}_{2}}\cdot \overrightarrow{{X}_{\beta }}-\overrightarrow{X}\right|, \overrightarrow{{D}_{\delta }}=\left|\overrightarrow{{C}_{3}}\cdot \overrightarrow{{X}_{\delta }}-\overrightarrow{X}\right|$$
(9)
$$\overrightarrow{{X}_{1}}={\overrightarrow{X}}_{a}-\overrightarrow{{A}_{1}}\cdot \overrightarrow{{D}_{\alpha }}, \overrightarrow{{X}_{2}}={\overrightarrow{X}}_{\beta }-\overrightarrow{{A}_{2}}\cdot \overrightarrow{{D}_{\beta }}, \overrightarrow{{X}_{3}}={\overrightarrow{X}}_{\delta }-\overrightarrow{{A}_{3}}\cdot \overrightarrow{{D}_{\delta }}$$
(10)
$$\overrightarrow{X}\left(t+1\right)=\frac{\overrightarrow{{X}_{1}}\left(t\right)+\overrightarrow{{X}_{2}}\left(t\right)+\overrightarrow{{X}_{3}}\left(t\right)}{3}$$
(11)

In I-GWO algorithm, the hunting mechanism of the GWO has been modified. The authors of I-GWO algorithm supposed that the alpha wolf has the best knowledge about the prey. The second wolf in the pack follows the alpha wolf, and in the same way, the nth wolf in the pack follows all n − 1 wolves before it in the pack. Simply putting, the alpha wolf guides the other wolves in the pack for hunting. There are also some modifications in I-GWO mathematically from the classic GWO. In the I-GWO, each grey wolf in the pack encircles the prey; mathematically presented in Eqs. 4 and 5. In this method, authors have been used a new version of the \(\overrightarrow{a}\) parameter so as to improve the convergence speed of the metaheuristics, which is defined in Eq. 12. The I-GWO benefits from the position of the alpha wolf while updating other wolves’ positions in the pack (Eqs. 1315).

$$\overrightarrow{a}=2\left(1-\frac{{t}^{2}}{{T}^{2}}\right)$$
(12)
$$\overrightarrow{{D}_{\alpha }}=\left|\overrightarrow{{C}_{\alpha }}\cdot \overrightarrow{{X}_{\alpha }}-\overrightarrow{X}\right|$$
(13)
$$\overrightarrow{{X}_{1}}={\overrightarrow{X}}_{\alpha }-\overrightarrow{{A}_{1}}\cdot \overrightarrow{{D}_{\alpha }}$$
(14)
$$\overrightarrow{{X}_{n}}\left(t+1\right)=\frac{1}{n-1}\sum \limits_{i=1}^{n-1}{X}_{i}\left(t\right);{ n}=2, 3, \dots ,{ m}$$
(15)

Ex-GWO algorithm is yet another version of the GWO algorithm. In the Ex-GWO, the first three wolves; alpha, beta, and delta wolves, all have the best knowledge about the prey. The fourth wolf in the pack, just like in the GWO algorithm, updates its own position based on alpha, beta, and delta wolves. The fifth wolf updates its own position using the positions of the first three wolves and the fourth wolf. As such, the nth wolf updates its own position based on the first three wolves in the pack and the n − 3 wolves before it. So, the wolves benefit more from the packs’ knowledge in order to hunt and attack the prey. In the Ex-GWO, encircling of the prey is mathematically expressed using Eqs. 47. The hunting and position updates are also given by Eqs. 1618.

$$\overrightarrow{{D}_{1}}=\left|\overrightarrow{{C}_{1}}\cdot \overrightarrow{{X}_{1}}-\overrightarrow{X}\right|, \overrightarrow{{D}_{2}}=\left|\overrightarrow{{C}_{2}}\cdot \overrightarrow{{X}_{2}}-\overrightarrow{X}\right|, \overrightarrow{{D}_{3}}=\left|\overrightarrow{{C}_{3}}\cdot \overrightarrow{{X}_{3}}-\overrightarrow{X}\right|$$
(16)
$$\overrightarrow{{X}_{1}}={\overrightarrow{X}}_{1}-\overrightarrow{{A}_{1}}\cdot \overrightarrow{{D}_{1}}, \overrightarrow{{X}_{2}}={\overrightarrow{X}}_{2}-\overrightarrow{{A}_{2}}\cdot \overrightarrow{{D}_{2}}, \overrightarrow{{X}_{3}}={\overrightarrow{X}}_{3}-\overrightarrow{{A}_{3}}\cdot \overrightarrow{{D}_{3}}$$
(17)
$$\overrightarrow{{X}_{n}}\left(t+1\right)=\frac{1}{n-1}\sum\limits_{i=1}^{n-1}{X}_{i}\left(t\right);{ n}=4, 5, \dots ,{ m}$$
(18)

3.5 Hybrid path planning methods: adapted-RRTGWO, adapted-RRTI-GWO, and adapted-RRTEx-GWO

In this paper, a new hybrid path planning method with three versions is proposed to use the advantages of metaheuristic and sampling-based methods in one place. We propose an improvement in the RRT algorithm using three metaheuristic algorithms (GWO, I-GWO, and Ex-GWO). In order to use the RRT concept together with metaheuristic methods, it is necessary to make the classical RRT method compatible and efficient. Thus, the success rate of the proposed method in finding optimal solutions may be increased. Based on this, a newer version of the RRT method (Adapted-RRT) is recommended in this study. Figure 5 shows the working mechanism of the RRT and Adapted-RRT.

Fig. 5
figure 5

Working mechanism sample of a RRT and b Adapted-RRT

The random selection of the intermediate stations in all RRT methods is realized by a mechanism with considering possible obstacles. In the RRT, if an obstacle is located between the two points, the relevant path is discarded, otherwise, the path without any obstacle is chosen. However, this algorithm or its related improvements, either failed to find the optimal path or they suffered over-computed processes to find the optimal path and were not efficient in time. As stated in Sect. 2.1, the terms defined for this algorithm are used. As shown in Fig. 5a, in the RRT, when the UAV is in state 1, there are three possible stations, a UAV chooses a station without obstacle in the path and follows this path to reach the destination.

Adapted-RRT avoids possible obstacles without making random movements. It is a hybrid method that uses metaheuristic algorithms in its working mechanism to find collision-free optimal paths in a short time with the least process costs. It has also suitable behaviour in convergence rate. We define the r parameter more efficiently with a different mechanism and also use this value in choosing the appropriate path length in a next station selection of each UAV. For this, it acts with the X parameter in its metaheuristic algorithms. It also tries to find the optimal direction to reach the destination station via a shorter route as collision-free. For this, the angle is defined and optimized by metaheuristic algorithms. In the Adapted-RRT is deployed two important operations to select next stations and finally generate path; the length of each movement and the direction of each movement. For this, three metaheuristic algorithms are used to perform these two operations in the most appropriate way. The GWO algorithms may be the best choice to perform these operations in a balanced way. Accordingly, three different variants of this algorithm are used. The Ex-GWO-based path planning method may be performed more successfully in larger and crowded environments with more populations and iterations and I-GWO-based path planning method may give good results in medium and smaller with less populated environments. In addition, the GWO-based method is generally between these two methods and behaves in a balanced way. Equation 3 is used to calculate the total path cost of the selected stations in all versions of our proposed method. The working mechanism of the proposed method, which is based on sampling and metaheuristic algorithms, is illustrated schematically in Fig. 5b.

3.5.1 Calculation of length of each movement

In the proposed method, generally, each UAV has a travel length of 0 to r. This value indicates the length of each movement to a next station (new station). As mentioned before, the Euclidean distance determines the cost of each path. There is a limitation parameter that forces the UAVs to move around to the r value. It is necessary to use this since the next stations are not initially defined, otherwise, the agents reach the destination or a random station without any restriction. The r value calculation in the Adapted-RRT is obtained using Eq. 19. Here dim represents stations number between source and destination states. On the other hand, in this study, length of each movement, which is selected using metaheuristic algorithms, is aimed to be the best length. The value calculated from metaheuristic algorithms is named with X. Therefore, firstly, an X value is calculated from the GWO, I-GWO, and Ex-GWO algorithms based on Eqs. 11, 15, and 18, respectively. If this value is greater than r, the length of the value is calculated according to the first part of Eq. 20, otherwise, it is obtained by another part of Eq. 20.

$$r=\left({p}_{\rm destination}-{p}_{\rm source}\right)/\left({\rm dim}+1\right)$$
(19)
$$\left\{\begin{array}{c}X=r+\left(X-r\right)*{\rm rand}\left(0, 1\right) \,\,\,{\text{ if }}X>r\\ X=X+\left(r-X\right)*{\rm rand}\left(0, 1\right) \,\,\,{\text{ if }} X\le r\end{array}\right.$$
(20)

3.5.2 Conclusion of direction of each movement

In this method, another control parameter to find an optimal path is direction. The movement direction is not declared, because it is desired to avoid standing in the current station. Thanks to this operation, the most suitable next station for each current station is selected and therefore the corresponding UAV moves to the selected next station position. In this step, the Adapted-RRT algorithm still uses three GWO algorithms to find an optimal angle to determine the movement direction. The angles obtained from these metaheuristic algorithms are updated to a more precise value as the iterations progress. The parameter a is very important to determine the most appropriate angle. While a has the largest value, the larger angle is chosen and as the iterations progress, a value becomes smaller and hence the angle will also narrow. The a is linearly decreased from 2 to 0 over the courses of iteration. This parameter helps the Adapted-RRT algorithm in finding a better solution over the iterations of the proposed method. As mentioned before, the first process in the work of the GWO algorithms is to initialize a, A and C values (Eqs. 6, 7, 8, and 12). The coefficient vectors \(\overrightarrow{A}\) and \(\overrightarrow{C}\) are considered to lead in encircling their hunt (angle) based on Eqs. 4 and 5. These parameters control the trade-off between the exploration and exploitation phase in each GWO algorithms.

In angle calculation, first, the angle between current and destination stations is calculated by Eq. 21. For this, the angle is determined with cosines. Here, the Euclidian distance between current and destination stations is effective in finding the angular value. The parameter \(EuclDist\) represents this distance. Then, Eqs. 22 and 23 are used to find the angle between current and next stations. It is most important in the next station selection. The parameter \(\beta \) determines the movable degrees that take UAVs to the destination. This parameter is experimental that in this study it is chosen 15 degrees. Equation 24 is used to find the coordinates of candidate state based on the angle and X value obtained. Since the problem is three-dimensional, there are two angles, so it is quite natural to update two axes. The working mechanism of the angle finding and updating is shown in Fig. 6 with a simple example.

Fig. 6
figure 6

Angle finding and movement length updating mechanism in a certain iteration of proposed method- An example

$$cos\left({\rm angle}\right)=\frac{{X}_{\rm destination}-{X}_{\rm current}}{{\rm EuclDist}_{{\rm destination}, {\rm current}}}$$
(21)
$${\rm Angle}_{1}=\left({\rm arccos}\left({\rm angle}\right)+\left(\beta *a\right)\right), {\rm Angle}_{2}=\left({\rm arccos}\left({\rm angle}\right)-\left(\beta *a\right)\right)$$
(22)
$${\rm UpdatedAngle}\, {:=}\, {\rm rand}\left({\rm Angle}_{1}, {\rm Angle}_{2}\right)$$
(23)
$${X}_{\rm next}={X}_{\rm current}+X*cos\left({\rm UpdatedAngle}\right), {Y}_{\rm next}={Y}_{\rm current}+X*sin\left({\rm UpdatedAngle}\right), {Z}_{\rm next}={Z}_{\rm current}$$
(24)

Based on these two operations, the most suitable next station is selected for each current station. Therefore, the optimal path will be determined for each UAV. The pseudocode and flowchart of the proposed method with three variants of metaheuristic algorithms (Adapted-RRTGWO, Adapted-RRTI-GWO, and Adapted-RRTEX-GWO) can be found in Algorithm 3 and Fig. 7. The general architecture of the proposed method is also represented in Fig. 8.

Fig. 7
figure 7

The flowchart of the proposed method

Fig. 8
figure 8

General architecture of the proposed method

figure c

4 Simulation results

This section presents the results of the proposed method with three variants of metaheuristic algorithms (Adapted-RRTGWO, Adapted-RRTI-GWO, and Adapted-RRTEx-GWO) and evaluates their performance. They are compared with 8 mixed sampling and metaheuristic-based methods; BPIB-RRT* [49], tGSRT [53], GWO [20], I-GWO [21], Ex-GWO [21], PSO [54], Improved BA [52], and WOA [55] using same environmental conditions. The implementation and analysis presentation has been performed using Java and MATLAB. The proposed methods and other used methods to compare are simulated on a Core i7-5500U 2.4 processor with 8 GB of RAM.

4.1 Assumptions of simulation

In this study, different scenarios have been considered in various conditions such as map size, the number of obstacles and their sizes, start and target points of UAV to better analyse the performance of the proposed method. In this regard, four maps (small, medium, crowded medium, and large), presented in Table 3, with different starting and ending boundaries have been used in simulations. Furthermore, the starting and final stations of three UAVs are listed in Table 4, whereas the obstacles coordinates are given in Table 5. The shape of obstacles is supposed rectangle. All the coordinates are presented in three-dimensional space. Moreover, the assumed environment is continuous and bounded. The optimal path costs, execution time and complexity, and convergence rate analysis is explained in the following section in detail, using various population sizes and iterations numbers as given in the couple-tuples form: (25, 40), (50, 100), (100, 100). In addition, the parameter values of metaheuristic algorithms used are presented in Table 6.

Table 6 The simulation parameters

4.2 Comparison and evaluation of the path costs

In this section, the Adapted-RRTGWO, Adapted-RRTI-GWO, and Adapted-RRTEx-GWO for path planning are analysed based on cost function (Eq. 3). All the obtained cost values are in centimetres. Also, the overall time parameter refers to the average running time of each method. As metaheuristic algorithms may obtain different as well as near solutions, we run each algorithm 10 times. The best cost values are presented with different population sizes and iteration numbers (see Tables 7, 8, 9 and 10). Each version of the proposed method has three UAVs with different start and final stations. According to the results, the Adapted-RRT-based methods are moved to the next station based on the adapting current position according to the destination position. With larger dimensions of the map and the increasing number of obstacles, the proposed algorithms give better minimum cost values, when compare with other algorithms.

Table 7 Simulation results for small map
Table 8 Simulation results for medium map
Table 9 Simulation results for medium crowded map
Table 10 Simulation results for large map

Among all the algorithms used in this paper, Adapted-RRTEX-GWO algorithm gives the best results compared to other algorithms, while Adapted-RRTI-GWO and I-GWO algorithms take in the second and third ranks, respectively. Table 11 presents ranking summary of each algorithm. Briefly, each path planning method runs in the three different populations and iteration sizes on four maps. Table 11 shows the percentage of algorithms obtaining the minimum cost. Here, each path planning method has a different mechanism, as the metaheuristic operations and mechanism vary, the obtained results are different. The proposed adapted-RRT path planning method benefits from the parameters of the metaheuristic algorithms to find the optimal path.

Table 11 Ranking summary of used algorithms in cost path parameter

Although it is not possible to make a definite generalization based on the analysis of the results obtained, a result like the following may arise. The Adapted-RRTEX-GWO can perform more successfully in larger and crowded environments with more obstacles. Also, the more populations and iterations used, it may have more chance to find good results. The main reason for this is that in the Ex-GWO method, almost all wolves in the pack have an important role in each other's position update. Therefore, the wolves in the pack minimize the escape paths of the hunt (prey), and hence, the hunts can be caught faster. On the other hand, the Adapted-RRTI-GWO method may give good results in small-medium sized environments. In addition, it may be more successful than other GWO algorithms in fewer populations and iterations because it depends only on the alpha wolf. The GWO-based method is generally between these two methods and behaves in a balanced way.

All the paths generated for different maps have been shown in Fig. 9 is a perspective view. In this figure, it shows the path of each UAV travels in 3D. The cost of these paths was presented in the previous paragraph in Tables 7, 8, 9 and 10. In this figure, the balls show the source (initial) station of each UAV, and the star symbols show the destination state of each UAV. As mentioned before, there are three UAVs on each map, that the proposed path planning methods generate collision-free optimal paths. As such, there are three different paths generated for each algorithm; making 9 different paths for each map. The results show that the proposed methods generate optimum paths without any collision. Also, the experiments demonstrate that the metaheuristic algorithms need at least 40 iteration number and 25 population size for the optimal path, after which the metaheuristic algorithms are less likely to decrease the cost of the path. This issue has also been determined in studies in the literature [17, 22, 39].

Fig. 9
figure 9

Generated optimal paths in four maps

4.3 Comparison and evaluation of the execution time

The execution times of each algorithm are presented in Fig. 10. The proposed algorithms have better performance in terms of time taken when going through the maps. The execution time difference among the proposed algorithms and others are detailed in the figure. The performance of the execution time of Adapted-RRTEx-GWO and Adapted-RRTI-GWO is better than others, respectively. In the Adapted-RRT method, locating the next station necessitates only the angle and the distance of the destination and the next station selection is based on the situation. Therefore, it can find suitable solutions in a short time. So, this is due to its hybrid mechanism. Another important inference is that as the population and iteration numbers of metaheuristic algorithms increase, the overall time parameter will also increase, but this problem is not encountered despite using metaheuristic algorithms in the proposed hybrid versions.

Fig. 10
figure 10

Execution time analysis for different maps in each algorithm

Also, in time complexity analysis, while metaheuristic-based methods generally have O(n2) ≤ T notations, sampling-based methods such as RRT have O (n Logn) ≤ T ≤ O(n2) complexity. Our proposed methods, which are a hybridization of sampling and metaheuristic methods, have O(n2) complexity. In the computational analysis, the Adapted-RRTI-GWO is better than the two other proposed. Because I-GWO depends on only alpha wolf and therefore, calculations are more efficient. Among the other two algorithms (Adapted-RRTGWO and Adapted-RRTEx-GWO), the best performance belongs to Adapted-RRTGWO. Because only the best three wolves are taken into account in Adapted-RRTGWO. However, in Adapted-RRTEx-GWO, every wolf in the group follows all their previous wolves and performs an update process. On the other hand, the complexity of BPIB-RRT * is O(n Logn), tGSRT is O(n2), and the complexity of the rest of the other algorithms is equal or bigger than O(n2).

4.4 Comparison and evaluation of the convergence rate

Another analysis parameter is the convergence rate. Figures 11, 12, and 13 present the convergence curve of each path planning algorithm. As aforementioned, the obstacles numbers and the boundary sizes of the map are declared in Sect. 3.2. The three versions of the proposed path planning method have different structures with respect to exploration and exploitation. The presented figures illustrate the convergence curve of algorithms with 40 iterations and a population size of 25. In general metaheuristic algorithms have better convergence performances than others. Among the metaheuristic approaches used in this study, GWO-based methods are more successful due to their working mechanism as a swarm-intelligence group. Because wolves can update their positions according to the positions of other wolves in the herd and approach the hunt (solution). Based on the results, the Ex-GWO-based Adapted-RRT algorithm has better convergence curve behaviour overall in many cases, because in the Ex-GWO algorithm the experience of the entire population is used in the optimal pathfinding, in each iteration. When the convergence curve behaviours are analysed, it is seen that the Adapted-RRTEx-GWO method generally behaves more balanced between exploration and exploitation phases. With this balance, it is slowly moving towards a global optimum. Thus it gives a chance to find all possible candidate solutions in the global workspace. The term convergence in the literature [20, 55, 57,58,59] refers to the rate or behaviour of an algorithm towards the global optimum. Of course, a quick convergence results in local optima stagnation. By contrast, sudden changes in the solutions lead to local optima avoidance but reduce the convergence speed towards the global optimum. In other words, the search agents change abruptly in the early stages of the optimization process and then gradually converge. The proposed method generally has a balanced movement in all situations where it could find or not the best solution. In other words, using the Ex-GWO algorithm, this method mainly exhibits balanced convergence behaviour. Also, thanks to its characteristic feature, it outperformed in more complex and larger areas. According to the results and the figures obtained, regarding the I-GWO algorithm, UAVs reach the best optimal path earlier than when using other techniques. In the analysis, different iteration sizes have been considered to get the size of the optimal iteration. As a result of our observations, we conclude that 40 iterations are enough to find the best path. Because it was seen that the same results were obtained when more iterations were continued. The acquired outcomes also indicate that the execution time of algorithms is variable in the maps. Due to the number of obstacles and the map boundary, different results are collected. Also, while using three UAVs with different initial and final states' behaviours in the convergence rate analysis, it is shown that the number of obstacles has an effect on the path cost.

Fig. 11
figure 11

Convergence analysis for UAV1 on each map with 25 populations and 40 iterations

Fig. 12
figure 12

Convergence analysis for UAV2 on each map in 25 populations and 40 iterations

Fig. 13
figure 13

Convergence analysis for UAV3 on each map with 25 populations and 40 iterations

Figure 14 depicts the boxplot analysis of each map and UAV1. The aim is to compare the performance of the proposed and used methods. These results are acquired for 10 runs of metaheuristics with iteration size of 40. The box plot graph analysis shows the maximum and minimum values of the best cost obtained along with the frequency of the values. The boxplot analysis shows that Adapted-RRTEx-GWO algorithm gives better performance in comparison with other algorithms. It can be stated, based on values in Sect. 4.2, the finding costs by Ex-GWO-based methods are generally close to the average values, shown by red lines in the Fig. 14. It should be noted that best cost was typically obtained after an average of 10 runs.

Fig. 14
figure 14

Boxplots graph analysis for UAV2 for each map with 25 populations and 40 iterations

4.5 General comparison and discussion

Table 12 presents the general characteristics of each type of method in path planning. In this paper, we compared our proposed three-versions method with 8 different methods (sampling and nature categories-based). The proposed method has been evaluated in a simulation environment like other algorithms used to compare. In this simulation, the obtained results and analysis of them explain the proposed method outperformed other algorithms in defined considered parameters. As a result, a summary evaluation of our proposed hybrid method with other methods in both categories has also been made. The values and ranges of the parameters in this study have been used considering the parameters and their ranges as they appear in the literature [18, 23, 27, 29, 43, 56].

Table 12 Features of the focused path planning methods

The advantages and outperform of the method proposed in this study are discussed in the simulation and analysis section. However, this method did not take into account UAVs' own variables such as the battery, speed, altitude, and processing power. In addition, dynamic conditions such as wind and rain in real environments are not taken as a basis. However, the proposed method is also expected to be successful under these conditions, in line with its general architecture and hybrid mechanism. Since this method has a wide range, it can be used in real and various application areas, and new studies are planned in the future based on the parameters that are not considered in this study.

5 Conclusions and future scope

This study presented a new hybrid method with three versions to solve 3D path planning problem in autonomous UAVs. In this paper, an improvement on RRT algorithm, namely Adapted-RRT, is presented and it is applied with three different well-known metaheuristic algorithms; GWO, I-GWO, and Ex-GWO. These methods are named as Adapted-RRTGWO, Adapted-RRTI-GWO, and Adapted-RRTEx-GWO. They were suggested to find the optimal paths between initial and final stations of each UAV without any collision as well as with minimum time and space complexities. Adapted-RRT avoids possible obstacles without making random movements. In the Adapted-RRT were deployed two important operations (length of each movement and direction of each movement) to select next stations and finally generate path. For this, three metaheuristic algorithms are used to perform these two operations in the most appropriate way. Based on results, it was concluded that the Adapted-RRTEX-GWO can perform more successfully in larger and crowded environments with more obstacles, populations and iterations. On the other hand, Adapted-RRTI-GWO method may give good results in small-medium sized environments. In addition, it may be more successful than other GWO algorithms in fewer populations and iterations. The GWO-based method is generally between these two methods and behaves in a balanced way. In this study, four different maps with various obstacles have been used, furthermore, three UAVs with different start and destination states have been considered. The proposed methods have been analysed in terms of optimal path costs, execution time, and convergence rate by varying the population sizes as well as the iteration numbers. The results obtained show that the Adapted-RRTEx-GWO method outperforms the others 10, out of the total of 11 algorithms. In future work, 3D path planning method proposed will be realized using machine learning methods such as reinforcement learning or game theory-based algorithms. Another possible direction of further study can be path planning for Autonomous Underwater Vehicle (AUV), connected vehicles with VANET and FANET structures.