1 Introduction

UAV-based networks have significant energy consumption, a short flight time, and powerful communication capabilities. As a result, building routing models for these networks necessitates an efficient examination of routing paths, temporal node performance, network parameters, and other multi-modal and contextual restrictions. To create such models, researchers must evaluate a wide range of real-time parameters such as collision awareness, flying height, turning angles, threat avoidance, and so on. Figure 1 shows a list of such parameters, with metrics for collision avoidance, self-constraints, and external dynamics separated to identify the most useful metrics that must be optimized in large-scale routing (path planning) scenarios (Lee and Friderikos 2022; Jinqiang et al. 2021; Cheng et al. 2022; Li et al. 2022). Based on this comparison, the most important metrics for indigenous UAV networks are expected time of arrival (ETA), separation maintenance, fuel capacity, UAV slope, turning angle, and relative height. The most significant criteria for indigenous UAV networks are separation maintenance, fuel capacity, UAV slope, turning angle, and relative height.

Fig. 1
figure 1

Parameters affecting the UAV routing process

These parameters are combined to form an objective function \({f}_{p}\) which can be evaluated as per Eq. (1)

$${f}_{p}=\frac{{f}_{1}\left(ETA, RH, F, S, {T}_{a}\right)}{{f}_{2}(Sep, Obs)}$$
(1)

where, RH, F, S and Ta represent the relative height, fuel requirements, slope, and turning angle, all of which must be minimized, while Sep and Obs represent separation distance and obstacle avoidance probability, which must be maximized for the optimization process (Liu et al. 2021; Qadir et al. 2022). The functions f1 and f2 are decided as per the context of the network, and used for continuous planning operations. A survey of models (Shao et al. 2022; Airlangga and Liu 2022; Bono Rossello et al. 2022; Sanchez-Fernandez et al. 2022; Dong et al. 2022) that perform these operations is discussed. Based on which it was observed that existing path planning models with collision awareness are either highly complex or have lower efficiency, that limits abilities. Moreover, these models usually do not consider energy constraints and are applied to static targets. The objective of this research work is to improve the overall performance of the network by considering characteristics like collision avoidance, target detection, and minimization of collision. The precise or accurate detection of the object increases the throughput of the network in context to energy saving for UAVs, especially in real-time adversarial environments.

In Sect. 2 review of literature has been explained to increase the potential of the proposed method. In Sect. 3 proposed scheme is used as a continuous pattern assessment of dynamic environment. In Sect. 4, the proposed model’s performance was evaluated using large-scale network simulated environment to compared to that of traditional routing methods. Finally, this text concludes with some context-specific and network-specific observations about the proposed model and recommends methods to further optimize its performance.

2 Literature review

A wide variety of UAV path planning models are proposed by researchers, and each has its internal characteristics. For instance, work in Wang et al. (2022), Pan et al. (2022), Zhou et al. (2021), and Xu et al. (2021a) proposes the use of Dueling double deep Q-network (D3QN), improved artificial potential function (IAPF), artificial bee colony with bat algorithm (ABCBA), and constrained multiobjective optimization problem optimization (CMOP) to estimate efficient paths for different network scenarios. These paths are optimized via the use of high-density route information sets and are validated under different scaled networks. However, these models have higher complexity, which limits their applicability and usability when applied to real-time scenarios. To overcome these issues, work in Feng et al. (2021), Cao et al. (2022), Duan et al. (2021a), Jensen-Nau et al. (2021), and Shiri et al. (2022) propose the use of artificial potential fields, Concentrated Coverage Path Planning Models, Improved Intelligent Water Drops (IIWD) Models, and Voronoi-based path generation (VPG) Models, which assists in the integration of low complexity operations during path estimation under adversarial network scenarios. But the efficiency and trust levels of these models is low, which can be improved via use of Iterative Single Head Attention (ISHA) (Chen et al. 2022), Adaptive Clustering (Niu et al. 2022), Rapidly Exploring Deep Tree (RDT) (Guo et al. 2021), Convolutional Neural Networks (CNNs) (Chang et al. 2022), Geometric Distance with Reinforcement Learning (GDRL) (Shen et al. 2022), Detach and Steer (Zhang et al. 2021), and Improved Adaptive Grey Wolf Optimization (IA GWO) (Zhang and Zhang 2021) that assists in improving path planning performance with low complexity and high scalability levels. However, these models do not incorporate trust levels while estimating path plans, which limits their performance capabilities.

Models discussed in Liu et al. (2022), Wu et al. (2021), Bruggemann (2022), and Pan et al. (2021) further propose the use of Graph Theory, Tangent Intersection with Target Guidance Strategy, Estimation of Distribution Algorithm (EDA) with the Genetic Algorithm (GA), and feature driven flight planning that incorporates delay, energy levels, and path reusability metrics for better planning performance under different use cases. However, these models do not incorporate trust metrics, which limits their scalability levels. To overcome this issue, work in Xu et al. (2021b), Roberge and Tarbouchi (2021), Akbar et al. (2022), and Xie et al. (2021) proposes the use of Deep Learning Trained by Genetic Algorithm (DL-GA), Mixed-Strategy Gravitational Search Algorithm (MSGSA), multiple point-of-interest (MPoI) based path planning, and deep reinforcement learning, which assist in the integration of high-density parameter sets for incorporation of trust levels during routing operations. Work in Duan et al. (2021b), Shao et al. (2021), Huang et al. (2021), Zhao et al. (2021), Cui and Wang (2021) further extends this concept via integration of Dynamic Discrete Pigeon-Inspired Optimization, improved particle swarm optimization (PSO) with Gauss pseudo-spectral method (GPM), Stochastic Time-Dependent Optimizations, Decentralized Learning Optimizations, and Multi-Layer Reinforcement Learning Techniques, that assist in continuous optimizations of the model under real-time use cases. Similar concepts are presented in Huang et al. (2022), Wu et al. (2022), Du et al. (2022), Peng et al. (2022), Vashisth et al. (2021), and Hsu and Gau (2022), which propose the use of multiobjective UAV trajectory planning, dynamic programming, Iterative Chance-Constrained Optimization, constrained decomposition-based multi-objective evolution algorithm, and deep reinforcement learning, which assists in improving path planning operations under different scenarios. But it was found that these models are either more complex, or have lower efficiency, which limits their deployment abilities. Moreover, these models usually do not consider energy constraints and are applied to static targets. The proposed model was assessed in different scenario, and compared w.r.t. existing path planning techniques for its validation under real conditions.

3 Proposed hybrid bioinspired model

Based on the survey of various existing collision-aware routing models it has been found that these models are either very complex or have lower efficiency, which limits their deployment abilities. Moreover, these models usually do not consider energy constraints and are applied to static targets. To overcome these limitations, this section proposes the design of a novel hybrid bioinspired model for dynamic routing. In Fig. 2, where it can be seen that the proposed model starts collecting node-level and network-level parametric sets that include Cartesian location, residual energy levels, temporal routing performance, and temporal collision performance levels. The model then deploys a Grey Wolf Optimization (GWO) based routing process to identify optimal routes between two anchor points. These routes are further tuned via a Firefly based Optimization (FFO) which assists in estimating high-trust routes based on their temporal performance via continuous data update operations. The selected route sets are further scrutinized via a continuous learning framework (CLF), which assists in the identification of dynamic moving targets, and uses this information for incremental route updates. The model initially collects temporal information about different node and network configurations and uses them to form initial routes. These routes are formed via a Grey Wolf Optimization (GWO) based model, which works via the following process

  • To initialize the optimizer, set the following GWO constants,

    • Total Wolf configurations to be generated for optimization (\({N}_{w}\))

    • Total iterations for which these Wolves will be evaluated (\({N}_{i}\))

    • A constant rate of learning for these Wolves (\({L}_{w}\))

    • Current node locations and qualitative parameters

    Fig. 2
    figure 2

    Overall flow of the routing process with collision aware operations

  • While performing GWO based routing, a set of nodes consisting of source (\(src\)) & destination (\(dest\)) nodes are selected, which will assist in the identification of optimal traveling paths between these nodes

  • For each pair of source & destination nodes, evaluate the reference distance \({d}_{ref}\) via Eq. (2),

    $${d}_{ref}=\sqrt{\begin{array}{c}{\left({x}_{src}-{x}_{dest}\right)}^{2}+{\left({y}_{src}-{y}_{dest}\right)}^{2}\\ +{\left({z}_{src}-{z}_{dest}\right)}^{2}\end{array}}$$
    (2)

    where, \(x, \,y, \, \text{and} \, z\) represent the Cartesian locations of the nodes.

  • Based on this reference distance generate \({N}_{w}\) Wolf configurations as per the following process,

    • Identify all other UAV nodes that are in the route of the current source–destination pair, by checking all nodes that satisfy Eq. (3),

      $${d}_{src,i}<{d}_{ref} \quad \& \quad {d}_{i,dest}<{d}_{ref}$$
      (3)

      where \(i\) represents respective node numbers.

    • Out of these nodes, identify \(N\) routing nodes via Eq. (4),

      $$N=STOCH\left({L}_{w}*{N}_{n}, {N}_{n}\right)$$
      (4)

      where \({N}_{n}\) represents the no of nodes that are present in the network scenario, and \(STOCH\) represents a stochastic Markovian process that is used for the generation of different number sets.

    • For each of these node sets, identify their \(\varnothing \, \& \, \theta\) values via Eqs. (5) and (6) as follows,

      $$\theta ={\mathrm{tan}}^{-1}\left(\frac{\sqrt{{x}^{2}+{y}^{2}}}{Z}\right)$$
      (5)
      $$\varnothing ={\mathrm{tan}}^{-1}\left(\frac{y}{x}\right)$$
      (6)
    • Now, rotate the angles by stochastic shifts of \({\varnothing }^{\prime}\, \& \, {\theta }^{\prime}\) via Eqs. (7) and (8),

      $${\varnothing }^{\prime}=\varnothing +STOCH\left(-\frac{\pi }{2}*{L}_{r},\frac{\pi }{2}*{L}_{r}\right)$$
      (7)
      $${\theta }^{\prime}=\theta +STOCH\left(-\pi *{L}_{r},\pi *{L}_{r}\right)$$
      (8)

      where, \({L}_{r}\) is the learning rate, which is initially set up as \({L}_{r}=1\), and then modified via the optimization process.

    • Add source to the initial location, and destination to the final location, and also add these updated co-ordinates to the route lists. Based on these updated co-ordinates estimate the final route distance as per Eq. (9),

      $$d=\sum_{i=2}^{N}{d}_{i-1, i}$$
      (9)
    • As per the distance metrics, estimate Wolf fitness via Eq. (10),

      $${f}_{w}=d*\sum_{i=1}^{N-1}{E}_{i}$$
      (10)

      where \(E\) represents the energy needed to move from a given location to the next location under real-time conditions.

    • This is repeated process for all Wolves, and \({N}_{w}\) Wolf configurations are generated, each of which represents different routing paths.

  • Estimate Wolf fitness threshold via Eq. (11),

    $${f}_{th}=\sum_{i=1}^{{N}_{w}}{f}_{w}*\frac{{L}_{w}}{{N}_{w}}$$
    (11)
  • Based on this threshold, mark the Wolves as follows,

    • $$\text{A\,Wolf\,is\,marked\,as}\, \text{`Delta'},\, \mathrm{ when }\,{f}_{w}>{f}_{th}$$
      (12)
      $$\mathrm{Else},\mathrm{ Wolf\, is\,marked\,as}\, \text{`} \text{Alpha'},\, \mathrm{ when }\,{f}_{w}<{f}_{th}*\frac{{L}_{w}}{2}$$
      (13)
      $$\mathrm{Else},\mathrm{ Wolf\, is\,marked\,as}\, \text{`Beta'},\, \mathrm{ when }\,{f}_{w}<{f}_{th}*{L}_{w}$$
      (14)
    • Otherwise, Wolf is marked as ‘Gamma’ for further optimizations

  • Once all Wolves are marked, scan each of them and modify their internal configurations for \({N}_{i}\) iterations as per the following process,

    • Regenerate all ‘Delta’ Wolves, via Eqs. (4)–(9)

    • For ‘Beta’ & ‘Gamma’ Wolves, modify \({L}_{r}\) via Eq. (15),

      $${L}_{r}={L}_{r}\left(1\pm \frac{1}{STOCH\left(\frac{{N}_{w}}{2}, {N}_{w}\right)}\right)$$
      (15)
    • Use this new \({L}_{r}\) to generate their new configurations.

    • After each iteration, identify the fitness threshold, and recheck the fitness levels for each of the Wolves, which will assist in the identification of optimal routing paths.

Once all iterations are completed, select ‘Alpha’ Wolves as initial routing configurations, and modify these configurations via a Firefly based optimization process. This process reiterates all the ‘Alpha’ solutions and identifies high-trust paths, which will assist in achieving better QoS levels. The model has steps,

  • To initialize the optimizations, set the following FF constants,

    • The total number of fireflies used for optimization (\({N}_{ff}\))

    • The total number of iterations used during the optimization process (\({N}_{i}\))

    • The rate at which the fireflies will learn from each other (\({L}_{ff}\))

    • Temporal routing parameters on each path including throughput, collisions, and link quality on the given path sets.

  • Scan all ‘Alpha’ Wolves for \({N}_{i}\) iterations, as per the following process,

    • Generate current path brightness via Eq. (16),

      $${p}_{b}=\sum_{i=2}^{{N}_{h}}{d}_{i-1,i}*\left[\frac{TH{R}_{i-1}}{Max\left(THR\right)}\right]*N{C}_{i,i+1}*\frac{1}{L{Q}_{i,i+1}}$$
      (16)

      where, \({N}_{h}\) represents the number of hops decided by the GWO process, while \(THR, NC \& LQ\) represents the throughput of nodes on the given path which is evaluated via Eq. (17), the number of temporal collisions on the path which is evaluated via Eq. (18), and temporal link quality of the given paths which is evaluated via Eq. (19) as follows,

      $$THR=\sum_{t=t1}^{t2}\frac{NN\left(t\right)}{Max\left(NN\right)*(t2-t1)}$$
      (17)

      where, \(NN(t)\) represents the no of nodes that have used this path between the time interval of t1 and t2, which is recorded by the router nodes.

      $$NC=\sum_{t=t1}^{t2}\frac{V{C}_{t}}{\sum VC}$$
      (18)

      where \(V{C}_{t}\) represents the number of vehicles that collided during the given time intervals.

      $$LQ=\sum_{t=t1}^{t2}\frac{1}{V{F}_{t}}$$
      (19)

      where \(V{F}_{t}\) represents the total number of vehicles that became faulty after using the given path between the given time intervals.

    • Now, incrementally modify values of \(\varnothing\, \& \,\theta\) via Eq. (20),

      $$\left(\varnothing ,\theta \right)={\left(\varnothing ,\theta \right)}_{old}\pm \frac{\pi *STOCH\left(\frac{1}{{L}_{ff}}, {L}_{ff}\right)}{{L}_{ff}+1}$$
      (20)
    • Use these new values to estimate new paths, and estimate their path brightness levels via Eq. (16), and based on this new level, accept this path if \({p}_{b}\left(New\right)<{p}_{b}(Old)\)

    • This process is continued for \({N}_{ff}\) fireflies and new configurations are generated for each of the ‘Alpha’ Wolf paths.

  • When all iterations are done, the path with max brightness levels will be selected as the final solution for routing operations.

The selected path is used for routing operations, and new levels of throughput, number of collisions, and link quality are updated for continuous optimization operations. These paths are stored on the database via an Incremental Learning Layer (IL), which correlates between QoS (Quality of Service) levels of the current path, and existing stored paths. This QoS level is estimated for each path as per Eq. (21),

$$Q=\sum_{i=1}^{{N}_{p}}\frac{\sum N{C}_{i}}{\sum L{Q}_{i}}$$
(21)

where \({N}_{p}\) is the number of ‘Alpha’ Wolf configurations selected by the GWO process. Based on this Q value, the reward function is estimated via Eq. (22),

$$r=\frac{Q\left(current\right)-Q(db)}{{L}_{ff}}+{L}_{r}\left(Q\left(current\right)-Max\left(Q\right)\right)$$
(22)

The current path sets are updated in the database if \(r>1\), which indicates that the current path sets have a lower number of collisions, with higher link quality, while other paths are discarded from the optimization operations. Using this process, path caches are generated, and if GWO selects similar paths, then they are directly used without the need for FFO based validation operations. Due to the use of these path caches, the speed of operation for the model is improved, while the energy needed for the routing process is reduced when compared to real-time scenarios. This performance is validated via comparison with standard routing techniques in the next section.

4 Statistical analysis

The proposed BPAC2R Model initially uses GWO to estimate low congestion routes, which are re-evaluated via FFO by utilization of temporal node & network parameter sets. The selected paths are cached and later used for continuous optimizations via an incremental learning process. Due to these optimizations, it is expected that the model must showcase lower energy consumption, lower routing delay, and minimized number of collisions. The model was tested on standard UAV configurations, which were taken from NTNU Open Research Dataset (available at https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/L41IGQ). To execute the proposed model in a simulation environment, the initial values are considered to play an important role in making UAVs tested on different conditions. The UAV configurations were tested on the following network configuration parameter sets as listed in Table 1. Table 1 tells about all the fundamental considerations used in UAV routing testing for collision avoidance.

Table 1 UAV configuration used during routing operations

As per these configuration parameters, a large number of movements (NMs) were done for the UAV network, and these movements were varied between 250 to 5000, to estimate the true value of different parameter sets. For each of these movements, routing delay (D) was estimated via Eq. (23) as follows,

$$D=\frac{1}{NM}\sum_{i=1}^{NM}{ts}_{reach}-{ts}_{start}$$
(23)

where \({ts}_{reach}\, \&\, {ts}_{start}\) represent the timestamps at which the nodes reach the destination location and start from the source locations. The delay performance was compared with IIWD (Jensen-Nau et al. 2021), IA GWO (Zhang and Zhang 2021), and MS GSA (Roberge and Tarbouchi 2021) in Table 2.

Table 2 Delay needed for routing UAVs between multiple locations

As per this evaluation, it has been found that the proposed model was 23.5% faster than IIWD (Jensen-Nau et al. 2021), 34.2% faster than IA GWO (Zhang and Zhang 2021), and 38.5% faster than MS GSA (Roberge and Tarbouchi 2021) in real conditions. This is possible due to the dual filtering of routes via GWO and FFO Models, which assists in the identification of low delay route sets. Due to this, the model is capable of deployment for high-speed routing use cases. Similar performance was estimated for energy consumption via Eq. (24), and tabulated in Table 3 as follows,

Table 3 Energy is needed for routing UAVs between different locations
$$E=\frac{1}{NM}\sum_{i=1}^{NM}{E}_{src}{\left(start\right)}_{i}-{E}_{src}{\left(complete\right)}_{i}$$
(24)

here \(E\left(start\right)\, \& \,E(complete)\) represent energy levels of the source node during the start and completion.

It has been found from Table 3, that the proposed model showcased 16.5% lower energy consumption than IIWD (Jensen-Nau et al. 2021), 24.3% lower energy consumption than IA GWO (Zhang and Zhang 2021), and 14.2% lower energy consumption than MS GSA (Roberge and Tarbouchi 2021), which makes the model useful for low energy & high lifetime scenarios. This is possible due to the inclusion of residual energy levels during the formation of routes via the FFO process. Similar performance for the number of average collisions (NAC) can be seen in Table 4,

Table 4 Total number of collisions for routing UAVs between different locations

According to this above assessment, it has been found that the proposed model achieved 10.4% lower collisions than IIWD (Jensen-Nau et al. 2021), 10.5% lower collisions than IA GWO (Zhang and Zhang 2021), and 18.3% lower collisions than MS GSA (Roberge and Tarbouchi 2021) in the real-world schema. This is possible due to the initial filtering of routes via GWO & then using trust-based routing via FFO Models, which assists in the identification of low delay and low congestion route sets. Due to this, the model is capable of deployment for low-collision routing use cases. Similarly, the throughput performance in terms of vehicles crossing on routes per minute (vpm) can be seen in Table 5.

Table 5 Total throughput for routing UAVs between different locations

According to the evaluation, it was found that the proposed model has better path reusability due to path caching mechanisms, which assists in improving its throughput levels. The proposed model showcased 15.4% better throughput than IIWD (Jensen-Nau et al. 2021), 25.3% higher throughput than IA GWO (Zhang and Zhang 2021), and 18.5% better throughput than MS GSA (Roberge and Tarbouchi 2021; Vashisth et al. 2023), which makes best suited for a wide variety of path reusability scenarios. This is possible due to the inclusion of throughput during path optimizations. Due to these operations, the proposed model was observed to be better than standard path planning models and can be used for low energy, high speed, low congestion, and high throughput use cases.

5 Future scope and conclusion

The proposed BPAC2R Model first estimates low congestion routes with GWO; these estimates are then updated with FFO by using temporal node & network parameter sets. The chosen paths are cached and subsequently used for ongoing optimizations through incremental learning. The model is expected to demonstrate lower energy consumption, lower routing delay, and a minimum number of collisions as a result of these optimizations. The proposed model was found to be 38.5% faster than MS GSA (Roberge and Tarbouchi 2021) under real-time scenarios, 23.5% faster than IIWD (Jensen-Nau et al. 2021), 34.2% faster than IA GWO (Zhang and Zhang 2021), and 33.5% faster than MS GSA (Roberge and Tarbouchi 2021). Dual route filtering using GWO and FFO models, which aids in the identification of low delay route sets, makes this possible. As a result, the model can be used for high-speed routing use cases. The suggested model also showed 16.5%, 24.3%, and 14.2% lower energy consumption than IIWD (Jensen-Nau et al. 2021), IA GWO (Zhang and Zhang 2021; Vashisth and Batth 2020), and MS GSA (Roberge and Tarbouchi 2021), respectively, which makes the model useful for low energy and high lifetime scenarios. This is possible because residual energy levels are taken into account when routes are formed using the FFO process. Additionally, it has been found that in real-world problems, the proposed model produced collision rates that were 10.4%, 10.5%, and 18.3% lower than those of IIWD (Jensen-Nau et al. 2021), IA GWO (Zhang and Zhang 2021), and [R3, respectively. This is made possible by first filtering routes using GWO, followed by trust-based routing using FFO Models, which helps identify low delay and low congestion route sets. The model can therefore be used for low-collision routing use cases. It was found that the proposed model has better path reusability thanks to path caching mechanisms, which helps to increase its throughput levels. The proposed model demonstrated throughput improvements of 15.4%, 25.3%, and 18.5% over IIWD (Jensen-Nau et al. 2021), IA GWO (Zhang and Zhang 2021), and MS GSA (Roberge and Tarbouchi 2021), respectively, making it best for a variety of path reusability scenarios. This is possible because path optimizations take throughput into account. The proposed model was found to be superior to traditional path planning models as a result of these operations, and it can be applied to use cases involving low energy consumption, high speed, minimal congestion, and high throughput. Future performance testing of the proposed model on large-scale networks is necessary, and it can be enhanced by incorporating simple bio-inspired techniques. Through the use of deep learning auto encoder techniques, Gated Recurrent Units (GRUs), and Q-learning operations, the model’s performance can also be enhanced, making it more effective in a variety of real-time scenarios.

Routing involves planning and determining of path that the UAV wheel has to follow on the ground. Especially at the time of take-off, landing, and number of ground operations. For effective obstacle avoidance and efficient ground operation, routing becomes a major parameter of UAV performance. Embedding routing in a UAV network includes implementing protocols and algorithms that can allow effective data exchange and control between multiple UAVs. Positioning, localization and neighborhood discovery for UAVs has been carried out to get simulation of various scenarios. Embedding routing in UAVs is a multidisciplinary approach that involves software development, robotics, and communication engineering. So optimal routing algorithm helps UAVs to get better throughput.