UAV swarm air combat maneuver decision-making method based on multi-agent reinforcement learning and transferring

Zheng, Zhiqiang; Wei, Chen; Duan, Haibin

doi:10.1007/s11432-023-4088-2

UAV swarm air combat maneuver decision-making method based on multi-agent reinforcement learning and transferring

Research Paper
Published: 24 July 2024

Volume 67, article number 180204, (2024)
Cite this article

Download PDF

Science China Information Sciences Aims and scope Submit manuscript

UAV swarm air combat maneuver decision-making method based on multi-agent reinforcement learning and transferring

Download PDF

Zhiqiang Zheng¹,
Chen Wei¹ &
Haibin Duan¹

224 Accesses
Explore all metrics

Abstract

During short-range air combat involving unmanned aircraft vehicle (UAV) swarms, UAVs must make accurate maneuver decisions based on information from both enemy and friendly UAVs. This dual requirement of competition and cooperation presents a significant challenge in the field of unmanned air combat. In this paper, a method based on multi-agent reinforcement learning (MARL) is proposed to address this issue. An actor network containing three subnetworks that can handle different types of situational information is designed. Hence, the results from simpler one-on-one scenarios are leveraged to enhance the complex swarm air combat training process. Separate state spaces for local and global information are designed for the actor and critic networks. A detailed reward function is proposed to encourage participation. To prevent lazy participants in air combat, a reward assignment operation is applied to distribute these dense rewards. Simulation testing and ablation experiments demonstrate that both the transfer operation and reward assignment operation can effectively deal with the swarm air combat scenario, and reflect the effectiveness of the proposed method.

Article PDF

Enhancing multi-UAV air combat decision making via hierarchical reinforcement learning

Article Open access 23 February 2024

Predictive air combat decision model with segmented reward allocation

Article Open access 22 July 2024

The research on intelligent cooperative combat of UAV cluster with multi-agent reinforcement learning

Article 30 September 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Fan S, Liu H H T. Multi-UAV cooperative hunting in cluttered environments considering downwash effects. Guid Navigat Control, 2023, 03: 2350004
Article Google Scholar
Kong L, Reis J, He W, et al. On dynamic performance control for a quadrotor-slung-load system with unknown load mass. Automatica, 2024, 162: 111516
Article MathSciNet Google Scholar
Li S, Shao X, Zhang W, et al. Distributed multicircular circumnavigation control for UAVs with desired angular spacing. Defence Tech, 2024, 31: 429–446
Article Google Scholar
Kong L, Reis J, He W, et al. Experimental validation of a robust prescribed performance nonlinear controller for an unmanned aerial vehicle with unknown mass. IEEE ASME Trans Mechatron, 2024, 29: 301–312
Article Google Scholar
Jiang F, Xu M, Li Y, et al. Short-range air combat maneuver decision of UAV swarm based on multi-agent transformer introducing virtual objects. Eng Appl Artif Intell, 2023, 123: 106358
Article Google Scholar
Dong Y Q, Ai J L, Liu J Q. Guidance and control for own aircraft in the autonomous air combat: a historical review and future prospects. Proc Inst Mech Eng Part G J Aero Eng, 2019, 233: 5943–5991
Article Google Scholar
Sun Z, Piao H, Yang Z, et al. Multi-agent hierarchical policy gradient for air combat tactics emergence via self-play. Eng Appl Artif Intelligence, 2021, 98: 104112
Article Google Scholar
Kong W R, Zhou D Y, Zhang K, et al. Air combat autonomous maneuver decision for one-on-one within visual range engagement base on robust multi-agent reinforcement learning. In: Proceedings of the 16th International Conference on Control & Automation (ICCA), Singapore, 2020. 506–512
Yang Q, Zhang J, Shi G, et al. Maneuver decision of UAV in short-range air combat based on deep reinforcement learning. IEEE Access, 2019, 8: 363–378
Article Google Scholar
Wang L, Wang J, Liu H, et al. Decision-making strategies for close-range air combat based on reinforcement learning with variable-scale actions. Aerospace, 2023, 10: 401
Article Google Scholar
Li S, Wang Y, Zhou Y, et al. Multi-UAV cooperative air combat decision-making based on multi-agent double-soft actor-critic. Aerospace, 2023, 10: 574
Article Google Scholar
Duan H, Li P, Yu Y. A predator-prey particle swarm optimization approach to multiple UCAV air combat modeled by dynamic game theory. IEEE CAA J Autom Sin, 2015, 2: 11–18
Article MathSciNet Google Scholar
Huang C, Dong K, Huang H, et al. Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization. J Syst Eng Electron, 2018, 29: 86–97
Article Google Scholar
Liu L, Zheng Y, Lu X, et al. Research on individual performance index of air cluster combat aircraft based on differential game theory. J Phys-Conf Ser, 2023, 2478: 102013
Article Google Scholar
Liu Y P, Gao X, Shi J X, et al. Research on decision-making method of air combat embedded training based on extended influence diagram. In: Proceedings of Advances in Guidance, Navigation and Control. Lecture Notes in Electrical Engineering, Singapore, 2021
Jiandong Z, Qiming Y, Guoqing S, et al. UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning. J Syst Eng Electron, 2021, 32: 1421–1438
Article Google Scholar
Li Y, Shi J, Jiang W, et al. Autonomous maneuver decision-making for a UCAV in short-range aerial combat based on an MS-DDQN algorithm. Defence Tech, 2022, 18: 1697–1714
Article Google Scholar
Zhu J, Kuang M, Zhou W, et al. Mastering air combat game with deep reinforcement learning. Defence Tech, 2024, 34: 295–312
Article Google Scholar
Yuan X, Wang H, Yu W. A weighted mean field reinforcement learning algorithm for large-scale multi-agent collaboration. Guid Navigat Control, 2023, 03: 2350007
Article Google Scholar
Li J N, Nie H, Chai T, et al. Reinforcement learning for optimal tracking of large-scale systems with multitime scales. Sci China Inf Sci, 2023, 66: 170201
Article MathSciNet Google Scholar
Wang H, Wang J. Enhancing multi-UAV air combat decision making via hierarchical reinforcement learning. Sci Rep, 2024, 14: 4458
Article Google Scholar
Luo D, Fan Z, Yang Z, et al. Multi-UAV cooperative maneuver decision-making for pursuit-evasion using improved MADRL. Defence Tech, 2024, 35: 187–197
Article Google Scholar
Wang Z, Guo Y, Li N, et al. Autonomous collaborative combat strategy of unmanned system group in continuous dynamic environment based on PD-MADDPG. Comput Commun, 2023, 200: 182–204
Article Google Scholar
Hu D, Yang R, Zhang Y, et al. Aerial combat maneuvering policy learning based on confrontation demonstrations and dynamic quality replay. Eng Appl Artif Intell, 2022, 111: 104767
Article Google Scholar
Austin F, Carbone G, Falco M, et al. Automated maneuvering decisions for air-to-air combat. In: Proceedings of Guidance, Navigation and Control Conference, Monterey, 1987
Yang A W, Li Z W, Li B, et al. Air combat situation assessment based on dynamic variable weight. Acta Armamentarii, 2021, 42: 1553–1563
Google Scholar
Zhan G, Zhang X, Li Z, et al. Multiple-UAV reinforcement learning algorithm based on improved PPO in ray framework. Drones, 2022, 6: 166
Article Google Scholar
Yu C, Velu A, Vinitsky E, et al. The surprising effectiveness of PPO in cooperative, multi-agent games. 2021. ArXiv:2103.01955
Schulman J, Wolski F, Dhariwal P, at al. Proximal policy optimization algorithms. 2017. ArXiv:1707.06347
Zhu J W, Zhang H, Zhao S B, et al. Multi-constrained intelligent gliding guidance via optimal control and DQN. Sci China Inf Sci, 2023, 66: 132202
Article Google Scholar
Li L T, Zhou Z M, Chai J J, et al. Learning continuous 3-DoF air-to-air close-in combat strategy using proximal policy optimization. In: Proceedings of IEEE Conference on Games (CoG), Beijing, 2022. 616–619

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (Grant No. 2023YFC3011001) and National Natural Science Foundation of China (Grant Nos. U20B2071, 62350048, T2121003).

Author information

Authors and Affiliations

State Key Laboratory of Virtual Reality Technology and Systems, School of Automation Science and Electrical Engineering, Beihang University, Beijing, 100083, China
Zhiqiang Zheng, Chen Wei & Haibin Duan

Authors

Zhiqiang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Chen Wei
View author publications
You can also search for this author in PubMed Google Scholar
Haibin Duan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haibin Duan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, Z., Wei, C. & Duan, H. UAV swarm air combat maneuver decision-making method based on multi-agent reinforcement learning and transferring. Sci. China Inf. Sci. 67, 180204 (2024). https://doi.org/10.1007/s11432-023-4088-2

Download citation

Received: 28 September 2023
Revised: 19 March 2024
Accepted: 17 April 2024
Published: 24 July 2024
DOI: https://doi.org/10.1007/s11432-023-4088-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

UAV swarm air combat maneuver decision-making method based on multi-agent reinforcement learning and transferring

Abstract

Article PDF

Similar content being viewed by others

Enhancing multi-UAV air combat decision making via hierarchical reinforcement learning

Predictive air combat decision model with segmented reward allocation

The research on intelligent cooperative combat of UAV cluster with multi-agent reinforcement learning

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation