| Citation: | ZHANG H,ZHU J W,LI X P,et al. Deep reinforcement learning intelligent guidance for intercepting high maneuvering targets[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(6):2060-2069 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0375 |
An adaptive proportional navigation law with intelligent parameter adjustment by deep reinforcement learning is presented to address the issue of excessive miss distances and energy loss in the interception of moving targets using fixed coefficient proportional navigation law. First, a state space based on real-time flight states, an action space containing lateral and vertical gains, and a reward function model integrating different states is established. Meanwhile, a prediction-correction method is introduced to improve the accuracy of action evaluation in the model design of the reward function. Secondly, the soft actor-critic (SAC) algorithm is employed to train a network parameter and guidance parameter decision system that takes into account the miss distances and energy loss according to the relative motion states of the interceptor and the target. In comparison to traditional proportional navigation guidance, the simulation results demonstrate that the guidance technique has strong adaptability and can greatly minimize energy loss while retaining low miss distances.
| [1] |
纪毅, 王伟, 张宏岩, 等. 面向高机动目标拦截任务的空空导弹制导方法综述[J]. 航空兵器, 2022, 29(6): 15-25. doi: 10.12132/ISSN.1673-5048.2022.0119
JI Y, WANG W, ZHANG H Y, et al. A survey on guidance method of air-to-air missiles facing high maneuvering targets[J]. Aero Weaponry, 2022, 29(6): 15-25(in Chinese). doi: 10.12132/ISSN.1673-5048.2022.0119
|
| [2] |
JEON I S, KARPENKO M, LEE J I. Connections between proportional navigation and terminal velocity maximization guidance[J]. Journal of Guidance, Control, and Dynamics, 2020, 43(2): 383-388. doi: 10.2514/1.G004672
|
| [3] |
ZHANG B L, ZHOU D. Optimal predictive sliding-mode guidance law for intercepting near-space hypersonic maneuvering target[J]. Chinese Journal of Aeronautics, 2022, 35(4): 320-331. doi: 10.1016/j.cja.2021.05.021
|
| [4] |
LI C D, WANG J, HE S M, et al. Collision-geometry-based generalized optimal impact angle guidance for various missile and target motions[J]. Aerospace Science and Technology, 2020, 106: 106204. doi: 10.1016/j.ast.2020.106204
|
| [5] |
YE D, SHI M M, SUN Z W. Satellite proximate interception vector guidance based on differential games[J]. Chinese Journal of Aeronautics, 2018, 31(6): 1352-1361. doi: 10.1016/j.cja.2018.03.012
|
| [6] |
郭建国, 胡冠杰, 郭宗易, 等. 天线罩误差下基于ADP的机动目标拦截制导策略[J]. 宇航学报, 2022, 43(7): 911-920. doi: 10.3873/j.issn.1000-1328.2022.07.007
GUO J G, HU G J, GUO Z Y, et al. ADP-based guidance strategy for maneuvering target interception under radome errors[J]. Journal of Astronautics, 2022, 43(7): 911-920(in Chinese). doi: 10.3873/j.issn.1000-1328.2022.07.007
|
| [7] |
白志会, 黎克波, 苏文山, 等. 现实真比例导引拦截任意机动目标捕获区域[J]. 航空学报, 2020, 41(8): 323947.
BAI Z H, LI K B, SU W S, et al. Capture region of RTPN guidance law against arbitrarily maneuvering targets[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(8): 323947(in Chinese).
|
| [8] |
王荣刚, 唐硕. 拦截高速运动目标广义相对偏置比例制导律[J]. 西北工业大学学报, 2019, 37(4): 682-690. doi: 10.3969/j.issn.1000-2758.2019.04.006
WANG R G, TANG S. Intercepting higher-speed targets using generalized relative biased proportional navigation[J]. Journal of Northwestern Polytechnical University, 2019, 37(4): 682-690(in Chinese). doi: 10.3969/j.issn.1000-2758.2019.04.006
|
| [9] |
LI K B, SHIN H S, TSOURDOS A, et al. Capturability of 3D PPN against lower-speed maneuvering target for homing phase[J]. IEEE Transactions on Aerospace and Electronic Systems, 2020, 56(1): 711-722. doi: 10.1109/TAES.2019.2938601
|
| [10] |
SHIN H S, LI K B. An improvement in three-dimensional pure proportional navigation guidance[J]. IEEE Transactions on Aerospace and Electronic Systems, 2021, 57(5): 3004-3014. doi: 10.1109/TAES.2021.3067656
|
| [11] |
张秦浩, 敖百强, 张秦雪. Q-learning强化学习制导律[J]. 系统工程与电子技术, 2020, 42(2): 414-419. doi: 10.3969/j.issn.1001-506X.2020.02.21
ZHANG Q H, AO B Q, ZHANG Q X. Reinforcement learning guidance law of Q-learning[J]. Systems Engineering and Electronics, 2020, 42(2): 414-419(in Chinese). doi: 10.3969/j.issn.1001-506X.2020.02.21
|
| [12] |
李庆波, 李芳, 董瑞星, 等. 利用强化学习开展比例导引律的导航比设计[J]. 兵工学报, 2022, 43(12): 3040-3047. doi: 10.12382/bgxb.2021.0631
LI Q B, LI F, DONG R X, et al. Navigation ratio design of proportional navigation law using reinforcement learning[J]. Acta Armamentarii, 2022, 43(12): 3040-3047(in Chinese). doi: 10.12382/bgxb.2021.0631
|
| [13] |
邱潇颀, 高长生, 荆武兴. 拦截大气层内机动目标的深度强化学习制导律[J]. 宇航学报, 2022, 43(5): 685-695. doi: 10.3873/j.issn.1000-1328.2022.05.013
QIU X Q, GAO C S, JING W X. Deep reinforcement learning guidance law for intercepting endo-atmospheric maneuvering targets[J]. Journal of Astronautics, 2022, 43(5): 685-695(in Chinese). doi: 10.3873/j.issn.1000-1328.2022.05.013
|
| [14] |
CHEN W X, GAO C S, JING W X. Proximal policy optimization guidance algorithm for intercepting near-space maneuvering targets[J]. Aerospace Science and Technology, 2023, 132: 108031. doi: 10.1016/j.ast.2022.108031
|
| [15] |
HE S M, SHIN H S, TSOURDOS A. Computational missile guidance: a deep reinforcement learning approach[J]. Journal of Aerospace Information Systems, 2021, 18(8): 571-582. doi: 10.2514/1.I010970
|
| [16] |
GAUDET B, FURFARO R, LINARES R. Reinforcement learning for angle-only intercept guidance of maneuvering targets[J]. Aerospace Science and Technology, 2020, 99: 105746. doi: 10.1016/j.ast.2020.105746
|
| [17] |
HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//Proceedings of the 35th International Conference on Machine Learning. Stockholm: PMLR, 2018, 80: 1861-1870.
|
| [18] |
HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actor-critic algorithms and applications[EB/OL]. (2019-01-29)[2023-06-01]. http://arxiv.org/abs/1812.05905v2.
|