基于深度强化学习的无人机航路规划方法研究【毕业设计，本人博客来源论文】_基于分层强化学习的无人机机动决策代码资源-CSDN文库

版权申诉

5星 · 超过95%的资源 35 浏览量 2024-03-24 20:43:15 上传评论 2 收藏 26.6MB PDF 举报

资源推荐

资源详情

资源评论

东北大学本科生毕业设计（论文）摘要

摘要

本文研究了基于深度强化学习的无人机航路规划方法。针对静态障碍环境，受到

人工势场算法(APF)的启发，本文提出了一种基于多智能体强化学习的动态自适应势

场算法(DAPF)，该算法的核心是动态地改变“势场”从而引导无人机规划出更优质的

航路。算法由两部分组成，顶层为规划层，负责输出无人机下一航路点的位置，底层

为学习与决策层，由强化学习模型组成，负责为顶层提供策略。在 DAPF 中，每个障

碍物被当作一个智能体，智能体之间通过学习与合作的方式改善航路。由于多智能体

强化学习一般有三种实现方式即集中式训练集中式执行、分布式训练分布式执行、集

中式训练分布式执行，本文研究了使用不同形式多智能体算法的 DAPF 算法在静态

障碍环境下的规划效果，并分析各自的应用场景。另外，将该算法与传统路径规划算

法例如 RRT、A*、蚁群算法在相同环境下进行实验比较，结果证明该算法在航路最

大曲折度、航路全局曲折度、航路距离等方面具有显著的优势。同时，作为 APF 算

法的改进，DAPF 还解决了局部极小点这一势场法普遍存在的问题。

针对动态障碍环境，受到自然界流水避石的启发，本文提出了一种自适应扰动流

场动态系统算法(AIFDS)，AIFDS 算法的核心为自适应地改变空间中流场的走向和大

小从而改变无人机的规划航路。在该算法中，无人机被当作智能体，它通过与环境的

交互，逐渐学会如何调节流场从而提前规划出一条安全性高、距离短、执行时间短的

航路。 AIFDS算法可以和几乎所有具有连续动作空间的强化学习算法结合，本文研

究了其与最大熵强化学习算法(SAC)、深度确定性策略梯度算法(DDPG)、近端策略优

化算法(PPO)、双延迟深度确定性策略梯度算法(TD3)的结合算法，分别在具有多个动

态障碍物的环境下进行实验，结果证明 AIFDS 在航路安全性等方面具有非常好的表

现。同时，将其与改进前的 IFDS 算法实验比较发现，AIFDS 在航路的各项评价指标

上有较大幅度的提升。

针对深度强化学习运算量密集且训练速度慢问题，本文给出了一种多进程加速框

架予以解决，该框架利用子进程并行地收集经验并提供给主进程用于更新网络。实验

结果证明多进程技术可以提升约一倍的效率，大幅度缩减训练时间。

关键词：深度强化学习；航路规划；人工势场算法；扰动流场动态系统算法；多进

程训练

东北大学本科生毕业设计（论文） Abstract

ABSTRACT

This paper studies the method of UAV route planning based on deep reinforcement

learning. In view of the static obstacle environment, inspired by the artificial potential field

algorithm, this paper proposes the dynamic adaptive potential field algorithm (DAPF) based

on Multi-Agent Reinforcement Learning. The core of this algorithm is to change the

potential field dynamically to guide the UAV to plan a better route. DAPF consists of two

parts, the top layer is the planning layer, which is responsible for outputting the position of

the next navigation point of UAV. The bottom layer is the learning and decision-making layer,

which consists of the reinforcement learning model and is responsible for providing

strategies for the top layer. In DAPF, each obstacle is regarded as an agent, and the agents

improve the route through learning and cooperation. As there are three ways to implement

Multi-Agent Reinforcement Learning, namely centralized training centralized execution,

distributed training distributed execution and centralized training distributed execution, this

paper studies the planning effect of DAPF using different forms of Multi-Agent algorithm

in static obstacle environment, and analyzes their respective application scenarios. In

addition, DAPF algorithm and traditional path planning algorithms such as RRT, A* and Ant

Colony algorithm are compared in the same environment. The results show that the DAPF

proposed in this paper has all-round advantages in the maximum tortuosity, global tortuosity

and distance of the route. At the same time, as an improvement of the artificial potential field

algorithm, DAPF also solves the common problem of local minimum.

As for the dynamic obstacle environment, inspired by the natural water to avoid rocks,

this paper proposes an adaptive interfered fluid dynamical system (AIFDS). The core of

AIFDS is to adaptively change the direction and size of the flow field in the space, so as to

change the planning route of UAV. In AIFDS, UAV is regarded as an agent. Through the

interaction with the environment, it gradually learns how to adjust the flow field, so as to

plan a route with high safety, short distance and short execution time in advance. AIFDS can

be combined with almost all reinforcement learning algorithms with continuous action space.

This paper studies the combination of AIFDS with SAC, DDPG, PPO, TD3 algorithm.

Experiments are carried out in the environment with multiple dynamic obstacles, and the

东北大学本科生毕业设计（论文）目录

摘要 ................................................................................................................ I

ABSTRACT ..................................................................................................... II

1. 绪论 ........................................................................................................ - 1 -

1.1 引言 .....................................................................................................................- 1 -

1.2 研究背景 .............................................................................................................- 1 -

1.3 国内外研究现状 .................................................................................................- 2 -

1.4 本文的研究内容和技术路线 .............................................................................- 3 -

1.5 本章小结 .............................................................................................................- 5 -

2. 无人机航路规划约束及性能指标 ........................................................ - 7 -

2.1 引言 .....................................................................................................................- 7 -

2.2 无人机运动学约束 .............................................................................................- 7 -

2.3 规划航路性能指标选定 .....................................................................................- 8 -

2.4 本章小结 .............................................................................................................- 9 -

3. 强化学习基础 .......................................................................................- 11 -

3.1 引言 ................................................................................................................... - 11 -

3.2 马尔科夫决策过程 ........................................................................................... - 11 -

3.3 状态值函数和状态-行为值函数 ......................................................................- 12 -

3.4 贝尔曼方程与最优贝尔曼方程 .......................................................................- 12 -

3.5 价值学习和策略学习 .......................................................................................- 14 -

3.6 本章小结 ...........................................................................................................- 14 -

4. 静态环境下深度强化学习航路规划 .................................................. - 15 -

4.1 引言 ...................................................................................................................- 15 -

4.2 人工势场三维航路规划算法 ...........................................................................- 15 -

4.2.1 引力场与斥力场的定义及计算方法 .....................................................- 15 -

4.2.2 改进人工势场算法 .................................................................................- 18 -

4.2.3 带运动学约束的人工势场算法 .............................................................- 21 -

剩余75页未读，继续阅读

评论收藏

内容反馈

版权申诉

xiaofeixia5

2024-04-07

发现一个宝藏资源，赶紧冲冲冲！支持大佬~
weixin_52384132

2024-04-03

感谢大佬分享的资源，对我启发很大，给了我新的灵感。
烈宇

2024-04-11

发现一个宝藏资源，资源有很高的参考价值，赶紧学起来~
weixin_42460541

2024-04-07

资源内容总结的很到位，内容详实，很受用，学到了~
可爱的草香

2024-04-23

资源有一定的参考价值，与资源描述一致，很实用，能够借鉴的部分挺多的，值得下载。

前往

页

iπ弟弟

粉丝: 1299
资源: 33

基于深度强化学习的无人机航路规划方法研究【毕业设计，本人博客来源论文】

基于深度强化学习的蜂窝无人机网络中的轨迹设计.pdf

基于深度强化学习的无人机飞行路线规划.pdf

基于深度强化学习的无人机区域覆盖路径规划研究.pdf

基于深度强化学习的无人机自主部署及能效优化策略.docx

基于深度强化学习的无人机着陆轨迹跟踪控制.pdf

【无人机路径规划】基于强化学习实现多无人机路径规划附matlab代码.zip

基于深度强化学习的三维路径规划算法设计Matlab源码含A星算法-RRT-AOC-APF算法+详细代码注释(毕设项目).zip

具有精英策略的深度强化学习无人机集群通信网络拓扑设计.pdf

基于深度学习的无人机故障诊断方法研究.pdf

基于深度学习的无人机航拍车辆检测.pdf

基于深度学习的无人机识别算法研究.pdf

基于深度学习的无人机影像夜光藻赤潮提取方法.pdf

论文研究-基于正态云自适应遗传算法的无人机航路规划.pdf

基于蚁群算法的无人机航路规划

基于强化学习实现路径规划附论文和python代码.zip

DRL UAV-path planning 深度强化学习无人机路径规划程序源码

Dubinscartrajecorytracking-master_MATLAB无人机_无人机_航路规划_航路规划算法_无人机规

基于深度学习技术的智能化无人机视觉系统设计研究.pdf

34个经典javaweb项目实例.zip

项目源码：基于Hadoop+Spark招聘推荐可视化系统 大数据项目 计算机毕业设计

毕业设计 springBoot人力资源管理系统+毕业论文+前后端源代码

毕业设计：舆情监测系统（SpringBoot+NLP）

基于spring boot的小区物业管理系统源码+论文+答辩ppt

计算机毕业设计：Flask股票数据采集分析可视化系统 python+爬虫+金融数据

人脸识别系统OpenCV+dlib+python（含数据库）Pyqt5界面设计 项目源码 毕业设计

毕业设计-基于JAVA的springboot超市进销存系统(源代码+论文）

基于51单片机的智能电子秤系统设计(含代码仿真及论文)无需积分！

Python爬取智联招聘网站数据，2023.10.31测试，可跑

OpenCV和YOLOv8 实时车速检测+车辆检测跟踪系统 深度学习 测速 计算机视觉 计算机毕业设计

最新资源

项目源码：基于Hadoop+Spark招聘推荐可视化系统大数据项目计算机毕业设计

人脸识别系统OpenCV+dlib+python（含数据库）Pyqt5界面设计项目源码毕业设计

OpenCV和YOLOv8 实时车速检测+车辆检测跟踪系统深度学习测速计算机视觉计算机毕业设计