没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
试读
76页
应博客粉丝响应,这里给出我的博客:https://zyunfei.blog.csdn.net/article/details/117911809?ydreferer=aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl80MzE0NTk0MS9jYXRlZ29yeV8xMDYxMzQzMC5odG1sP3NwbT0xMDAxLjIwMTQuMzAwMS41NDgy 对应的毕业设计论文,这篇毕业设计共计80页左右,主要是将强化学习应用于无人机路径规划,有需要的可以付费购买(毕竟内容很多,知识付费是有必要的)。同时,这篇论文的开源代码地址:https://github.com/ZYunfeii/UAV_Obstacle_Avoiding_DRL 可以给我点个star。
资源推荐
资源详情
资源评论
东北大学本科生毕业设计(论文) 摘要
I
摘 要
本文研究了基于深度强化学习的无人机航路规划方法。针对静态障碍环境,受到
人工势场算法(APF)的启发,本文提出了一种基于多智能体强化学习的动态自适应势
场算法(DAPF),该算法的核心是动态地改变“势场”从而引导无人机规划出更优质的
航路。算法由两部分组成,顶层为规划层,负责输出无人机下一航路点的位置,底层
为学习与决策层,由强化学习模型组成,负责为顶层提供策略。在 DAPF 中,每个障
碍物被当作一个智能体,智能体之间通过学习与合作的方式改善航路。由于多智能体
强化学习一般有三种实现方式即集中式训练集中式执行、分布式训练分布式执行、集
中式训练分布式执行,本文研究了使用不同形式多智能体算法的 DAPF 算法在静态
障碍环境下的规划效果,并分析各自的应用场景。另外,将该算法与传统路径规划算
法例如 RRT、A*、蚁群算法在相同环境下进行实验比较,结果证明该算法在航路最
大曲折度、航路全局曲折度、航路距离等方面具有显著的优势。同时,作为 APF 算
法的改进,DAPF 还解决了局部极小点这一势场法普遍存在的问题。
针对动态障碍环境,受到自然界流水避石的启发,本文提出了一种自适应扰动流
场动态系统算法(AIFDS),AIFDS 算法的核心为自适应地改变空间中流场的走向和大
小从而改变无人机的规划航路。在该算法中,无人机被当作智能体,它通过与环境的
交互,逐渐学会如何调节流场从而提前规划出一条安全性高、距离短、执行时间短的
航路。 AIFDS算法可以和几乎所有具有连续动作空间的强化学习算法结合,本文研
究了其与最大熵强化学习算法(SAC)、深度确定性策略梯度算法(DDPG)、近端策略优
化算法(PPO)、双延迟深度确定性策略梯度算法(TD3)的结合算法,分别在具有多个动
态障碍物的环境下进行实验,结果证明 AIFDS 在航路安全性等方面具有非常好的表
现。同时,将其与改进前的 IFDS 算法实验比较发现,AIFDS 在航路的各项评价指标
上有较大幅度的提升。
针对深度强化学习运算量密集且训练速度慢问题,本文给出了一种多进程加速框
架予以解决,该框架利用子进程并行地收集经验并提供给主进程用于更新网络。实验
结果证明多进程技术可以提升约一倍的效率,大幅度缩减训练时间。
关键词: 深度强化学习;航路规划;人工势场算法;扰动流场动态系统算法;多进
程训练
东北大学本科生毕业设计(论文) Abstract
II
ABSTRACT
This paper studies the method of UAV route planning based on deep reinforcement
learning. In view of the static obstacle environment, inspired by the artificial potential field
algorithm, this paper proposes the dynamic adaptive potential field algorithm (DAPF) based
on Multi-Agent Reinforcement Learning. The core of this algorithm is to change the
potential field dynamically to guide the UAV to plan a better route. DAPF consists of two
parts, the top layer is the planning layer, which is responsible for outputting the position of
the next navigation point of UAV. The bottom layer is the learning and decision-making layer,
which consists of the reinforcement learning model and is responsible for providing
strategies for the top layer. In DAPF, each obstacle is regarded as an agent, and the agents
improve the route through learning and cooperation. As there are three ways to implement
Multi-Agent Reinforcement Learning, namely centralized training centralized execution,
distributed training distributed execution and centralized training distributed execution, this
paper studies the planning effect of DAPF using different forms of Multi-Agent algorithm
in static obstacle environment, and analyzes their respective application scenarios. In
addition, DAPF algorithm and traditional path planning algorithms such as RRT, A* and Ant
Colony algorithm are compared in the same environment. The results show that the DAPF
proposed in this paper has all-round advantages in the maximum tortuosity, global tortuosity
and distance of the route. At the same time, as an improvement of the artificial potential field
algorithm, DAPF also solves the common problem of local minimum.
As for the dynamic obstacle environment, inspired by the natural water to avoid rocks,
this paper proposes an adaptive interfered fluid dynamical system (AIFDS). The core of
AIFDS is to adaptively change the direction and size of the flow field in the space, so as to
change the planning route of UAV. In AIFDS, UAV is regarded as an agent. Through the
interaction with the environment, it gradually learns how to adjust the flow field, so as to
plan a route with high safety, short distance and short execution time in advance. AIFDS can
be combined with almost all reinforcement learning algorithms with continuous action space.
This paper studies the combination of AIFDS with SAC, DDPG, PPO, TD3 algorithm.
Experiments are carried out in the environment with multiple dynamic obstacles, and the
东北大学本科生毕业设计(论文) Abstract
III
results show that AIFDS has a bright performance in the aspect of route safety. At the same
time, compared with the IFDS, it is found that the evaluation indexes of the route are
improved greatly.
Aiming at the problem of intensive computation and slow training speed in deep
reinforcement learning, this paper proposes a multi process acceleration framework to solve
the problem. The framework uses sub processes to collect experience in parallel and provide
it to the main process for updating the network. The experimental results show that the multi
process technology can double the efficiency and greatly reduce the training time.
Key words: Deep reinforcement learning; Route planning; Artificial potential field;
Interfered fluid dynamical system; Multi process training
东北大学本科生毕业设计(论文) 目录
V
目 录
摘 要 ................................................................................................................ I
ABSTRACT ..................................................................................................... II
1. 绪论 ........................................................................................................ - 1 -
1.1 引言 .....................................................................................................................- 1 -
1.2 研究背景 .............................................................................................................- 1 -
1.3 国内外研究现状 .................................................................................................- 2 -
1.4 本文的研究内容和技术路线 .............................................................................- 3 -
1.5 本章小结 .............................................................................................................- 5 -
2. 无人机航路规划约束及性能指标 ........................................................ - 7 -
2.1 引言 .....................................................................................................................- 7 -
2.2 无人机运动学约束 .............................................................................................- 7 -
2.3 规划航路性能指标选定 .....................................................................................- 8 -
2.4 本章小结 .............................................................................................................- 9 -
3. 强化学习基础 .......................................................................................- 11 -
3.1 引言 ................................................................................................................... - 11 -
3.2 马尔科夫决策过程 ........................................................................................... - 11 -
3.3 状态值函数和状态-行为值函数 ......................................................................- 12 -
3.4 贝尔曼方程与最优贝尔曼方程 .......................................................................- 12 -
3.5 价值学习和策略学习 .......................................................................................- 14 -
3.6 本章小结 ...........................................................................................................- 14 -
4. 静态环境下深度强化学习航路规划 .................................................. - 15 -
4.1 引言 ...................................................................................................................- 15 -
4.2 人工势场三维航路规划算法 ...........................................................................- 15 -
4.2.1 引力场与斥力场的定义及计算方法 .....................................................- 15 -
4.2.2 改进人工势场算法 .................................................................................- 18 -
4.2.3 带运动学约束的人工势场算法 .............................................................- 21 -
剩余75页未读,继续阅读
iπ弟弟
- 粉丝: 1299
- 资源: 33
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 论文(最终)_20240430235101.pdf
- 基于python编写的Keras深度学习框架开发,利用卷积神经网络CNN,快速识别图片并进行分类
- 最全空间计量实证方法(空间杜宾模型和检验以及结果解释文档).txt
- 5uonly.apk
- 蓝桥杯Python组的历年真题
- 2023-04-06-项目笔记 - 第一百十九阶段 - 4.4.2.117全局变量的作用域-117 -2024.04.30
- 2023-04-06-项目笔记 - 第一百十九阶段 - 4.4.2.117全局变量的作用域-117 -2024.04.30
- 前端开发技术实验报告:内含4四实验&实验报告
- Highlight Plus v20.0.1
- 林周瑜-论文.docx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
- 1
- 2
前往页