### Deep RL algorithms implemented using Pytorch
#### Algo list:
1. [DQN](https://github.com/akashe/DeepReinforcementLearning/blob/main/DQN.py)
2. [Vanilla policy Gradient](https://github.com/akashe/DeepReinforcementLearning/blob/main/vanilla_policy_gradient.py)
3. [Deep Deterministic Policy Gradient](https://github.com/akashe/DeepReinforcementLearning/blob/main/ddpg.py)
4. [Twin Delayed Deep Deterministic Policy Gradient](https://github.com/akashe/DeepReinforcementLearning/blob/main/td3.py)
5. [Soft Actor Critic](https://github.com/akashe/DeepReinforcementLearning/blob/main/SoftActorCritic.py)
6. [Proximal Policy Optimization - CLIP](https://github.com/akashe/DeepReinforcementLearning/blob/main/ppo_clip.py)
###### Article on deeper Look into [policy gradients](https://akashe.io/blog/2020/10/14/policy-gradient-methods/)
#### Experimental Results:
|Algorithm| Discrete Env: LunarLander-v2 | Continuous Env: Pendulum-v0 |
| :---: | :---: | :---: |
| DQN | ![LunnarLander-DQN](https://raw.githubusercontent.com/akashe/DeepReinforcementLearning/main/figures/DQN_Lunar_lander_rewards.png) | - |
| VPG | ![LunarLander-VPG](https://raw.githubusercontent.com/akashe/DeepReinforcementLearning/main/figures/VPG_LunarLander-v2_rewards.png) | - |
| DDPG | - | ![Pendulum-DDPG](https://raw.githubusercontent.com/akashe/DeepReinforcementLearning/main/figures/DDPG_Pendulum-v0_rewards.png)|
| TD3 | - | ![Pendulum-TD3](https://raw.githubusercontent.com/akashe/DeepReinforcementLearning/main/figures/TD3_Pendulum_rewards.png) |
| SAC | - | ![Pendulum-SAC](https://raw.githubusercontent.com/akashe/DeepReinforcementLearning/main/figures/SAC_Pendulum-v0_rewards.png) |
| PPO | - | ![Pendulum-PPO](https://raw.githubusercontent.com/akashe/DeepReinforcementLearning/main/figures/PPO_Pendulum-v0_rewards.png) |
#### Usage:
Just run the file/algorithm directly. There is no common structures between algorithms as I implemented them as I learnt them.
Different algorithms are inspired from different sources.
#### Resources:
1. [RL course by David Silver](https://www.youtube.com/watch?v=KHZVXao4qXs&list=PLqYmG7hTraZDM-OYHWgPebj2MfCFzFObQ&index=7)
2. [Lecture slides for above course](https://www.davidsilver.uk/teaching/)
3. [Spinning up by OpenAI](https://spinningup.openai.com)
3. [More exhaustive RL guide by Deeny Britz](https://github.com/dennybritz/reinforcement-learning)
#### Future projects:
1. If time available I will add a simple program for elevator using RL.
2. Better graphs
没有合适的资源?快使用搜索试试~ 我知道了~
DeepReinforcementLearning:深度RL实施。 在pytorch中实现的DQN,SAC,DDPG,TD3,P...
共42个文件
py:15个
png:8个
pyc:8个
5星 · 超过95%的资源 需积分: 46 62 下载量 18 浏览量
2021-02-13
07:25:10
上传
评论 10
收藏 391KB ZIP 举报
温馨提示
使用Pytorch实现的深度RL算法 算法列表: 关于深入探讨 实验结果: 算法 离散环境:LunarLander-v2 连续环境:Pendulum-v0 DQN -- VPG -- DDPG -- TD3 -- SAC -- PPO -- 用法: 只需直接运行文件/算法。 在我学习算法时,它们之间没有通用的结构。 不同的算法来自不同的来源。 资源: 未来的项目: 如果有时间,我将为使用RL的电梯添加一个简单的程序。 更好的图形
资源详情
资源评论
资源推荐
收起资源包目录
DeepReinforcementLearning-main.zip (42个子文件)
DeepReinforcementLearning-main
td3.py 4KB
.ipynb_checkpoints
test_and_intial_Experimentation-checkpoint.ipynb 72B
Policy Gradient Methods-checkpoint.ipynb 13KB
RLUtils
__init__.py 21B
utils.py 3KB
__pycache__
utils.cpython-37.pyc 4KB
__init__.cpython-37.pyc 179B
SoftActorCritic.py 3KB
Policy Gradient Methods.ipynb 13KB
Readme.md 2KB
.idea
.gitignore 47B
misc.xml 292B
vcs.xml 180B
inspectionProfiles
Project_Default.xml 659B
profiles_settings.xml 174B
modules.xml 294B
ReinforcementLearning.iml 317B
ppo_clip.py 4KB
ddpg.py 9KB
agents
__init__.py 57B
agent.py 125B
__pycache__
__init__.cpython-37.pyc 225B
agent.cpython-37.pyc 560B
ActorCriticAgents
__init__.py 63B
PPO_clip_agent.py 11KB
td3_agent.py 7KB
soft_Actor_critic_Agent.py 7KB
__pycache__
soft_Actor_critic_Agent.cpython-37.pyc 6KB
td3_agent.cpython-37.pyc 6KB
PPO_clip_agent.cpython-37.pyc 8KB
__init__.cpython-37.pyc 235B
MLPAgent.py 0B
figures
PPO_MountainCarContinuous-v0_rewards.png 22KB
DQN_Lunar_lander_losses.png 38KB
VPG_LunarLander-v2_rewards.png 38KB
SAC_Pendulum-v0_rewards.png 51KB
DQN_Lunar_lander_rewards.png 48KB
TD3_Pendulum_rewards.png 62KB
DDPG_Pendulum-v0_rewards.png 43KB
PPO_Pendulum-v0_rewards.png 57KB
vanilla_policy_gradient.py 8KB
DQN.py 19KB
共 42 条
- 1
WillisWang
- 粉丝: 23
- 资源: 4701
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论1