<p align="center">
<img src="https://user-images.githubusercontent.com/17582508/52845370-4a930200-314a-11e9-9889-e00007043872.jpg" align="center">
[![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/medipixel/rl_algorithms.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/medipixel/rl_algorithms/context:python)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
[![All Contributors](https://img.shields.io/badge/all_contributors-10-orange.svg?style=flat-square)](#contributors-)
<!-- ALL-CONTRIBUTORS-BADGE:END -->
</p>
## Contents
* [Welcome!](https://github.com/medipixel/rl_algorithms#welcome)
* [Contributors](https://github.com/medipixel/rl_algorithms#contributors)
* [Algorithms](https://github.com/medipixel/rl_algorithms#algorithms)
* [Performance](https://github.com/medipixel/rl_algorithms#performance)
* [Getting Started](https://github.com/medipixel/rl_algorithms#getting-started)
* [Class Diagram](https://github.com/medipixel/rl_algorithms#class-diagram)
* [References](https://github.com/medipixel/rl_algorithms#references)
## Welcome!
This repository contains Reinforcement Learning algorithms which are being used for research activities at Medipixel. The source code will be frequently updated.
We are warmly welcoming external contributors! :)
|<img src="https://user-images.githubusercontent.com/17582508/52840582-18c76e80-313d-11e9-9752-3d6138f39a15.gif" width="260" height="180"/>|<img src="https://media.giphy.com/media/ZxLNajigOcLyeUnOwg/giphy.gif" width="160" height="180"/>|<img src="https://media.giphy.com/media/1mikGEln2lArKMQ6Pt/giphy.gif" width="260" height="180"/>|
|:---:|:---:|:---:|
|BC agent on LunarLanderContinuous-v2|RainbowIQN agent on PongNoFrameskip-v4|SAC agent on Reacher-v2|
## Contributors
Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->
<table>
<tr>
<td align="center"><a href="https://github.com/Curt-Park"><img src="https://avatars3.githubusercontent.com/u/14961526?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Jinwoo Park (Curt)</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=Curt-Park" title="Code">ð»</a></td>
<td align="center"><a href="https://github.com/MrSyee"><img src="https://avatars3.githubusercontent.com/u/17582508?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Kyunghwan Kim</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=MrSyee" title="Code">ð»</a></td>
<td align="center"><a href="https://github.com/darthegg"><img src="https://avatars3.githubusercontent.com/u/16010242?v=4?s=100" width="100px;" alt=""/><br /><sub><b>darthegg</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=darthegg" title="Code">ð»</a></td>
<td align="center"><a href="https://github.com/mclearning2"><img src="https://avatars3.githubusercontent.com/u/43226417?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Mincheol Kim</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=mclearning2" title="Code">ð»</a></td>
<td align="center"><a href="https://github.com/minseop4898"><img src="https://avatars1.githubusercontent.com/u/34338299?v=4?s=100" width="100px;" alt=""/><br /><sub><b>ê¹ë¯¼ì</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=minseop4898" title="Code">ð»</a></td>
<td align="center"><a href="https://github.com/jinPrelude"><img src="https://avatars1.githubusercontent.com/u/16518993?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Leejin Jung</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=jinPrelude" title="Code">ð»</a></td>
<td align="center"><a href="https://github.com/cyoon1729"><img src="https://avatars2.githubusercontent.com/u/33583101?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Chris Yoon</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=cyoon1729" title="Code">ð»</a></td>
</tr>
<tr>
<td align="center"><a href="https://jiseonghan.github.io/"><img src="https://avatars2.githubusercontent.com/u/48741026?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Jiseong Han</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=jiseongHAN" title="Code">ð»</a></td>
<td align="center"><a href="https://github.com/sehyun-hwang"><img src="https://avatars3.githubusercontent.com/u/23437715?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Sehyun Hwang</b></sub></a><br /><a href="#maintenance-sehyun-hwang" title="Maintenance">ð§</a></td>
<td align="center"><a href="https://github.com/isk03276"><img src="https://avatars.githubusercontent.com/u/23740495?v=4?s=100" width="100px;" alt=""/><br /><sub><b>eunjin</b></sub></a><br /><a href="https://github.com/medipixel/rl_algorithms/commits?author=isk03276" title="Code">ð»</a></td>
</tr>
</table>
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- ALL-CONTRIBUTORS-LIST:END -->
This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification.
## Algorithms
0. [Advantage Actor-Critic (A2C)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/a2c)
1. [Deep Deterministic Policy Gradient (DDPG)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/ddpg)
2. [Proximal Policy Optimization Algorithms (PPO)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/ppo)
3. [Twin Delayed Deep Deterministic Policy Gradient Algorithm (TD3)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/td3)
4. [Soft Actor Critic Algorithm (SAC)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/sac)
5. [Behaviour Cloning (BC with DDPG, SAC)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/bc)
6. [From Demonstrations (DDPGfD, SACfD, DQfD)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/fd)
7. [Rainbow DQN](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/dqn)
8. [Rainbow IQN (without DuelingNet)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/dqn) - DuelingNet [degrades performance](https://github.com/medipixel/rl_algorithms/pull/137)
9. Rainbow IQN (with [ResNet](https://github.com/medipixel/rl_algorithms/blob/master/rl_algorithms/common/networks/backbones/resnet.py))
10. [Recurrent Replay DQN (R2D1)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/recurrent)
11. [Distributed Pioritized Experience Replay (Ape-X)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/common/apex)
12. [Policy Distillation](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/distillation)
13. [Generative Adversarial Imitation Learning (GAIL)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/gail)
14. [Sample Efficient Actor-Critic with Experience Replay (ACER)](https://github.com/medipixel/rl_algorithms/tree/master/rl_algorithms/acer)
## Performance
We have tested each algorithm on some of the following environments.
- [PongNoFrameskip-v4](https://github.com/medipixel/rl_algorithms/tree/master/configs/pong_no_frameskip_v4)
- [LunarLanderContinuous-v2](https://github.com/medipixel/rl_algorithms/tree/master/configs/lunarlander_continuous_v2)
- [LunarLander_v2](https://github.com/medipixel/rl_algorithms/tree/master/configs/lunarlander_v2)
- [Reacher-v2](https://github.com/medipixel/rl_algorithms/tree/master/configs/reacher-v2)
âPlease note that this won't be fr
没有合适的资源?快使用搜索试试~ 我知道了~
RL关键算法的结构实现_Python_下载.zip
共165个文件
py:111个
yaml:29个
sh:5个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 24 浏览量
2023-04-30
10:23:26
上传
评论
收藏 1.52MB ZIP 举报
温馨提示
RL关键算法的结构实现_Python_下载.zip
资源推荐
资源详情
资源评论
收起资源包目录
RL关键算法的结构实现_Python_下载.zip (165个子文件)
.all-contributorsrc 3KB
.isort.cfg 177B
CODEOWNERS 299B
Dockerfile 911B
.flake8 246B
.gitignore 164B
MANIFEST.in 42B
mypy.ini 77B
Jenkinsfile 326B
Makefile 758B
README.md 23KB
README.md 2KB
LICENSE.md 1KB
reacher_demo.pkl 1.2MB
lunarlander_continuous_demo.pkl 977KB
lunarlander_discrete_demo.pkl 645KB
losses.py 15KB
replay_buffer.py 14KB
dqn_agent.py 14KB
dqn_agent.py 12KB
agent.py 11KB
atari_wrappers.py 11KB
learner.py 10KB
agent.py 10KB
losses.py 9KB
learner.py 9KB
multiprocessing_env.py 9KB
agent.py 9KB
distributed_logger.py 9KB
learner.py 8KB
agent.py 8KB
wrapper.py 8KB
agent.py 8KB
agent.py 8KB
learner.py 8KB
agent.py 8KB
networks.py 8KB
brain.py 8KB
heads.py 7KB
sac_agent.py 7KB
architecture.py 7KB
dqn_agent.py 7KB
sac_agent.py 7KB
ddpg_agent.py 7KB
learner.py 7KB
ddpg_agent.py 7KB
learner.py 6KB
agent.py 6KB
learner.py 6KB
helper_functions.py 6KB
worker.py 6KB
learner.py 6KB
resnet.py 6KB
utils.py 6KB
agent.py 5KB
distributed_worker.py 5KB
learner.py 5KB
learner.py 5KB
her.py 5KB
distributed_worker.py 5KB
learner.py 5KB
grad_cam.py 5KB
sac_learner.py 5KB
sac_learner.py 4KB
run_pong_no_frameskip_v4.py 4KB
ddpg_learner.py 4KB
segment_tree.py 4KB
linear.py 4KB
ddpg_learner.py 4KB
distillation_buffer.py 4KB
dqn_learner.py 4KB
registry.py 4KB
buffer.py 4KB
run_reacher_v2.py 4KB
run_lunarlander_continuous_v2.py 3KB
run_lunarlander_v2.py 3KB
saliency_map.py 3KB
her.py 3KB
test_cnn_cfg.py 3KB
test_config_registry.py 3KB
test_prioritized_buffer.py 3KB
test_run_distillation_agent.py 3KB
networks.py 3KB
__init__.py 2KB
config.py 2KB
test_run_agent.py 2KB
test_distillation_buffer.py 2KB
cnn.py 2KB
registry.py 2KB
distributed_logger.py 2KB
noise.py 2KB
test_helper_funcion.py 2KB
test_run_apex.py 2KB
gail_buffer.py 2KB
utils.py 1KB
test_uniform_buffer.py 1KB
utils.py 1KB
buffer.py 1KB
setup.py 1KB
normalizers.py 1KB
共 165 条
- 1
- 2
资源评论
快撑死的鱼
- 粉丝: 1w+
- 资源: 9153
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- javaweb网上购物系统(源码+数据库+文档)高分毕业设计&期末大作业
- NX二次开发UF-CURVE-free-wrap-parms 函数介绍
- 详解计算机网络经典面试题.pdf
- NX二次开发UF-CURVE-free-trim 函数介绍
- app1111111111111111111111.py
- Docker常用命令合集.pdf
- NX二次开发UF-CURVE-free-curve-struct 函数介绍
- NX二次开发UF-CURVE-fix-spline-data 函数介绍
- NX二次开发UF-CURVE-evaluate-curve-structure 函数介绍
- 软件测试面试题-基础知识问题&测试方法和策略&工具和技术
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功