# Deep RL for traffic signal control
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
This repo implements start-of-the-art mutli-agent (decentralized) deep RL algorithms for large-scale traffic signal control in SUMO-simulated environments.
Available cooperation levels:
* Centralized: a global agent that makes global control w/ global observation, reward.
* Decentralized: multiple local agents that make local control independently w/ neighborhood information sharing.
Available NN layers:
Fully-connected, LSTM.
Available algorithms:
IQL, IA2C, IA2C with stabilization (called MA2C in this paper). For more advanced algorithms, please check [deeprl_network](https://github.com/cts198859/deeprl_network).
Available environments:
* A 6-intersection benchmark traffic network. [Ye, Bao-Lin, et al. "A hierarchical model predictive control approach for signal splits optimization in large-scale urban road networks." IEEE Transactions on Intelligent Transportation Systems 17.8 (2016): 2182-2192.](https://ieeexplore.ieee.org/abstract/document/7406703/)
* A 5X5 traffic grid. [Chu, Tianshu, Shuhui Qu, and Jie Wang. "Large-scale traffic grid signal control with regional reinforcement learning." American Control Conference (ACC), 2016. IEEE, 2016.](https://ieeexplore.ieee.org/abstract/document/7525014/)
* A modified Monaco traffic network with 30 signalized intersections. [L. Codeca, J. Härri, "Monaco SUMO Traffic (MoST) Scenario: A 3D Mobility Scenario for Cooperative ITS" SUMO 2018, SUMO User Conference, Simulating Autonomous and Intermodal Transport Systems May 14-16, 2018, Berlin, Germany.](http://www.eurecom.fr/en/publication/5527/download/comsys-publi-5527.pdf) ([code](https://github.com/lcodeca/MoSTScenario))
## Requirements
* Python3==3.5
* [Tensorflow](http://www.tensorflow.org/install)==1.12.0
* [SUMO](http://sumo.dlr.de/wiki/Installing)>=1.1.0
Required packages can be installed by running `setup_mac.sh` or `setup_ubuntu.sh`.
Attention: the code on master branch is for SUMO version >= 1.1.0. Please go to branch [sumo-0.32.0](https://github.com/cts198859/deeprl_signal_control/tree/sumo-0.32.0) if you are using the old SUMO version.
## Usages
First define all hyperparameters in a config file under `[config_dir]`, and create the base directory of experiements `[base_dir]`. Before training, please call `build_file.py` under `[environment_dir]/data/` to generate SUMO network files for `small_grid` and `large_grid` environments.
1. To train a new agent, run
~~~
python3 main.py --base-dir [base_dir]/[agent] train --config-dir [config_dir] --test-mode no_test
~~~
`[agent]` is from `{ia2c, ma2c, iqll, iqld}`. `no_test` is suggested, since tests will significantly slow down the training speed.
2. To access tensorboard during training, run
~~~
tensorboard --logdir=[base_dir]/log
~~~
3. To evaluate and compare trained agents, run
~~~
python3 main.py --base-dir [base_dir] evaluate --agents [agents] --evaluation-seeds [seeds]
~~~
Evaluation data will be output to `[base_dir]/eva_data`, and make sure evaluation seeds are different from those used in training. Under default evaluation setting, the inference policy of A2C is stochastic whereas that of Q-learning is greedy (deterministic). To explicitly specifiy the inference policy type, pass argument `--evaluation-policy-type [default/stochastic/deterministic]`. Please note running a determinisitc inference policy for A2C may cause the performance loss, due to the violation of "on-policy" learning.
4. To visualize the agent behavior, run
~~~
python3 main.py --base-dir [base_dir] evaluate --agents [agent] --evaluation-seeds [seed] --demo
~~~
It is recommended to have only one agent and one evaluation seed for the demo run. This will launch the SUMO GUI, and `./large_grid/data/view.xml` can be applied to visualize queue length and intersectin delay in edge color and thickness. Below are a few example screenshots.
| t=1500s | t=2500s | t=3500s
:-------------------:|:--------------------:|:--------------------:
![](./figs/1500.png) | ![](./figs/2500.png) | ![](./figs/3500.png)
## Reproducibility
Due to SUMO version change and a few corresponding code modifications (e.g. `tau="0.5"` has to be removed from `vType` to prevent extensive vehicle collisions in simulation), it becomes difficult to reproduce paper results, which are based on SUMO 0.32.0. So we have re-run the experiments using SUMO 1.1.0 and provided the following training plots as reference. The conclusion still remains the same, that is, MA2C ~ IQL-LR > IA2C in large grid and MA2C > IA2C > IQL-LR in Monaco net. Note rather than reproducing exactly the same results, an evaluation is always valid as far as the comparison is fair, that is, fixing env config and seed across agents.
| large grid | Monaco net
:-------------------------------:|:------------------------------:
![](./figs/large_grid_train.png) | ![](./figs/real_net_train.png)
## Citation
If you find this useful in your research, please cite our paper "Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control" ([early access version](https://ieeexplore.ieee.org/document/8667868), [preprint version](https://arxiv.org/pdf/1903.04527.pdf)):
~~~
@article{chu2019multi,
title={Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control},
author={Chu, Tianshu and Wang, Jie and Codec{\`a}, Lara and Li, Zhaojian},
journal={IEEE Transactions on Intelligent Transportation Systems},
year={2019},
publisher={IEEE}
}
~~~
没有合适的资源?快使用搜索试试~ 我知道了~
用于大规模交通信号控制的多代理深度强化学习。_Python_Jupyter Notebook_下载.zip
共82个文件
py:25个
csv:19个
ini:14个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 8 浏览量
2023-04-26
11:23:23
上传
评论
收藏 7.5MB ZIP 举报
温馨提示
用于大规模交通信号控制的多代理深度强化学习。_Python_Jupyter Notebook_下载.zip
资源推荐
资源详情
资源评论
收起资源包目录
用于大规模交通信号控制的多代理深度强化学习。_Python_Jupyter Notebook_下载.zip (82个子文件)
deeprl_signal_control-master
utils.py 14KB
__init__.py 0B
setup_ubuntu.sh 277B
large_grid
__init__.py 0B
data
__init__.py 0B
network.pdf 158KB
build_file.py 16KB
view.xml 37KB
intersection.pdf 30KB
network1.pdf 157KB
build_file_old.py 15KB
real_net
__init__.py 0B
data
__init__.py 0B
network.pdf 1.65MB
build_file.py 7KB
view.xml 37KB
intersection.pdf 117KB
view1.xml 37KB
in
.DS_Store 6KB
most.net.xml 928KB
most.add.xml 12KB
most.sumocfg 1KB
main.py 9KB
LICENSE 1KB
agents
utils.py 10KB
__init__.py 0B
models.py 16KB
policies.py 16KB
setup_eval.py 2KB
figs
large_grid_train.png 108KB
1500.png 58KB
real_net_train.png 98KB
2500.png 65KB
3500.png 65KB
real_net_experimental_data
eva_data
real_net_greedy_traffic.csv 2.54MB
real_net_ia2c_traffic.csv 2.55MB
real_net_iqll_traffic.csv 2.57MB
real_net_greedy_trip.csv 906KB
real_net_ma2c_traffic.csv 2.47MB
real_net_iqll_control.csv 562KB
real_net_ma2c_trip.csv 1.09MB
real_net_iqll_trip.csv 729KB
real_net_iqld_traffic.csv 2.58MB
real_net_greedy_control.csv 557KB
real_net_iqld_trip.csv 482KB
real_net_iqld_control.csv 563KB
real_net_ia2c_trip.csv 1.03MB
real_net_ia2c_control.csv 559KB
real_net_ma2c_control.csv 556KB
train_data
ia2c_train_reward.csv 75KB
iqll_real_train_reward.csv 75KB
ma2c_real_train_reward.csv 76KB
iqld_train_reward.csv 76KB
envs
__init__.py 0B
large_grid_env.py 14KB
real_net_env.py 7KB
env.py 26KB
small_grid_env.py 4KB
test_env.py 1KB
small_grid
__init__.py 0B
data
__init__.py 0B
network.pdf 31KB
build_file.py 18KB
extract_tensorboard.py 2KB
result_plot.ipynb 27KB
.ipynb_checkpoints
result_plot-checkpoint.ipynb 21KB
README.md 6KB
setup_mac.sh 469B
config
config_greedy_real.ini 532B
config_ia2c_large.ini 966B
config_test_real.ini 556B
config_iqll_real.ini 835B
config_test_large.ini 580B
config_ma2c_real.ini 935B
config_ia2c_real.ini 923B
config_test_small.ini 558B
config_iqld_real.ini 859B
config_ma2c_large.ini 978B
config_iqld_large.ini 901B
config_iqll_large.ini 877B
config_iqld_gym.ini 373B
config_greedy_large.ini 982B
共 82 条
- 1
资源评论
快撑死的鱼
- 粉丝: 1w+
- 资源: 9154
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功