# Faster R-CNN in MXNet with distributed implementation and data parallelization
![example detections](https://cloud.githubusercontent.com/assets/13162287/22101032/92085dc0-de6c-11e6-9228-67e72606ddbc.png)
## Why?
There exist good implementations of Faster R-CNN yet they lack support for recent
ConvNet architectures. The aim of reproducing it from scratch is to fully utilize
MXNet engines and parallelization for object detection.
| Indicator | py-faster-rcnn (caffe resp.) | mx-rcnn (this reproduction) |
| :-------- | :--------------------------- | :-------------------------- |
| Speed [1] | 2.5 img/s training, 5 img/s testing | 3.8 img/s in training, 12.5 img/s testing |
| Performance [2] | mAP 73.2 | mAP 75.97 |
| Efficiency [3] | 11G for Fast R-CNN | 4.6G for Fast R-CNN |
| Parallelization [4] | None | 3.8 img/s to 6 img/s for 2 GPUs |
| Extensibility [5] | Old framework and base networks | ResNet |
[1] On Ubuntu 14.04.5 with device Titan X, cuDNN enabled.
The experiment is VGG-16 end-to-end training.
[2] VGG network. Trained end-to-end on VOC07trainval+12trainval, tested on VOC07 test.
[3] VGG network. Fast R-CNN is the most memory expensive process.
[4] VGG network (parallelization limited by bandwidth).
ResNet-101 speeds up from 2 img/s to 3.5 img/s.
[5] py-faster-rcnn does not support ResNet or recent caffe version.
## Why Not?
* If you value stability and reproducibility over performance and efficiency, please refer to official implementations.
There is no promise in all cases nor all experiments.
* If you value simplicity. Technical details are *very complicated* in MXNet.
This is by design to attain maximum possible performance instead of patching fixes after fixes.
Performance and parallelization are more than a change of parameter.
* If you want to do CPU training, be advised that it has not been verified yet.
You will not encounter NOT_IMPLEMENTED_ERROR so it is still possible.
* If you are on Windows or Python3, some people reported it was possible with some modifications.
But they have disappeared.
## Experiments
| Method | Network | Training Data | Testing Data | Reference | Result | Link |
| :----- | :------ | :------------ | :----------- | :-------: | :----: | :---: |
| Fast R-CNN | VGG16 | VOC07 | VOC07test | 66.9 | 66.50 | [Dropbox](https://www.dropbox.com/s/xmxjitv0kl96h7v/vgg_fast_rcnn-0008.params?dl=0) |
| Faster R-CNN alternate | VGG16 | VOC07 | VOC07test | 69.9 | 69.62 | [Dropbox](https://www.dropbox.com/s/fgj71uzxz8h6ajj/vgg_voc_alter-0008.params?dl=0) |
| Faster R-CNN end-to-end | VGG16 | VOC07 | VOC07test | 69.9 | 70.23 | [Dropbox](https://www.dropbox.com/s/gfxnf1qzzc0lzw2/vgg_voc07-0010.params?dl=0) |
| Faster R-CNN end-to-end | VGG16 | VOC07+12 | VOC07test | 73.2 | 75.97 | [Dropbox](https://www.dropbox.com/s/rvktx65s48cuyb9/vgg_voc0712-0010.params?dl=0) |
| Faster R-CNN end-to-end | ResNet-101 | VOC07+12 | VOC07test | 76.4 | 79.35 | [Dropbox](https://www.dropbox.com/s/ge2wl0tn47xezdf/resnet_voc0712-0010.params?dl=0) |
| Faster R-CNN end-to-end | VGG16 | COCO train | COCO val | 21.2 | 22.8 | [Dropbox](https://www.dropbox.com/s/e0ivvrc4pku3vj7/vgg_coco-0010.params?dl=0) |
| Faster R-CNN end-to-end | ResNet-101 | COCO train | COCO val | 27.2 | 26.1 | [Dropbox](https://www.dropbox.com/s/bfuy2uo1q1nwqjr/resnet_coco-0010.params?dl=0) |
The above experiments were conducted at [a mx-rcnn version](https://github.com/precedenceguo/mx-rcnn/tree/6a1ab0eec5035a10a1efb5fc8c9d6c54e101b4d0)
using [a MXNet fork, based on MXNet 0.9.1 nnvm pre-release](https://github.com/precedenceguo/mxnet/tree/simple).
## I'm Feeling Lucky
* Prepare: `bash script/additional_deps.sh`
* Download training data: `bash script/get_voc.sh`
* Download pretrained model: `bash script/get_pretrained_model.sh`
* Training and testing: `bash script/vgg_voc07.sh 0,1` (use gpu 0 and 1)
## Getting started
See if `bash script/additional_deps.sh` will do the following for you.
* Suppose `HOME` represents where this file is located. All commands, unless stated otherwise, should be started from `HOME`.
* Install python package `cython easydict matplotlib scikit-image`.
* Install MXNet version v0.9.5 or higher and MXNet Python Interface. Open `python` type `import mxnet` to confirm.
* Run `make` in `HOME`.
Command line arguments have the same meaning as in mxnet/example/image-classification.
* `prefix` refers to the first part of a saved model file name and `epoch` refers to a number in this file name.
In `model/vgg-0000.params`, `prefix` is `"model/vgg"` and `epoch` is `0`.
* `begin_epoch` means the start of your training process, which will apply to all saved checkpoints.
* Remember to turn off cudnn auto tune. `export MXNET_CUDNN_AUTOTUNE_DEFAULT=0`.
## Demo (Pascal VOC)
* An example of trained model (trained on VOC07 trainval) can be accessed from
[Baidu Yun](http://pan.baidu.com/s/1boRhGvH) (ixiw) or
[Dropbox](https://www.dropbox.com/s/jrr83q0ai2ckltq/final-0000.params.tar.gz?dl=0).
If you put the extracted model `final-0000.params` in `HOME` then use `--prefix final --epoch 0` to access it.
* Try out detection result by running `python demo.py --prefix final --epoch 0 --image myimage.jpg --gpu 0 --vis`.
Drop the `--vis` if you do not have a display or want to save as a new file.
## Training Faster R-CNN
The following tutorial is based on VOC data, VGG network. Supply `--network resnet` and
`--dataset coco` to use other networks and datasets.
Refer to `script/vgg_voc07.sh` and other experiments for examples.
### Prepare Training Data
See `bash script/get_voc.sh` and `bash script/get_coco.sh` will do the following for you.
* Make a folder `data` in `HOME`. `data` folder will be used to place the training data folder `VOCdevkit` and `coco`.
* Download and extract [Pascal VOC data](http://host.robots.ox.ac.uk/pascal/VOC/), place the `VOCdevkit` folder in `HOME/data`.
* Download and extract [coco dataset](http://mscoco.org/dataset/), place all images to `coco/images` and annotation jsons to `data/annotations`.
(Skip this if not interested) All dataset have three attributes, `image_set`, `root_path` and `dataset_path`.
* `image_set` could be `2007_trainval` or something like `2007trainval+2012trainval`.
* `root_path` is usually `data`, where `cache`, `selective_search_data`, `rpn_data` will be stored.
* `dataset_path` could be something like `data/VOCdevkit`, where images, annotations and results can be put so that many copies of datasets can be linked to the same actual place.
### Prepare Pretrained Models
See if `bash script/get_pretrained_model.sh` will do this for you. If not,
* Make a folder `model` in `HOME`. `model` folder will be used to place model checkpoints along the training process.
It is recommended to set `model` as a symbolic link to somewhere else in hard disk.
* Download VGG16 pretrained model `vgg16-0000.params` from [MXNet model gallery](https://github.com/dmlc/mxnet-model-gallery/blob/master/imagenet-1k-vgg.md) to `model` folder.
* Download ResNet pretrained model `resnet-101-0000.params` from [ResNet](https://github.com/tornadomeet/ResNet) to `model` folder.
### Alternate Training
See if `bash script/vgg_alter_voc07.sh 0` (use gpu 0) will do the following for you.
* Start training by running `python train_alternate.py`. This will train the VGG network on the VOC07 trainval.
More control of training process can be found in the argparse help.
* Start testing by running `python test.py --prefix model/final --epoch 0` after completing the training process.
This will test the VGG network on the VOC07 test with the model in `HOME/model/final-0000.params`.
Adding a `--vis` will turn on visualization and `-h` will show help as in the training process.
### End-to-end Training (approximate process)
See if `bash script/vgg_voc07.sh 0` (use gpu 0) will do the following for you.
* Start training by running `python train_
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
基于Faster-RCNN算法对目标进行识别与分类源码简介本资源提供了基于Faster-RCNN算法的目标识别与分类源码,旨在帮助初学者和研究人员快速上手并理解这一先进的目标检测算法。Faster-RCNN结合了深度学习的强大特征提取能力和区域建议网络(RPN)的高效候选区域生成,实现了端到端的目标检测流程。该源码包含了从数据预处理、模型构建、训练到测试的完整流程。通过使用VGG16等预训练模型提取图像特征,RPN生成候选区域,并通过RoI Pooling处理不同大小的候选区域,最后结合多任务损失函数进行分类和定位。源码中详细注释了每个步骤,便于用户理解和修改。请注意,由于Faster-RCNN算法的复杂性,建议使用者具备一定的深度学习基础知识和编程经验。同时,由于技术和硬件的快速发展,某些参数设置可能需要根据具体环境进行调整以获得最佳性能。该资源为学习资源,仅供学习和研究目的使用,不得用于任何商业用途或非法活动。希望这份源码能够成为你探索目标检测领域的有力工具。
资源推荐
资源详情
资源评论
收起资源包目录
基于Faster-RCNN算法对目标进行识别与分类源码.zip (187个子文件)
_mask.c 670KB
maskApi.c 8KB
nms_kernel.cu 5KB
dummy 0B
dummy 0B
dummy 0B
dummy 0B
dummy 0B
.gitignore 15B
maskApi.h 2KB
gpu_nms.hpp 146B
mx-rcnn-nanrui.iml 284B
VOCevallayout.m 5KB
example_layout.m 4KB
example_detector.m 4KB
create_segmentations_from_detections.m 4KB
PASreadrectxt.m 3KB
viewdet.m 3KB
VOCevaldet.m 3KB
example_classifier.m 3KB
VOCinit.m 3KB
VOCevalseg.m 3KB
viewanno.m 2KB
VOCxml2struct.m 2KB
VOCreadrecxml.m 2KB
VOCevalcls.m 1KB
VOCwritexml.m 1KB
VOClabelcolormap.m 669B
example_segmenter.m 366B
PASerrmsg.m 297B
PASemptyobject.m 224B
PASreadrecord.m 210B
VOCreadxml.m 200B
PASemptyrecord.m 134B
Makefile 264B
README.md 13KB
config.mk 53B
devkit_doc.pdf 175KB
show.png 1.61MB
dl-window.py 32KB
cocoeval.py 23KB
symbol_vgg.py 21KB
coco.py 18KB
loader.py 15KB
imdb.py 13KB
symbol_resnet.py 12KB
pascal_voc.py 11KB
train_end2end.py 9KB
rpn.py 9KB
proposal.py 9KB
coco.py 9KB
train_rpn.py 9KB
train_rcnn.py 9KB
tester.py 8KB
module.py 8KB
rcnn.py 7KB
result.py 6KB
pascal_voc_eval.py 6KB
train_alternate.py 6KB
config_analysis.py 6KB
setup.py 5KB
bbox_transform.py 5KB
metric.py 5KB
image.py 5KB
test_rcnn.py 5KB
bbox_regression.py 5KB
config.py 5KB
mask.py 4KB
test_rpn.py 4KB
proposal_target.py 4KB
load_model.py 2KB
generate_anchor.py 2KB
load_data.py 2KB
callback.py 2KB
nms.py 2KB
reeval.py 1KB
Train_data.py 955B
save_model.py 762B
combine_model.py 709B
setup.py 579B
Global.py 543B
ds_utils.py 442B
logger.py 113B
__init__.py 77B
__init__.py 53B
__init__.py 21B
__init__.py 0B
__init__.py 0B
__init__.py 0B
__init__.py 0B
__init__.py 0B
__init__.py 0B
__init__.py 0B
cocoeval.pyc 17KB
symbol_vgg.pyc 15KB
coco.pyc 15KB
loader.pyc 14KB
imdb.pyc 11KB
pascal_voc.pyc 11KB
symbol_resnet.pyc 10KB
共 187 条
- 1
- 2
资源评论
葡萄籽儿
- 粉丝: 727
- 资源: 2493
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 整治个体医疗机构违法违规执业行为 守护百姓就医健康安全工作方案.docx
- 职业技术学院修缮项目管理暂行办法.docx
- 中标后的具体实施方案.docx
- 综合行政执法人员制服着装制度.docx
- 基于SSM框架的Java超市管理系统设计与实现
- comsol高压电力电缆电场计算模型,可以得到电缆内部电势、电场及各个位置电场线分布,提供comsol详细学习资料及模型
- 云计算试题及答案 判断选择
- C#winform银行管理系统(源码+数据库db文件)银行卡管理,存取款,账单查询,转账,信用卡等功能;账户还拥有临时钱包功能,可以用于存款等功能,同时接收转账自动存入临时钱包,非常方便
- HTML5实现好看的端午节网页源码.zip
- 三菱Q系列PLC 堆垛程序,QD77MS16走8轴总线控制伺服项目,实际应用的项目,包含PLC程序+三菱HMI程序+元件分配表+电气原理图整套项目资料
- 物联网试题及答案 选择判断
- 高频正弦波振荡电路[参数为10M、100M],以及高频小信号放大电路、丙类功率放大电路的Multisim仿真
- 开关磁阻电机调速系统仿真 角度控制 PWM控制 三相开关磁阻电机6 4极 功率转信号 matlab任何版本都可,需要其他模型可加好友 matlab仿真word文档讲解,simulink仿真源文件
- Python项目开发全览:涵盖Web开发、数据科学、机器学习与工具
- Python 学生宿舍管理系统源码,有详细的功能要求、使用技术、数据库设计、用户界面搭建、扩展需求-安全控制说明,可供计算机相关专业学生作为 2025 年毕设开发项目参考
- 基于SpringBoot和MySQL的企业会议室预约管理系统设计与实现
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功