simple-faster-rcnn-pytorch_simple-faster-rcnn-_rcnnpytorch代码_pyt

共37个文件

py：25个

xml：4个

iml：1个

版权申诉

RCNN

5星 · 超过95%的资源 18 浏览量 2021-09-11 12:36:07 上传评论 2 收藏 698KB RAR 举报

《深度学习目标检测：PyTorch实现简单版Faster R-CNN详解》在计算机视觉领域，目标检测是一项核心任务，旨在识别并定位图像中的特定对象。Faster R-CNN（快速区域卷积神经网络）是由Ross Girshick等人提出的，它在R-CNN和Fast R-CNN的基础上进行了优化，大大提高了目标检测的效率。本文将围绕基于PyTorch实现的简单版Faster R-CNN进行深入探讨，帮助读者理解其工作原理和实现细节。 Faster R-CNN的核心创新在于引入了Region Proposal Network (RPN)，它与主网络共享卷积层，同时生成候选框和分类得分，使得目标检测过程更加高效。在PyTorch实现中，这一特性使得代码更易于理解和实现。 1. **基础架构**：Faster R-CNN通常由两部分组成：RPN和检测网络。RPN首先在特征图上滑动窗口，生成一系列可能包含物体的候选区域（RoIs）。接着，这些RoIs经过RoI Pooling层，转换为固定大小的特征向量，输入到全连接层进行分类和回归，以确定最终的目标位置和类别。 2. **RoI Pooling**：RoI Pooling是Faster R-CNN的关键操作，它将不同大小的RoI转换为固定尺寸的特征图，确保后续的全连接层可以处理。在PyTorch中，RoIAlign层取代了RoIPooling，以更精确地对应RoI内的像素。 3. **损失函数**：Faster R-CNN的损失函数包括两个部分：RPN的分类损失和回归损失，以及检测网络的分类和边界框回归损失。分类损失衡量的是候选框是否包含目标，而回归损失则用于微调RoI的位置。 4. **训练流程**：在训练过程中，先独立训练RPN，然后固定RPN的权重，联合训练检测网络。这种分阶段训练的方式有助于收敛。 5. **代码注释**：提供的代码带有详细注释，有利于初学者理解每一步操作的含义，例如如何构建网络结构、前向传播过程、损失计算以及反向传播等步骤。 6. **PyTorch优势**：PyTorch的动态计算图特性使得模型构建和调试更为直观，它的模块化设计也便于复用和扩展。此外，PyTorch的社区支持丰富，有大量预训练模型和工具库可供参考。通过学习这个简化版的Faster R-CNN PyTorch实现，开发者不仅可以掌握目标检测的基本原理，还能熟悉PyTorch的使用，为后续研究和开发其他深度学习模型打下坚实基础。在实际应用中，Faster R-CNN还可以与其他技术结合，如Mask R-CNN进行实例分割，或者与YOLO系列模型结合，提升实时性。理解并实践这个代码，对于提升深度学习目标检测技能至关重要。

资源推荐

资源详情

资源评论

收起资源包目录

simple-faster-rcnn-pytorch_simple-faster-rcnn-_rcnnpytorch代码_pytorchfasterRCNN_深度学习目标_RCNN_源码.rar （37个子文件）

simple-faster-rcnn-pytorch

trainer.py 12KB

README.MD 8KB

requirements.txt 117B

data

__init__.py 0B

voc_dataset.py 6KB

dataset.py 6KB

model

faster_rcnn_vgg16.py 6KB

faster_rcnn.py 11KB

region_proposal_network.py 10KB

__init__.py 52B

roi_module.py 4KB

utils

creator_tool.py 22KB

roi_cupy.py 6KB

nms

__init__.py 75B

_nms_gpu_post_py.py 720B

non_maximum_suppression.py 6KB

_nms_gpu_post.pyx 970B

build.py 279B

__init__.py 0B

bbox_tools.py 13KB

LICENSE 2KB

demo.ipynb 686KB

utils

vis_tool.py 7KB

__init__.py 600B

config.py 2KB

array_tool.py 619B

eval_tool.py 12KB

.idea

misc.xml 327B

workspace.xml 2KB

inspectionProfiles

profiles_settings.xml 174B

simple-faster-rcnn-pytorch.iml 526B

modules.xml 311B

misc

convert_caffe_pretrain.py 729B

train_fast.py 4KB

demo.jpg 120KB

.gitattributes 18B

train.py 6KB

# A Simple and Fast Implementation of Faster R-CNN ## 1. Introduction **I've update the code to support both Python2 and Python3, PyTorch 0.4. If you want the old version code please checkout branch [v0.3](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/tree/0.3)** This project is a **Simplified** Faster R-CNN implementation based on [chainercv](https://github.com/chainer/chainercv) and other [projects](#acknowledgement) . It aims to: - Simplify the code (*Simple is better than complex*) - Make the code more straightforward (*Flat is better than nested*) - Match the performance reported in [origin paper](https://arxiv.org/abs/1506.01497) (*Speed Counts and mAP Matters*) And it has the following features: - It can be run as pure Python code, no more build affair. (cuda code moves to cupy, Cython acceleration are optional) - It's a minimal implemention in around 2000 lines valid code with a lot of comment and instruction.(thanks to chainercv's excellent documentation) - It achieves higher mAP than the origin implementation (0.712 VS 0.699) - It achieve speed compariable with other implementation (6fps and 14fps for train and test in TITAN XP with cython) - It's memory-efficient (about 3GB for vgg16) ![img](http://7zh43r.com1.z0.glb.clouddn.com/del/faster-speed.jpg) ## 2. Performance ### 2.1 mAP VGG16 train on `trainval` and test on `test` split. **Note**: the training shows great randomness, you may need a bit of luck and more epoches of training to reach the highest mAP. However, it should be easy to surpass the lower bound. | Implementation | mAP | | :--------------------------------------: | :---------: | | [origin paper](https://arxiv.org/abs/1506.01497) | 0.699 | | train with caffe pretrained model | 0.700-0.712 | | train with torchvision pretrained model | 0.685-0.701 | | model converted from [chainercv](https://github.com/chainer/chainercv/tree/master/examples/faster_rcnn) (reported 0.706) | 0.7053 | ### 2.2 Speed | Implementation | GPU | Inference | Trainining | | :--------------------------------------: | :------: | :-------: | :--------: | | [origin paper](https://arxiv.org/abs/1506.01497) | K40 | 5 fps | NA | | This[1] | TITAN Xp | 14-15 fps | 6 fps | | [pytorch-faster-rcnn](https://github.com/ruotianluo/pytorch-faster-rcnn) | TITAN Xp | 15-17fps | 6fps | [1]: make sure you install cupy correctly and only one program run on the GPU. The training speed is sensitive to your gpu status. see [troubleshooting](troubleshooting) for more info. Morever it's slow in the start of the program -- it need time to warm up. It could be faster by removing visualization, logging, averaging loss etc. ## 3. Install dependencies requires PyTorch >=0.4 - install PyTorch >=0.4 with GPU (code are GPU-only), refer to [official website](http://pytorch.org) - install cupy, you can install via `pip install cupy-cuda80` or(cupy-cuda90,cupy-cuda91, etc). - install other dependencies: `pip install -r requirements.txt ` - Optional, but strongly recommended: build cython code `nms_gpu_post`: ```Bash cd model/utils/nms/ python build.py build_ext --inplace cd - ``` - start visdom for visualization ```Bash nohup python -m visdom.server & ``` ## 4. Demo Download pretrained model from [Google Drive](https://drive.google.com/open?id=1cQ27LIn-Rig4-Uayzy_gH5-cW-NRGVzY) or [Baidu Netdisk( passwd: scxn)](https://pan.baidu.com/s/1o87RuXW) See [demo.ipynb](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/demo.ipynb) for more detail. ## 5. Train ### 5.1 Prepare data #### Pascal VOC2007 1. Download the training, validation, test data and VOCdevkit ```Bash wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar ``` 2. Extract all of these tars into one directory named `VOCdevkit` ```Bash tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar ``` 3. It should have this basic structure ```Bash $VOCdevkit/ # development kit $VOCdevkit/VOCcode/ # VOC utility code $VOCdevkit/VOC2007 # image sets, annotations, etc. # ... and several other directories ... ``` 4. modify `voc_data_dir` cfg item in `utils/config.py`, or pass it to program using argument like `--voc-data-dir=/path/to/VOCdevkit/VOC2007/` . ### 5.2 Prepare caffe-pretrained vgg16 If you want to use caffe-pretrain model as initial weight, you can run below to get vgg16 weights converted from caffe, which is the same as the origin paper use. ````Bash python misc/convert_caffe_pretrain.py ```` This scripts would download pretrained model and converted it to the format compatible with torchvision. If you are in China and can not download the pretrain model, you may refer to [this issue](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/issues/63) Then you could specify where caffe-pretraind model `vgg16_caffe.pth` stored in `utils/config.py` by setting `caffe_pretrain_path`. The default path is ok. If you want to use pretrained model from torchvision, you may skip this step. **NOTE**, caffe pretrained model has shown slight better performance. **NOTE**: caffe model require images in BGR 0-255, while torchvision model requires images in RGB and 0-1. See `data/dataset.py`for more detail. ### 5.3 begin training ```Bash mkdir checkpoints/ # folder for snapshots ``` ```bash python train.py train --env='fasterrcnn-caffe' --plot-every=100 --caffe-pretrain ``` you may refer to `utils/config.py` for more argument. Some Key arguments: - `--caffe-pretrain=False`: use pretrain model from caffe or torchvision (Default: torchvison) - `--plot-every=n`: visualize prediction, loss etc every `n` batches. - `--env`: visdom env for visualization - `--voc_data_dir`: where the VOC data stored - `--use-drop`: use dropout in RoI head, default False - `--use-Adam`: use Adam instead of SGD, default SGD. (You need set a very low `lr` for Adam) - `--load-path`: pretrained model path, default `None`, if it's specified, it would be loaded. you may open browser, visit `http://<ip>:8097` and see the visualization of training procedure as below: ![visdom](http://7zh43r.com2.z0.glb.clouddn.com/del/visdom-fasterrcnn.png) ## Troubleshooting - dataloader: `received 0 items of ancdata` see [discussion](https://github.com/pytorch/pytorch/issues/973#issuecomment-346405667), It's alreadly fixed in [train.py](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/train.py#L17-L22). So I think you are free from this problem. - Windows support I don't have windows machine with GPU to debug and test it. It's welcome if anyone could make a pull request and test it. ## More - [ ] training on coco - [ ] resnet - [ ] Maybe；replace cupy with THTensor+cffi? - [ ] Maybe：Convert all numpy code to tensor? - [x] python2-compatibility ## Acknowledgement This work builds on many excellent works, which include: - [Yusuke Niitani's ChainerCV](https://github.com/chainer/chainercv) (mainly) - [Ruotian Luo's pytorch-faster-rcnn](https://github.com/ruotianluo/pytorch-faster-rcnn) which based on [Xinlei Chen's tf-faster-rcnn](https://github.com/endernewton/tf-faster-rcnn) - [faster-rcnn.pytorch by Jianwei Yang and Jiasen Lu](https://github.com/jwyang/faster-rcnn.pytorch).It mainly refer to [longcw's faster_rcnn_pytorch](https://github.com/longcw/faster_rcnn_pytorch) - All the above Repositories have referred to [py-faster-rcnn by Ross Girshick and Sean Bell](https://github.com/rbgirshick/py-faster-rcnn) either directly or indirectly. ## ^_^ Licensed under MIT, see the LICENSE for more de

评论收藏

内容反馈

版权申诉