# High quality, fast, modular reference implementation of SSD in PyTorch 1.0
This repository implements [SSD (Single Shot MultiBox Detector)](https://arxiv.org/abs/1512.02325). The implementation is heavily influenced by the projects [ssd.pytorch](https://github.com/amdegroot/ssd.pytorch), [pytorch-ssd](https://github.com/qfgaohao/pytorch-ssd) and [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark). This repository aims to be the code base for researches based on SSD.
<div align="center">
<img src="figures/004545.jpg" width="500px" />
<p>Example SSD output (vgg_ssd300_voc0712).</p>
</div>
| Losses | Learning rate | Metrics |
| :-----------: |:-------------:| :------:|
| ![losses](figures/losses.png) | ![lr](figures/lr.png) | ![metric](figures/metrics.png) |
## Highlights
- **PyTorch 1.0**: Support PyTorch 1.0 or higher.
- **Multi-GPU training and inference**: We use `DistributedDataParallel`, you can train or test with arbitrary GPU(s), the training schema will change accordingly.
- **Modular**: Add your own modules without pain. We abstract `backbone`,`Detector`, `BoxHead`, `BoxPredictor`, etc. You can replace every component with your own code without change the code base. For example, You can add [EfficientNet](https://github.com/lukemelas/EfficientNet-PyTorch) as backbone, just add `efficient_net.py` (ALREADY ADDED) and register it, specific it in the config file, It's done!
- **CPU support for inference**: runs on CPU in inference time.
- **Smooth and enjoyable training procedure**: we save the state of model, optimizer, scheduler, training iter, you can stop your training and resume training exactly from the save point without change your training `CMD`.
- **Batched inference**: can perform inference using multiple images per batch per GPU.
- **Evaluating during training**: eval you model every `eval_step` to check performance improving or not.
- **Metrics Visualization**: visualize metrics details in tensorboard, like AP, APl, APm and APs for COCO dataset or mAP and 20 categories' AP for VOC dataset.
- **Auto download**: load pre-trained weights from URL and cache it.
## Installation
### Requirements
1. Python3
1. PyTorch 1.0 or higher
1. yacs
1. [Vizer](https://github.com/lufficc/Vizer)
1. GCC >= 4.9
1. OpenCV
### Step-by-step installation
```bash
git clone https://github.com/lufficc/SSD.git
cd SSD
#Required packages
pip install torch torchvision yacs tqdm opencv-python vizer
# Optional packages
# If you want visualize loss curve. Default is enabled. Disable by using --use_tensorboard 0 when training.
pip install tensorboardX
# If you train coco dataset, must install cocoapi.
cd ~/github
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install
```
### Build
If your torchvision >= 0.3.0, nms build is not needed! We also provide a python-like nms, but is very slower than build-version.
```bash
# For faster inference you need to build nms, this is needed when evaluating. Only training doesn't need this.
cd ext
python build.py build_ext develop
```
## Train
### Setting Up Datasets
#### Pascal VOC
For Pascal VOC dataset, make the folder structure like this:
```
VOC_ROOT
|__ VOC2007
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ VOC2012
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ ...
```
Where `VOC_ROOT` default is `datasets` folder in current project, you can create symlinks to `datasets` or `export VOC_ROOT="/path/to/voc_root"`.
#### COCO
For COCO dataset, make the folder structure like this:
```
COCO_ROOT
|__ annotations
|_ instances_valminusminival2014.json
|_ instances_minival2014.json
|_ instances_train2014.json
|_ instances_val2014.json
|_ ...
|__ train2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ val2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ ...
```
Where `COCO_ROOT` default is `datasets` folder in current project, you can create symlinks to `datasets` or `export COCO_ROOT="/path/to/coco_root"`.
### Single GPU training
```bash
# for example, train SSD300:
python train.py --config-file configs/vgg_ssd300_voc0712.yaml
```
### Multi-GPU training
```bash
# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/vgg_ssd300_voc0712.yaml SOLVER.WARMUP_FACTOR 0.03333 SOLVER.WARMUP_ITERS 1000
```
The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, ...) will also changed according to this paper: [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677).
## Evaluate
### Single GPU evaluating
```bash
# for example, evaluate SSD300:
python test.py --config-file configs/vgg_ssd300_voc0712.yaml
```
### Multi-GPU evaluating
```bash
# for example, evaluate SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS test.py --config-file configs/vgg_ssd300_voc0712.yaml
```
## Demo
Predicting image in a folder is simple:
```bash
python demo.py --config-file configs/vgg_ssd300_voc0712.yaml --images_dir demo --ckpt https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_voc0712.pth
```
Then it will download and cache `vgg_ssd300_voc0712.pth` automatically and predicted images with boxes, scores and label names will saved to `demo/result` folder by default.
You will see a similar output:
```text
(0001/0005) 004101.jpg: objects 01 | load 010ms | inference 033ms | FPS 31
(0002/0005) 003123.jpg: objects 05 | load 009ms | inference 019ms | FPS 53
(0003/0005) 000342.jpg: objects 02 | load 009ms | inference 019ms | FPS 51
(0004/0005) 008591.jpg: objects 02 | load 008ms | inference 020ms | FPS 50
(0005/0005) 000542.jpg: objects 01 | load 011ms | inference 019ms | FPS 53
```
## MODEL ZOO
### Origin Paper:
| | VOC2007 test | coco test-dev2015 |
| :-----: | :----------: | :----------: |
| SSD300* | 77.2 | 25.1 |
| SSD512* | 79.8 | 28.8 |
### COCO:
| Backbone | Input Size | box AP | Model Size | Download |
| :------------: | :----------:| :--------------------------: | :--------: | :-------: |
| VGG16 | 300 | 25.2 | 262MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_coco_trainval35k.pth) |
| VGG16 | 512 | 29.0 | 275MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd512_coco_trainval35k.pth) |
### PASCAL VOC:
| Backbone | Input Size | mAP | Model Size | Download |
| :--------------: | :----------:| :--------------------------: | :--------: | :-------: |
| VGG16 | 300 | 77.7 | 201MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_voc0712.pth) |
| VGG16 | 512 | 80.7 | 207MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd512_voc0712.pth) |
| Mobilenet V2 | 320 | 68.9 | 25.5MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/mobilenet_v2_ssd320_voc0712_v2.pth) |
| EfficientNet-B3 | 300 | 73.9 | 97.1MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/efficient_net_b3_ssd300_voc0712.pth) |
## Develop Guide
If you want to add your custom components, please see [DEVELOP_GUIDE.md](DEVELOP_GUIDE.md) for more details.
## Troubleshooting
If you have issues running or compiling this code, we have compiled a list of common issues in [TROUBLESHOOTING.md](TROUBLESHOOTING.md). If your issue is not present there, please feel free t
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
人工智能大作业-无人机图像目标检测 (231个子文件)
yolov3.cfg 8KB
yolov3-custom.cfg 8KB
yolov3-tiny.cfg 2KB
nms_cpu.cpp 2KB
vision.cpp 186B
nms.cu 5KB
coco.data 115B
custom.data 100B
.gitignore 47B
nms.h 716B
vision.h 247B
vision.h 189B
data_transformer.iml 284B
0000001_02999_d_0000005.jpg 468KB
0000006_00159_d_0000001.jpg 173KB
dog.jpg 160KB
000542.jpg 113KB
0000002_00005_d_0000014.jpg 96KB
008591.jpg 94KB
004101.jpg 93KB
003123.jpg 80KB
004545.jpg 53KB
000342.jpg 51KB
000542.jpg 49KB
008591.jpg 42KB
004101.jpg 37KB
003123.jpg 33KB
000342.jpg 23KB
人工智能大作业_流程.md 11KB
README.md 8KB
README.md 6KB
readme.md 4KB
DEVELOP_GUIDE.md 4KB
TROUBLESHOOTING.md 623B
coco.names 625B
classes.names 82B
nohup.out 72KB
dog.png 321KB
losses.png 39KB
metrics.png 16KB
lr.png 13KB
models.py 15KB
eval_detection_voc.py 14KB
eval_detection_voc.py 14KB
transforms.py 14KB
utils.py 12KB
utils.py 9KB
txt2xml.py 9KB
efficient_net.py 9KB
train.py 7KB
train.py 7KB
trainer.py 6KB
data_loading.py 6KB
visdrone_train.py 6KB
train_visdrone.py 6KB
train.py 6KB
detect_vis.py 5KB
detect.py 5KB
box_utils.py 5KB
detect.py 5KB
datasets.py 5KB
visdrone.py 5KB
mobilenet.py 5KB
voc.py 5KB
visdrone_demo.py 4KB
visdrone_demo.py 4KB
video_demo.py 4KB
demo.py 4KB
coco.py 4KB
test.py 4KB
voc2yolo.py 4KB
checkpoint.py 4KB
vgg.py 4KB
path_catlog.py 4KB
defaults.py 3KB
inference.py 3KB
model_zoo.py 3KB
box_predictor.py 3KB
visdrone_test.py 3KB
visdrone_test.py 3KB
dist_util.py 3KB
build.py 3KB
test.py 3KB
test.py 3KB
distributed.py 3KB
__init__.py 2KB
__init__.py 2KB
__init__.py 2KB
nms.py 2KB
inference.py 2KB
prior_box.py 2KB
box_head.py 2KB
metric_logger.py 2KB
loss.py 2KB
container.py 2KB
build.py 2KB
registry.py 1KB
__init__.py 1KB
parse_config.py 1KB
move_imgs.py 1KB
共 231 条
- 1
- 2
- 3
资源评论
geobuins
- 粉丝: 2036
- 资源: 1209
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 聋哑人手语词汇图像分类数据集【已标注,约1,100张数据】
- 基于Pygame库实现新年烟花效果的Python代码
- 必应图片壁纸Python爬虫代码bing-img.zip
- 购物返利源码/代购网站源码/每日分打包完整版源码下载
- Java外卖项目(瑞吉外卖项目的扩展)
- 使用Python和matplotlib库绘制爱心图形的技术教程
- 国际象棋检测11-YOLO(v7至v9)、COCO、Darknet、Paligemma、VOC数据集合集.rar
- Python与Pygame实现带特效的圣诞节场景模拟程序
- R语言实战机器学习实战教程
- 常用算法介绍与学习资源汇总
- ssd5课件图片记录保存
- 国际象棋检测2-YOLO(v5至v11)、COCO、CreateML、Paligemma、TFRecord、VOC数据集合集.rar
- Offer资讯交流Web系统(编号:0889870).zip
- 高考志愿智能推荐系统_2a1qfv22.zip
- 个性化推荐影院(编号:03132141).zip
- 高校学生求职就业平台(编号:24440246).zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功