# High quality, fast, modular reference implementation of SSD in PyTorch 1.0
This repository implements [SSD (Single Shot MultiBox Detector)](https://arxiv.org/abs/1512.02325). The implementation is heavily influenced by the projects [ssd.pytorch](https://github.com/amdegroot/ssd.pytorch), [pytorch-ssd](https://github.com/qfgaohao/pytorch-ssd) and [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark). This repository aims to be the code base for researches based on SSD.
<div align="center">
<img src="figures/004545.jpg" width="500px" />
<p>Example SSD output (vgg_ssd300_voc0712).</p>
</div>
| Losses | Learning rate | Metrics |
| :-----------: |:-------------:| :------:|
| ![losses](figures/losses.png) | ![lr](figures/lr.png) | ![metric](figures/metrics.png) |
## Highlights
- **PyTorch 1.0**: Support PyTorch 1.0 or higher.
- **Multi-GPU training and inference**: We use `DistributedDataParallel`, you can train or test with arbitrary GPU(s), the training schema will change accordingly.
- **Modular**: Add your own modules without pain. We abstract `backbone`,`Detector`, `BoxHead`, `BoxPredictor`, etc. You can replace every component with your own code without change the code base. For example, You can add [EfficientNet](https://github.com/lukemelas/EfficientNet-PyTorch) as backbone, just add `efficient_net.py` (ALREADY ADDED) and register it, specific it in the config file, It's done!
- **CPU support for inference**: runs on CPU in inference time.
- **Smooth and enjoyable training procedure**: we save the state of model, optimizer, scheduler, training iter, you can stop your training and resume training exactly from the save point without change your training `CMD`.
- **Batched inference**: can perform inference using multiple images per batch per GPU.
- **Evaluating during training**: eval you model every `eval_step` to check performance improving or not.
- **Metrics Visualization**: visualize metrics details in tensorboard, like AP, APl, APm and APs for COCO dataset or mAP and 20 categories' AP for VOC dataset.
- **Auto download**: load pre-trained weights from URL and cache it.
## Installation
### Requirements
1. Python3
1. PyTorch 1.0 or higher
1. yacs
1. [Vizer](https://github.com/lufficc/Vizer)
1. GCC >= 4.9
1. OpenCV
### Step-by-step installation
```bash
git clone https://github.com/lufficc/SSD.git
cd SSD
# Required packages: torch torchvision yacs tqdm opencv-python vizer
pip install -r requirements.txt
# Done! That's ALL! No BUILD! No bothering SETUP!
# It's recommended to install the latest release of torch and torchvision.
```
## Train
### Setting Up Datasets
#### Pascal VOC
For Pascal VOC dataset, make the folder structure like this:
```
VOC_ROOT
|__ VOC2007
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ VOC2012
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ ...
```
Where `VOC_ROOT` default is `datasets` folder in current project, you can create symlinks to `datasets` or `export VOC_ROOT="/path/to/voc_root"`.
#### COCO
For COCO dataset, make the folder structure like this:
```
COCO_ROOT
|__ annotations
|_ instances_valminusminival2014.json
|_ instances_minival2014.json
|_ instances_train2014.json
|_ instances_val2014.json
|_ ...
|__ train2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ val2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ ...
```
Where `COCO_ROOT` default is `datasets` folder in current project, you can create symlinks to `datasets` or `export COCO_ROOT="/path/to/coco_root"`.
### Single GPU training
```bash
# for example, train SSD300:
python train.py --config-file configs/vgg_ssd300_voc0712.yaml
```
### Multi-GPU training
```bash
# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/vgg_ssd300_voc0712.yaml SOLVER.WARMUP_FACTOR 0.03333 SOLVER.WARMUP_ITERS 1000
```
The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, ...) will also changed according to this paper: [Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour](https://arxiv.org/abs/1706.02677).
## Evaluate
### Single GPU evaluating
```bash
# for example, evaluate SSD300:
python test.py --config-file configs/vgg_ssd300_voc0712.yaml
```
### Multi-GPU evaluating
```bash
# for example, evaluate SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS test.py --config-file configs/vgg_ssd300_voc0712.yaml
```
## Demo
Predicting image in a folder is simple:
```bash
python demo.py --config-file configs/vgg_ssd300_voc0712.yaml --images_dir demo --ckpt https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_voc0712.pth
```
Then it will download and cache `vgg_ssd300_voc0712.pth` automatically and predicted images with boxes, scores and label names will saved to `demo/result` folder by default.
You will see a similar output:
```text
(0001/0005) 004101.jpg: objects 01 | load 010ms | inference 033ms | FPS 31
(0002/0005) 003123.jpg: objects 05 | load 009ms | inference 019ms | FPS 53
(0003/0005) 000342.jpg: objects 02 | load 009ms | inference 019ms | FPS 51
(0004/0005) 008591.jpg: objects 02 | load 008ms | inference 020ms | FPS 50
(0005/0005) 000542.jpg: objects 01 | load 011ms | inference 019ms | FPS 53
```
## MODEL ZOO
### Origin Paper:
| | VOC2007 test | coco test-dev2015 |
| :-----: | :----------: | :----------: |
| SSD300* | 77.2 | 25.1 |
| SSD512* | 79.8 | 28.8 |
### COCO:
| Backbone | Input Size | box AP | Model Size | Download |
| :------------: | :----------:| :--------------------------: | :--------: | :-------: |
| VGG16 | 300 | 25.2 | 262MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_coco_trainval35k.pth) |
| VGG16 | 512 | 29.0 | 275MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd512_coco_trainval35k.pth) |
### PASCAL VOC:
| Backbone | Input Size | mAP | Model Size | Download |
| :--------------: | :----------:| :--------------------------: | :--------: | :-------: |
| VGG16 | 300 | 77.7 | 201MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_voc0712.pth) |
| VGG16 | 512 | 80.7 | 207MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd512_voc0712.pth) |
| Mobilenet V2 | 320 | 68.9 | 25.5MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/mobilenet_v2_ssd320_voc0712_v2.pth) |
| Mobilenet V3 | 320 | 69.5 | 29.9MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/mobilenet_v3_ssd320_voc0712.pth) |
| EfficientNet-B3 | 300 | 73.9 | 97.1MB | [model](https://github.com/lufficc/SSD/releases/download/1.2/efficient_net_b3_ssd300_voc0712.pth) |
## Develop Guide
If you want to add your custom components, please see [DEVELOP_GUIDE.md](DEVELOP_GUIDE.md) for more details.
## Troubleshooting
If you have issues running or compiling this code, we have compiled a list of common issues in [TROUBLESHOOTING.md](TROUBLESHOOTING.md). If your issue is not present there, please feel free to open a new issue.
## Citations
If you use this project in your research, please cite this project.
```text
@misc{lufficc2018ssd,
author = {Congcong Li},
title = {{High quality, fast, modular reference implementation of SSD in PyTorch}},
year = {2018},
howpublished = {\url{https://g
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
SSD(Single Shot MultiBox Detector)是一种流行的目标检测模型,它通过单次前向传播即可预测图像中的目标位置和类别。SSD模型以其速度快和性能好而受到广泛欢迎,适用于需要实时目标检测的场景。以下是SSD模型的一些关键特性: 单次检测:SSD的核心特性是它能够在单次前向传播中预测图像中所有目标的边界框和类别。 多尺度预测:SSD通过在不同尺度的特征图上进行检测,能够检测不同大小的目标。 默认框(Default Boxes):SSD使用了一系列预定义的默认框,这些框在训练过程中被调整以匹配真实目标的大小。 交叉类别边界框回归:SSD使用了一个统一的框架来同时预测类别和边界框偏移,这提高了模型的效率。 数据增强:SSD通常使用数据增强技术来提高模型的泛化能力,包括图像缩放、裁剪和颜色扭曲等。 特征融合:在SSD的某些版本中,如SSD-Lite,通过特征融合技术结合了低层次和高层次的特征,以提高小目标的检测性能。 端到端训练:SSD模型可以直接从图像到边界框和类别概率进行端到端的训练。 实时性能:SSD保持了较高的速度,适用于需要快速检测反馈的应用
资源推荐
资源详情
资源评论
收起资源包目录
SSD-master.zip (93个子文件)
SSD-master
test1.py 17KB
setup.py 973B
DEVELOP_GUIDE.md 4KB
LICENSE 1KB
configs
mobilenet_v3_ssd320_voc0712.yaml 713B
vgg_ssd300_coco_trainval35k.yaml 575B
mobilenet_v2_ssd320_voc0712.yaml 712B
vgg_ssd512_coco_trainval35k.yaml 711B
efficient_net_b3_ssd300_voc0712.yaml 380B
vgg_ssd512_voc0712.yaml 683B
vgg_ssd300_voc0712.yaml 277B
demo
004101.jpg 93KB
008591.jpg 94KB
003123.jpg 80KB
000342.jpg 51KB
000542.jpg 113KB
ssd
__init__.py 0B
layers
__init__.py 760B
separable_conv.py 674B
data
__init__.py 0B
samplers
__init__.py 178B
distributed.py 3KB
iteration_based_batch_sampler.py 1KB
datasets
__init__.py 1KB
coco.py 3KB
voc.py 4KB
evaluation
__init__.py 903B
voc
__init__.py 2KB
eval_detection_voc.py 14KB
coco
__init__.py 2KB
build.py 2KB
transforms
__init__.py 1KB
transforms.py 14KB
target_transform.py 1KB
utils
__init__.py 20B
dist_util.py 3KB
model_zoo.py 3KB
checkpoint.py 4KB
box_utils.py 5KB
misc.py 213B
registry.py 1KB
logger.py 743B
nms.py 2KB
metric_logger.py 2KB
engine
__init__.py 0B
trainer.py 6KB
inference.py 3KB
solver
__init__.py 0B
lr_scheduler.py 1KB
build.py 669B
structures
__init__.py 0B
container.py 2KB
config
__init__.py 32B
defaults.py 3KB
path_catlog.py 3KB
modeling
anchors
__init__.py 0B
prior_box.py 2KB
__init__.py 0B
backbone
__init__.py 368B
efficient_net
utils.py 9KB
__init__.py 386B
efficient_net.py 9KB
mobilenetv3.py 7KB
vgg.py 4KB
mobilenet.py 5KB
box_head
__init__.py 198B
loss.py 2KB
box_head.py 2KB
inference.py 2KB
box_predictor.py 3KB
registry.py 115B
detector
__init__.py 241B
ssd_detector.py 564B
TROUBLESHOOTING.md 623B
requirements.txt 57B
figures
004545.jpg 53KB
metrics.png 16KB
lr.png 13KB
losses.png 39KB
MANIFEST.in 33B
.gitignore 308B
train.py 4KB
demo.py 4KB
test.py 3KB
outputs
.gitignore 0B
README.md 8KB
.idea
workspace.xml 2KB
misc.xml 188B
inspectionProfiles
Project_Default.xml 1KB
profiles_settings.xml 174B
modules.xml 279B
.gitignore 184B
SSD-master.iml 452B
共 93 条
- 1
资源评论
张飞飞飞飞飞
- 粉丝: 364
- 资源: 26
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功