# Fully Convolutional Networks for Semantic Segmentation
This is the reference implementation of the models and code for the fully convolutional networks (FCNs) in the [PAMI FCN](https://arxiv.org/abs/1605.06211) and [CVPR FCN](http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Long_Fully_Convolutional_Networks_2015_CVPR_paper.html) papers:
Fully Convolutional Models for Semantic Segmentation
Evan Shelhamer*, Jonathan Long*, Trevor Darrell
PAMI 2016
arXiv:1605.06211
Fully Convolutional Models for Semantic Segmentation
Jonathan Long*, Evan Shelhamer*, Trevor Darrell
CVPR 2015
arXiv:1411.4038
**Note that this is a work in progress and the final, reference version is coming soon.**
Please ask Caffe and FCN usage questions on the [caffe-users mailing list](https://groups.google.com/forum/#!forum/caffe-users).
Refer to [these slides](https://docs.google.com/presentation/d/10XodYojlW-1iurpUsMoAZknQMS36p7lVIfFZ-Z7V_aY/edit?usp=sharing) for a summary of the approach.
These models are compatible with `BVLC/caffe:master`.
Compatibility has held since `master@8c66fa5` with the merge of PRs #3613 and #3570.
The code and models here are available under the same license as Caffe (BSD-2) and the Caffe-bundled models (that is, unrestricted use; see the [BVLC model license](http://caffe.berkeleyvision.org/model_zoo.html#bvlc-model-license)).
**PASCAL VOC models**: trained online with high momentum for a ~5 point boost in mean intersection-over-union over the original models.
These models are trained using extra data from [Hariharan et al.](http://www.cs.berkeley.edu/~bharath2/codes/SBD/download.html), but excluding SBD val.
FCN-32s is fine-tuned from the [ILSVRC-trained VGG-16 model](https://github.com/BVLC/caffe/wiki/Model-Zoo#models-used-by-the-vgg-team-in-ilsvrc-2014), and the finer strides are then fine-tuned in turn.
The "at-once" FCN-8s is fine-tuned from VGG-16 all-at-once by scaling the skip connections to better condition optimization.
* [FCN-32s PASCAL](voc-fcn32s): single stream, 32 pixel prediction stride net, scoring 63.6 mIU on seg11valid
* [FCN-16s PASCAL](voc-fcn16s): two stream, 16 pixel prediction stride net, scoring 65.0 mIU on seg11valid
* [FCN-8s PASCAL](voc-fcn8s): three stream, 8 pixel prediction stride net, scoring 65.5 mIU on seg11valid and 67.2 mIU on seg12test
* [FCN-8s PASCAL at-once](voc-fcn8s-atonce): all-at-once, three stream, 8 pixel prediction stride net, scoring 65.4 mIU on seg11valid
[FCN-AlexNet PASCAL](voc-fcn-alexnet): AlexNet (CaffeNet) architecture, single stream, 32 pixel prediction stride net, scoring 48.0 mIU on seg11valid.
Unlike the FCN-32/16/8s models, this network is trained with gradient accumulation, normalized loss, and standard momentum.
(Note: when both FCN-32s/FCN-VGG16 and FCN-AlexNet are trained in this same way FCN-VGG16 is far better; see Table 1 of the paper.)
To reproduce the validation scores, use the [seg11valid](https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/data/pascal/seg11valid.txt) split defined by the paper in footnote 7. Since SBD train and PASCAL VOC 2011 segval intersect, we only evaluate on the non-intersecting set for validation purposes.
**NYUDv2 models**: trained online with high momentum on color, depth, and HHA features (from Gupta et al. https://github.com/s-gupta/rcnn-depth).
These models demonstrate FCNs for multi-modal input.
* [FCN-32s NYUDv2 Color](nyud-fcn32s-color): single stream, 32 pixel prediction stride net on color/BGR input
* [FCN-32s NYUDv2 HHA](nyud-fcn32s-hha): single stream, 32 pixel prediction stride net on HHA input
* [FCN-32s NYUDv2 Early Color-Depth](nyud-fcn32s-color-d): single stream, 32 pixel prediction stride net on early fusion of color and (log) depth for 4-channel input
* [FCN-32s NYUDv2 Late Color-HHA](nyud-fcn32s-color-hha): single stream, 32 pixel prediction stride net by late fusion of FCN-32s NYUDv2 Color and FCN-32s NYUDv2 HHA
**SIFT Flow models**: trained online with high momentum for joint semantic class and geometric class segmentation.
These models demonstrate FCNs for multi-task output.
* [FCN-32s SIFT Flow](siftflow-fcn32s): single stream stream, 32 pixel prediction stride net
* [FCN-16s SIFT Flow](siftflow-fcn16s): two stream, 16 pixel prediction stride net
* [FCN-8s SIFT Flow](siftflow-fcn8s): three stream, 8 pixel prediction stride net
*Note*: in this release, the evaluation of the semantic classes is not quite right at the moment due to an issue with missing classes.
This will be corrected soon.
The evaluation of the geometric classes is fine.
**PASCAL-Context models**: trained online with high momentum on an object and scene labeling of PASCAL VOC.
* [FCN-32s PASCAL-Context](pascalcontext-fcn32s): single stream, 32 pixel prediction stride net
* [FCN-16s PASCAL-Context](pascalcontext-fcn16s): two stream, 16 pixel prediction stride net
* [FCN-8s PASCAL-Context](pascalcontext-fcn8s): three stream, 8 pixel prediction stride net
## Frequently Asked Questions
**Is learning the interpolation necessary?** In our original experiments the interpolation layers were initialized to bilinear kernels and then learned.
In follow-up experiments, and this reference implementation, the bilinear kernels are fixed.
There is no significant difference in accuracy in our experiments, and fixing these parameters gives a slight speed-up.
Note that in our networks there is only one interpolation kernel per output class, and results may differ for higher-dimensional and non-linear interpolation, for which learning may help further.
**Why pad the input?**: The 100 pixel input padding guarantees that the network output can be aligned to the input for any input size in the given datasets, for instance PASCAL VOC.
The alignment is handled automatically by net specification and the crop layer.
It is possible, though less convenient, to calculate the exact offsets necessary and do away with this amount of padding.
**Why are all the outputs/gradients/parameters zero?**: This is almost universally due to not initializing the weights as needed.
To reproduce our FCN training, or train your own FCNs, it is crucial to transplant the weights from the corresponding ILSVRC net such as VGG16.
The included `surgery.transplant()` method can help with this.
**What about FCN-GoogLeNet?**: a reference FCN-GoogLeNet for PASCAL VOC is coming soon.
没有合适的资源?快使用搜索试试~ 我知道了~
pytorch-fcn:完全卷积网络的PyTorch实现。 (可提供重现原始结果的训练代码。)
共152个文件
py:59个
prototxt:49个
caffemodel-url:14个
5星 · 超过95%的资源 需积分: 50 45 下载量 107 浏览量
2021-02-05
09:50:32
上传
评论 3
收藏 816KB ZIP 举报
温馨提示
pytorch-fcn PyTorch实现。 要求 > = 0.2.0 > = 0.1.8 > = 6.1.5 安装 git clone https://github.com/wkentaro/pytorch-fcn.git cd pytorch-fcn pip install . # or pip install torchfcn 训练 参见。 准确性 在10fdec9 。 模型 实作 时代 迭代 平均IU 预训练模型 FCN32 -- -- 63.63 FCN32 我们的 11 96000 62.84 FCN16 -- -- 65.01 FCN16 我们的
资源详情
资源评论
资源推荐
收起资源包目录
pytorch-fcn:完全卷积网络的PyTorch实现。 (可提供重现原始结果的训练代码。) (152个子文件)
caffemodel-url 74B
caffemodel-url 73B
caffemodel-url 73B
caffemodel-url 72B
caffemodel-url 70B
caffemodel-url 68B
caffemodel-url 68B
caffemodel-url 68B
caffemodel-url 67B
caffemodel-url 66B
caffemodel-url 65B
caffemodel-url 65B
caffemodel-url 65B
caffemodel-url 64B
setup.cfg 73B
.gitignore 36B
.gitignore 6B
.gitmodules 97B
MANIFEST.in 76B
fcn8s_iter28000.jpg 696KB
LICENSE 1KB
README.md 6KB
README.md 3KB
README.md 1KB
README.md 1KB
README.md 844B
README.md 807B
README.md 700B
README.md 564B
trainval.prototxt 14KB
test.prototxt 14KB
trainval.prototxt 11KB
test.prototxt 11KB
trainval.prototxt 10KB
test.prototxt 10KB
val.prototxt 9KB
train.prototxt 9KB
deploy.prototxt 9KB
train.prototxt 9KB
val.prototxt 9KB
val.prototxt 9KB
train.prototxt 9KB
deploy.prototxt 8KB
trainval.prototxt 8KB
test.prototxt 8KB
train.prototxt 8KB
val.prototxt 8KB
val.prototxt 8KB
train.prototxt 8KB
deploy.prototxt 8KB
trainval.prototxt 7KB
test.prototxt 7KB
train.prototxt 7KB
val.prototxt 7KB
val.prototxt 7KB
train.prototxt 7KB
trainval.prototxt 7KB
trainval.prototxt 7KB
test.prototxt 7KB
test.prototxt 7KB
deploy.prototxt 7KB
val.prototxt 4KB
train.prototxt 4KB
solver.prototxt 435B
solver.prototxt 435B
solver.prototxt 435B
solver.prototxt 435B
solver.prototxt 432B
solver.prototxt 432B
solver.prototxt 432B
solver.prototxt 431B
solver.prototxt 431B
solver.prototxt 431B
solver.prototxt 431B
solver.prototxt 429B
solver.prototxt 386B
solver.prototxt 386B
solver.prototxt 386B
fcn8s.py 9KB
trainer.py 8KB
voc_layers.py 7KB
fcn32s.py 6KB
net.py 5KB
nyud_layers.py 5KB
voc.py 5KB
fcn16s.py 5KB
pascalcontext_layers.py 5KB
siftflow_layers.py 4KB
net.py 4KB
train_fcn32s.py 4KB
net.py 4KB
learning_curve.py 4KB
net.py 4KB
net.py 4KB
train_fcn16s.py 4KB
train_fcn8s.py 4KB
net.py 4KB
speedtest.py 3KB
net.py 3KB
train_fcn8s_atonce.py 3KB
共 152 条
- 1
- 2
参丸
- 粉丝: 14
- 资源: 4658
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- date0425111111111111111111111
- 包含贪心算法的定义及python代码部分实现
- 自动驾驶-状态估计和定位之扩展卡尔曼滤波.pdf
- csdn之x-ca-key,x-ca-nonce,x-ca-signature与x-ca-signature-headers探索
- 基于TM1620数码显示芯片STM32单片机驱动程序软件源代码.zip
- 【tomcat6使用redis配置session共享】
- 包含杨辉三角的说明及java代码实现
- FDN371N-NL-VB一款SOT23封装N-Channel场效应MOS管
- AutomotiveSPICE-V4.0 中文版
- Java实现杨辉三角的生成和打印代码示例.md
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论1