# FCN.tensorflow
Tensorflow implementation of [Fully Convolutional Networks for Semantic Segmentation](http://arxiv.org/pdf/1605.06211v1.pdf) (FCNs).
The implementation is largely based on the reference code provided by the authors of the paper [link](https://github.com/shelhamer/fcn.berkeleyvision.org). The model was applied on the Scene Parsing Challenge dataset provided by MIT [http://sceneparsing.csail.mit.edu/](http://sceneparsing.csail.mit.edu/).
1. [Prerequisites](#prerequisites)
2. [Results](#results)
3. [Observations](#observations)
4. [Useful links](#useful-links)
## Prerequisites
- The results were obtained after training for ~6-7 hrs on a 12GB TitanX.
- The code was originally written and tested with `tensorflow0.11` and `python2.7`.
The tf.summary calls have been updated to work with tensorflow version 0.12. To work with older versions of tensorflow use branch [tf.0.11_compatible](https://github.com/shekkizh/FCN.tensorflow/tree/tf.0.11_compatible).
- Some of the problems while working with tensorflow1.0 and in windows have been discussed in [Issue #9](https://github.com/shekkizh/FCN.tensorflow/issues/9).
- To train model simply execute `python FCN.py`
- To visualize results for a random batch of images use flag `--mode=visualize`
- `debug` flag can be set during training to add information regarding activations, gradients, variables etc.
- The [IPython notebook](https://github.com/shekkizh/FCN.tensorflow/blob/master/logs/images/Image_Cmaped.ipynb) in logs folder can be used to view results in color as below.
## Results
Results were obtained by training the model in batches of 2 with resized image of 256x256. Note that although the training is done at this image size - Nothing prevents the model from working on arbitrary sized images. No post processing was done on the predicted images. Training was done for 9 epochs - The shorter training time explains why certain concepts seem semantically understood by the model while others were not. Results below are from randomly chosen images from validation dataset.
Pretty much used the same network design as in the reference model implementation of the paper in caffe. The weights for the new layers added were initialized with small values, and the learning was done using Adam Optimizer (Learning rate = 1e-4).
![](logs/images/inp_1.png) ![](logs/images/gt_c1.png) ![](logs/images/pred_c1.png)
![](logs/images/inp_2.png) ![](logs/images/gt_c2.png) ![](logs/images/pred_c2.png)
![](logs/images/inp_3.png) ![](logs/images/gt_c3.png) ![](logs/images/pred_c3.png)
![](logs/images/inp_4.png) ![](logs/images/gt_c4.png) ![](logs/images/pred_c4.png)
![](logs/images/inp_6.png) ![](logs/images/gt_c6.png) ![](logs/images/pred_c6.png)
## Observations
- The small batch size was necessary to fit the training model in memory but explains the slow learning
- Concepts that had many examples seem to be correctly identified and segmented - in the example above you can see that cars, persons were identified better. I believe this can be solved by training for longer epochs.
- Also the resizing of images cause loss of information - you can notice this in the fact smaller objects are segmented with less accuracy.
![](logs/images/sparse_entropy.png)
Now for the gradients,
- If you closely watch the gradients you will notice the inital training is almost entirely on the new layers added - it is only after these layers are reasonably trained do we see the VGG layers get some gradient flow. This is understandable as changes the new layers affect the loss objective much more in the beginning.
- The earlier layers of the netowrk are initialized with VGG weights and so conceptually would require less tuning unless the train data is extremely varied - which in this case is not.
- The first layer of convolutional model captures low level information and since this entrirely dataset dependent you notice the gradients adjusting the first layer weights to accustom the model to the dataset.
- The other conv layers from VGG have very small gradients flowing as the concepts captured here are good enough for our end objective - Segmentation.
- This is the core reason **Transfer Learning** works so well. Just thought of pointing this out while here.
![](logs/images/conv_1_1_gradient.png) ![](logs/images/conv_4_1_gradient.png) ![](logs/images/conv_4_2_gradient.png) ![](logs/images/conv_4_3_gradient.png)
## Useful Links
- Video of the presentaion given by the authors on the paper - [link](http://techtalks.tv/talks/fully-convolutional-networks-for-semantic-segmentation/61606/)
没有合适的资源?快使用搜索试试~ 我知道了~
FCN全卷积网络源代码
共55个文件
png:46个
py:5个
ipynb:1个
需积分: 39 107 下载量 20 浏览量
2018-09-23
16:48:23
上传
评论 7
收藏 1.03MB ZIP 举报
温馨提示
FCN源代码,这个代码非常适合配合FCN论文进行学习,适合初学者阅读学习深度学习网络构建框架。
资源推荐
资源详情
资源评论
收起资源包目录
FCN.tensorflow-master.zip (55个子文件)
FCN.tensorflow-master
logs
images
sparse_entropy.png 62KB
gt_7.png 4KB
Image_Cmaped.ipynb 111KB
gt_8.png 3KB
gt_3.png 4KB
pred_c3.png 11KB
pred_c2.png 8KB
pred_c6.png 7KB
pred_c0.png 8KB
pred_7.png 3KB
inp_0.png 80KB
pred_c5.png 8KB
inp_6.png 68KB
inp_8.png 48KB
pred_2.png 2KB
gt_c3.png 14KB
pred_c4.png 11KB
gt_c6.png 12KB
gt_0.png 2KB
conv_4_2_gradient.png 16KB
pred_3.png 3KB
gt_c0.png 9KB
gt_6.png 3KB
gt_5.png 1KB
inp_7.png 67KB
inp_4.png 65KB
pred_5.png 1KB
gt_1.png 2KB
conv_4_1_gradient.png 16KB
inp_1.png 70KB
conv_1_1_gradient.png 20KB
pred_4.png 3KB
gt_2.png 4KB
inp_2.png 89KB
conv_4_3_gradient.png 16KB
gt_c2.png 15KB
pred_8.png 2KB
pred_1.png 3KB
pred_6.png 1KB
pred_0.png 2KB
gt_4.png 3KB
gt_c1.png 10KB
gt_c4.png 14KB
inp_3.png 82KB
inp_5.png 65KB
gt_c5.png 9KB
pred_c1.png 11KB
__init__.py 0B
read_MITSceneParsingData.py 2KB
LICENSE 1KB
TensorflowUtils.py 8KB
README.md 5KB
FCN.py 10KB
.gitignore 42B
BatchDatsetReader.py 3KB
共 55 条
- 1
资源评论
SilverGamer
- 粉丝: 178
- 资源: 17
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功