FCN模型进行图像分割_FCN资源-CSDN文库

共55个文件

png：46个

py：5个

gitignore：1个

网络

深度学习

需积分: 5 196 浏览量 2024-03-28 17:00:02 上传评论收藏 1.03MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

FCN.tensorflow-master.zip （55个子文件）

FCN.tensorflow-master

__init__.py 0B

FCN.py 10KB

BatchDatsetReader.py 3KB

LICENSE 1KB

TensorflowUtils.py 8KB

logs

images

gt_c0.png 9KB

pred_5.png 1KB

pred_1.png 3KB

inp_5.png 65KB

conv_1_1_gradient.png 20KB

pred_c6.png 7KB

gt_3.png 4KB

gt_0.png 2KB

inp_8.png 48KB

sparse_entropy.png 62KB

gt_6.png 3KB

pred_c4.png 11KB

gt_c6.png 12KB

inp_7.png 67KB

gt_c3.png 14KB

gt_2.png 4KB

conv_4_3_gradient.png 16KB

gt_7.png 4KB

inp_6.png 68KB

gt_c2.png 15KB

gt_5.png 1KB

pred_8.png 2KB

inp_0.png 80KB

gt_c5.png 9KB

gt_4.png 3KB

pred_c0.png 8KB

gt_8.png 3KB

pred_c3.png 11KB

pred_0.png 2KB

pred_7.png 3KB

pred_4.png 3KB

pred_3.png 3KB

inp_2.png 89KB

gt_1.png 2KB

gt_c4.png 14KB

gt_c1.png 10KB

pred_c5.png 8KB

pred_2.png 2KB

inp_1.png 70KB

conv_4_2_gradient.png 16KB

pred_c1.png 11KB

conv_4_1_gradient.png 16KB

pred_6.png 1KB

pred_c2.png 8KB

Image_Cmaped.ipynb 111KB

inp_3.png 82KB

inp_4.png 65KB

.gitignore 42B

README.md 5KB

read_MITSceneParsingData.py 2KB

# FCN.tensorflow Tensorflow implementation of [Fully Convolutional Networks for Semantic Segmentation](https://arxiv.org/pdf/1605.06211v1.pdf) (FCNs). The implementation is largely based on the reference code provided by the authors of the paper [link](https://github.com/shelhamer/fcn.berkeleyvision.org). The model was applied on the Scene Parsing Challenge dataset provided by MIT [http://sceneparsing.csail.mit.edu/](http://sceneparsing.csail.mit.edu/). 1. [Prerequisites](#prerequisites) 2. [Results](#results) 3. [Observations](#observations) 4. [Useful links](#useful-links) ## Prerequisites - The results were obtained after training for ~6-7 hrs on a 12GB TitanX. - The code was originally written and tested with `tensorflow0.11` and `python2.7`. The tf.summary calls have been updated to work with tensorflow version 0.12. To work with older versions of tensorflow use branch [tf.0.11_compatible](https://github.com/shekkizh/FCN.tensorflow/tree/tf.0.11_compatible). - Some of the problems while working with tensorflow1.0 and in windows have been discussed in [Issue #9](https://github.com/shekkizh/FCN.tensorflow/issues/9). - To train model simply execute `python FCN.py` - To visualize results for a random batch of images use flag `--mode=visualize` - `debug` flag can be set during training to add information regarding activations, gradients, variables etc. - The [IPython notebook](https://github.com/shekkizh/FCN.tensorflow/blob/master/logs/images/Image_Cmaped.ipynb) in logs folder can be used to view results in color as below. ## Results Results were obtained by training the model in batches of 2 with resized image of 256x256. Note that although the training is done at this image size - Nothing prevents the model from working on arbitrary sized images. No post processing was done on the predicted images. Training was done for 9 epochs - The shorter training time explains why certain concepts seem semantically understood by the model while others were not. Results below are from randomly chosen images from validation dataset. Pretty much used the same network design as in the reference model implementation of the paper in caffe. The weights for the new layers added were initialized with small values, and the learning was done using Adam Optimizer (Learning rate = 1e-4). ![](logs/images/inp_1.png) ![](logs/images/gt_c1.png) ![](logs/images/pred_c1.png) ![](logs/images/inp_2.png) ![](logs/images/gt_c2.png) ![](logs/images/pred_c2.png) ![](logs/images/inp_3.png) ![](logs/images/gt_c3.png) ![](logs/images/pred_c3.png) ![](logs/images/inp_4.png) ![](logs/images/gt_c4.png) ![](logs/images/pred_c4.png) ![](logs/images/inp_6.png) ![](logs/images/gt_c6.png) ![](logs/images/pred_c6.png) ## Observations - The small batch size was necessary to fit the training model in memory but explains the slow learning - Concepts that had many examples seem to be correctly identified and segmented - in the example above you can see that cars, persons were identified better. I believe this can be solved by training for longer epochs. - Also the resizing of images cause loss of information - you can notice this in the fact smaller objects are segmented with less accuracy. ![](logs/images/sparse_entropy.png) Now for the gradients, - If you closely watch the gradients you will notice the inital training is almost entirely on the new layers added - it is only after these layers are reasonably trained do we see the VGG layers get some gradient flow. This is understandable as changes the new layers affect the loss objective much more in the beginning. - The earlier layers of the netowrk are initialized with VGG weights and so conceptually would require less tuning unless the train data is extremely varied - which in this case is not. - The first layer of convolutional model captures low level information and since this entrirely dataset dependent you notice the gradients adjusting the first layer weights to accustom the model to the dataset. - The other conv layers from VGG have very small gradients flowing as the concepts captured here are good enough for our end objective - Segmentation. - This is the core reason **Transfer Learning** works so well. Just thought of pointing this out while here. ![](logs/images/conv_1_1_gradient.png) ![](logs/images/conv_4_1_gradient.png) ![](logs/images/conv_4_2_gradient.png) ![](logs/images/conv_4_3_gradient.png) ## Useful Links - Video of the presentaion given by the authors on the paper - [link](http://techtalks.tv/talks/fully-convolutional-networks-for-semantic-segmentation/61606/)

评论收藏

内容反馈