# Mask R-CNN for Object Detection and Segmentation
This is an implementation of [Mask R-CNN](https://arxiv.org/abs/1703.06870) on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.
![Instance Segmentation Sample](assets/street.png)
The repository includes:
* Source code of Mask R-CNN built on FPN and ResNet101.
* Training code for MS COCO
* Pre-trained weights for MS COCO
* Jupyter notebooks to visualize the detection pipeline at every step
* ParallelModel class for multi-GPU training
* Evaluation on MS COCO metrics (AP)
* Example of training on your own dataset
The code is documented and designed to be easy to extend. If you use it in your research, please consider referencing this repository. If you work on 3D vision, you might find our recently released [Matterport3D](https://matterport.com/blog/2017/09/20/announcing-matterport3d-research-dataset/) dataset useful as well.
This dataset was created from 3D-reconstructed spaces captured by our customers who agreed to make them publicly available for academic use. You can see more examples [here](https://matterport.com/gallery/).
# Projects Using this Model
If you extend this model to other datasets or build projects that use it, we'd love to hear from you.
* [Images to OSM](https://github.com/jremillard/images-to-osm): Use TensorFlow, Bing, and OSM to find features in satellite images.
The goal is to improve OpenStreetMap by adding high quality baseball, soccer, tennis, football, and basketball fields.
# Getting Started
* [demo.ipynb](/demo.ipynb) Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images.
It includes code to run object detection and instance segmentation on arbitrary images.
* [train_shapes.ipynb](train_shapes.ipynb) shows how to train Mask R-CNN on your own dataset. This notebook introduces a toy dataset (Shapes) to demonstrate training on a new dataset.
* ([model.py](model.py), [utils.py](utils.py), [config.py](config.py)): These files contain the main Mask RCNN implementation.
* [inspect_data.ipynb](/inspect_data.ipynb). This notebook visualizes the different pre-processing steps
to prepare the training data.
* [inspect_model.ipynb](/inspect_model.ipynb) This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline.
* [inspect_weights.ipynb](/inspect_weights.ipynb)
This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.
# Step by Step Detection
To help with debugging and understanding the model, there are 3 notebooks
([inspect_data.ipynb](inspect_data.ipynb), [inspect_model.ipynb](inspect_model.ipynb),
[inspect_weights.ipynb](inspect_weights.ipynb)) that provide a lot of visualizations and allow running the model step by step to inspect the output at each point. Here are a few examples:
## 1. Anchor sorting and filtering
Visualizes every step of the first stage Region Proposal Network and displays positive and negative anchors along with anchor box refinement.
![](assets/detection_anchors.png)
## 2. Bounding Box Refinement
This is an example of final detection boxes (dotted lines) and the refinement applied to them (solid lines) in the second stage.
![](assets/detection_refinement.png)
## 3. Mask Generation
Examples of generated masks. These then get scaled and placed on the image in the right location.
![](assets/detection_masks.png)
## 4.Layer activations
Often it's useful to inspect the activations at different layers to look for signs of trouble (all zeros or random noise).
![](assets/detection_activations.png)
## 5. Weight Histograms
Another useful debugging tool is to inspect the weight histograms. These are included in the inspect_weights.ipynb notebook.
![](assets/detection_histograms.png)
## 6. Logging to TensorBoard
TensorBoard is another great debugging and visualization tool. The model is configured to log losses and save weights at the end of every epoch.
![](assets/detection_tensorboard.png)
## 6. Composing the different pieces into a final result
![](assets/detection_final.png)
# Training on MS COCO
We're providing pre-trained weights for MS COCO to make it easier to start. You can
use those weights as a starting point to train your own variation on the network.
Training and evaluation code is in coco.py. You can import this
module in Jupyter notebook (see the provided notebooks for examples) or you
can run it directly from the command line as such:
```
# Train a new model starting from pre-trained COCO weights
python3 coco.py train --dataset=/path/to/coco/ --model=coco
# Train a new model starting from ImageNet weights
python3 coco.py train --dataset=/path/to/coco/ --model=imagenet
# Continue training a model that you had trained earlier
python3 coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5
# Continue training the last model you trained. This will find
# the last trained weights in the model directory.
python3 coco.py train --dataset=/path/to/coco/ --model=last
```
You can also run the COCO evaluation code with:
```
# Run COCO evaluation on the last trained model
python3 coco.py evaluate --dataset=/path/to/coco/ --model=last
```
The training schedule, learning rate, and other parameters should be set in coco.py.
# Training on Your Own Dataset
To train the model on your own dataset you'll need to sub-class two classes:
```Config```
This class contains the default configuration. Subclass it and modify the attributes you need to change.
```Dataset```
This class provides a consistent way to work with any dataset.
It allows you to use new datasets for training without having to change
the code of the model. It also supports loading multiple datasets at the
same time, which is useful if the objects you want to detect are not
all available in one dataset.
The ```Dataset``` class itself is the base class. To use it, create a new
class that inherits from it and adds functions specific to your dataset.
See the base `Dataset` class in utils.py and examples of extending it in train_shapes.ipynb and coco.py.
## Differences from the Official Paper
This implementation follows the Mask RCNN paper for the most part, but there are a few cases where we deviated in favor of code simplicity and generalization. These are some of the differences we're aware of. If you encounter other differences, please do let us know.
* **Image Resizing:** To support training multiple images per batch we resize all images to the same size. For example, 1024x1024px on MS COCO. We preserve the aspect ratio, so if an image is not square we pad it with zeros. In the paper the resizing is done such that the smallest side is 800px and the largest is trimmed at 1000px.
* **Bounding Boxes**: Some datasets provide bounding boxes and some provide masks only. To support training on multiple datasets we opted to ignore the bounding boxes that come with the dataset and generate them on the fly instead. We pick the smallest box that encapsulates all the pixels of the mask as the bounding box. This simplifies the implementation and also makes it easy to apply certain image augmentations that would otherwise be really hard to apply to bounding boxes, such as image rotation.
To validate this approach, we compared our computed bounding boxes to those provided by the COCO dataset.
We found that ~2% of bounding boxes differed by 1px or more, ~0.05% differed by 5px or more,
and only 0.01% differed by 10px or more.
* **Learning Rate:** The paper uses a learning rate of 0.02, but we found that to be
too high, and often causes the weights to explode, especially when using a small batch
size. It might be related to differences between how Caffe and TensorFlow compute
gradients (sum vs mean across batches and G
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
Mask-Rcnn.zip (86个子文件)
Mask-Rcnn
代码使用说明.txt 1KB
实验环境配置.txt 1KB
code
test1.py 5KB
visualize.py 16KB
pro2video.py 3KB
coco.py 20KB
utils.py 25KB
LICENSE 1KB
shapes.py 7KB
model.py 108KB
config.py 6KB
README.md 10KB
parallel_model.py 7KB
pro_video.py 654B
pro_video.py 654B
coco
README.txt 1KB
PythonAPI
pycocotools
mask.py 4KB
coco.py 18KB
cocoeval.py 23KB
_mask.c 706KB
__pycache__
cocoeval.cpython-34.pyc 20KB
coco.cpython-34.pyc 15KB
mask.cpython-34.pyc 1KB
__init__.cpython-34.pyc 156B
_mask.so 877KB
_mask.pyx 11KB
__init__.py 21B
_mask.cpython-34m.so 902KB
pycocoDemo.ipynb 2.86MB
.setup.py.swp 12KB
pycocoEvalDemo.ipynb 4KB
setup.py 749B
Makefile 201B
build
common
temp.linux-x86_64-3.4
pycocotools
license.txt 1KB
.git
index 4KB
hooks
pre-push.sample 1KB
prepare-commit-msg.sample 1KB
applypatch-msg.sample 452B
pre-commit.sample 2KB
pre-applypatch.sample 398B
commit-msg.sample 896B
pre-rebase.sample 5KB
update.sample 4KB
post-update.sample 189B
config 261B
description 73B
refs
tags
heads
master 41B
remotes
origin
HEAD 32B
branches
logs
refs
heads
master 173B
remotes
origin
HEAD 173B
HEAD 173B
packed-refs 107B
objects
info
pack
pack-c29f3728c0b1a155786073f0ec9034bfa55b1311.idx 26KB
pack-c29f3728c0b1a155786073f0ec9034bfa55b1311.pack 10.39MB
info
exclude 240B
HEAD 23B
LuaAPI
rocks
coco-scm-1.rockspec 821B
env.lua 436B
CocoApi.lua 10KB
MaskApi.lua 10KB
cocoDemo.lua 791B
init.lua 498B
results
person_keypoints_val2014_fakekeypoints100_results.json 35KB
instances_val2014_fakesegm100_results.json 265KB
captions_val2014_fakecap_results.json 87KB
instances_val2014_fakebbox100_results.json 59KB
val2014_fake_eval_res.txt 3KB
common
gason.h 3KB
gason.cpp 9KB
maskApi.h 2KB
maskApi.c 8KB
.gitignore 287B
MatlabAPI
evalDemo.m 2KB
private
maskApiMex.c 5KB
gasonMex.cpp 9KB
maskApiMex.mexa64 21KB
gasonMex.mexa64 37KB
gasonMex.mexmaci64 40KB
maskApiMex.mexmaci64 23KB
getPrmDflt.m 3KB
CocoApi.m 14KB
MaskApi.m 5KB
CocoEval.m 22KB
cocoDemo.m 1KB
gason.m 2KB
CocoUtils.m 16KB
共 86 条
- 1
资源评论
邓凌佳
- 粉丝: 65
- 资源: 1万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- #P0015. 全排列 超级简单
- pta题库答案c语言之排序4统计工龄.zip
- pta题库答案c语言之树结构7堆中的路径.zip
- pta题库答案c语言之树结构3TreeTraversalsAgain.zip
- pta题库答案c语言之树结构2ListLeaves.zip
- pta题库答案c语言之树结构1树的同构.zip
- 基于C++实现民航飞行与地图简易管理系统可执行程序+说明+详细注释.zip
- pta题库答案c语言之复杂度1最大子列和问题.zip
- 三维装箱问题(Three-Dimensional Bin Packing Problem,3D-BPP)是一个经典的组合优化问题
- 以下是一些关于Linux线程同步的基本概念和方法.txt
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功