# Mask R-CNN for Object Detection and Segmentation
This is an implementation of [Mask R-CNN](https://arxiv.org/abs/1703.06870) on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.
![Instance Segmentation Sample](assets/street.png)
The repository includes:
* Source code of Mask R-CNN built on FPN and ResNet101.
* Training code for MS COCO
* Pre-trained weights for MS COCO
* Jupyter notebooks to visualize the detection pipeline at every step
* ParallelModel class for multi-GPU training
* Evaluation on MS COCO metrics (AP)
* Example of training on your own dataset
The code is documented and designed to be easy to extend. If you use it in your research, please consider citing this repository (bibtex below). If you work on 3D vision, you might find our recently released [Matterport3D](https://matterport.com/blog/2017/09/20/announcing-matterport3d-research-dataset/) dataset useful as well.
This dataset was created from 3D-reconstructed spaces captured by our customers who agreed to make them publicly available for academic use. You can see more examples [here](https://matterport.com/gallery/).
# Getting Started
* [demo.ipynb](samples/demo.ipynb) Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images.
It includes code to run object detection and instance segmentation on arbitrary images.
* [train_shapes.ipynb](samples/shapes/train_shapes.ipynb) shows how to train Mask R-CNN on your own dataset. This notebook introduces a toy dataset (Shapes) to demonstrate training on a new dataset.
* ([model.py](mrcnn/model.py), [utils.py](mrcnn/utils.py), [config.py](mrcnn/config.py)): These files contain the main Mask RCNN implementation.
* [inspect_data.ipynb](samples/coco/inspect_data.ipynb). This notebook visualizes the different pre-processing steps
to prepare the training data.
* [inspect_model.ipynb](samples/coco/inspect_model.ipynb) This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline.
* [inspect_weights.ipynb](samples/coco/inspect_weights.ipynb)
This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.
# Step by Step Detection
To help with debugging and understanding the model, there are 3 notebooks
([inspect_data.ipynb](samples/coco/inspect_data.ipynb), [inspect_model.ipynb](samples/coco/inspect_model.ipynb),
[inspect_weights.ipynb](samples/coco/inspect_weights.ipynb)) that provide a lot of visualizations and allow running the model step by step to inspect the output at each point. Here are a few examples:
## 1. Anchor sorting and filtering
Visualizes every step of the first stage Region Proposal Network and displays positive and negative anchors along with anchor box refinement.
![](assets/detection_anchors.png)
## 2. Bounding Box Refinement
This is an example of final detection boxes (dotted lines) and the refinement applied to them (solid lines) in the second stage.
![](assets/detection_refinement.png)
## 3. Mask Generation
Examples of generated masks. These then get scaled and placed on the image in the right location.
![](assets/detection_masks.png)
## 4.Layer activations
Often it's useful to inspect the activations at different layers to look for signs of trouble (all zeros or random noise).
![](assets/detection_activations.png)
## 5. Weight Histograms
Another useful debugging tool is to inspect the weight histograms. These are included in the inspect_weights.ipynb notebook.
![](assets/detection_histograms.png)
## 6. Logging to TensorBoard
TensorBoard is another great debugging and visualization tool. The model is configured to log losses and save weights at the end of every epoch.
![](assets/detection_tensorboard.png)
## 6. Composing the different pieces into a final result
![](assets/detection_final.png)
# Training on MS COCO
We're providing pre-trained weights for MS COCO to make it easier to start. You can
use those weights as a starting point to train your own variation on the network.
Training and evaluation code is in `samples/coco/coco.py`. You can import this
module in Jupyter notebook (see the provided notebooks for examples) or you
can run it directly from the command line as such:
```
# Train a new model starting from pre-trained COCO weights
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=coco
# Train a new model starting from ImageNet weights
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=imagenet
# Continue training a model that you had trained earlier
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5
# Continue training the last model you trained. This will find
# the last trained weights in the model directory.
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=last
```
You can also run the COCO evaluation code with:
```
# Run COCO evaluation on the last trained model
python3 samples/coco/coco.py evaluate --dataset=/path/to/coco/ --model=last
```
The training schedule, learning rate, and other parameters should be set in `samples/coco/coco.py`.
# Training on Your Own Dataset
Start by reading this [blog post about the balloon color splash sample](https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46). It covers the process starting from annotating images to training to using the results in a sample application.
In summary, to train the model on your own dataset you'll need to extend two classes:
```Config```
This class contains the default configuration. Subclass it and modify the attributes you need to change.
```Dataset```
This class provides a consistent way to work with any dataset.
It allows you to use new datasets for training without having to change
the code of the model. It also supports loading multiple datasets at the
same time, which is useful if the objects you want to detect are not
all available in one dataset.
See examples in `samples/shapes/train_shapes.ipynb`, `samples/coco/coco.py`, `samples/balloon/balloon.py`, and `samples/nucleus/nucleus.py`.
## Differences from the Official Paper
This implementation follows the Mask RCNN paper for the most part, but there are a few cases where we deviated in favor of code simplicity and generalization. These are some of the differences we're aware of. If you encounter other differences, please do let us know.
* **Image Resizing:** To support training multiple images per batch we resize all images to the same size. For example, 1024x1024px on MS COCO. We preserve the aspect ratio, so if an image is not square we pad it with zeros. In the paper the resizing is done such that the smallest side is 800px and the largest is trimmed at 1000px.
* **Bounding Boxes**: Some datasets provide bounding boxes and some provide masks only. To support training on multiple datasets we opted to ignore the bounding boxes that come with the dataset and generate them on the fly instead. We pick the smallest box that encapsulates all the pixels of the mask as the bounding box. This simplifies the implementation and also makes it easy to apply image augmentations that would otherwise be harder to apply to bounding boxes, such as image rotation.
To validate this approach, we compared our computed bounding boxes to those provided by the COCO dataset.
We found that ~2% of bounding boxes differed by 1px or more, ~0.05% differed by 5px or more,
and only 0.01% differed by 10px or more.
* **Learning Rate:** The paper uses a learning rate of 0.02, but we found that to be
too high, and often causes the weights to explode, especially when using a small batch
size. It might be related to differences between how Caffe and TensorFlow compute
gradients (sum vs
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
悉尼大学2019年夏季研究项目:利用SiamMask进行单目标跟踪,利用Mask RCN (184个子文件)
_mask.c 692KB
region.c 489KB
region.c 469KB
region.c 23KB
region.c 23KB
setup.cfg 99B
4k_video.gif 11.61MB
balloon_color_splash.gif 9.21MB
project_usiigaci1.gif 2.54MB
project_usiigaci2.gif 609KB
.gitignore 564B
.gitignore 128B
.gitignore 92B
.gitignore 79B
.gitignore 24B
.gitignore 23B
.gitignore 13B
.gitignore 6B
buffer.h 4KB
buffer.h 4KB
region.h 3KB
region.h 3KB
inspect_balloon_model.ipynb 9.99MB
inspect_model.ipynb 9.84MB
inspect_data.ipynb 7.84MB
inspect_balloon_data.ipynb 7.8MB
inspect_nucleus_model.ipynb 6.59MB
inspect_nucleus_data.ipynb 4.21MB
inspect_weights.ipynb 1.21MB
demo.ipynb 840KB
train_shapes.ipynb 99KB
9118579087_f9ffa19e63_z.jpg 301KB
9247489789_132c0d534a_z.jpg 282KB
8829708882_48f263491e_z.jpg 237KB
8699757338_c3941051b6_z.jpg 234KB
4782628554_668bc31826_z.jpg 224KB
8239308689_efa6c11b08_z.jpg 221KB
8053677163_d4c8f416be_z.jpg 220KB
project_shiny1.jpg 216KB
6821351586_59aa0dc110_z.jpg 212KB
7933423348_c30bd9bd4e_z.jpg 209KB
2383514521_1fc8d7b0de_z.jpg 203KB
8433365521_9252889f9a_z.jpg 178KB
2516944023_d00345997d_z.jpg 177KB
25691390_f9944f61b5_z.jpg 176KB
3132016470_c27baa00e8_z.jpg 170KB
8734543718_37f6b8bd45_z.jpg 163KB
5951960966_d4e1cda5d0_z.jpg 157KB
3627527276_6fe8cd9bfe_z.jpg 157KB
3651581213_f81963d1dd_z.jpg 147KB
8512296263_5fc5458e20_z.jpg 146KB
7581246086_cf7bbb7255_z.jpg 142KB
6584515005_fce9cec486_z.jpg 142KB
1045023827_4ec3e8ba5c_z.jpg 138KB
2502287818_41e4b0c4fb_z.jpg 131KB
4410436637_7b0ca36ee7_z.jpg 124KB
262985539_1709e54576_z.jpg 122KB
3800883468_12af3c0b50_z.jpg 120KB
3862500489_6fd195d183_z.jpg 94KB
3878153025_8fde829928_z.jpg 86KB
12283150_12d37e6389_z.jpg 67KB
train.jpg 45KB
train.jpg 44KB
train.jpg 44KB
train.jpg 43KB
config_vot.json 388B
config_vot18.json 388B
README.md 13KB
README.md 3KB
README.md 3KB
README.md 2KB
README.md 1KB
Readme.md 94B
region.o 1.35MB
region.o 1.27MB
region.o 1.26MB
region.o 1.19MB
region.o 67KB
region.o 67KB
region.o 66KB
region.o 66KB
project_ice_wedge_polygons.png 1MB
images_to_osm.png 983KB
project_3dbuildings.png 959KB
street.png 890KB
detection_final.png 887KB
detection_anchors.png 747KB
nucleus_segmentation.png 708KB
detection_refinement.png 703KB
mapping_challenge.png 613KB
project_grass_gis.png 579KB
detection_activations.png 69KB
detection_tensorboard.png 43KB
detection_histograms.png 13KB
detection_masks.png 10KB
c_region.pxd 1KB
c_region.pxd 1KB
model.py 124KB
utils.py 33KB
cocoeval.py 24KB
共 184 条
- 1
- 2
资源评论
好家伙VCC
- 粉丝: 2402
- 资源: 9141
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 高可用k8s集群离线部署(五)
- 福泰轴承股份有限公司进销存系统pf-springboot毕业项目,适合计算机毕-设、实训项目、大作业学习.zip
- 程序设计基础课程设计实践教学指导书V1.0.doc
- 429大神JSP基于SSH2文件共享网站设计毕业课程源码设计
- 滑雪场管理系统--论文pf-springboot毕业项目,适合计算机毕-设、实训项目、大作业学习.zip
- 甘肃旅游服务平台代码--论文pf-springboot毕业项目,适合计算机毕-设、实训项目、大作业学习.zip
- 校园博客系统-springboot毕业项目,适合计算机毕-设、实训项目、大作业学习.zip
- 洞见研报南京芯驰半导体科技股份(汽车智能驾驶芯片研发商,北京芯驰半导体科技股份有限公司)创投信息
- 01) 并联型+APF有源电力滤波器,三相三线; 02) 谐波检测采用基于瞬时无功功率理论的ip-iq检测方法; 03) 采用电压外环+电流内环双闭环控制; 04) 电压外环:采用PI控制; 05)
- 洋州影院购票管理系统的设计与实现-springboot毕业项目,适合计算机毕-设、实训项目、大作业学习.zip
- 基于LabVIEW的虚拟仪器双音多频(DTMF)系统设计与实现及应用
- “有光”摄影分享网站系统pf-springboot毕业项目,适合计算机毕-设、实训项目、大作业学习.zip
- 基于yolov8的检测GUI程序
- cphy_rx_ref.zip
- 在线考试系统研究与实现_iq653-springboot毕业项目,适合计算机毕-设、实训项目、大作业学习.zip
- http抓包实验.docx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功