FAIR用于对象检测研究的研究平台，实现了MaskR-CNN和RetinaNet等流行算法_carprice

共222个文件

py：90个

yaml：87个

jpg：12个

版权申诉

65 浏览量 2023-04-30 10:26:12 上传评论收藏 3.93MB ZIP 举报

**FAIR对象检测研究平台详解** Facebook AI Research (FAIR) 是Facebook公司设立的一个专注于人工智能研究的团队，致力于推动AI技术的发展。在这个平台上，研究人员和开发者可以利用先进的计算机视觉技术，如对象检测、实例分割等，进行创新研究。在本案例中，重点提及了两种流行的算法：Mask R-CNN和RetinaNet，这些都是在对象检测领域具有广泛影响力的方法。 **Mask R-CNN** Mask R-CNN是FAIR团队于2017年提出的一种深度学习模型，它是 Faster R-CNN 的扩展，旨在同时实现对象检测和像素级别的语义分割。该模型的核心创新在于引入了一个“分支”结构，这个分支可以在检测到的对象实例上直接预测分割掩模。这种设计使得Mask R-CNN能够不仅定位出对象的位置，还能精确地分割出对象的轮廓，对于自动驾驶、医疗影像分析等应用有着显著的价值。 **RetinaNet** RetinaNet是另一款由FAIR团队开发的深度学习对象检测框架，它解决了传统检测网络中的“类别不平衡问题”。在传统的检测网络中，背景区域远多于目标物体，导致训练过程中对目标物体的响应被稀释。RetinaNet通过引入一种名为"焦点损失"（Focal Loss）的新损失函数，降低了大量简单负样本的权重，从而更专注于难例的学习，提升了小目标和密集目标的检测性能。RetinaNet的这种改进使得它在保持高精度的同时，也具有较高的检测速度。 **Detectron-main项目** 在提供的压缩包文件"Detectron-main"中，包含了FAIR的Detectron项目的源代码。Detectron是一个用于物体检测、实例分割和关键点检测的开源框架，它支持多种先进的算法，包括上述的Mask R-CNN和RetinaNet。这个框架基于PyTorch，提供了易于使用的API，使得研究人员和开发者能够快速实验和部署新的模型。Detectron的主要特点包括高效的训练和推理、丰富的预训练模型库以及强大的可视化工具，这些都极大地推动了计算机视觉领域的研究和实践。通过深入研究Detectron项目，我们可以了解如何实现这些先进算法，如何调整超参数以优化性能，以及如何在自己的数据集上进行迁移学习。对于希望在对象检测领域深化理解或开展相关研究的人来说，这是一个非常宝贵的学习资源。同时，Detectron也是探索其他复杂视觉任务，如语义分割、关键点检测等的起点，对于提升AI系统的视觉智能具有重要意义。

资源推荐

资源详情

资源评论

收起资源包目录

FAIR用于对象检测研究的研究平台，实现了MaskR-CNN和RetinaNet等流行算法_carprice_Kaggl.zip （222个子文件）

zero_even_op.cc 1KB

Utils.cmake 11KB

Cuda.cmake 10KB

FindCuDNN.cmake 3KB

legacymake.cmake 2KB

Dependencies.cmake 2KB

Summary.cmake 2KB

zero_even_op.cu 2KB

Dockerfile 742B

.gitignore 316B

zero_even_op.h 1KB

33823288584_1d21cf0a26_k_example_output.jpg 997KB

17790319373_bd19b24cfc_k_example_output.jpg 961KB

gn.jpg 248KB

33823288584_1d21cf0a26_k.jpg 244KB

34501842524_3c858b3080_k.jpg 229KB

15673749081_767a7fa63a_k.jpg 220KB

18124840932_e42b3e377c_k.jpg 159KB

17790319373_bd19b24cfc_k.jpg 146KB

19064748793_bb942deea1_k.jpg 137KB

16004479832_a748d55f21_k.jpg 133KB

33887522274_eebd074106_k.jpg 119KB

24274813513_0cfd2ce6d0_k.jpg 111KB

LICENSE 10KB

voc_eval.m 1KB

xVOCap.m 258B

get_voc_opts.m 231B

Makefile 487B

MODEL_ZOO.md 116KB

README.md 17KB

INSTALL.md 9KB

README.md 7KB

GETTING_STARTED.md 6KB

FAQ.md 3KB

CODE_OF_CONDUCT.md 3KB

README.md 3KB

CONTRIBUTING.md 1KB

issue_template.md 982B

NOTICE 1KB

NOTICE 917B

config.py 46KB

test.py 34KB

model_builder.py 23KB

detector.py 22KB

convert_pkl_to_pb.py 22KB

FPN.py 20KB

json_dataset.py 19KB

json_dataset_evaluator.py 17KB

test_engine.py 14KB

vis.py 14KB

task_evaluation.py 14KB

retinanet.py 13KB

boxes.py 13KB

model_convert_utils.py 12KB

net.py 12KB

retinanet_heads.py 12KB

loader.py 12KB

rpn.py 11KB

fast_rcnn.py 11KB

ResNet.py 11KB

mask_rcnn_heads.py 10KB

segms.py 10KB

rpn_generator.py 9KB

keypoints.py 9KB

generate_proposals.py 9KB

convert_cityscapes_to_coco.py 8KB

pickle_caffe_blobs.py 8KB

train.py 8KB

keypoint_rcnn_heads.py 8KB

test_retinanet.py 8KB

voc_eval.py 8KB

roidb.py 8KB

dataset_catalog.py 7KB

voc_dataset_evaluator.py 7KB

io.py 7KB

test_cfg.py 7KB

infer.py 6KB

blob.py 6KB

fast_rcnn_heads.py 6KB

data_loader_benchmark.py 6KB

infer_simple.py 6KB

rpn_heads.py 5KB

optimizer.py 5KB

mask_rcnn.py 5KB

subprocess.py 5KB

c2.py 5KB

test_restore_checkpoint.py 5KB

collect_and_distribute_fpn_rpn_proposals.py 5KB

keypoint_rcnn.py 5KB

test_zero_even_op.py 5KB

test_bbox_transform.py 4KB

minibatch.py 4KB

lr_policy.py 4KB

convert_coco_model_to_cityscapes.py 4KB

train_net.py 4KB

test_batch_permutation_op.py 4KB

data_utils.py 4KB

test_loader.py 4KB

generate_anchors.py 4KB

共 222 条

# Group Normalization for Mask R-CNN <div align="center"> <img src="gn.jpg" width="700px" /> </div> ## Introduction This file provides Mask R-CNN baseline results and models trained with [Group Normalization](https://arxiv.org/abs/1803.08494): ``` @article{GroupNorm2018, title={Group Normalization}, author={Yuxin Wu and Kaiming He}, journal={arXiv:1803.08494}, year={2018} } ``` **Note:** This code uses the GroupNorm op implemented in CUDA, included in the Caffe2 repo. When writing this document, Caffe2 is being merged into PyTorch, and the GroupNorm op is located [here](https://github.com/pytorch/pytorch/blob/master/caffe2/operators/group_norm_op.cu). Make sure your Caffe2 is up to date. ## Pretrained Models with GN These models are trained in Caffe2 on the standard ImageNet-1k dataset, using GroupNorm with 32 groups (G=32). - [R-50-GN.pkl](https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/47261647/R-50-GN.pkl): ResNet-50 with GN, 24.0\% top-1 error (center-crop). - [R-101-GN.pkl](https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/47592356/R-101-GN.pkl): ResNet-101 with GN, 22.6\% top-1 error (center-crop). ## Results ### Baselines with BN <table><tbody>    <th valign="bottom">         case          </th> <th valign="bottom">type</th> <th valign="bottom">lr schd</th> <th valign="bottom">im/ gpu</th> <th valign="bottom">train mem (GB)</th> <th valign="bottom">train time (s/iter)</th> <th valign="bottom">train time total (hr)</th> <th valign="bottom">inference time (s/im)</th> <th valign="bottom">box AP</th> <th valign="bottom">mask AP</th> <th valign="bottom">model id</th> <tr> <td align="left">R-50-FPN, BN*</td> <td align="left">Mask R-CNN</td> <td align="left">2x</td> <td align="right">2</td> <td align="right">8.6</td> <td align="right">0.897</td> <td align="right">44.9</td> <td align="right">0.099 + 0.018</td> <td align="right">38.6</td> <td align="right">34.5</td> <td align="right">35859007</td> </tr> <tr> <td align="left">R-101-FPN, BN*</td> <td align="left">Mask R-CNN</td> <td align="left">2x</td> <td align="right">2</td> <td align="right">10.2</td> <td align="right">0.993</td> <td align="right">49.7</td> <td align="right">0.126 + 0.017</td> <td align="right">40.9</td> <td align="right">36.4</td> <td align="right">35861858</td> </tr>  </tbody></table> **Notes:** - This table is copied from [Detectron Model Zoo](https://github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md#end-to-end-faster--mask-r-cnn-baselines). - BN* means that BatchNorm (BN) is used for pre-training and is frozen and turned into a per-channel linear layer when fine-tuning. This is the default of Faster/Mask R-CNN and Detectron. ### Mask R-CNN with GN #### Standard Mask R-CNN recipe <table><tbody>    <th valign="bottom">         case          </th> <th valign="bottom">type</th> <th valign="bottom">lr schd</th> <th valign="bottom">im/ gpu</th> <th valign="bottom">train mem (GB)</th> <th valign="bottom">train time (s/iter)</th> <th valign="bottom">train time total (hr)</th> <th valign="bottom">inference time (s/im)</th> <th valign="bottom">box AP</th> <th valign="bottom">mask AP</th> <th valign="bottom">model id</th> <th valign="bottom">download links</th>  <tr> <td align="left">R-50-FPN, GN</td> <td align="left">Mask R-CNN</td> <td align="left">2x</td> <td align="right">2</td> <td align="right">10.5</td> <td align="right">1.017</td> <td align="right">50.8</td> <td align="right">0.146 + 0.017</td> <td align="right">40.3</td> <td align="right">35.7</td> <td align="right">48616381</td> <td align="left"> <a href="https://dl.fbaipublicfiles.com/detectron/GN/48616381/04_2018_gn_baselines/e2e_mask_rcnn_R-50-FPN_2x_gn_0416.13_23_38.bTlTI97Q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl">model</a>  |  <a href="https://dl.fbaipublicfiles.com/detectron/GN/48616381/04_2018_gn_baselines/e2e_mask_rcnn_R-50-FPN_2x_gn_0416.13_23_38.bTlTI97Q/output/test/coco_2014_minival/generalized_rcnn/bbox_coco_2014_minival_results.json">boxes</a>  |  <a href="https://dl.fbaipublicfiles.com/detectron/GN/48616381/04_2018_gn_baselines/e2e_mask_rcnn_R-50-FPN_2x_gn_0416.13_23_38.bTlTI97Q/output/test/coco_2014_minival/generalized_rcnn/segmentations_coco_2014_minival_results.json">masks</a></td> </tr> <tr> <td align="left">R-101-FPN, GN</td> <td align="left">Mask R-CNN</td> <td align="left">2x</td> <td align="right">2</td> <td align="right">12.4</td> <td align="right">1.151</td> <td align="right">57.5</td> <td align="right">0.180 + 0.015</td> <td align="right">41.8</td> <td align="right">36.8</td> <td align="right">48616724</td> <td align="left"> <a href="https://dl.fbaipublicfiles.com/detectron/GN/48616724/04_2018_gn_baselines/e2e_mask_rcnn_R-101-FPN_2x_gn_0416.13_26_34.GLnri4GR/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl">model</a>  |  <a href="https://dl.fbaipublicfiles.com/detectron/GN/48616724/04_2018_gn_baselines/e2e_mask_rcnn_R-101-FPN_2x_gn_0416.13_26_34.GLnri4GR/output/test/coco_2014_minival/generalized_rcnn/bbox_coco_2014_minival_results.json">boxes</a>  |  <a href="https://dl.fbaipublicfiles.com/detectron/GN/48616724/04_2018_gn_baselines/e2e_mask_rcnn_R-101-FPN_2x_gn_0416.13_26_34.GLnri4GR/output/test/coco_2014_minival/generalized_rcnn/segmentations_coco_2014_minival_results.json">masks</a></td> </tr>  </tbody></table> **Notes:** - GN is applied on: (i) ResNet layers inherited from pre-training, (ii) the FPN-specific layers, (iii) the RoI bbox head, and (iv) the RoI mask head. - These GN models use a 4conv+1fc RoI box head. The BN* counterpart with this head performs similarly with the default 2fc head: using this co

评论收藏

内容反馈

版权申诉