深度学习之PyTorch物体检测实战_pytorch复杂项目实战资源-CSDN文库

共574个文件

py：159个

jpg：30个

pyc：21个

版权申诉

pytorch

深度学习

目标检测

5星 · 超过95%的资源 95 浏览量 2023-05-05 23:32:15 上传评论 1 收藏 149.26MB ZIP 举报

在本教程中，我们将深入探讨如何使用PyTorch框架进行深度学习中的物体检测任务。PyTorch是一个流行的开源机器学习库，它以其灵活性和易用性受到广大开发者的喜爱。物体检测是计算机视觉领域的一个重要任务，其目标是识别图像中的特定对象并确定它们的位置。以下是对"深度学习之PyTorch物体检测实战"主题的详细说明。 1. **PyTorch基础**： - PyTorch的核心是动态计算图，这使得模型构建和调试更加直观。 - Tensor是PyTorch的基本数据结构，用于存储和操作多维数组。 - autograd模块提供了自动微分，这是训练神经网络的关键部分。 2. **深度学习原理**： - 深度学习依赖于多层神经网络，通过反向传播优化权重来提高预测性能。 - 激活函数如ReLU、sigmoid和tanh引入非线性，使网络能够学习更复杂的模式。 - 学习过程通常涉及损失函数（如交叉熵）和优化器（如SGD、Adam）的组合。 3. **物体检测技术**： - 物体检测方法包括传统的基于区域的CNN（R-CNN）、快速R-CNN、YOLO（You Only Look Once）和Mask R-CNN等。 - SSD（Single Shot MultiBox Detector）是一种高效的一次性检测框架，能在单个前向传递中同时预测边界框和类别。 4. **PyTorch中的物体检测模型**： - PyTorch提供了许多预训练的物体检测模型，如Faster R-CNN、Mask R-CNN和YOLOv3。 - torchvision库包含了这些模型的实现，以及ImageNet预训练权重，简化了模型的迁移学习。 5. **模型训练**： - 数据预处理包括图像缩放、归一化和数据增强，以提高模型泛化能力。 - 训练过程中涉及批处理、学习率调度和早停策略，以优化模型性能。 6. **目标检测API使用**： - torchvision.models.detection API提供了一种简单的方法来加载和微调预训练的物体检测模型。 - 使用示例代码演示如何加载模型、预处理输入图像、进行推理和解析预测结果。 7. **实战项目**： - 项目可能包括创建一个数据集，定义数据加载器，选择合适的模型，训练模型，并在测试集上评估性能。 - 可能会涉及自定义模型的调整，如添加新的类别或修改网络架构。 8. **可视化工具**： - 使用可视化工具如visdom或TensorBoard可以监控训练过程，如损失曲线和学习率变化。 - 通过matplotlib或OpenCV将预测结果可视化，有助于理解和改进模型。 9. **挑战与解决方案**： - 遇到的常见问题可能包括过拟合、欠拟合、内存管理等，解决方法可能包括正则化、批量归一化、模型剪枝等。通过以上步骤，你可以掌握使用PyTorch进行物体检测的基本流程和技术。实践项目将帮助巩固理论知识，并提供一个实际应用深度学习的平台。无论你是初学者还是有经验的开发者，这个实战教程都能提供有价值的见解和经验。

资源推荐

资源详情

资源评论

收起资源包目录

深度学习之PyTorch物体检测实战（574个子文件）

00f38969670d50ac8cb7fcbc1c2e48bbfcefb3 415B

0142ccec14d0ac00da570dc683ace198d7bbbe 2KB

020408f0d212471ce03cf85d762cd9b2b42df9 1KB

02c63d7a92b933ce3521296a5c2357911118c4 159B

02d1f7a7d9a50d75281d7b9c3b5a746a8f5e57 2KB

0867b2c776314ee6f2ed2f5ef5711b7917040c 406B

0a058c1f0f8f0ccac85fd2d8be49ed58be51ea 50B

0a341c32707f847d2c89414f7aef13bffdf6eb 1KB

0bef3321d78fc73556906649ab61eaaea60d86 848B

0c3ef41ce20750a17c4333633bd0ba6a4f038e 4KB

0d0c2ec99f52b4be39561acd3ae13ae1af0dac 157B

0d1c67384917e514f65f4831f6806a461a6ec5 1KB

0fd03febfabcebf594f8e7169dca49f6e79c14 244B

10c79422f8f8bf7924facdcdac9541952901c1 160B

1666a5633325e50dc05676a96050653812e0dd 526B

167deb5e3e34a335d819867bd8571a551a1aa5 412B

181a2129d001fb5232b03752c4f663a802a57e 2KB

18aa48af12af548439173b5b5f895e64ee079a 121KB

18da4c9ec4bfab797fda57b90a231696c3f79c 93B

192f7b9db150b445f04fd9c59862e7fdec3ed0 382B

1b71c058b6a4bec6e48fcdeaee8a922974a439 637B

1c308d2e29cfa0b6b10d0f854daab85254d38f 217B

1ef8d3054c98ba70fcfeee5c9adc1c0bd3d3fc 1KB

2116fff5060a76c228129454a2967760e0bf7b 388B

21d797b09768597b607bd26a0a693b4f22cd25 209B

22adc9d2e112d9c39743e1b2fcf02a21c265db 266B

24a4547f35bbcdcba0864a93bf73b9ff2b7cd7 87B

28a99faff5a0b43131a644634cb5cfd22ea4fe 270B

29215b1b83539dd79d5aabfb34af4a8aa48967 75KB

2c8542f6a843505ac9992aa29562bf2b34b58d 946B

2c889dc5c3df8908ead215fe739e23dea3500f 13KB

2d36b367a36bf96e42c88071a794f5aa67b788 231KB

312f1eb01c5c00a60017100bb8dd0c2c2d67b5 50B

3153604cfb791af84020f0a1bd1704068e58fe 87B

319c7108f2f54f95e8dca0a5199386d7358140 836B

33cb46d91fb49e5b04407e5c71f4c2056539a8 2KB

357cfa2607f9b016a96db1c3fea12a27b5f06d 361B

3b1ca1a4f36d961af27fea2714f937358a089b 362KB

3b717acc6e3f803c96b92fe09c26977d8142ee 63KB

3bb5be20afe87bc56a17d6e1f03eb88debb242 1KB

3cbe1de0abf3661bd853c030c5405f04f403e6 122B

3e8b88398d94a7454f201372317a9414344c7c 374KB

3fb56f86f2a15fe91e37d660af67d324915e5b 238B

4204f47051b19030b33039c246b155f40bf2e8 157B

426de79e5a0258f5eb2abc7eb42f6cf249b1a7 1.94MB

42b8460cae8f81e75d3e40ad90f20deca57873 992KB

4489f28922b00e0a3c59e2f257a06e33a40465 94B

48863e08d77aa7dfca033e5c1d2706c9bcad91 677B

48a390db17cfecad3a425130260709d3b5e414 802B

49f84a6da80cf8b56dfc15780959eb9cffb64a 504B

4b16242af0bb2171c9ed00638342a309c99e69 1.26MB

4bfda012637fd4363bb438a45e9b9b4fa7f5bc 424B

4df1a4b1e5f9b659c5907c939be398bdb348ec 79B

4ed837c82ad80ffff8f458e316456da93022a7 411B

4f80f53db0badb2ab0d55b9ac09a8415149f66 85B

50e4c62278c618a615220faffc64097b8028ad 367B

51d031eb8f14b36faf1f8a74a72c107a67c20c 81B

53287ddf87ff142cb36f9bb65d84bcc5009939 406B

53c8e990bf711b74f09d4984714327cacfffe7 444B

53dd687e0a4dbdfc71023b11430ca2066282ae 163B

54d8eb10cca4cfc8158794d391ca89c0995662 196B

55302dc3f8f74b81d08aa98ac1a2a94a9ba28d 159B

555601e6d891e58393af1464d338d5ed786b72 87B

5699178c9ecc030576d5458f30f90b8d64451c 116B

56ad0163a67a381c7ed314a4e5d962e5914683 58B

56fac99e09457b97e5f94dfe60842af32ee6c0 3KB

583e0930d25d029a074c0d456a312e6d7cd239 307B

58cf5e736b4f750d6cb91273dd8d493513a4fc 293B

59661d76e463f43fbafad9087df49bdfa009cd 268B

5d672a3d7aaf8fa8034e60d2cd8a4db8f4d0bf 56B

5fbf182c5325aa5041d71eedf458202de49d95 56B

603e0677f8a9a3f5a6ea8f3757fdd4c84472a2 6KB

612abe135757576f0732e28bd1d695275723d3 161B

6205a821df15c9976dc4c5eefc30c1b3db3e79 284B

63059d6c92c2297f0a47e9a5ef97955a47b016 71B

66b02723394c63ebd90a694b0d780410f63e38 324B

6789b85b4b4e662519fcb7fe4d88ddf2205c5b 176KB

6b481c09629a49fc6d4da02de980aac55747ab 188B

6b7df931ee77e9dcaf6f5c95e0df590a25c327 387B

6cacb8fcd92466eaecc650f2bc79d2617d10cb 54KB

6d1d23615b3b02f8109c6c939add741dd3cbd8 4KB

6d26f2048a3878195bf14649ea8e6bfc1a23ca 2KB

6f1d6fd567f617f2e5ede3d6156d1a27b36495 5KB

6f69090700d77098c00c423fb898a9c6dc4d67 2KB

70fa2e457a38ec7d4dbba3d27ad91699e7f00b 72KB

717ec0319dd58daddc03a25767d778d1087712 123B

72608ff6d174aeafc6babd99703b5133d838f1 265B

72d9b488e8397556c70251316795f2db69c813 224B

730bee58027488c6310ca5e23856bc8a0d40b3 2KB

74842d065e43469d50f396b86e4982e0ad6b4f 242B

7509505b01a766bbf637dcbb1e2c5f24903ac5 135KB

7534f443c382d9e05700fe9822fbd96a071503 70KB

761f46ba08ed459af026b59f6b91b6fa597dd1 127KB

7692916f6f42af4a184d64e163120c69c0c13f 330B

7d3c182315aefc32b69075c739b8bf08aede02 4KB

7f045f2e3556baf5aed7ced576ac6a6f9600be 6KB

7fa422f584d5d8bb9e5d3bb4ed9cf9036ff430 178B

814a03705fc407483ca9013d653a501d46713b 3KB

81bc6c972f71044a5eb430e004ae8ed3d98a87 247B

849be7e1f5f9487966278692f5ec9de8c11ebb 2KB

共 574 条

# SSD: Single Shot MultiBox Object Detector, in PyTorch A [PyTorch](http://pytorch.org/) implementation of [Single Shot MultiBox Detector](http://arxiv.org/abs/1512.02325) from the 2016 paper by Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang, and Alexander C. Berg. The official and original Caffe code can be found [here](https://github.com/weiliu89/caffe/tree/ssd). <img align="right" src= "https://github.com/amdegroot/ssd.pytorch/blob/master/doc/ssd.png" height = 400/> ### Table of Contents - <a href='#installation'>Installation</a> - <a href='#datasets'>Datasets</a> - <a href='#training-ssd'>Train</a> - <a href='#evaluation'>Evaluate</a> - <a href='#performance'>Performance</a> - <a href='#demos'>Demos</a> - <a href='#todo'>Future Work</a> - <a href='#references'>Reference</a>         ## Installation - Install [PyTorch](http://pytorch.org/) by selecting your environment on the website and running the appropriate command. - Clone this repository. * Note: We currently only support Python 3+. - Then download the dataset by following the [instructions](#datasets) below. - We now support [Visdom](https://github.com/facebookresearch/visdom) for real-time loss visualization during training! * To use Visdom in the browser: ```Shell # First install Python server and client pip install visdom # Start the server (probably in a screen or tmux) python -m visdom.server ``` * Then (during training) navigate to http://localhost:8097/ (see the Train section below for training details). - Note: For training, we currently support [VOC](http://host.robots.ox.ac.uk/pascal/VOC/) and [COCO](http://mscoco.org/), and aim to add [ImageNet](http://www.image-net.org/) support soon. ## Datasets To make things easy, we provide bash scripts to handle the dataset downloads and setup for you. We also provide simple dataset loaders that inherit `torch.utils.data.Dataset`, making them fully compatible with the `torchvision.datasets` [API](http://pytorch.org/docs/torchvision/datasets.html). ### COCO Microsoft COCO: Common Objects in Context ##### Download COCO 2014 ```Shell # specify a directory for dataset to be downloaded into, else default is ~/data/ sh data/scripts/COCO2014.sh ``` ### VOC Dataset PASCAL VOC: Visual Object Classes ##### Download VOC2007 trainval & test ```Shell # specify a directory for dataset to be downloaded into, else default is ~/data/ sh data/scripts/VOC2007.sh # <directory> ``` ##### Download VOC2012 trainval ```Shell # specify a directory for dataset to be downloaded into, else default is ~/data/ sh data/scripts/VOC2012.sh # <directory> ``` ## Training SSD - First download the fc-reduced [VGG-16](https://arxiv.org/abs/1409.1556) PyTorch base network weights at: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth - By default, we assume you have downloaded the file in the `ssd.pytorch/weights` dir: ```Shell mkdir weights cd weights wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth ``` - To train SSD using the train script simply specify the parameters listed in `train.py` as a flag or manually change them. ```Shell python train.py ``` - Note: * For training, an NVIDIA GPU is strongly recommended for speed. * For instructions on Visdom usage/installation, see the <a href='#installation'>Installation</a> section. * You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see `train.py` for options) ## Evaluation To evaluate a trained network: ```Shell python eval.py ``` You can specify the parameters listed in the `eval.py` file by flagging them or manually changing them. <img align="left" src= "https://github.com/amdegroot/ssd.pytorch/blob/master/doc/detection_examples.png"> ## Performance #### VOC2007 Test ##### mAP | Original | Converted weiliu89 weights | From scratch w/o data aug | From scratch w/ data aug | |:-:|:-:|:-:|:-:| | 77.2 % | 77.26 % | 58.12% | 77.43 % | ##### FPS **GTX 1060:** ~45.45 FPS ## Demos ### Use a pre-trained SSD network for detection #### Download a pre-trained network - We are trying to provide PyTorch `state_dicts` (dict of weight tensors) of the latest SSD model definitions trained on different datasets. - Currently, we provide the following PyTorch models: * SSD300 trained on VOC0712 (newest PyTorch weights) - https://s3.amazonaws.com/amdegroot-models/ssd300_mAP_77.43_v2.pth * SSD300 trained on VOC0712 (original Caffe weights) - https://s3.amazonaws.com/amdegroot-models/ssd_300_VOC0712.pth - Our goal is to reproduce this table from the [original paper](http://arxiv.org/abs/1512.02325) <p align="left"> <img src="http://www.cs.unc.edu/~wliu/papers/ssd_results.png" alt="SSD results on multiple datasets" width="800px"></p> ### Try the demo notebook - Make sure you have [jupyter notebook](http://jupyter.readthedocs.io/en/latest/install.html) installed. - Two alternatives for installing jupyter notebook: 1. If you installed PyTorch with [conda](https://www.continuum.io/downloads) (recommended), then you should already have it. (Just navigate to the ssd.pytorch cloned repo and run): `jupyter notebook` 2. If using [pip](https://pypi.python.org/pypi/pip): ```Shell # make sure pip is upgraded pip3 install --upgrade pip # install jupyter notebook pip install jupyter # Run this inside ssd.pytorch jupyter notebook ``` - Now navigate to `demo/demo.ipynb` at http://localhost:8888 (by default) and have at it! ### Try the webcam demo - Works on CPU (may have to tweak `cv2.waitkey` for optimal fps) or on an NVIDIA GPU - This demo currently requires opencv2+ w/ python bindings and an onboard webcam * You can change the default webcam in `demo/live.py` - Install the [imutils](https://github.com/jrosebr1/imutils) package to leverage multi-threading on CPU: * `pip install imutils` - Running `python -m demo.live` opens the webcam and begins detecting! ## TODO We have accumulated the following to-do list, which we hope to complete in the near future - Still to come: * [x] Support for the MS COCO dataset * [ ] Support for SSD512 training and testing * [ ] Support for training on custom datasets ## Authors * [**Max deGroot**](https://github.com/amdegroot) * [**Ellis Brown**](http://github.com/ellisbrown) ***Note:*** Unfortunately, this is just a hobby of ours and not a full-time job, so we'll do our best to keep things up to date, but no guarantees. That being said, thanks to everyone for your continued help and feedback as it is really appreciated. We will try to address everything as soon as possible. ## References - Wei Liu, et al. "SSD: Single Shot MultiBox Detector." [ECCV2016]((http://arxiv.org/abs/1512.02325)). - [Original Implementation (CAFFE)](https://github.com/weiliu89/caffe/tree/ssd) - A huge thank you to [Alex Koltun](https://github.com/alexkoltun) and his team at [Webyclip](webyclip.com) for their help in finishing the data augmentation portion. - A list of other great SSD ports that were sources of inspiration (especially the Chainer repo): * [Chainer](https://github.com/Hakuyume/chainer-ssd), [Keras](https://github.com/rykov8/ssd_keras), [MXNet](https://github.com/zhreshold/mxnet-ssd), [Tensorflow](https://github.com/balancap/SSD-Tensorflow)

评论收藏

内容反馈

版权申诉