# Fast and accurate Human Pose Estimation using ShelfNet with PyTorch
This repository is the result of my curiosity to find out whether **ShelfNet** is an efficient CNN architecture for computer vision tasks other than semantic segmentation, and more specifically for the human pose estimation task. The answer is a clear yes, with **74.6 mAP** and **127 FPS** on the MS COCO Keypoints data set which represents a 3.5x boost in FPS compared to **HRNet** for a similar accuracy.
![ShelfNet Keypoints Demo](assets/anim1.gif)
This repository includes:
* Source code of ShelfNet modified from the authors' [repository](https://github.com/juntang-zhuang/ShelfNet/tree/pascal)
* Code to prepare the MS COCO keypoints dataset
* Training and evaluation code for MS COCO keypoints modified from the HRNet authors' [repository](https://github.com/HRNet/HRNet-Human-Pose-Estimation)
* Pre-trained weights for ShelfNet50
If you use it in your projects, please consider citing this repository (bibtex below).
## ShelfNet Architecture Overview
The ShelfNet architecture was introduced by J. Zhuang, J. Yang, L. Gu and N. Dvornek through a paper available on [arXiv](https://arxiv.org/abs/1811.11254). The paper evaluates the network only on the semantic segmentation task. The authors' contribution is to have created a fast architecture with a performance similar to the state of the art (PSPNet & EncNet at the time of publishing this repository) on **PASCAL VOC** and better performance on **Cityscapes**. Therefore, ShelfNet is presently one of the most suitable architectures for real-world applications with resource constraints.
![ShelfNet Architecture](assets/ShelfNet_Architecture.jpg)
As depicted above, ShelfNet uses a ResNet backbone combined with 2 encoder/decoder branches. The first encoder (in green?) reduces channel complexity by a factor 4 for faster inference speed. The S-block is a residual block with shared-weights to significantly reduce the number of parameters. The network uses strided convolutions for down-sampling and transpose convolutions for up-sampling. The structure can be seen as an ensemble of [FCN](https://github.com/fmahoudeau/fcn) where the information flows through many different paths, resulting in increased accuracy.
## Results on Microsoft COCO KeyPoints
This section reports test results for ShelfNet50 on the famous [MS COCO KeyPoints](http://cocodataset.org/#keypoints-2019) dataset, and makes a comparison with the state of the art HRNet. All experiments use the same person detector
which has AP of 56.4 on COCO val2017 dataset. You can find the download link on the HRNet [repository](https://github.com/HRNet/HRNet-Human-Pose-Estimation). A single Titan RTX with 24GB RAM was used for the ShelfNet50 experiments. The batch size is 128 for an input size of 256x192 and 72 for 384x288.
| Architecture | Input size | Parameters | AP | AR | Memory size | FPS |
|-------------------------|-------------|-------------|---------|---------|-------------|---------|
| pose_hrnet_w32 | 256x192 | 28.5M | 0.744 | 0.798 | 931 MB | 37.4 |
| pose_hrnet_w32 | 384x288 | 28.5M | 0.758 | 0.809 | 957 MB | 37.6 |
| pose_hrnet_w48 | 256x192 | 63.6M | 0.751 | 0.804 | 1083 MB | 37.7 |
| pose_hrnet_w48 | 384x288 | 63.6M | **0.763** | 0.812 | 1103 MB | 36.7 |
|-------------------------|-------------|-------------|---------|---------|-------------|---------|
| **shelfnet_50** | 256x192 | 38.7M | 0.725 | 0.782 | 1013 MB | 127.3 |
| **shelfnet_50** | 384x288 | 38.7M | 0.746 | 0.797 | 1033 MB | **127.7** |
## Training on Your Own
I'm providing pre-trained weights for ShelfNet50 to make it easier to start. The test accuracies are obtained without providing the ground truth bounding boxes.
| Model | AP |
|--------------------------------------------------------------------------------------|---------|
| [ShelfNet50_256x192](https://1drv.ms/u/s!AvyZUg7UPo_CgdN2S7I54mQD_bglow?e=ENRfVH) | 0.725 |
| [ShelfNet50_384x288](https://1drv.ms/u/s!AvyZUg7UPo_CgdN3kXRSo4PrHcf8RQ?e=IscuxG) | 0.746 |
You can train and evaluate directly from the command line as such:
```
# Train ShelfNet on COCO
python train.py --cfg coco/shelfnet/shelfnet50_384x288_adam_lr1e-3.yaml
```
```
# Test ShelfNet on COCO
python test.py --cfg coco/shelfnet/shelfnet50_384x288_adam_lr1e-3.yaml TEST.MODEL_FILE ../output/coco/shelfnet/shelf_384x288_adam_lr1e-3/model_best.pth TEST.USE_GT_BBOX False
| Arch | AP | Ap .5 | AP .75| AP (M)| AP (L)| AR | AR .5 | AR .75| AR (M)| AR (L)|
|------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| shelfnet | 0.746 | 0.901 | 0.814 | 0.706 | 0.818 | 0.797 | 0.938 | 0.858 | 0.752 | 0.862 |
```
## Requirements
Python 3.7, Torch 1.3.1 or greater, requests, tqdm, yacs, json_tricks, and pycocotools.
Contrary to the ShelfNet repository, this repository is not based on torch-encoding.
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
人体姿态估计_基于Pytorch+ShelfNet实现快速准确的人体姿态估计算法_附项目源码_优质项目实战.zip (37个子文件)
人体姿态估计_基于Pytorch+ShelfNet实现快速准确的人体姿态估计算法_附项目源码_优质项目实战
shelfnet
__init__.py 0B
utils
utils.py 7KB
__init__.py 59B
vis.py 5KB
zipreader.py 2KB
files.py 3KB
transforms.py 4KB
datasets
__init__.py 456B
mpii.py 7KB
coco.py 18KB
JointsDataset.py 10KB
core
loss.py 3KB
evaluate.py 2KB
inference.py 3KB
function.py 8KB
models
__init__.py 50B
model_zoo.py 1KB
model_store.py 3KB
base.py 2KB
pose_hrnet.py 18KB
shelfnet.py 8KB
backbones
__init__.py 22B
resnet.py 9KB
config
__init__.py 369B
models.py 687B
default.py 3KB
assets
anim1.gif 14.07MB
ShelfNet_Architecture.jpg 151KB
experiments
__init__.py 0B
test_speed.py 3KB
train.py 6KB
test.py 4KB
coco
shelfnet
shelfnet50_384x288_adam_lr1e-3.yaml 2KB
shelfnet50_256x192_adam_lr1e-3.yaml 2KB
README.md 5KB
scripts
prepare_coco.py 2KB
prepare_pascal.py 3KB
共 37 条
- 1
资源评论
__AtYou__
- 粉丝: 1879
- 资源: 636
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功