KAPAO是一种高效的单阶段人体姿势估计模型，它可以检测关键点和姿势作为对象，并融合检测结果来预测人体姿势.zip资源-CSDN文库

共51个文件

py：26个

yaml：8个

gif：6个

版权申诉

深度学习

pytorch

姿态估计

24 浏览量 2024-11-26 18:14:57 上传评论收藏 85.21MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

KAPAO 是一种高效的单阶段人体姿势估计模型，它可以检测关键点和姿势作为对象，并融合检测结果来预测人体姿势。.zip （51个子文件）

标签.txt 63B

data

crowdpose.yaml 1KB

coco-kp.yaml 2KB

hyps

hyp.kp-p6.yaml 2KB

hyp.kp.yaml 2KB

scripts

get_coco_kp.sh 804B

download_models.py 697B

get_crowdpose.sh 584B

LICENSE 34KB

utils

loss.py 11KB

loggers

__init__.py 6KB

wandb

__init__.py 0B

sweep.yaml 2KB

log_dataset.py 891B

sweep.py 866B

README.md 10KB

wandb_utils.py 25KB

augmentations.py 13KB

metrics.py 13KB

autoanchor.py 7KB

general.py 34KB

activations.py 4KB

labels.py 4KB

downloads.py 6KB

plots.py 18KB

datasets.py 45KB

callbacks.py 6KB

torch_utils.py 14KB

res

nrchfeybHmw_480p_kapao_l_coco_gpu.gif 32.21MB

crowdpose_100024.jpg 110KB

2016-01-04_21-33-35_Depth_kapao_s_coco_gpu.gif 2.18MB

kapao_inference.gif 7.23MB

accuracy_latency.png 39KB

yBZ0Y2t0ceo_480p_kapao_s_coco_cpu.gif 7.82MB

squash_inference_kapao_s_coco.gif 16.2MB

2DiQUX11YaY_720p_kapao_s_coco_gpu.gif 19.92MB

val.py 15KB

资源内容.txt 957B

requirements.txt 841B

models

yolov5s6.yaml 2KB

common.py 19KB

yolov5l6.yaml 2KB

yolov5m6.yaml 2KB

experimental.py 4KB

yolo.py 15KB

.gitignore 4KB

demos

image.py 5KB

squash.py 10KB

video.py 10KB

train.py 30KB

README.md 10KB

# KAPAO (Keypoints and Poses as Objects) [Accepted to ECCV 2022](https://arxiv.org/abs/2111.08557) KAPAO is an efficient single-stage multi-person human pose estimation method that models **k**eypoints **a**nd **p**oses **a**s **o**bjects within a dense anchor-based detection framework. KAPAO simultaneously detects _pose objects_ and _keypoint objects_ and fuses the detections to predict human poses: ![alt text](./res/kapao_inference.gif) When not using test-time augmentation (TTA), KAPAO is much faster and more accurate than previous single-stage methods like [DEKR](https://github.com/HRNet/DEKR), [HigherHRNet](https://github.com/HRNet/HigherHRNet-Human-Pose-Estimation), [HigherHRNet + SWAHR](https://github.com/greatlog/SWAHR-HumanPose), and [CenterGroup](https://github.com/dvl-tum/center-group): ![alt text](./res/accuracy_latency.png) This repository contains the official PyTorch implementation for the paper: Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation. Our code was forked from ultralytics/yolov5 at commit [5487451](https://github.com/ultralytics/yolov5/tree/5487451). ### Setup 1. If you haven't already, [install Anaconda or Miniconda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html). 2. Create a new conda environment with Python 3.6: `$ conda create -n kapao python=3.6`. 3. Activate the environment: `$ conda activate kapao` 4. Clone this repo: `$ git clone https://github.com/wmcnally/kapao.git` 5. Install the dependencies: `$ cd kapao && pip install -r requirements.txt` 6. Download the trained models: `$ python data/scripts/download_models.py` ## Inference Demos **Note:** FPS calculations include **all processing** (i.e., including image loading, resizing, inference, plotting / tracking, etc.). See script arguments for inference options. --- #### Static Image To generate the four images in the GIF above: 1. `$ python demos/image.py --bbox` 2. `$ python demos/image.py --bbox --pose --face --no-kp-dets` 3. `$ python demos/image.py --bbox --pose --face --no-kp-dets --kp-bbox` 4. `$ python demos/image.py --pose --face` #### Shuffling Video KAPAO runs fastest on low resolution video with few people in the frame. This demo runs KAPAO-S on a single-person 480p dance video using an input size of 1024. The inference speed is **~9.5 FPS** on our CPU, and **~60 FPS** on our TITAN Xp. **CPU inference:** ![alt text](./res/yBZ0Y2t0ceo_480p_kapao_s_coco_cpu.gif) To display the results in real-time: `$ python demos/video.py --face --display` To create the GIF above: `$ python demos/video.py --face --device cpu --gif` **CPU specs:** Intel Core i7-8700K 16GB DDR4 3000MHz Samsung 970 Pro M.2 NVMe SSD --- #### Flash Mob Video This demo runs KAPAO-S on a 720p flash mob video using an input size of 1280. **GPU inference:** ![alt text](./res/2DiQUX11YaY_720p_kapao_s_coco_gpu.gif) To display the results in real-time: `$ python demos/video.py --yt-id 2DiQUX11YaY --tag 136 --imgsz 1280 --color 255 0 255 --start 188 --end 196 --display` To create the GIF above: `$ python demos/video.py --yt-id 2DiQUX11YaY --tag 136 --imgsz 1280 --color 255 0 255 --start 188 --end 196 --gif` --- #### Red Light Green Light This demo runs KAPAO-L on a 480p clip from the TV show _Squid Game_ using an input size of 1024. The plotted poses constitute keypoint objects only. **GPU inference:** ![alt text](./res/nrchfeybHmw_480p_kapao_l_coco_gpu.gif) To display the results in real-time: `$ python demos/video.py --yt-id nrchfeybHmw --imgsz 1024 --weights kapao_l_coco.pt --conf-thres-kp 0.01 --kp-obj --face --start 56 --end 72 --display` To create the GIF above: `$ python demos/video.py --yt-id nrchfeybHmw --imgsz 1024 --weights kapao_l_coco.pt --conf-thres-kp 0.01 --kp-obj --face --start 56 --end 72 --gif` --- #### Squash Video This demo runs KAPAO-S on a 1080p slow motion squash video. It uses a simple player tracking algorithm based on the frame-to-frame pose differences. **GPU inference:** ![alt text](./res/squash_inference_kapao_s_coco.gif) To display the inference results in real-time: `$ python demos/squash.py --display --fps` To create the GIF above: `$ python demos/squash.py --start 42 --end 50 --gif --fps` --- #### Depth Video Pose objects generalize well and can even be detected in depth video. Here KAPAO-S was run on a depth video from a [fencing action recognition dataset](https://ieeexplore.ieee.org/abstract/document/8076041?casa_token=Zvm7dLIr1rYAAAAA:KrqtVl3NXrJZn05Eb4KGMio-18VPHc3uyDJZSiNJyI7f7oHQ5V2iwB7bK4mCJCmN83NrRl4P). ![alt text](./res/2016-01-04_21-33-35_Depth_kapao_s_coco_gpu.gif) The depth video above can be downloaded directly from [here](https://drive.google.com/file/d/1n4so5WN6snyCYxeUk4xX1glADqQuitXP/view?usp=sharing). To create the GIF above: `$ python demos/video.py -p 2016-01-04_21-33-35_Depth.avi --face --start 0 --end -1 --gif --gif-size 480 360` --- #### Web Demo A web demo was integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio) (credit to [@AK391](https://github.com/AK391)). It uses KAPAO-S to run CPU inference on short video clips. ## COCO Experiments Download the COCO dataset: `$ sh data/scripts/get_coco_kp.sh` ### Validation (without TTA) - KAPAO-S (63.0 AP): `$ python val.py --rect` - KAPAO-M (68.5 AP): `$ python val.py --rect --weights kapao_m_coco.pt` - KAPAO-L (70.6 AP): `$ python val.py --rect --weights kapao_l_coco.pt` ### Validation (with TTA) - KAPAO-S (64.3 AP): `$ python val.py --scales 0.8 1 1.2 --flips -1 3 -1` - KAPAO-M (69.6 AP): `$ python val.py --weights kapao_m_coco.pt \ ` `--scales 0.8 1 1.2 --flips -1 3 -1` - KAPAO-L (71.6 AP): `$ python val.py --weights kapao_l_coco.pt \ ` `--scales 0.8 1 1.2 --flips -1 3 -1` ### Testing - KAPAO-S (63.8 AP): `$ python val.py --scales 0.8 1 1.2 --flips -1 3 -1 --task test` - KAPAO-M (68.8 AP): `$ python val.py --weights kapao_m_coco.pt \ ` `--scales 0.8 1 1.2 --flips -1 3 -1 --task test` - KAPAO-L (70.3 AP): `$ python val.py --weights kapao_l_coco.pt \ ` `--scales 0.8 1 1.2 --flips -1 3 -1 --task test` ### Training The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each. KAPAO-S: ``` python -m torch.distributed.launch --nproc_per_node 4 train.py \ --img 1280 \ --batch 128 \ --epochs 500 \ --data data/coco-kp.yaml \ --hyp data/hyps/hyp.kp-p6.yaml \ --val-scales 1 \ --val-flips -1 \ --weights yolov5s6.pt \ --project runs/s_e500 \ --name train \ --workers 128 ``` KAPAO-M: ``` python train.py \ --img 1280 \ --batch 72 \ --epochs 500 \ --data data/coco-kp.yaml \ --hyp data/hyps/hyp.kp-p6.yaml \ --val-scales 1 \ --val-flips -1 \ --weights yolov5m6.pt \ --project runs/m_e500 \ --name train \ --workers 128 ``` KAPAO-L: ``` python train.py \ --img 1280 \ --batch 48 \ --epochs 500 \ --data data/coco-kp.yaml \ --hyp data/hyps/hyp.kp-p6.yaml \ --val-scales 1 \ --val-flips -1 \ --weights yolov5l6.pt \ --project runs/l_e500 \ --name train \ --workers 128 ``` **Note:** [DDP](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html) is usually recommended but we found training was less stable for KAPAO-M/L using DDP. We are investigating this issue. ## CrowdPose Experiments - Install the [CrowdPose API](https://github.com/Jeff-sjtu/CrowdPose/tree/master/crowdpose-api) to your conda environment: `$ cd .. && git clone https://github.com/Jeff-sjtu/CrowdPose.git` `$ cd CrowdPose/crowdpose-api/PythonAPI && sh install.sh && cd ../../../kapao` - Download the CrowdPose dataset: `$ sh data/scripts/get_crowdpose.sh` ### Testing - KAPAO-S (63.8 AP): `$ python val.py --data crowdpose.yaml \ ` `--weights kapao_s_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1` - KAPAO-M (67.1 AP): `$ python val.py --data crowdpose.yaml \ ` `--weights kapa

评论收藏

内容反馈

版权申诉