基于openmmalb的视频行为识别框架MMaction2_MMAction2不会弹出视频窗口资源-CSDN文库

共1204个文件

py：716个

md：155个

sh：101个

行为识别

视频分析

需积分: 2 126 浏览量 2024-10-31 10:38:14 上传评论收藏 78.98MB 7Z 举报

资源推荐

资源详情

资源评论

收起资源包目录

基于openmmalb的视频行为识别框架MMaction2 （1204个子文件）

S001C001P001R001A001_rgb.avi 964KB

test.avi 288KB

argparse.bash 3KB

make.bat 760B

CITATION.cff 297B

setup.cfg 646B

docutils.conf 43B

config 264B

readthedocs.css 1KB

v_test1.csv 31KB

v_test2.csv 30KB

action_name.csv 3KB

pred.csv 1KB

v_test2.csv 857B

v_test1.csv 857B

multisports_sample.csv 669B

gt.csv 540B

ava_sample.csv 368B

v_test1.csv 355B

v_test2.csv 256B

ava_excluded_timestamps_sample.csv 34B

description 73B

Dockerfile 1KB

Dockerfile 527B

exclude 240B

mmaction2_overview.gif 1.62MB

spatio-temporal-det.gif 1.24MB

.gitignore 2KB

HEAD 183B

HEAD 30B

HEAD 21B

404.html 521B

404.html 498B

pack-5693b8dfbf18b0f2a52542fbd44503a9d1eb259c.idx 626KB

MANIFEST.in 177B

index 139KB

demo_stad.ipynb 259KB

demo_stad_zh_CN.ipynb 259KB

mmaction2_tutorial.ipynb 114KB

merge_pretrain.ipynb 8KB

demo.ipynb 3KB

zhihu_qrcode.jpg 388KB

miaomiao_qrcode.jpg 220KB

qq_group_qrcode.jpg 200KB

hand_det_out.jpg 46KB

hand_det.jpg 21KB

img_00010.jpg 20KB

img_00008.jpg 20KB

img_00009.jpg 20KB

img_00007.jpg 20KB

img_00006.jpg 20KB

img_00004.jpg 20KB

img_00005.jpg 20KB

img_00003.jpg 20KB

img_00002.jpg 20KB

img_00001.jpg 19KB

test.jpg 18KB

x_00003.jpg 5KB

y_00002.jpg 5KB

y_00004.jpg 5KB

x_00002.jpg 5KB

y_00003.jpg 5KB

x_00004.jpg 5KB

x_00005.jpg 5KB

y_00005.jpg 5KB

y_00001.jpg 4KB

x_00001.jpg 4KB

custom.js 528B

custom.js 221B

label_map.json 45KB

result.json 3KB

map_k700.json 3KB

map_k600.json 3KB

map_k400.json 2KB

gt.json 1KB

action_test_anno.json 851B

hvu_frame_test_anno.json 453B

hvu_video_test_anno.json 413B

hvu_video_eval_test_anno.json 249B

video_text_test_list.json 121B

rawvideo_test_anno.json 111B

LICENSE 11KB

main 183B

main 41B

Makefile 634B

changelog.md 67KB

README.md 45KB

config.md 41KB

config.md 37KB

README.md 33KB

README_zh-CN.md 32KB

guide_to_framework.md 26KB

README_zh-CN.md 26KB

共 1204 条

# Demo ## Outline - [Modify configs through script arguments](#modify-config-through-script-arguments): Tricks to directly modify configs through script arguments. - [Video demo](#video-demo): A demo script to predict the recognition result using a single video. - [Video GradCAM Demo](#video-gradcam-demo): A demo script to visualize GradCAM results using a single video. - [Webcam demo](#webcam-demo): A demo script to implement real-time action recognition from a web camera. - [Long Video demo](#long-video-demo): a demo script to predict different labels using a single long video. - [Skeleton-based Action Recognition Demo](#skeleton-based-action-recognition-demo): A demo script to predict the skeleton-based action recognition result using a single video. - [SpatioTemporal Action Detection Webcam Demo](#spatiotemporal-action-detection-webcam-demo): A demo script to implement real-time spatio-temporal action detection from a web camera. - [SpatioTemporal Action Detection Video Demo](#spatiotemporal-action-detection-video-demo): A demo script to predict the spatiotemporal action detection result using a single video. - [SpatioTemporal Action Detection ONNX Video Demo](#spatiotemporal-action-detection-onnx-video-demo): A demo script to predict the SpatioTemporal Action Detection result using the onnx file instead of building the PyTorch models. - [Inferencer Demo](#inferencer): A demo script to implement fast predict for video analysis tasks based on unified inferencer interface. - [Audio Demo](#audio-demo): A demo script to predict the recognition result using a single audio file. - [Video Structuralize Demo](#video-structuralize-demo): A demo script to predict the skeleton-based and rgb-based action recognition and spatio-temporal action detection result using a single video. ## Modify configs through script arguments When running demos using our provided scripts, you may specify `--cfg-options` to in-place modify the config. - Update config keys of dict. The config options can be specified following the order of the dict keys in the original config. For example, `--cfg-options model.backbone.norm_eval=False` changes the all BN modules in model backbones to `train` mode. - Update keys inside a list of configs. Some config dicts are composed as a list in your config. For example, the training pipeline `train_dataloader.dataset.pipeline` is normally a list e.g. `[dict(type='SampleFrames'), ...]`. If you want to change `'SampleFrames'` to `'DenseSampleFrames'` in the pipeline, you may specify `--cfg-options train_dataloader.dataset.pipeline.0.type=DenseSampleFrames`. - Update values of list/tuples. If the value to be updated is a list or a tuple. For example, the config file normally sets `workflow=[('train', 1)]`. If you want to change this key, you may specify `--cfg-options workflow="[(train,1),(val,1)]"`. Note that the quotation mark " is necessary to support list/tuple data types, and that **NO** white space is allowed inside the quotation marks in the specified value. ## Video demo MMAction2 provides a demo script to predict the recognition result using a single video. In order to get predict results in range `[0, 1]`, make sure to set `model['test_cfg'] = dict(average_clips='prob')` in config file. ```shell python demo/demo.py ${CONFIG_FILE} ${CHECKPOINT_FILE} ${VIDEO_FILE} ${LABEL_FILE} \ [--device ${DEVICE_TYPE}] [--fps ${FPS}] [--font-scale ${FONT_SCALE}] [--font-color ${FONT_COLOR}] \ [--target-resolution ${TARGET_RESOLUTION}] [--out-filename ${OUT_FILE}] ``` Optional arguments: - `--use-frames`: If specified, the demo will take rawframes as input. Otherwise, it will take a video as input. - `DEVICE_TYPE`: Type of device to run the demo. Allowed values are cuda device like `cuda:0` or `cpu`. If not specified, it will be set to `cuda:0`. - `FPS`: FPS value of the output video when using rawframes as input. If not specified, it will be set to 30. - `FONT_SCALE`: Font scale of the text added in the video. If not specified, it will be None. - `FONT_COLOR`: Font color of the text added in the video. If not specified, it will be `white`. - `TARGET_RESOLUTION`: Resolution(desired_width, desired_height) for resizing the frames before output when using a video as input. If not specified, it will be None and the frames are resized by keeping the existing aspect ratio. - `OUT_FILE`: Path to the output file which can be a video format or gif format. If not specified, it will be set to `None` and does not generate the output file. Examples: Assume that you are located at `$MMACTION2` and have already downloaded the checkpoints to the directory `checkpoints/`, or use checkpoint url from `configs/` to directly load corresponding checkpoint, which will be automatically saved in `$HOME/.cache/torch/checkpoints`. 1. Recognize a video file as input by using a TSN model on cuda by default. ```shell # The demo.mp4 and label_map_k400.txt are both from Kinetics-400 python demo/demo.py demo/demo_configs/tsn_r50_1x1x8_video_infer.py \ checkpoints/tsn_r50_8xb32-1x1x8-100e_kinetics400-rgb_20220818-2692d16c.pth \ demo/demo.mp4 tools/data/kinetics/label_map_k400.txt ``` 2. Recognize a video file as input by using a TSN model on cuda by default, loading checkpoint from url. ```shell # The demo.mp4 and label_map_k400.txt are both from Kinetics-400 python demo/demo.py demo/demo_configs/tsn_r50_1x1x8_video_infer.py \ https://download.openmmlab.com/mmaction/v1.0/recognition/tsn/tsn_r50_8xb32-1x1x8-100e_kinetics400-rgb/tsn_r50_8xb32-1x1x8-100e_kinetics400-rgb_20220818-2692d16c.pth \ demo/demo.mp4 tools/data/kinetics/label_map_k400.txt ``` 3. Recognize a video file as input by using a TSN model and then generate an mp4 file. ```shell # The demo.mp4 and label_map_k400.txt are both from Kinetics-400 python demo/demo.py demo/demo_configs/tsn_r50_1x1x8_video_infer.py \ checkpoints/tsn_r50_8xb32-1x1x8-100e_kinetics400-rgb_20220818-2692d16c.pth \ demo/demo.mp4 tools/data/kinetics/label_map_k400.txt --out-filename demo/demo_out.mp4 ``` ## Video GradCAM Demo MMAction2 provides a demo script to visualize GradCAM results using a single video. ```shell python tools/visualizations/vis_cam.py ${CONFIG_FILE} ${CHECKPOINT_FILE} ${VIDEO_FILE} [--use-frames] \ [--device ${DEVICE_TYPE}] [--target-layer-name ${TARGET_LAYER_NAME}] [--fps {FPS}] \ [--target-resolution ${TARGET_RESOLUTION}] [--resize-algorithm {RESIZE_ALGORITHM}] [--out-filename {OUT_FILE}] ``` - `--use-frames`: If specified, the demo will take rawframes as input. Otherwise, it will take a video as input. - `DEVICE_TYPE`: Type of device to run the demo. Allowed values are cuda device like `cuda:0` or `cpu`. If not specified, it will be set to `cuda:0`. - `FPS`: FPS value of the output video when using rawframes as input. If not specified, it will be set to 30. - `OUT_FILE`: Path to the output file which can be a video format or gif format. If not specified, it will be set to `None` and does not generate the output file. - `TARGET_LAYER_NAME`: Layer name to generate GradCAM localization map. - `TARGET_RESOLUTION`: Resolution(desired_width, desired_height) for resizing the frames before output when using a video as input. If not specified, it will be None and the frames are resized by keeping the existing aspect ratio. - `RESIZE_ALGORITHM`: Resize algorithm used for resizing. If not specified, it will be set to `bilinear`. Examples: Assume that you are located at `$MMACTION2` and have already downloaded the checkpoints to the directory `checkpoints/`, or use checkpoint url from `configs/` to directly load corresponding checkpoint, which will be automatically saved in `$HOME/.cache/torch/checkpoints`. 1. Get GradCAM results of a I3D model, using a video file as input and then generate an gif file with 10 fps. ```shell python tools/visualizations/vis_cam.py demo/demo_configs/i3d_r50_32x2x1_vide

评论收藏

内容反馈