颜色分类leetcode-Real-time-GesRec:在EgoGesture、NvGesture、Jester、Kinetics和UCF_nvgesture数据集下载,nvgesture数据集资源-CSDN文库

共122个文件

py：55个

txt：35个

sh：13个

系统开源

需积分: 50 18 浏览量 2021-07-06 21:21:27 上传评论收藏 19.55MB ZIP 举报

资源详情

资源评论

资源推荐

收起资源包目录

颜色分类leetcode-Real-time-GesRec:在EgoGesture、NvGesture、Jester、Kinetics和UCF （122个子文件）

kinetics-600_train.csv 14.29MB

kinetics-600_val.csv 1.04MB

.gitignore 1KB

results_Eff_3DCNNs.jpg 879KB

kinetics.json 46.28MB

val.json 17.14MB

egogestureall.json 6.68MB

egogesturebinary.json 6.47MB

egogestureall_but_None.json 3.47MB

ucf101_01.json 1.15MB

ucf101_02.json 1.15MB

ucf101_03.json 1.15MB

test.json 6KB

opts_det.json 2KB

opts_clf.json 2KB

opts.json 2KB

LICENSE 1KB

README.md 7KB

opts.py 18KB

spatial_transforms.py 16KB

online_test.py 15KB

model.py 13KB

online_test_video.py 12KB

online_test_wo_detector.py 12KB

egogesture_online.py 9KB

egogesture.py 9KB

speed_gpu.py 8KB

nv.py 8KB

nv_online.py 8KB

main.py 8KB

dataset.py 7KB

resnet.py 7KB

eval_kinetics.py 7KB

offline_test.py 7KB

kinetics.py 7KB

shufflenetv2.py 7KB

jester.py 7KB

resnext.py 7KB

ucf101.py 6KB

resnetl.py 6KB

eval_ucf101.py 6KB

squeezenet.py 6KB

inference.py 6KB

test_models.py 6KB

nv_prepare.py 6KB

shufflenet.py 6KB

mobilenetv2.py 5KB

utils.py 5KB

c3d.py 4KB

ego_prepare.py 4KB

temporal_transforms.py 4KB

mobilenet.py 3KB

egogesture_json.py 3KB

nv_json.py 3KB

train.py 3KB

count_hooks.py 3KB

calculate_FLOP.py 2KB

test.py 2KB

ucf101_json.py 2KB

kinetics_json.py 2KB

jester_json.py 2KB

validation.py 2KB

video_jpg_kinetics.py 1KB

utils.py 1KB

video_jpg_ucf101_hmdb51.py 1KB

n_frames_kinetics.py 1KB

video_jpg.py 1KB

n_frames_ucf101_hmdb51.py 981B

n_frames_jester.py 947B

video_accuracy.py 773B

mean.py 635B

target_transforms.py 446B

__init__.py 26B

run-all.sh 9KB

run-all-online-wo-detector.sh 2KB

run-all-online.sh 2KB

run-egogesture.sh 2KB

run-nvgesture.sh 2KB

run-kinetics.sh 1KB

run_online_egogesture.sh 1KB

run_online_nvgesture.sh 1KB

run_online.sh 1KB

run_online_video_egogesture.sh 948B

run_online_nvgesture_wo_detector.sh 915B

run_online_egogesture_wo_detector.sh 896B

run-jester.sh 475B

trainlistall.txt 1.08MB

trainlistbinary.txt 1.05MB

trainlist01.txt 1MB

trainlist.txt 1MB

trainlistall_but_None.txt 554KB

trainlist03.txt 389KB

trainlist02.txt 388KB

trainlist01.txt 386KB

testlistall.txt 381KB

testlistbinary.txt 372KB

vallistall.txt 365KB

vallistbinary.txt 357KB

testlistall_but_None.txt 191KB

vallistall_but_None.txt 183KB

共 122 条

# Real-time Hand Gesture Recognition with 3D CNNs PyTorch implementation of the article [Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks](https://arxiv.org/abs/1901.10323) and [Resource Efficient 3D Convolutional Neural Networks](https://arxiv.org/pdf/1904.02422.pdf), codes and pretrained models. <div align="center" style="width:image width px;"> <img src="https://media.giphy.com/media/9M3aPvPOVxSQmYGv8p/giphy.gif" width=500 alt="simulation results"> </div> Figure: A real-time simulation of the architecture with input video from EgoGesture dataset (on left side) and real-time (online) classification scores of each gesture (on right side) are shown, where each class is annotated with different color. This code includes training, fine-tuning and testing on EgoGesture and nvGesture datasets. Note that the code only includes ResNet-10, ResNetL-10, ResneXt-101, C3D v1, whose other versions can be easily added. ## Abstract Real-time recognition of dynamic hand gestures from video streams is a challenging task since (i) there is no indication when a gesture starts and ends in the video, (ii) performed gestures should only be recognized once, and (iii) the entire architecture should be designed considering the memory and power budget. In this work, we address these challenges by proposing a hierarchical structure enabling offline-working convolutional neural network (CNN) architectures to operate online efficiently by using sliding window approach. The proposed architecture consists of two models: (1) A detector which is a lightweight CNN architecture to detect gestures and (2) a classifier which is a deep CNN to classify the detected gestures. In order to evaluate the single-time activations of the detected gestures, we propose to use the Levenshtein distance as an evaluation metric since it can measure misclassifications, multiple detections, and missing detections at the same time. We evaluate our architecture on two publicly available datasets - EgoGesture and NVIDIA Dynamic Hand Gesture Datasets - which require temporal detection and classification of the performed hand gestures. ResNeXt-101 model, which is used as a classifier, achieves the state-of-the-art offline classification accuracy of 94.04% and 83.82% for depth modality on EgoGesture and NVIDIA benchmarks, respectively. In real-time detection and classification, we obtain considerable early detections while achieving performances close to offline operation. The codes and pretrained models used in this work are publicly available. ## Requirements * [PyTorch](http://pytorch.org/) ```bash conda install pytorch torchvision cuda80 -c soumith ``` * Python 3 ### Pretrained models [Pretrained_models_v1 (1.08GB)](https://drive.google.com/file/d/11MJWXmFnx9shbVtsaP1V8ak_kADg0r7D/view?usp=sharing): The best performing models in [paper](https://arxiv.org/abs/1901.10323) [Pretrained_RGB_models_for_det_and_clf (371MB)(Google Drive)](https://drive.google.com/file/d/1V23zvjAKZr7FUOBLpgPZkpHGv8_D-cOs/view?usp=sharing) [Pretrained_RGB_models_for_det_and_clf (371MB)(Baidu Netdisk)](https://pan.baidu.com/s/114WKw0lxLfWMZA6SYSSJlw) -code:p1va [Pretrained_models_v2 (15.2GB)](https://drive.google.com/file/d/1rSWnzlOwGXjO_6C7U8eE6V43MlcnN6J_/view?usp=sharing): All models in [paper](https://ieeexplore.ieee.org/document/8982092) with efficient 3D-CNN Models ## Preparation ### EgoGesture * Download videos by following [the official site](http://www.nlpr.ia.ac.cn/iva/yfzhang/datasets/egogesture.html). * We will use extracted images that is also provided by the owners * Generate n_frames files using ```utils/ego_prepare.py``` N frames format is as following: "path to the folder" "class index" "start frame" "end frame" ```bash mkdir annotation_EgoGesture python utils/ego_prepare.py training trainlistall.txt all python utils/ego_prepare.py training trainlistall_but_None.txt all_but_None python utils/ego_prepare.py training trainlistbinary.txt binary python utils/ego_prepare.py validation vallistall.txt all python utils/ego_prepare.py validation vallistall_but_None.txt all_but_None python utils/ego_prepare.py validation vallistbinary.txt binary python utils/ego_prepare.py testing testlistall.txt all python utils/ego_prepare.py testing testlistall_but_None.txt all_but_None python utils/ego_prepare.py testing testlistbinary.txt binary ``` * Generate annotation file in json format similar to ActivityNet using ```utils/egogesture_json.py``` ```bash python utils/egogesture_json.py 'annotation_EgoGesture' all python utils/egogesture_json.py 'annotation_EgoGesture' all_but_None python utils/egogesture_json.py 'annotation_EgoGesture' binary ``` ### nvGesture * Download videos by following [the official site](https://research.nvidia.com/publication/online-detection-and-classification-dynamic-hand-gestures-recurrent-3d-convolutional). * Generate n_frames files using ```utils/nv_prepare.py``` N frames format is as following: "path to the folder" "class index" "start frame" "end frame" ```bash mkdir annotation_nvGesture python utils/nv_prepare.py training trainlistall.txt all python utils/nv_prepare.py training trainlistall_but_None.txt all_but_None python utils/nv_prepare.py training trainlistbinary.txt binary python utils/nv_prepare.py validation vallistall.txt all python utils/nv_prepare.py validation vallistall_but_None.txt all_but_None python utils/nv_prepare.py validation vallistbinary.txt binary ``` * Generate annotation file in json format similar to ActivityNet using ```utils/nv_json.py``` ```bash python utils/nv_json.py 'annotation_nvGesture' all python utils/nv_json.py 'annotation_nvGesture' all_but_None python utils/nv_json.py 'annotation_nvGesture' binary ``` ### Jester * Download videos by following [the official site](https://20bn.com/datasets/jester). * N frames and class index file is already provided annotation_Jester/{'classInd.txt', 'trainlist01.txt', 'vallist01.txt'} N frames format is as following: "path to the folder" "class index" "start frame" "end frame" * Generate annotation file in json format similar to ActivityNet using ```utils/jester_json.py``` ```bash python utils/jester_json.py 'annotation_Jester' ``` ## Running the code * Offline testing (offline_test.py) and training (main.py) ```bash bash run_offline.sh ``` * Online testing ```bash bash run_online.sh ``` ## Citation Please cite the following articles if you use this code or pre-trained models: ```bibtex @article{kopuklu_real-time_2019, title = {Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks}, url = {http://arxiv.org/abs/1901.10323}, author = {Köpüklü, Okan and Gunduz, Ahmet and Kose, Neslihan and Rigoll, Gerhard}, year={2019} } ``` ```bibtex @article{kopuklu2020online, title={Online Dynamic Hand Gesture Recognition Including Efficiency Analysis}, author={K{\"o}p{\"u}kl{\"u}, Okan and Gunduz, Ahmet and Kose, Neslihan and Rigoll, Gerhard}, journal={IEEE Transactions on Biometrics, Behavior, and Identity Science}, volume={2}, number={2}, pages={85--97}, year={2020}, publisher={IEEE} } ``` ## Acknowledgement We thank Kensho Hara for releasing his [codebase](https://github.com/kenshohara/3D-ResNets-PyTorch), which we build our work on top.