# Real-time Hand Gesture Recognition with 3D CNNs
PyTorch implementation of the article [Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks](https://arxiv.org/abs/1901.10323) and [Resource Efficient 3D Convolutional Neural Networks](https://arxiv.org/pdf/1904.02422.pdf), codes and pretrained models.
<div align="center" style="width:image width px;">
<img src="https://media.giphy.com/media/9M3aPvPOVxSQmYGv8p/giphy.gif" width=500 alt="simulation results">
</div>
Figure: A real-time simulation of the architecture with input video from EgoGesture dataset (on left side) and real-time (online) classification scores of each gesture (on right side) are shown, where each class is annotated with different color.
This code includes training, fine-tuning and testing on EgoGesture and nvGesture datasets.
Note that the code only includes ResNet-10, ResNetL-10, ResneXt-101, C3D v1, whose other versions can be easily added.
## Abstract
Real-time recognition of dynamic hand gestures from video streams is a challenging task since (i)
there is no indication when a gesture starts and ends in the video, (ii) performed gestures should
only be recognized once, and (iii) the entire architecture should be designed considering the memory
and power budget. In this work, we address these challenges by proposing a hierarchical structure
enabling offline-working convolutional neural network (CNN) architectures to operate online efficiently
by using sliding window approach. The proposed architecture consists of two models: (1) A detector which
is a lightweight CNN architecture to detect gestures and (2) a classifier which is a deep CNN to classify
the detected gestures. In order to evaluate the single-time activations of the detected gestures, we propose
to use the Levenshtein distance as an evaluation metric since it can measure misclassifications, multiple detections,
and missing detections at the same time. We evaluate our architecture on two publicly available datasets - EgoGesture
and NVIDIA Dynamic Hand Gesture Datasets - which require temporal detection and classification of the performed hand gestures.
ResNeXt-101 model, which is used as a classifier, achieves the state-of-the-art offline classification accuracy of 94.04% and
83.82% for depth modality on EgoGesture and NVIDIA benchmarks, respectively. In real-time detection and classification,
we obtain considerable early detections while achieving performances close to offline operation. The codes and pretrained models used in this work are publicly available.
## Requirements
* [PyTorch](http://pytorch.org/)
```bash
conda install pytorch torchvision cuda80 -c soumith
```
* Python 3
### Pretrained models
[Pretrained_models_v1 (1.08GB)](https://drive.google.com/file/d/11MJWXmFnx9shbVtsaP1V8ak_kADg0r7D/view?usp=sharing): The best performing models in [paper](https://arxiv.org/abs/1901.10323)
[Pretrained_RGB_models_for_det_and_clf (371MB)(Google Drive)](https://drive.google.com/file/d/1V23zvjAKZr7FUOBLpgPZkpHGv8_D-cOs/view?usp=sharing)
[Pretrained_RGB_models_for_det_and_clf (371MB)(Baidu Netdisk)](https://pan.baidu.com/s/114WKw0lxLfWMZA6SYSSJlw) -code:p1va
[Pretrained_models_v2 (15.2GB)](https://drive.google.com/file/d/1rSWnzlOwGXjO_6C7U8eE6V43MlcnN6J_/view?usp=sharing): All models in [paper](https://ieeexplore.ieee.org/document/8982092) with efficient 3D-CNN Models
## Preparation
### EgoGesture
* Download videos by following [the official site](http://www.nlpr.ia.ac.cn/iva/yfzhang/datasets/egogesture.html).
* We will use extracted images that is also provided by the owners
* Generate n_frames files using ```utils/ego_prepare.py```
N frames format is as following: "path to the folder" "class index" "start frame" "end frame"
```bash
mkdir annotation_EgoGesture
python utils/ego_prepare.py training trainlistall.txt all
python utils/ego_prepare.py training trainlistall_but_None.txt all_but_None
python utils/ego_prepare.py training trainlistbinary.txt binary
python utils/ego_prepare.py validation vallistall.txt all
python utils/ego_prepare.py validation vallistall_but_None.txt all_but_None
python utils/ego_prepare.py validation vallistbinary.txt binary
python utils/ego_prepare.py testing testlistall.txt all
python utils/ego_prepare.py testing testlistall_but_None.txt all_but_None
python utils/ego_prepare.py testing testlistbinary.txt binary
```
* Generate annotation file in json format similar to ActivityNet using ```utils/egogesture_json.py```
```bash
python utils/egogesture_json.py 'annotation_EgoGesture' all
python utils/egogesture_json.py 'annotation_EgoGesture' all_but_None
python utils/egogesture_json.py 'annotation_EgoGesture' binary
```
### nvGesture
* Download videos by following [the official site](https://research.nvidia.com/publication/online-detection-and-classification-dynamic-hand-gestures-recurrent-3d-convolutional).
* Generate n_frames files using ```utils/nv_prepare.py```
N frames format is as following: "path to the folder" "class index" "start frame" "end frame"
```bash
mkdir annotation_nvGesture
python utils/nv_prepare.py training trainlistall.txt all
python utils/nv_prepare.py training trainlistall_but_None.txt all_but_None
python utils/nv_prepare.py training trainlistbinary.txt binary
python utils/nv_prepare.py validation vallistall.txt all
python utils/nv_prepare.py validation vallistall_but_None.txt all_but_None
python utils/nv_prepare.py validation vallistbinary.txt binary
```
* Generate annotation file in json format similar to ActivityNet using ```utils/nv_json.py```
```bash
python utils/nv_json.py 'annotation_nvGesture' all
python utils/nv_json.py 'annotation_nvGesture' all_but_None
python utils/nv_json.py 'annotation_nvGesture' binary
```
### Jester
* Download videos by following [the official site](https://20bn.com/datasets/jester).
* N frames and class index file is already provided annotation_Jester/{'classInd.txt', 'trainlist01.txt', 'vallist01.txt'}
N frames format is as following: "path to the folder" "class index" "start frame" "end frame"
* Generate annotation file in json format similar to ActivityNet using ```utils/jester_json.py```
```bash
python utils/jester_json.py 'annotation_Jester'
```
## Running the code
* Offline testing (offline_test.py) and training (main.py)
```bash
bash run_offline.sh
```
* Online testing
```bash
bash run_online.sh
```
## Citation
Please cite the following articles if you use this code or pre-trained models:
```bibtex
@article{kopuklu_real-time_2019,
title = {Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks},
url = {http://arxiv.org/abs/1901.10323},
author = {Köpüklü, Okan and Gunduz, Ahmet and Kose, Neslihan and Rigoll, Gerhard},
year={2019}
}
```
```bibtex
@article{kopuklu2020online,
title={Online Dynamic Hand Gesture Recognition Including Efficiency Analysis},
author={K{\"o}p{\"u}kl{\"u}, Okan and Gunduz, Ahmet and Kose, Neslihan and Rigoll, Gerhard},
journal={IEEE Transactions on Biometrics, Behavior, and Identity Science},
volume={2},
number={2},
pages={85--97},
year={2020},
publisher={IEEE}
}
```
## Acknowledgement
We thank Kensho Hara for releasing his [codebase](https://github.com/kenshohara/3D-ResNets-PyTorch), which we build our work on top.
没有合适的资源?快使用搜索试试~ 我知道了~
颜色分类leetcode-Real-time-GesRec:在EgoGesture、NvGesture、Jester、Kinet...
共122个文件
py:55个
txt:35个
sh:13个
需积分: 50 8 下载量 18 浏览量
2021-07-06
21:21:27
上传
评论
收藏 19.55MB ZIP 举报
温馨提示
颜色分类leetcode 使用 3D CNN 进行实时手势识别 文章和 的 PyTorch 实现、代码和预训练模型。 图:显示了具有来自 EgoGesture 数据集(左侧)的输入视频和每个手势的实时(在线)分类分数(右侧)的架构的实时模拟,其中每个类都用不同的颜色进行了注释。 此代码包括对 EgoGesture 和 nvGesture 数据集的训练、微调和测试。 注意代码只包含ResNet-10、ResNetL-10、ResneXt-101、C3D v1,其他版本可以轻松添加。 抽象的 从视频流中实时识别动态手势是一项具有挑战性的任务,因为 (i) 在视频中没有手势开始和结束的指示,(ii) 执行的手势应该只识别一次,以及 (iii) 整个架构的设计应考虑内存和功率预算。 在这项工作中,我们通过提出一种分层结构来解决这些挑战,该结构使离线工作的卷积神经网络 (CNN) 架构能够通过使用滑动窗口方法有效地在线运行。 所提出的架构由两个模型组成:(1)检测器是一种轻量级的 CNN 架构,用于检测手势;(2)分类器是一种深度 CNN,用于对检测到的手势进行分类。 为了评估检测到的手势的单次
资源详情
资源评论
资源推荐
收起资源包目录
颜色分类leetcode-Real-time-GesRec:在EgoGesture、NvGesture、Jester、Kinetics和UCF (122个子文件)
kinetics-600_train.csv 14.29MB
kinetics-600_val.csv 1.04MB
.gitignore 1KB
results_Eff_3DCNNs.jpg 879KB
kinetics.json 46.28MB
val.json 17.14MB
egogestureall.json 6.68MB
egogesturebinary.json 6.47MB
egogestureall_but_None.json 3.47MB
ucf101_01.json 1.15MB
ucf101_02.json 1.15MB
ucf101_03.json 1.15MB
test.json 6KB
opts_det.json 2KB
opts_clf.json 2KB
opts.json 2KB
LICENSE 1KB
README.md 7KB
opts.py 18KB
spatial_transforms.py 16KB
online_test.py 15KB
model.py 13KB
online_test_video.py 12KB
online_test_wo_detector.py 12KB
egogesture_online.py 9KB
egogesture.py 9KB
speed_gpu.py 8KB
nv.py 8KB
nv_online.py 8KB
main.py 8KB
dataset.py 7KB
resnet.py 7KB
eval_kinetics.py 7KB
offline_test.py 7KB
kinetics.py 7KB
shufflenetv2.py 7KB
jester.py 7KB
resnext.py 7KB
ucf101.py 6KB
resnetl.py 6KB
eval_ucf101.py 6KB
squeezenet.py 6KB
inference.py 6KB
test_models.py 6KB
nv_prepare.py 6KB
shufflenet.py 6KB
mobilenetv2.py 5KB
utils.py 5KB
c3d.py 4KB
ego_prepare.py 4KB
temporal_transforms.py 4KB
mobilenet.py 3KB
egogesture_json.py 3KB
nv_json.py 3KB
train.py 3KB
count_hooks.py 3KB
calculate_FLOP.py 2KB
test.py 2KB
ucf101_json.py 2KB
kinetics_json.py 2KB
jester_json.py 2KB
validation.py 2KB
video_jpg_kinetics.py 1KB
utils.py 1KB
video_jpg_ucf101_hmdb51.py 1KB
n_frames_kinetics.py 1KB
video_jpg.py 1KB
n_frames_ucf101_hmdb51.py 981B
n_frames_jester.py 947B
video_accuracy.py 773B
mean.py 635B
target_transforms.py 446B
__init__.py 26B
run-all.sh 9KB
run-all-online-wo-detector.sh 2KB
run-all-online.sh 2KB
run-egogesture.sh 2KB
run-nvgesture.sh 2KB
run-kinetics.sh 1KB
run_online_egogesture.sh 1KB
run_online_nvgesture.sh 1KB
run_online.sh 1KB
run_online_video_egogesture.sh 948B
run_online_nvgesture_wo_detector.sh 915B
run_online_egogesture_wo_detector.sh 896B
run-jester.sh 475B
trainlistall.txt 1.08MB
trainlistbinary.txt 1.05MB
trainlist01.txt 1MB
trainlist.txt 1MB
trainlistall_but_None.txt 554KB
trainlist03.txt 389KB
trainlist02.txt 388KB
trainlist01.txt 386KB
testlistall.txt 381KB
testlistbinary.txt 372KB
vallistall.txt 365KB
vallistbinary.txt 357KB
testlistall_but_None.txt 191KB
vallistall_but_None.txt 183KB
共 122 条
- 1
- 2
weixin_38747815
- 粉丝: 54
- 资源: 889
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 个人实习的终极无敌面经
- 新年主题下的计算机资源利用与探索
- lianjia2.csv
- 2022年江苏省职业院校技能大赛中职网络搭建与应用赛项公开赛卷技能要求
- 毕设和企业适用springboot企业资源规划类及健康管理监控平台源码+论文+视频.zip
- 小功率调幅发射机设计报告(含各级电路的计算与调试)
- 基于 SSM + Shiro + Dubbo 的 RESTful Web 应用快速启动器资料齐全+详细文档.zip
- 基于 dubbo 实现的分布式电商平台资料齐全+详细文档.zip
- 基于 spring、dubbo 的分布式服务架构资料齐全+详细文档.zip
- 基于dubbo redis分布式定时回调服务资料齐全+详细文档.zip
- 基于atomikos的分布式事务管理资料齐全+详细文档.zip
- 基于Dubbo 2.6.6版本源码注释资料齐全+详细文档.zip
- 基于dubbo+sqlhint来实现的特殊数据库操作(比如:SQL语句路由)资料齐全+详细文档.zip
- 基于dubbo+zookeeper将”优雅的SSM框架“拆分为分布式架构资料齐全+详细文档.zip
- 基于dubbo、spring扩展实现的接入层灰度、服务层灰度、mq灰度、外部调用灰度,支持多套灰度环境(灰度系统)资料齐全+详细文档.zip
- 基于dubbo2.6.4的Dubbo TraceId的设置获取传递工具包资料齐全+详细文档.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0