# CMT.pytorch
## Implementation of [CMT: Convolutional Neural Networks Meet Vision Transformers](https://arxiv.org/pdf/2107.06263.pdf)
### Set up
```
- python==3.6
- cuda==10.0
# other pytorch/timm version can also work
pip install torch==1.7.0 torchvision==0.8.1;
pip install timm==0.3.2;
pip install torchprofile;
# build apex
cd /your_path_to/apex-master/;
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
```
### Data preparation
Download and extract ImageNet train and val images from http://image-net.org/.
The directory structure is:
```
│path/to/imagenet/
├──train/
│ ├── n01440764
│ │ ├── n01440764_10026.JPEG
│ │ ├── n01440764_10027.JPEG
│ │ ├── ......
│ ├── ......
├──val/
│ ├── n01440764
│ │ ├── ILSVRC2012_val_00000293.JPEG
│ │ ├── ILSVRC2012_val_00002138.JPEG
│ │ ├── ......
│ ├── ......
```
#### Training
To train CMT-Tiny on ImageNet-1K on a single node with 8 gpus:
```
python -m torch.distributed.launch --nproc_per_node=8 train.py --data-path /your_path_to/imagenet/ --output_dir /your_path_to/output/ --model cmt_ti --batch-size 256 --apex-amp --input-size 160 --weight-decay 0.05 --drop-path 0.1 --epochs 800 --test_freq 100 --test_epoch 760 --warmup-lr 1e-7 --warmup-epochs 20 --lr 8e-4 --min-lr 1e-5 --no-model-ema
```
To train CMT-XS on ImageNet-1K on a single node with 8 gpus:
```
python -m torch.distributed.launch --nproc_per_node=8 train.py --data-path /your_path_to/imagenet/ --output_dir /your_path_to/output/ --model cmt_xs --batch-size 256 --apex-amp --input-size 192 --weight-decay 0.04 --drop-path 0.08 --epochs 400 --test_freq 100 --test_epoch 360 --warmup-lr 1e-6 --warmup-epochs 20 --lr 7e-4 --min-lr 2e-5 --model-ema-decay 0.9998
```
To train CMT-Small on ImageNet-1K on a single node with 8 gpus:
```
python -m torch.distributed.launch --nproc_per_node=8 train.py --data-path /your_path_to/imagenet/ --output_dir /your_path_to/output/ --model cmt_s --batch-size 128 --apex-amp --input-size 224 --weight-decay 0.05 --drop-path 0.1 --epochs 300 --test_freq 100 --test_epoch 260 --warmup-lr 1e-7 --warmup-epochs 20
```
To train CMT-Base on ImageNet-1K on a single node with 8 gpus:
```
python -m torch.distributed.launch --nproc_per_node=8 train.py --data-path /your_path_to/imagenet/ --output_dir /your_path_to/output/ --model cmt_b --batch-size 64 --apex-amp --input-size 256 --weight-decay 0.05 --drop-path 0.25 --epochs 300 --test_freq 100 --test_epoch 260 --warmup-lr 1e-6 --min-lr 2e-5 --warmup-epochs 20
```
### CMT on ImageNet-1K Classification
| Model | Top 1 Acc. | Log | Ckpt |
| :------------------- | :--------: | :------: | :------: |
| CMT-Ti | 79.0% | [github](https://github.com/ggjy/CMT.pytorch/releases/download/release-v1/log_cmt_tiny.txt) | [github](https://github.com/ggjy/CMT.pytorch/releases/download/release-v1/cmt_tiny.pth) |
| CMT-XS | 81.8% | [github](https://github.com/ggjy/CMT.pytorch/releases/download/release-v1/log_cmt_xs.txt) | [github](https://github.com/ggjy/CMT.pytorch/releases/download/release-v1/cmt_xs.pth) |
| CMT-Small | 83.5% | [github](https://github.com/ggjy/CMT.pytorch/releases/download/release-v1/log_cmt_small.txt) | [github](https://github.com/ggjy/CMT.pytorch/releases/download/release-v1/cmt_small.pth) |
| CMT-Base | 84.5% | [github](https://github.com/ggjy/CMT.pytorch/releases/download/release-v1/log_cmt_base.txt) | [github](https://github.com/ggjy/CMT.pytorch/releases/download/release-v1/cmt_base.pth) |
## Citation
If you find this project useful in your research, please consider cite:
```bibtex
@article{guo2021cmt,
title={Cmt: Convolutional neural networks meet vision transformers},
author={Guo, Jianyuan and Han, Kai and Wu, Han and Xu, Chang and Tang, Yehui and Xu, Chunjing and Wang, Yunhe},
journal={arXiv preprint arXiv:2107.06263},
year={2021}
}
```
## Acknowledgment
This repo is based on [DeiT](https://github.com/facebookresearch/deit) and [pytorch-image-models](https://github.com/rwightman/pytorch-image-models).
CMT实现.zip
版权申诉
20 浏览量
2023-08-24
09:57:34
上传
评论
收藏 28KB ZIP 举报
sjx_alo
- 粉丝: 1w+
- 资源: 1206
最新资源
- IMG_0694.GIF
- 基于图像的三维模型重建C++源代码+文档说明(高分课程设计)
- 基于聚焦法的工件立体测量方案,根据数据进行三维重建 使用HALCON处理图像,MATLAB拟合数据+源代码+数据集+效果图
- 锄战三国村 修改:货币使用不减 v1.10(2) 原创 (中文).apk
- 基于python实现的单目双目视觉三维重建+源代码+图像图片(高分课程设计)
- 基于C+++OPENCV的全景图像拼接源码(课程设计)
- 基于Python+OpenCV对多张图片进行全景图像拼接,消除鬼影,消除裂缝+源代码+文档说明+界面截图(高分课程设计)
- 基于C++实现的全景图像拼接源码(课程设计)
- 基于SIFT特征点提取和RASIC算法实现全景图像拼接python源码+文档说明+界面截图+详细注释(95分以上课程大作业)
- 基于matlab实现眼部判别的疲劳检测系统+源代码+全部数据+文档说明+详细注释+使用说明+截图(高分课程设计)
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈