# KnowledgeDistillation Layer (Caffe implementation)
## Installation
1. Install [Caffe](https://github.com/BVLC/caffe/) in your directory `CAFFE`<br>
2. Download this repository in your directory `ROOT`<br>
3. Move files to your Caffe folder<br>
```bash
cp $ROOT/knowledge_distillation_layer.hpp $CAFFE/include/caffe/layers
cp $ROOT/knowledge_distillation_layer.cpp $CAFFE/src/caffe/layers
```
4. Modify `$CAFFE/src/caffe/proto/caffe.proto`<br>add `optional KnowledgeDistillationParameter` in `LayerParameter`
```proto
message LayerParameter {
...
//next available layer-specific ID
optional KnowledgeDistillationParameter knowledge_distillation_param = 147;
}
```
<br>add `message KnowledgeDistillationParameter`<br>
```proto
message KnowledgeDistillationParameter {
optional float temperature = 1 [default = 1];
}
```
5. Build Caffe
<br>
## Usage
KnowledgeDistillation Layer has one specific parameter `temperature`.<br><br>The layer takes 2 or 3 input blobs:<br>
`bottom[0]`: the logits of the student<br>
`bottom[1]`: the logits of the teacher<br>
`bottom[2]`(*optional*): label inputs<br>
The logits are first divided by temperatrue T, then mapped to probability distributions over classes using the softmax function. The layer computes KL divergence instead of cross entropy. The gradients are multiplied by T^2, as suggested in the [paper](https://arxiv.org/abs/1503.02531).<br>
1. Common setting in `prototxt` (2 input blobs are given)
```
layer {
name: "KD"
type: "KnowledgeDistillation"
bottom: "student_logits"
bottom: "taecher_logits"
top: "KL_div"
include { phase: TRAIN }
knowledge_distillation_param { temperature: 4 } #usually larger than 1
loss_weight: 1
}
```
2. If you have ignore_label, 3 input blobs should be given
```
layer {
name: "KD"
type: "KnowledgeDistillation"
bottom: "student_logits"
bottom: "taecher_logits"
bottom: "label"
top: "KL_div"
include { phase: TRAIN }
knowledge_distillation_param { temperature: 4 }
loss_param {ignore_label: 2}
loss_weight: 1
}
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
知识蒸馏_基于Caffe实现的知识蒸馏Layer算子实现_附项目源码_优质项目实战.zip (3个子文件)
知识蒸馏_基于Caffe实现的知识蒸馏Layer算子实现_附项目源码_优质项目实战
knowledge_distillation_layer.hpp 5KB
knowledge_distillation_layer.cpp 8KB
README.md 2KB
共 3 条
- 1
资源评论
__AtYou__
- 粉丝: 3505
- 资源: 2166
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功