没有合适的资源?快使用搜索试试~ 我知道了~
藏经阁-Analysis of dropout learnning.pdf
需积分: 5 0 下载量 49 浏览量
2023-08-26
15:13:14
上传
评论
收藏 185KB PDF 举报
温馨提示
试读
9页
藏经阁-Analysis of dropout learnning.pdf
资源推荐
资源详情
资源评论
arXiv:1706.06859v1 [cs.LG] 20 Jun 2017
Analysis of dropout learning regarded as
ensemble learning
Kazuyuki Hara
1
Daisuke Saitoh
2
Hayaru Shouno
3
College of Industrial Technology, Nihon University,
1-2-1 Izumi-cho, Narashino-shi, Chiba, 275-8575 Japan.
Graduate School of Industrial Technology, Nihon University
Graduate School of Informatics and Engineering,
The University of Electro-Communications
1-5-1 Chofugaoka, Chofu-shi, Tokyo, 182-8585 Japan.
Abstract
Deep learning is the state-of-the-art in fields such as visual object
recognition and speech recognition. This learning uses a large number of
layers, huge number of units, and connections. Therefore, overfi tting is
a serious problem. To avoid this problem, dropout learning is proposed.
Dropout learning neglects some inputs and hidden units in the learning
process with a probability, p, and then, the neglected inputs and hidden
units are combined with the learned n etwork to express the final output.
We find t hat the process of combining th e neglected hidden units with
the learned network can be regarded as ensemble learning, so we analyze
dropout learning from this point of view.
keywords: Dropout learning, over fitting, regularization, ensemble learning,
soft-committee machine, teacher-student formulation
1 Introduction
Deep learning [1, 2] is attracting much a ttention in the field of vis ual object
recognition, speech recognition, object detection, and many other domains. It
provides automatic feature extr action and has the ability to achieve outstanding
performance [3, 4].
Deep le arning uses a very deep layered networ k and a huge number o f data,
so overfitting is a serio us problem. To avo id overfitting, regularizatio n is used.
Hinton et al. proposed a regularization method called “dropout learning ” [5]
for this purpose. Dropout learning follows two processes. At learning time,
some hidden units are neglected with a probability p, and this process reduces
the network size. At test time, learned hidden units and those not learned are
1
资源评论
weixin_40191861_zj
- 粉丝: 62
- 资源: 1万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功