没有合适的资源?快使用搜索试试~ 我知道了~
Matching Networks for One Shot Learning
需积分: 0 1 下载量 164 浏览量
2023-09-26
13:52:17
上传
评论
收藏 2.75MB PDF 举报
温馨提示


试读
9页
小样本学习
资源推荐
资源详情
资源评论

























Matching Networks for One Shot Learning
Oriol Vinyals
Google DeepMind
vinyals@google.com
Charles Blundell
Google DeepMind
cblundell@google.com
Timothy Lillicrap
Google DeepMind
countzero@google.com
Koray Kavukcuoglu
Google DeepMind
korayk@google.com
Daan Wierstra
Google DeepMind
wierstra@google.com
Abstract
Learning from a few examples remains a key challenge in machine learning.
Despite recent advances in important domains such as vision and language, the
standard supervised deep learning paradigm does not offer a satisfactory solution
for learning new concepts rapidly from little data. In this work, we employ ideas
from metric learning based on deep neural features and from recent advances
that augment neural networks with external memories. Our framework learns a
network that maps a small labelled support set and an unlabelled example to its
label, obviating the need for fine-tuning to adapt to new class types. We then define
one-shot learning problems on vision (using Omniglot, ImageNet) and language
tasks. Our algorithm improves one-shot accuracy on ImageNet from 87.6% to
93.2% and from 88.0% to 93.8% on Omniglot compared to competing approaches.
We also demonstrate the usefulness of the same model on language modeling by
introducing a one-shot task on the Penn Treebank.
1 Introduction
Humans learn new concepts with very little supervision – e.g. a child can generalize the concept
of “giraffe” from a single picture in a book – yet our best deep learning systems need hundreds or
thousands of examples. This motivates the setting we are interested in: “one-shot” learning, which
consists of learning a class from a single labelled example.
Deep learning has made major advances in areas such as speech [
7
], vision [
13
] and language [
16
],
but is notorious for requiring large datasets. Data augmentation and regularization techniques alleviate
overfitting in low data regimes, but do not solve it. Furthermore, learning is still slow and based on
large datasets, requiring many weight updates using stochastic gradient descent. This, in our view, is
mostly due to the parametric aspect of the model, in which training examples need to be slowly learnt
by the model into its parameters.
In contrast, many non-parametric models allow novel examples to be rapidly assimilated, whilst not
suffering from catastrophic forgetting. Some models in this family (e.g., nearest neighbors) do not
require any training but performance depends on the chosen metric [
1
]. Previous work on metric
learning in non-parametric setups [
18
] has been influential on our model, and we aim to incorporate
the best characteristics from both parametric and non-parametric models – namely, rapid acquisition
of new examples while providing excellent generalisation from common examples.
The novelty of our work is twofold: at the modeling level, and at the training procedure. We propose
Matching Nets, a neural network which uses recent advances in attention and memory that enable
rapid learning. Secondly, our training procedure is based on a simple machine learning principle:
test and train conditions must match. Thus to train our network to do rapid learning, we train it by
30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.
资源评论



cainiaomyf
- 粉丝: 2
- 资源: 1
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助


安全验证
文档复制为VIP权益,开通VIP直接复制
