【免费】第二十四篇——UnsupervisedDomainAdaptationbyBackpropagation.pdf资源-CSDN文库

需积分: 0 142 浏览量 2020-11-23 10:57:15 上传评论收藏 3.18MB PDF 举报

资源推荐

资源详情

资源评论

Unsupervised Domain Adaptation by Backpropagation

Yaroslav Ganin GANIN@SKOLTECH.RU

Victor Lempitsky LEMPITSKY@SKOLTECH.RU

Skolkovo Institute of Science and Technology (Skoltech)

Abstract

Top-performing deep architectures are trained on

massive amounts of labeled data. In the absence

of labeled data for a certain task, domain adap-

tation often provides an attractive option given

that labeled data of similar nature but from a dif-

ferent domain (e.g. synthetic images) are avail-

able. Here, we propose a new approach to do-

main adaptation in deep architectures that can

be trained on large amount of labeled data from

the source domain and large amount of unlabeled

data from the target domain (no labeled target-

domain data is necessary).

As the training progresses, the approach pro-

motes the emergence of “deep” features that are

(i) discriminative for the main learning task on

the source domain and (ii) invariant with respect

to the shift between the domains. We show that

this adaptation behaviour can be achieved in al-

most any feed-forward model by augmenting it

with few standard layers and a simple new gra-

dient reversal layer. The resulting augmented

architecture can be trained using standard back-

propagation.

Overall, the approach can be implemented with

little effort using any of the deep-learning pack-

ages. The method performs very well in a se-

ries of image classiﬁcation experiments, achiev-

ing adaptation effect in the presence of big do-

main shifts and outperforming previous state-of-

the-art on Ofﬁce datasets.

1. Introduction

Deep feed-forward architectures have brought impressive

advances to the state-of-the-art across a wide variety of

machine-learning tasks and applications. At the moment,

however, these leaps in performance come only when a

large amount of labeled training data is available. At the

same time, for problems lacking labeled data, it may be

still possible to obtain training sets that are big enough for

training large-scale deep models, but that suffer from the

shift in data distribution from the actual data encountered

at “test time”. One particularly important example is syn-

thetic or semi-synthetic training data, which may come in

abundance and be fully labeled, but which inevitably have

a distribution that is different from real data (Liebelt &

Schmid, 2010; Stark et al., 2010; V

azquez et al., 2014; Sun

& Saenko, 2014).

Learning a discriminative classiﬁer or other predictor in

the presence of a shift between training and test distribu-

tions is known as domain adaptation (DA). A number of

approaches to domain adaptation has been suggested in the

context of shallow learning, e.g. in the situation when data

representation/features are given and ﬁxed. The proposed

approaches then build the mappings between the source

(training-time) and the target (test-time) domains, so that

the classiﬁer learned for the source domain can also be ap-

plied to the target domain, when composed with the learned

mapping between domains. The appeal of the domain

adaptation approaches is the ability to learn a mapping be-

tween domains in the situation when the target domain data

are either fully unlabeled (unsupervised domain annota-

tion) or have few labeled samples (semi-supervised domain

adaptation). Below, we focus on the harder unsupervised

case, although the proposed approach can be generalized to

the semi-supervised case rather straightforwardly.

Unlike most previous papers on domain adaptation that

worked with ﬁxed feature representations, we focus on

combining domain adaptation and deep feature learning

within one training process (deep domain adaptation). Our

goal is to embed domain adaptation into the process of

learning representation, so that the ﬁnal classiﬁcation de-

cisions are made based on features that are both discrim-

inative and invariant to the change of domains, i.e. have

the same or very similar distributions in the source and the

target domains. In this way, the obtained feed-forward net-

work can be applicable to the target domain without being

hindered by the shift between the two domains.

We thus focus on learning features that combine (i)

discriminativeness and (ii) domain-invariance. This is

achieved by jointly optimizing the underlying features as

well as two discriminative classiﬁers operating on these

features: (i) the label predictor that predicts class labels

and is used both during training and at test time and (ii) the

arXiv:1409.7495v2 [stat.ML] 27 Feb 2015

Unsupervised Domain Adaptation by Backpropagation

domain classiﬁer that discriminates between the source and

the target domains during training. While the parameters of

the classiﬁers are optimized in order to minimize their error

on the training set, the parameters of the underlying deep

feature mapping are optimized in order to minimize the loss

of the label classiﬁer and to maximize the loss of the domain

classiﬁer. The latter encourages domain-invariant features

to emerge in the course of the optimization.

Crucially, we show that all three training processes can

be embedded into an appropriately composed deep feed-

forward network (Figure 1) that uses standard layers and

loss functions, and can be trained using standard backprop-

agation algorithms based on stochastic gradient descent or

its modiﬁcations (e.g. SGD with momentum). Our ap-

proach is generic as it can be used to add domain adaptation

to any existing feed-forward architecture that is trainable by

backpropagation. In practice, the only non-standard com-

ponent of the proposed architecture is a rather trivial gra-

dient reversal layer that leaves the input unchanged during

forward propagation and reverses the gradient by multiply-

ing it by a negative scalar during the backpropagation.

Below, we detail the proposed approach to domain adap-

tation in deep architectures, and present results on tradi-

tional deep learning image datasets (such as MNIST (Le-

Cun et al., 1998) and SVHN (Netzer et al., 2011)) as well

as on OFFICE benchmarks (Saenko et al., 2010), where

the proposed method considerably improves over previous

state-of-the-art accuracy.

2. Related work

A large number of domain adaptation methods have been

proposed over the recent years, and here we focus on the

most related ones. Multiple methods perform unsuper-

vised domain adaptation by matching the feature distri-

butions in the source and the target domains. Some ap-

proaches perform this by reweighing or selecting samples

from the source domain (Borgwardt et al., 2006; Huang

et al., 2006; Gong et al., 2013), while others seek an ex-

plicit feature space transformation that would map source

distribution into the target ones (Pan et al., 2011; Gopalan

et al., 2011; Baktashmotlagh et al., 2013). An important

aspect of the distribution matching approach is the way the

(dis)similarity between distributions is measured. Here,

one popular choice is matching the distribution means in

the kernel-reproducing Hilbert space (Borgwardt et al.,

2006; Huang et al., 2006), whereas (Gong et al., 2012; Fer-

nando et al., 2013) map the principal axes associated with

each of the distributions. Our approach also attempts to

match feature space distributions, however this is accom-

plished by modifying the feature representation itself rather

than by reweighing or geometric transformation. Also, our

method uses (implicitly) a rather different way to measure

the disparity between distributions based on their separa-

bility by a deep discriminatively-trained classiﬁer.

Several approaches perform gradual transition from the

source to the target domain (Gopalan et al., 2011; Gong

et al., 2012) by a gradual change of the training distribu-

tion. Among these methods, (S. Chopra & Gopalan, 2013)

does this in a “deep” way by the layerwise training of a

sequence of deep autoencoders, while gradually replacing

source-domain samples with target-domain samples. This

improves over a similar approach of (Glorot et al., 2011)

that simply trains a single deep autoencoder for both do-

mains. In both approaches, the actual classiﬁer/predictor

is learned in a separate step using the feature representa-

tion learned by autoencoder(s). In contrast to (Glorot et al.,

2011; S. Chopra & Gopalan, 2013), our approach performs

feature learning, domain adaptation and classiﬁer learning

jointly, in a uniﬁed architecture, and using a single learning

algorithm (backpropagation). We therefore argue that our

approach is simpler (both conceptually and in terms of its

implementation). Our method also achieves considerably

better results on the popular OFFICE benchmark.

While the above approaches perform unsupervised domain

adaptation, there are approaches that perform supervised

domain adaptation by exploiting labeled data from the tar-

get domain. In the context of deep feed-forward archi-

tectures, such data can be used to “ﬁne-tune” the net-

work trained on the source domain (Zeiler & Fergus, 2013;

Oquab et al., 2014; Babenko et al., 2014). Our approach

does not require labeled target-domain data. At the same

time, it can easily incorporate such data when it is avail-

able.

An idea related to ours is described in (Goodfellow et al.,

2014). While their goal is quite different (building gener-

ative deep networks that can synthesize samples), the way

they measure and minimize the discrepancy between the

distribution of the training data and the distribution of the

synthesized data is very similar to the way our architecture

measures and minimizes the discrepancy between feature

distributions for the two domains.

Finally, a recent and concurrent report by (Tzeng et al.,

2014) also focuses on domain adaptation in feed-forward

networks. Their set of techniques measures and minimizes

the distance of the data means across domains. This ap-

proach may be regarded as a “ﬁrst-order” approximation

to our approach, which seeks a tighter alignment between

distributions.

3. Deep Domain Adaptation

3.1. The model

We now detail the proposed model for the domain adap-

tation. We assume that the model works with input sam-

ples x ∈ X, where X is some input space and cer-

tain labels (output) y from the label space Y . Below,

we assume classiﬁcation problems where Y is a ﬁnite set

(Y = {1, 2, . . . L}), however our approach is generic and

can handle any output label space that other deep feed-

剩余10页未读，继续阅读

评论收藏

内容反馈

疯狂java杰尼龟

粉丝: 4w+
资源: 13

第二十四篇——Unsupervised Domain Adaptation by Backpropagation.pdf

最新资源

第二十四篇——Unsupervised Domain Adaptation by Backpropagation.pdf

Unsupervised Domain Adaptation by Backpropagation.pdf

domain adaptation

Unsupervised Domain Adaption of Object Detectors A Survey.pdf

Python-AdversarialDiscriminativeDomainAdaptation的PyTorch实现

迁移学习Adversarial-discriminative-domain-adaptation

Deep Transfer Network: Unsupervised Domain Adaptation

Joint Feature and Labeling Function Adaptation for Unsupervised Domain Adaptation.pdf

几何感知的无监督域自适应_Geometry-Aware Unsupervised Domain Adaptation.pdf

Unsupervised Multi-Source Domain Adaptation Without Access.pdf

gpt2-language_models_are_unsupervised_multitask_learners.pdf

VDSM Unsupervised Video Disentanglement With State.pdf

UnsupervisedR&R Unsupervised Point Cloud Registration via.pdf

论文笔记：Cluster Alignment with a Teacher for Unsupervised Domain Adaptation

Optimal Transport for Domain Adaptation

Domain Adaptation for Visual Recognition

Unsupervised.Learning.with.R

PPREDICT & CLUSTER Unsupervised Skeleton Based Action Recognition中文翻译.pdf

Unsupervised Domain Adaption of Object Detectors A Survey.zip

cheatsheet-unsupervised-learning.pdf

Unsupervised-Domain-Adaptation-with-Differential-Treatment:[CVPR 2020]东西区别对待

Java 面经手册·小傅哥.pdf

解压后拖入浏览器扩展程序使用.zip

103套PPT模板.zip

Beyond Compare 免安装直接使用

notepad++.exe官网下载

Mars4_5.zip

keygen_2032.rar

最新资源