【免费】2019ARobustLearningApproachtoDomainAdaptiv.pdf资源-CSDN文库

需积分: 0 42 浏览量 2021-03-17 22:34:51 上传评论收藏 5.12MB PDF 举报

标题《2019 A Robust Learning Approach to Domain Adaptive Object Detection》和描述中的“域适应”以及“目标检测”指示了本文重点讨论的是计算机视觉领域中的一个核心问题——如何让对象检测模型适应新的、未知的环境。这在自动驾驶汽车、监控、医疗成像、面部分析和工业制造等应用中至关重要。随着卷积神经网络（CNNs）在对象检测领域取得显著进步，模型变得更加迅速、可靠且精确。然而，当模型从源域迁移到目标域时，域适应成为了一个显著的挑战。域适应问题主要源自于实际应用中不可避免的域迁移现象。比如，在自动驾驶的场合，目标域包含了训练数据中无法观察到的复杂道路环境。同样，在监控应用中，由于隐私法规的限制，足够代表性的训练数据常常缺失。为了解决这些挑战，本文提出了一种鲁棒的对象检测框架，该框架能够抵抗边界框类别标签、位置和尺寸注释中的噪声。文章在概念上将域适应问题视为带噪声标签的训练问题，并提出了一种能够适应域迁移的模型训练方法。具体来说，模型会在目标域上进行训练，训练数据是通过仅在源域训练的检测模型获得的一组噪声对象边界框。文章通过在SIM10K、Cityscapes和KITTI数据集上的不同源/目标域对上评估了所提出方法的准确性，并展示模型在多个域适应场景上显著提升了现有技术水平。对于“目标检测”领域，该研究强调了如何设计一种能够处理新场景和环境的对象检测系统。这涉及到对模型进行适当的训练，使之能够在面对新的数据分布时依然保持高性能。考虑到现实世界中数据分布的多变性，域适应技术使得同一个模型能够在不同的应用场景中保持其检测准确度，这对于实际部署来说至关重要。在介绍部分中，文章还提到，尽管在许多判别问题中，包括对象检测，通常假定训练（源域）和测试（目标域）数据集的分布是相同的，但实际情况中这一假设很容易被违反。对象检测中的域变化可能由于视角、背景、物体外观、场景类型和照明等因素的变化而发生。因此，研究者们必须开发能够处理这些问题的鲁棒模型。文章中还提到了领域适应技术在处理现实世界问题中的优势，例如自动驾驶车辆中的无限道路环境，以及由于隐私法规导致的监控场景中代表性训练数据的缺失。这些情况强调了研究领域适应技术的迫切性，同时也突出了在领域适应问题中需要解决的关键技术和挑战。

资源推荐

资源详情

资源评论

A Robust Learning Approach to Domain Adaptive Object Detection

Mehran Khodabandeh

Simon Fraser University

Burnaby, Canada

mkhodaba@sfu.ca

Arash Vahdat

NVIDIA

California, USA

avahdat@nvidia.com

Mani Ranjbar

Quadrant

Burnaby, Canada

mani@quadrant.ai

William G. Macready

Quadrant

Burnaby, Canada

bill@quadrant.ai

Abstract

Domain shift is unavoidable in real-world applications

of object detection. For example, in self-driving cars, the

target domain consists of unconstrained road environments

which cannot all possibly be observed in training data. Sim-

ilarly, in surveillance applications sufﬁciently representa-

tive training data may be lacking due to privacy regulations.

In this paper, we address the domain adaptation problem

from the perspective of robust learning and show that the

problem may be formulated as training with noisy labels.

We propose a robust object detection framework that is re-

silient to noise in bounding box class labels, locations and

size annotations. To adapt to the domain shift, the model

is trained on the target domain using a set of noisy ob-

ject bounding boxes that are obtained by a detection model

trained only in the source domain. We evaluate the accu-

racy of our approach in various source/target domain pairs

and demonstrate that the model signiﬁcantly improves the

state-of-the-art on multiple domain adaptation scenarios on

the SIM10K, Cityscapes and KITTI datasets.

1. Introduction

Object detection lies at the core of computer vision and

ﬁnds application in surveillance, medical imaging, self-

driving cars, face analysis, and industrial manufacturing.

Recent advances in object detection using convolutional

neural networks (CNNs) have made current models fast, re-

liable and accurate.

However, domain adaptation remains a signiﬁcant chal-

lenge in object detection. In many discriminative problems

(including object detection) it is usually assumed that the

distribution of instances in both train (source domain) and

test (target domain) set are identical. Unfortunately, this as-

sumption is easily violated, and domain changes in object

detection arise with variations in viewpoint, background,

object appearance, scene type and illumination. Further,

object detection models are often deployed in environments

which differ from the training environment.

Common domain adaptation approaches are based on ei-

ther supervised model ﬁne-tuning in the target domain or

unsupervised cross-domain representation learning. While

the former requires additional labeled instances in the tar-

get domain, the latter eliminates this requirement at the cost

of two new challenges. Firstly, the source/target represen-

tations should be matched in some space (e.g., either in in-

put space [66, 23] or hidden representations space [14, 52]).

Secondly, a mechanism for feature matching must be de-

ﬁned (e.g. maximum mean discrepancy (MMD) [38, 34],

H divergence [2], or adversarial learning).

In this paper, we approach domain adaptation differently,

and address the problem through robust training methods.

Our approach relies on the observation that, although a (pri-

mary) model trained in the source domain may have subop-

timal performance in the target domain, it may nevertheless

be used to detect objects in the target domain with some

accuracy. The detected objects can then be used to retrain

a detection model on both domains. However, because the

instances detected in the target domain may be inaccurate,

a robust detection framework (which accommodates these

inaccuracies) must be used during retraining.

The principal beneﬁt of this formulation is that the de-

tection model is trained in an unsupervised manner in the

target domain. Although we do not explicitly aim at match-

ing representations between source and target domain, the

detection model may implicitly achieve this because it is fed

by instances from both source and target domains.

To accommodate labeling inaccuracies we adopt a prob-

abilistic perspective and develop a robust training frame-

work for object detection on top of Faster R-CNN [45]. We

provide robustness against two types of noise: i) mistakes

in object labels (i.e., a bounding box is labeled as person

but actually is a pole), and ii) inaccurate bounding box lo-

cation and size (i.e., a bounding box does not enclose the

object). We formulate the robust retraining objective so

that the model can alter both bounding box class labels and

bounding box location/size based on its current belief of la-

bels in the target domain. This enables the robust detection

model to reﬁne the noisy labels in the target domain.

To further improve label quality in the target domain, we

introduce an auxiliary image classiﬁcation model. We ex-

arXiv:1904.02361v2 [cs.LG] 11 Aug 2019

pect that an auxiliary classiﬁer can improve target domain

labels because it may use cues that have not been utilized by

the original detection model. As examples, additional cues

can be based on additional input data (e.g. motion or opti-

cal ﬂow), different network architectures, or ensembles of

models. We note however, that the auxiliary image classi-

ﬁcation model is only used during the retraining phase and

the computational complexity of the ﬁnal detector is pre-

served at test time.

The contributions of this paper are summarized as fol-

lows: i) We provide the ﬁrst (to the best of our knowledge)

formulation of domain adaptation in object detection as ro-

bust learning. ii) We propose a novel robust object detection

framework that considers noise in training data on both ob-

ject labels and locations. We use Faster R-CNN[45] as our

base object detector, but our general framework, theoreti-

cally, could be adapted to other detectors (e.g. SSD [31] and

YOLO [43]) that minimize a classiﬁcation loss and regress

bounding boxes. iii) We use an independent classiﬁcation

reﬁnement module to allow other sources of information

from the target domain (e.g. motion, geometry, background

information) to be integrated seamlessly. iv) We demon-

strate that this robust framework achieves state-of-the-art on

several cross-domain detection tasks.

2. Previous Work

Object Detection: The ﬁrst approaches to object detec-

tion used a sliding window followed by a classiﬁer based

on hand-crafted features [6, 11, 58]. After advances in

deep convolutional neural networks, methods such as R-

CNN [19], SPPNet [22], and Fast R-CNN [18] arose which

used CNNs for feature extraction and classiﬁcation. Slow

sliding window algorithms were replaced with faster region

proposal methods such as selective search [53]. Recent

object detection methods further speed bounding box de-

tection. For example, in Faster R-CNN [45] a region pro-

posal network (RPN) was introduced to predict reﬁnements

in the locations and sizes of predeﬁned anchor boxes. In

SSD [31], classiﬁcation and bounding box prediction is per-

formed on feature maps at different scales using anchor

boxes with different aspect ratios. In YOLO [42], a regres-

sion problem on a grid is solved, where for each cell in the

grid, the bounding box and the class label of the object cen-

tering at that cell is predicted. Newer extensions are found

in [63, 43, 5]. A comprehensive comparison of methods is

reported in [25]. The goal of this paper is to increase the

accuracy of an object detector in a new domain regardless

of the speed. Consequently, we base our improvements on

Faster R-CNN, a slower, but accurate detector.

Our adoption of faster R-CNN also allows for direct comparison with

the state-of-the-art [2].

Domain Adaptation: was initially studied for image

classiﬁcation and the majority of the domain adaptation

literature focuses on this problem [10, 9, 29, 21, 20, 12,

48, 32, 33, 14, 13, 17, 1, 37, 30]. Some of the meth-

ods developed in this context include cross-domain kernel

learning methods such as adaptive multiple kernel learn-

ing (A-MKL) [10], domain transfer multiple kernel learn-

ing (DTMKL) [9], and geodesic ﬂow kernel (GFK) [20].

There are a wide variety of approaches directed towards ob-

taining domain invariant predictors: supervised learning of

non-linear transformations between domains using asym-

metric metric learning [29], unsupervised learning of in-

termediate representations [21], alignment of target and do-

main subspaces using eigenvector covariances [12], align-

ment the second-order statistics to minimize the shift be-

tween domains [48], and covariance matrix alignment ap-

proach [59]. The rise of deep learning brought with it steps

towards domain-invariant feature learning. In [32, 33] a re-

producing kernel Hilbert embedding of the hidden features

in the network is learned and mean-embedding matching

is performed for both domain distributions. In [14, 13] an

adversarial loss along with a domain classiﬁer is trained to

learn features that are discriminative and domain invariant.

There is less work in domain adaptation for object de-

tection. Domain adaptation methods for non-image clas-

siﬁcation tasks include [15] for ﬁne-grained recognition,

[3, 24, 64, ?] for semantic segmentation, [?] for dataset

generation, and [?] for ﬁnding out of distribution data in

active learning. For object detection itself, [61] used an

adaptive SVM to reduce the domain shift, [41] performed

subspace alignment on the features extracted from R-CNN,

and [2] used Faster RCNN as baseline and took an adversar-

ial approach (similar to [13]) to learn domain invariant fea-

tures jointly on target and source domains. We take a fun-

damentally different approach by reformulating the prob-

lem as noisy labeling. We design a robust-to-noise train-

ing scheme for object detection which is trained on noisy

bounding boxes and labels acquired from the target domain

as pseudo-ground-truth.

Noisy Labeling: Previous work on robust learning has fo-

cused on image classiﬁcation where there are few and dis-

joint classes. Early work used instance-independent noise

models, where each class is confused with other classes in-

dependent of the instance content [39, 36, 40, 47, 65, 62].

Recently, the literature has shifted towards instance-speciﬁc

label noise prediction [60, 35, 54, 55, 56, 57, 51, 27, 7, 44].

To the best of our knowledge, ours is the ﬁrst proposal for

an object detection model that is robust to label noise.

3. Method

Following the common formulation for domain adapta-

tion, we represent the training data space as the source do-

剩余10页未读，继续阅读

评论收藏

内容反馈

油光发亮的小猛

粉丝: 5
资源: 8

2019 A Robust Learning Approach to Domain Adaptiv.pdf

最新资源

2019 A Robust Learning Approach to Domain Adaptiv.pdf

A Robust Minimax Approach to Classification

Robust and Optimal Control_5.pdf

a robust optimization approach to asset-la under time-varying inv opp.pdf

A Robust Loss for Point Cloud Registration.pdf

a robust optimization perspective on stochastic programming.pdf

Robust and Optimal control_simple.pdf

Robust_Adaptive_Beamforming.pdf

robust optimization_sliders_rt.pdf

Learning to Reweight Examples for Robust Deep Learning.pdf

a practical guide to robust optimization.pdf

Robust Power System Frequency Control Springer .pdf

Amazon Machine Learning Developer Guide.pdf

Robust Portfolio Optimization using CVaR.PDF

Robust solutions of uncertain linear programs.pdf

robust multi-period portfolio selection .pdf

Budai13--Robust Vessel Segmentation in Fundus Images.pdf

worst case VaR and Robust portfolio_A conic programming approach.pdf

Robust Distributed Formation Controller Design for.pdf

An introduction to domain adaptation and transfer learning.pdf

robust mean-variance portfolio selection.pdf

distributionally robust optimization and its tractable approximations.pdf

Bioinformatics.Data.Skills.Reproducible.and.Robust.Research.pdf

Robust Visual Inertial Odometry Using a Direct EKF-Based Approach.pdf

Incremental Learning for Robust Visual Tracking.

An empirical Bayes approach to efficient portfolio selection.pdf

EECS-2018-120Efficient Policy Learning for Robust Robot Grasping.pdf

Robust Optimization - Princeton University Press

Visual-lidar+odometry+and+mapping_+low-drift,+robust,+and+fast.pdf

learning-spark-streaming.pdf

最新资源