监督学习论文资源-CSDN文库

需积分: 40 127 浏览量 2018-10-24 16:11:58 上传评论收藏 1.57MB PDF 举报

资源推荐

资源详情

资源评论

See discussions, stats, and author proﬁles for this publication at: https://www.researchgate.net/publication/308277668
LIFT: Learned Invariant Feature Transform
Conference Paper · October 2016
DOI: 10.1007/978-3-319-46466-4_28
CITATIONS
81
READS
664
4 authors, including:
Some of the authors of this publication are also working on these related projects:
Learning Descriptors for Object Recognition and 3D Pose Estimation View project
Electron imaging and 3D reconstruction of dislocations View project
Kwang Moo Yi
University of Victoria
38 PUBLICATIONS527 CITATIONS
SEE PROFILE
Vincent Lepetit
Université Bordeaux 1
199 PUBLICATIONS11,362 CITATIONS
SEE PROFILE
Pascal Fua
École Polytechnique Fédérale de Lausanne
597 PUBLICATIONS24,589 CITATIONS
SEE PROFILE
All content following this page was uploaded by Vincent Lepetit on 06 March 2018.
The user has requested enhancement of the downloaded ﬁle.

LIFT: Learned Invariant Feature Transform

Kwang Moo Yi

∗,1

, Eduard Trulls

∗,1

, Vincent Lepetit

, Pascal Fua

Computer Vision Laboratory, Ecole Polytechnique F´ed´erale de Lausanne (EPFL)

Institute for Computer Graphics and Vision, Graz University of Technology

{kwang.yi, eduard.trulls, pascal.fua}@epfl.ch, lepetit@icg.tugraz.at

Abstract. We introduce a novel Deep Network architecture that imple-

ments the full feature point handling pipeline, that is, detection, orienta-

tion estimation, and feature description. While previous works have suc-

cessfully tackled each one of these problems individually, we show how to

learn to do all three in a uniﬁed manner while preserving end-to-end dif-

ferentiability. We then demonstrate that our Deep pipeline outperforms

state-of-the-art methods on a number of benchmark datasets, without

the need of retraining.

Keywords: Local Features, Feature Descriptors, Deep Learning

1 Introduction

Local features play a key role in many Computer Vision applications. Find-

ing and matching them across images has been the subject of vast amounts

of research. Until recently, the best techniques relied on carefully hand-crafted

features [1–5]. Over the past few years, as in many areas of Computer Vision,

methods based in Machine Learning, and more speciﬁcally Deep Learning, have

started to outperform these traditional methods [6–10].

These new algorithms, however, address only a single step in the complete

processing chain, which includes detecting the features, computing their orienta-

tion, and extracting robust representations that allow us to match them across

images. In this paper we introduce a novel Deep architecture that performs all

three steps together. We demonstrate that it achieves better overall performance

than the state-of-the-art methods, in large part because it allows these individual

steps to be optimized to perform well in conjunction with each other.

Our architecture, which we refer to as LIFT for Learned Invariant Feature

Transform, is depicted by Fig. 1. It consists of three components that feed into

each other: the Detector, the Orientation Estimator, and the Descriptor. Each

one is based on Convolutional Neural Networks (CNNs), and patterned after

recent ones [6, 9, 10] that have been shown to perform these individual functions

well. To mesh them together we use Spatial Transformers [11] to rectify the

∗ First two authors contributed equally.

This work was supported in part by the EU FP7 project MAGELLAN under grant

number ICT-FP7-611526.

2 K. M. Yi, E. Trulls, V. Lepetit, P. Fua

DET

Crop

ORI

Rot

DESC

LIFT pipeline

SCORE MAP

softargmax

description

vector

Fig. 1. Our integrated feature extraction pipeline. Our pipeline consists of three major

components: the Detector, the Orientation Estimator, and the Descriptor. They are

tied together with diﬀerentiable operations to preserve end-to-end diﬀerentiability.

image patches given the output of the Detector and the Orientation Estimator.

We also replace the traditional approaches to non-local maximum suppression

(NMS) by the soft argmax function [12]. This allows us to preserve end-to-end

diﬀerentiability, and results in a full network that can still be trained with back-

propagation, which is not the case of any other architecture we know of.

Also, we show how to learn such a pipeline in an eﬀective manner. To this

end, we build a Siamese network and train it using the feature points produced

by a Structure-from-Motion (SfM) algorithm that we ran on images of a scene

captured under diﬀerent viewpoints and lighting conditions, to learn its weights.

We formulate this training problem on image patches extracted at diﬀerent scales

to make the optimization tractable. In practice, we found it impossible to train

the full architecture from scratch, because the individual components try to op-

timize for diﬀerent objectives. Instead, we introduce a problem-speciﬁc learning

approach to overcome this problem. It involves training the Descriptor ﬁrst,

which is then used to train the Orientation Estimator, and ﬁnally the Detector,

based on the already learned Descriptor and Orientation Estimator, diﬀerenti-

ating through the entire network. At test time, we decouple the Detector, which

runs over the whole image in scale space, from the Orientation Estimator and

Descriptor, which process only the keypoints.

In the next section we brieﬂy discuss earlier approaches. We then present our

approach in detail and show that it outperforms many state-of-the-art methods.

2 Related work

The amount of literature relating to local features is immense, but it always

revolves about ﬁnding feature points, computing their orientation, and matching

them. In this section, we will therefore discuss these three elements separately.

2.1 Feature Point Detectors

Research on feature point detection has focused mostly on ﬁnding distinctive

locations whose scale and rotation can be reliably estimated. Early works [13,

Figures are best viewed in color.

剩余16页未读，继续阅读

评论收藏

内容反馈

Zero-lei

粉丝: 7
资源: 1

监督学习论文

8篇半监督学习相关论文

Self-Supervised_Learning_Papers-Code:自我监督学习论文

很棒的对比自我监督学习：很棒的对比自我监督学习论文的完整列表

《对比监督学习》2020综述论文

论文研究-半监督学习在网络入侵分类中的应用研究.pdf

《深度半监督学习》综述论文

ICLR 2021上与【自监督学习】 & 【Transformer】相关的论文

无监督学习方法以及应用

论文研究-基于无监督学习的产品特征抽取.pdf

远程监督学习 PCNN 加论文翻译

TAMU首篇《图神经网络自监督学习》综述论文

《监督学》课程综述-论文.zip

《Language Models are Unsupervised Multitask Learners》

awesome-semi-supervised-learning:令人敬畏的半监督学习论文，方法和资源的最新和精选列表

请看最新8篇ICML 2020投稿论文（包括：自监督学习、联邦学习、图学习、数据隐私、语言模型、终身学习）.zip

模式识别与机器学习_机器学习_模式识别_人工智能_机器学习论文_

无监督异常检测论文集

深度学习论文合集.zip

机器学习经典论文（人工智能）

机器学习相关论文

电子科大最新《深度半监督学习》综述论文（2021版）

论文研究-基于霍夫森林和半监督学习的图像分类.pdf

从顶会看自监督学习最新研究进展

半监督学习领域经典论文MixMatch

整理的计算机视觉/深度学习/机器学习相关方向的论文

深度强化学习论文

最新资源