ContextAwareCFTracking目标跟踪官方源码CVPR2017（含论文原文及补充材料）资源-CSDN文库

共3个文件

pdf：2个

zip：1个

目标跟踪

源码

Tracking

CVPR

需积分: 44 65 浏览量 2017-11-21 10:55:33 上传评论 1 收藏 41.97MB RAR 举报

资源推荐

资源详情

资源评论

收起资源包目录

Context-Aware-CF-Tracking.rar （3个子文件）

Context-Aware-CF-Tracking

Context-Aware-CF-Tracking-master.zip 41.31MB

Mueller-2017-Context-Aware Correlation Filter - Supplementary Material.pdf 444KB

Mueller-2017-Context-Aware Correlation Filter.pdf 874KB

Context-Aware Correlation Filter Tracking

Matthias Mueller, Neil Smith, Bernard Ghanem

King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia

{matthias.mueller.2, neil.smith, bernard.ghanem}@kaust.edu.sa

Abstract

Correlation ﬁlter (CF) based trackers have recently

gained a lot of popularity due to their impressive per-

formance on benchmark datasets, while maintaining high

frame rates. A signiﬁcant amount of recent research focuses

on the incorporation of stronger features for a richer repre-

sentation of the tracking target. However, this only helps

to discriminate the target from background within a small

neighborhood. In this paper, we present a framework that

allows the explicit incorporation of global context within

CF trackers. We reformulate the original optimization prob-

lem and provide a closed form solution for single and multi-

dimensional features in the primal and dual domain. Ex-

tensive experiments demonstrate that this framework signif-

icantly improves the performance of many CF trackers with

only a modest impact on frame rate.

1. Introduction

Object tracking remains a core problem in computer

vision with numerous applications, such as surveillance,

human-machine interaction, robotics, etc. Large new

datasets and benchmarks such as OTB-50 [26], OTB-100

[27], TC-128 [19], ALOV300++ [24] and UAV123 [23],

as well as, tracking challenges such as the visual object

tracking (VOT) challenge and multi-object tracking (MOT)

challenge have sparked the interest of many researchers and

helped advance the ﬁeld signiﬁcantly. Despite substantial

progress in recent years, visual object tracking remains a

challenging problem in computer vision.

In this paper, we address the problem of single-object

tracking, which is commonly approached as a tracking-by-

detection problem. Currently, most research focuses on

model-free generic object trackers, where no prior assump-

tions regarding the object appearance are made. The generic

nature of this problem makes it challenging, since there are

very few constraints on object appearance, and the object

can undergo a variety of unpredictable transformations in

consecutive frames (e.g. aspect ratio change, illumination

variation, in/out-of-plane rotation, occlusion, etc.).

The tracking problem can be divided into two main chal-

Figure 1: Tracking results of our context-aware adaptation

of the baseline SAMF tracker, denoted as SAMF

, and a

comparison with recent state-of-the-art tracking algorithms

on the Box and Jump sequences from OTB-100.

lenges, object representation and sampling for detection.

Recently, most successful single-object tracking algorithms

use a discriminative object representation with either strong

hand-crafted features, such as HOG and Colornames, or

learned ones. Recent work has integrated deep features [21]

trained on a large dataset, such as ImageNet, to represent the

tracked object. Sampling on the other hand is a trade-off be-

tween computation time and precise scanning of the region

of interest for the target.

Lately, CF trackers have sparked a lot of interest, due

to their high accuracy while running at high frame rates.

[4, 6, 10, 11, 14]. In general, CF trackers learn a correla-

tion ﬁlter online to localize the object in consecutive frames.

The learned ﬁlter is applied to the region of interest in the

next frame and the location of the maximum response cor-

responds to the object location. The ﬁlter is then updated

by using the new object location. The major reasons be-

hind the success of this tracking paradigm is the approxi-

mate dense sampling performed by circularly shifting the

training samples and the computational efﬁciency of learn-

ing the correlation ﬁlter in the Fourier domain. Provided

that the background is homogeneous and the object does

not move much, these circular shifts are equivalent to actual

translations in the image and this framework works well.

However, since these assumptions are not always valid,

CF trackers have several drawbacks. One major drawback is

2017 IEEE Conference on Computer Vision and Pattern Recognition

DOI 10.1109/CVPR.2017.152

1387

Figure 2: Comparing conventional CF tracking to our proposed context-aware CF tracking.

that there are boundary effects due to the circulant assump-

tion. In addition, the target search region only contains a

small local neighborhood to limit drift and keep computa-

tional cost low. The boundary effects are usually suppressed

by a cosine window, which effectively reduces the search re-

gion even further. Therefore, CF trackers usually have very

limited information about their context and easily drift in

cases of fast motion, occlusion or background clutter. In

order to address this limitation, we propose a framework

that takes global context into account and incorporates it

directly into the learned ﬁlter (see Figure 2). We derive a

closed-form solution for our formulation and propose it as a

framework that can be easily integrated with most CF track-

ers to boost their performance, while maintaining their high

frame rate. As shown in Figure 1, integrating our frame-

work with the mediocre tracker SAMF [18] achieves better

tracking results than state-of-the-art trackers by exploiting

context information. Note, that it even outperforms the very

recent HCFT tracker [21], whose hierarchical convolutional

features implicitly contain context information. We show

through extensive evaluation on several large datasets that

integrating our framework improves all tested CF trackers

and allows top-performing CF trackers to exceed current

state-of-the-art precision and success scores on the well-

known OTB-100 benchmark [27].

2. Related Work

CF Trackers. Since the MOSSE work of Bolme et al. [4],

correlation ﬁlters (CF) have been studied as a robust and

efﬁcient approach to the problem of visual tracking. Ma-

jor improvements to MOSSE include the incorporation of

kernels and HOG features [10], the addition of color name

features [18] or color histograms [1], integration with sparse

tracking [30], adaptive scale [2, 5, 18], mitigation of bound-

ary effects [6], and the integration of deep CNN features

[21]. Currently, CF-based trackers rank at the top of cur-

rent benchmarks, such as OTB-100 [27], UAV123 [23], and

VOT2015 [17], while remaining computationally efﬁcient.

CF Variations and Improvements. Signiﬁcant attention

in recent work has focused on extending CF trackers to ad-

dress inherent limitations. For instance, Liu et al. propose

part-based tracking to reduce sensitivity to partial occlu-

sion and better preserve object structure [20]. The work

of [22] performs long term-tracking that is robust to appear-

ance variation by correlating temporal context and training

an online random fern classiﬁer for re-detection. Zhu et al.

propose a collaborative CF that combines a multi-scale ker-

nelized CF to handle scale variation with an online CUR

ﬁlter to address target drift [31]. These approaches regis-

ter improvements by either combining external classiﬁers to

assist the CF or taking advantage of its high computational

speed to run multiple CF trackers at once.

CF Frameworks. Recent work [2, 3] has found that some

of these inherent limitations can be overcome directly by

modifying the conventional CF model used for training. For

example, by adapting the target response (used for ridge re-

gression in CF) as part of a new formulation, Bibi et al.

signiﬁcantly decrease target drift while remaining compu-

tationally efﬁcient [3]. This method yields a closed-form

solution and can be applied to many CF trackers as a frame-

work. Similarly, this paper also proposes a framework that

makes CF trackers context-aware and increases their per-

formance beyond the improvement attainable by [3], while

being less computationally expensive.

Context Trackers. The use of context for tracking has been

explored in previous work by Dinah et al. [7], where distrac-

tors and supporters are detected and tracked using a sequen-

tial randomized forest, an online template-based appearance

model, and local features. In more recent work, contex-

tual information of a scene is exploited using a multi-level

clustering to detect similar objects and other potential dis-

tractors [28]. A global dynamic constraint is then learned

online to discriminate these distractors from the object of

interest. This approach shows improvement on a subset of

cluttered scenes in OTB-100, where distractors are predom-

inant. However, both of these trackers do not generalize

well and, as a result, their overall performance on current

benchmarks is only average. In contrast, our approach is

more generic and can make use of varying types of con-

textual image regions that may or may not contain distrac-

tors. We show that context awareness enables improvement

across the entire OTB-100 and is not limited to cluttered

scenes, where context can lead to the most improvement.

Contributions. To the best of our knowledge, (i) this is

the ﬁrst context-aware formulation that can be applied as a

framework to most CF trackers. Its closed form solution al-

1388

lows CF trackers to remain computationally efﬁcient, while

signiﬁcantly improving their performance. (ii) Extensive

experiments on several datasets show the effectiveness of

our formulation. All CF trackers beneﬁt from a boost in

performance, while remaining computationally efﬁcient.

3. CF Tracking

Before the detailed discussion of our proposed frame-

work and for completeness, we ﬁrst revisit the details of

conventional CF tracking. CF trackers use discriminative

learning at their core. The goal is to learn a discriminative

correlation ﬁlter that can be applied to the region of inter-

est in consecutive frames to infer the location of the target

(i.e. location of maximum ﬁlter response). The key contri-

bution leading to the popularity and success of CF trackers

is their sampling method. Due to computational constraints,

it is common practice to randomly pick a limited number of

negative samples around the target. The sophistication of

the sampling strategy and the number of negative samples

can have a signiﬁcant impact on tracking performance. CF

trackers allow for dense sampling around the target at very

low computational cost. This is achieved by modeling all

possible translations of the target within a search window

as circulant shifts and concatenating them to form the data

matrix A

. The circulant structure of this matrix facilitates

a very efﬁcient solution to the following ridge regression

problem in the Fourier domain.

min

||A

w − y||

+ λ

||w||

(1)

Here, the learned correlation ﬁlter is denoted by the vec-

tor w. The square matrix A

contains all circulant shifts of

the vectorized image patch a

and the regression target y is

a vectorized image of a 2D Gaussian.

Notation. We denote the j

component of vector x as

x(j). We denote its conjugate by x

∗

and its Fourier trans-

form F

x by ˆx, where F is the DFT matrix. The following

identity for circulant matrices is the key ingredient for solv-

ing Eq. (1) efﬁciently:

X = F diag(ˆx) F

and X

= F diag(ˆx

∗

) F

(2)

3.1. Solution in the Primal Domain

The objective in Eq. (1) is convex and has a unique

global minimum. Equating its gradient to zero leads to

a closed-form solution for the ﬁlter: w =(A

−1

y. Since A

is circulant, it can be diagonalized

using Eq. (2) and matrix inversion can be done efﬁciently

in the Fourier domain [10]:

ˆw =

ˆa

∗

 ˆy

ˆa

∗

 ˆa

+ λ

(3)

Detection formula. The learned ﬁlter w is convolved

with image patch z (search window) in the next frame,

where Z denotes its circulant matrix. The location of the

maximum response is the target location within the search

window. The primal detection formula is given by:

(w, Z)=Zw ⇔ ˆr

= ˆz  ˆw

(4)

3.2. Solution in the Dual Domain

Eq. (1) can also be solved in the dual domain using

the dual variable α, which relates to the primal variable

through w = A

α. The dual closed-form solution is:

α =(A

)

−1

y. Similar to the primal domain, it can

be computed efﬁciently in the Fourier domain [10]:

ˆα =

ˆy

ˆa

∗

 ˆa

+ λ

(5)

Since the solution can be written as a function of bi-

products, the kernel trick can also be applied allowing the

use of kernels in the dual domain [11].

Detection formula. The dual variable α can be used di-

rectly for detection by expressing it in terms of the primal

variable. This leads to the following dual detection formula:

(α, A

, Z)=ZA

α ⇔ ˆr

= ˆz  ˆa

∗

 ˆα

(6)

4. Context-Aware CF Tracking

The surroundings of the tracked object can have a big

impact on tracking performance. For example, if there is

a lot of background clutter, context is very important for

successful tracking. Therefore, we propose a framework

for CF trackers that adds contextual information to the ﬁlter

during the learning stage (Figure 2).

In every frame, we sample k context patches a

∈ R

around the object of interest a

∈ R

according to the sam-

pling strategy in Sec. 4.3. Their corresponding circulant

matrices are A

∈ R

n×n

and A

∈ R

n×n

, respectively.

These context patches can be viewed as hard negative sam-

ples. They contain global context in the form of various

distractors and diverse background. Intuitively speaking,

we want to learn a ﬁlter w ∈ R

that has a high response

for the target patch and close to zero response for context

patches (Figure 2). We encourage this by adding the context

patches as a regularizer to the standard formulation (see Eq.

(7)). As a result, the target patch is regressed to y like in the

standard formulation (Eq. (1)), while the context patches

are regressed to zeros controlled by the parameter λ

min

A

w − y

+ λ

w

+ λ



i=1

A

w

(7)

Note, that there are other possible choices for incorporat-

ing the context term (e.g. hinge loss). This would enforce a

1389

论文的策略是：将负样本作为

正则项，加到标准相关滤波表

达式的后边

评论收藏

内容反馈

越野者

粉丝: 434
资源: 46

Context Aware CF Tracking目标跟踪官方源码CVPR 2017（含论文原文及补充材料）

最新资源

Context Aware CF Tracking目标跟踪官方源码CVPR 2017（含论文原文及补充材料）

Context-Aware-CF-Tracking

cf_tracking, 基于 C 的两种相关滤波器的实现.zip

CVPR2017 PAPER

实时目标跟踪经典算法BACF

STRCF算法-matlab实现、论文及原理解读.rar

matlab下SRDCF跟踪算法代码，可在OTB下运行

目标跟踪算法KCF加入APCE评价标准的matlab源代码

cef_binary_74.1.19+gb62bacf+chromium-74.0.3729.157_windows32 最新demo

ADNet视频目标跟踪算法源码（MATLAB源码+TensorFlow源码+论文+官方补充材料）代

Context-aware-Visual-Tracking.zip_Aware_Context-aware_context aw

CVPR2022 Image Dehazing Transformer with Transmission-Aware 3D代码

Context-Aware Saliency Detection论文代码

Context_aware Sequential Recommender

CSR-DCF跟踪源码

视频目标跟踪算法

2018年及以前经典实时目标跟踪算法代码合集

BA算法[实现

（STRCF）相关滤波目标跟踪STRCF代码（matlab版本）

论文“Context-Aware Saliency Detection”的matlab代码及论文

3D Photography Using Context-Aware Layered Depth Inpainting.pdf

通信网技术Context-aware composition of mobile service

显著性检测Context-Aware Saliency Detection自己编写的matlab代码

Context-Aware Saliency Detection的Matlab代码

2048的matlab源代码-RPCF:论文``用于视觉跟踪的ROI池相关滤波器''的代码（CVPR2019）

SRDCF_code.zip

SRDCF运行结果文件

视觉跟踪算法SRDCF-matlab代码

2017CVPR Person Re-id

最新资源