ContextAwareCFTracking目标跟踪官方源码CVPR2017（含论文原文及补充材料）

共3个文件

pdf：2个

zip：1个

目标跟踪

源码

Tracking

CVPR

需积分: 44 185 浏览量 2017-11-21 10:55:33 上传评论 1 收藏 41.97MB RAR 举报

《Context Aware CF Tracking：CVPR 2017目标跟踪技术深度解析》在计算机视觉领域，目标跟踪是一项至关重要的任务，它涉及在连续的视频帧中定位并追踪特定对象。在CVPR 2017会议上，一项名为“Context Aware CF Tracking”的研究引起了广泛关注。该研究提出了一种新的目标跟踪方法，它融合了相关滤波器和背景上下文信息，显著提高了跟踪性能。本文将深入探讨这项技术的核心概念、实现细节及其应用价值。一、相关滤波器（Correlation Filter）相关滤波器是一种快速且高效的图像处理工具，广泛用于目标检测和跟踪。它通过建立模板与图像块之间的相关性来寻找最佳匹配，从而确定目标的位置。在Context Aware CF Tracking中，作者对传统相关滤波器进行了改进，使其能够更好地适应目标在不同场景中的变化。二、背景上下文信息（Context Information）在目标跟踪中，仅仅依赖目标自身的特征往往不足以应对复杂的环境变化，如遮挡、光照变化等。因此，引入背景上下文信息至关重要。Context Aware CF Tracking方法通过分析目标周围的环境，捕捉到有助于区分目标和背景的线索，提高了跟踪的鲁棒性和准确性。这包括目标边缘信息、邻域像素的相关性以及局部纹理特征等。三、算法流程 1. **初始化阶段**：选择第一帧中的目标作为初始模板，并利用相关滤波器学习其特征。 2. **跟踪阶段**：在后续帧中，算法通过比较当前帧的各个位置与模板的相关性，预测目标的可能位置。同时，结合背景上下文信息，排除非目标区域的干扰。 3. **更新策略**：随着视频的播放，算法不断更新模板，以适应目标外观的变化。这涉及到特征的重新学习和滤波器的优化。 4. **反馈机制**：当跟踪出现偏差时，算法会根据背景信息进行自我校正，确保目标定位的准确性。四、源码解析提供的MATLAB源码是实现这一方法的关键，它包含了数据预处理、模型训练、目标检测和跟踪更新等模块。开发者可以通过阅读和理解代码，进一步了解算法的实现细节，并可以根据实际需求进行修改和扩展。五、补充材料官方补充材料通常包括实验结果、参数设置和对比分析等，这些资料对于理解算法的性能和优势非常有帮助。通过分析这些内容，可以更全面地评估Context Aware CF Tracking在实际应用中的效果。 “Context Aware CF Tracking”为视频目标跟踪提供了一种创新思路，它将相关滤波与背景上下文信息相结合，增强了跟踪的稳定性和准确性。通过对源码的深入研究和实践，开发者可以在自己的项目中应用或借鉴这一技术，提升目标跟踪的质量和效率。

资源推荐

资源详情

资源评论

收起资源包目录

Context-Aware-CF-Tracking.rar （3个子文件）

Context-Aware-CF-Tracking

Context-Aware-CF-Tracking-master.zip 41.31MB

Mueller-2017-Context-Aware Correlation Filter - Supplementary Material.pdf 444KB

Mueller-2017-Context-Aware Correlation Filter.pdf 874KB

Context-Aware Correlation Filter Tracking

Matthias Mueller, Neil Smith, Bernard Ghanem

King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia

{matthias.mueller.2, neil.smith, bernard.ghanem}@kaust.edu.sa

Abstract

Correlation ﬁlter (CF) based trackers have recently

gained a lot of popularity due to their impressive per-

formance on benchmark datasets, while maintaining high

frame rates. A signiﬁcant amount of recent research focuses

on the incorporation of stronger features for a richer repre-

sentation of the tracking target. However, this only helps

to discriminate the target from background within a small

neighborhood. In this paper, we present a framework that

allows the explicit incorporation of global context within

CF trackers. We reformulate the original optimization prob-

lem and provide a closed form solution for single and multi-

dimensional features in the primal and dual domain. Ex-

tensive experiments demonstrate that this framework signif-

icantly improves the performance of many CF trackers with

only a modest impact on frame rate.

1. Introduction

Object tracking remains a core problem in computer

vision with numerous applications, such as surveillance,

human-machine interaction, robotics, etc. Large new

datasets and benchmarks such as OTB-50 [26], OTB-100

[27], TC-128 [19], ALOV300++ [24] and UAV123 [23],

as well as, tracking challenges such as the visual object

tracking (VOT) challenge and multi-object tracking (MOT)

challenge have sparked the interest of many researchers and

helped advance the ﬁeld signiﬁcantly. Despite substantial

progress in recent years, visual object tracking remains a

challenging problem in computer vision.

In this paper, we address the problem of single-object

tracking, which is commonly approached as a tracking-by-

detection problem. Currently, most research focuses on

model-free generic object trackers, where no prior assump-

tions regarding the object appearance are made. The generic

nature of this problem makes it challenging, since there are

very few constraints on object appearance, and the object

can undergo a variety of unpredictable transformations in

consecutive frames (e.g. aspect ratio change, illumination

variation, in/out-of-plane rotation, occlusion, etc.).

The tracking problem can be divided into two main chal-

Figure 1: Tracking results of our context-aware adaptation

of the baseline SAMF tracker, denoted as SAMF

, and a

comparison with recent state-of-the-art tracking algorithms

on the Box and Jump sequences from OTB-100.

lenges, object representation and sampling for detection.

Recently, most successful single-object tracking algorithms

use a discriminative object representation with either strong

hand-crafted features, such as HOG and Colornames, or

learned ones. Recent work has integrated deep features [21]

trained on a large dataset, such as ImageNet, to represent the

tracked object. Sampling on the other hand is a trade-off be-

tween computation time and precise scanning of the region

of interest for the target.

Lately, CF trackers have sparked a lot of interest, due

to their high accuracy while running at high frame rates.

[4, 6, 10, 11, 14]. In general, CF trackers learn a correla-

tion ﬁlter online to localize the object in consecutive frames.

The learned ﬁlter is applied to the region of interest in the

next frame and the location of the maximum response cor-

responds to the object location. The ﬁlter is then updated

by using the new object location. The major reasons be-

hind the success of this tracking paradigm is the approxi-

mate dense sampling performed by circularly shifting the

training samples and the computational efﬁciency of learn-

ing the correlation ﬁlter in the Fourier domain. Provided

that the background is homogeneous and the object does

not move much, these circular shifts are equivalent to actual

translations in the image and this framework works well.

However, since these assumptions are not always valid,

CF trackers have several drawbacks. One major drawback is

2017 IEEE Conference on Computer Vision and Pattern Recognition

DOI 10.1109/CVPR.2017.152

1387

Figure 2: Comparing conventional CF tracking to our proposed context-aware CF tracking.

that there are boundary effects due to the circulant assump-

tion. In addition, the target search region only contains a

small local neighborhood to limit drift and keep computa-

tional cost low. The boundary effects are usually suppressed

by a cosine window, which effectively reduces the search re-

gion even further. Therefore, CF trackers usually have very

limited information about their context and easily drift in

cases of fast motion, occlusion or background clutter. In

order to address this limitation, we propose a framework

that takes global context into account and incorporates it

directly into the learned ﬁlter (see Figure 2). We derive a

closed-form solution for our formulation and propose it as a

framework that can be easily integrated with most CF track-

ers to boost their performance, while maintaining their high

frame rate. As shown in Figure 1, integrating our frame-

work with the mediocre tracker SAMF [18] achieves better

tracking results than state-of-the-art trackers by exploiting

context information. Note, that it even outperforms the very

recent HCFT tracker [21], whose hierarchical convolutional

features implicitly contain context information. We show

through extensive evaluation on several large datasets that

integrating our framework improves all tested CF trackers

and allows top-performing CF trackers to exceed current

state-of-the-art precision and success scores on the well-

known OTB-100 benchmark [27].

2. Related Work

CF Trackers. Since the MOSSE work of Bolme et al. [4],

correlation ﬁlters (CF) have been studied as a robust and

efﬁcient approach to the problem of visual tracking. Ma-

jor improvements to MOSSE include the incorporation of

kernels and HOG features [10], the addition of color name

features [18] or color histograms [1], integration with sparse

tracking [30], adaptive scale [2, 5, 18], mitigation of bound-

ary effects [6], and the integration of deep CNN features

[21]. Currently, CF-based trackers rank at the top of cur-

rent benchmarks, such as OTB-100 [27], UAV123 [23], and

VOT2015 [17], while remaining computationally efﬁcient.

CF Variations and Improvements. Signiﬁcant attention

in recent work has focused on extending CF trackers to ad-

dress inherent limitations. For instance, Liu et al. propose

part-based tracking to reduce sensitivity to partial occlu-

sion and better preserve object structure [20]. The work

of [22] performs long term-tracking that is robust to appear-

ance variation by correlating temporal context and training

an online random fern classiﬁer for re-detection. Zhu et al.

propose a collaborative CF that combines a multi-scale ker-

nelized CF to handle scale variation with an online CUR

ﬁlter to address target drift [31]. These approaches regis-

ter improvements by either combining external classiﬁers to

assist the CF or taking advantage of its high computational

speed to run multiple CF trackers at once.

CF Frameworks. Recent work [2, 3] has found that some

of these inherent limitations can be overcome directly by

modifying the conventional CF model used for training. For

example, by adapting the target response (used for ridge re-

gression in CF) as part of a new formulation, Bibi et al.

signiﬁcantly decrease target drift while remaining compu-

tationally efﬁcient [3]. This method yields a closed-form

solution and can be applied to many CF trackers as a frame-

work. Similarly, this paper also proposes a framework that

makes CF trackers context-aware and increases their per-

formance beyond the improvement attainable by [3], while

being less computationally expensive.

Context Trackers. The use of context for tracking has been

explored in previous work by Dinah et al. [7], where distrac-

tors and supporters are detected and tracked using a sequen-

tial randomized forest, an online template-based appearance

model, and local features. In more recent work, contex-

tual information of a scene is exploited using a multi-level

clustering to detect similar objects and other potential dis-

tractors [28]. A global dynamic constraint is then learned

online to discriminate these distractors from the object of

interest. This approach shows improvement on a subset of

cluttered scenes in OTB-100, where distractors are predom-

inant. However, both of these trackers do not generalize

well and, as a result, their overall performance on current

benchmarks is only average. In contrast, our approach is

more generic and can make use of varying types of con-

textual image regions that may or may not contain distrac-

tors. We show that context awareness enables improvement

across the entire OTB-100 and is not limited to cluttered

scenes, where context can lead to the most improvement.

Contributions. To the best of our knowledge, (i) this is

the ﬁrst context-aware formulation that can be applied as a

framework to most CF trackers. Its closed form solution al-

1388

lows CF trackers to remain computationally efﬁcient, while

signiﬁcantly improving their performance. (ii) Extensive

experiments on several datasets show the effectiveness of

our formulation. All CF trackers beneﬁt from a boost in

performance, while remaining computationally efﬁcient.

3. CF Tracking

Before the detailed discussion of our proposed frame-

work and for completeness, we ﬁrst revisit the details of

conventional CF tracking. CF trackers use discriminative

learning at their core. The goal is to learn a discriminative

correlation ﬁlter that can be applied to the region of inter-

est in consecutive frames to infer the location of the target

(i.e. location of maximum ﬁlter response). The key contri-

bution leading to the popularity and success of CF trackers

is their sampling method. Due to computational constraints,

it is common practice to randomly pick a limited number of

negative samples around the target. The sophistication of

the sampling strategy and the number of negative samples

can have a signiﬁcant impact on tracking performance. CF

trackers allow for dense sampling around the target at very

low computational cost. This is achieved by modeling all

possible translations of the target within a search window

as circulant shifts and concatenating them to form the data

matrix A

. The circulant structure of this matrix facilitates

a very efﬁcient solution to the following ridge regression

problem in the Fourier domain.

min

||A

w − y||

+ λ

||w||

(1)

Here, the learned correlation ﬁlter is denoted by the vec-

tor w. The square matrix A

contains all circulant shifts of

the vectorized image patch a

and the regression target y is

a vectorized image of a 2D Gaussian.

Notation. We denote the j

component of vector x as

x(j). We denote its conjugate by x

∗

and its Fourier trans-

form F

x by ˆx, where F is the DFT matrix. The following

identity for circulant matrices is the key ingredient for solv-

ing Eq. (1) efﬁciently:

X = F diag(ˆx) F

and X

= F diag(ˆx

∗

) F

(2)

3.1. Solution in the Primal Domain

The objective in Eq. (1) is convex and has a unique

global minimum. Equating its gradient to zero leads to

a closed-form solution for the ﬁlter: w =(A

−1

y. Since A

is circulant, it can be diagonalized

using Eq. (2) and matrix inversion can be done efﬁciently

in the Fourier domain [10]:

ˆw =

ˆa

∗

 ˆy

ˆa

∗

 ˆa

+ λ

(3)

Detection formula. The learned ﬁlter w is convolved

with image patch z (search window) in the next frame,

where Z denotes its circulant matrix. The location of the

maximum response is the target location within the search

window. The primal detection formula is given by:

(w, Z)=Zw ⇔ ˆr

= ˆz  ˆw

(4)

3.2. Solution in the Dual Domain

Eq. (1) can also be solved in the dual domain using

the dual variable α, which relates to the primal variable

through w = A

α. The dual closed-form solution is:

α =(A

)

−1

y. Similar to the primal domain, it can

be computed efﬁciently in the Fourier domain [10]:

ˆα =

ˆy

ˆa

∗

 ˆa

+ λ

(5)

Since the solution can be written as a function of bi-

products, the kernel trick can also be applied allowing the

use of kernels in the dual domain [11].

Detection formula. The dual variable α can be used di-

rectly for detection by expressing it in terms of the primal

variable. This leads to the following dual detection formula:

(α, A

, Z)=ZA

α ⇔ ˆr

= ˆz  ˆa

∗

 ˆα

(6)

4. Context-Aware CF Tracking

The surroundings of the tracked object can have a big

impact on tracking performance. For example, if there is

a lot of background clutter, context is very important for

successful tracking. Therefore, we propose a framework

for CF trackers that adds contextual information to the ﬁlter

during the learning stage (Figure 2).

In every frame, we sample k context patches a

∈ R

around the object of interest a

∈ R

according to the sam-

pling strategy in Sec. 4.3. Their corresponding circulant

matrices are A

∈ R

n×n

and A

∈ R

n×n

, respectively.

These context patches can be viewed as hard negative sam-

ples. They contain global context in the form of various

distractors and diverse background. Intuitively speaking,

we want to learn a ﬁlter w ∈ R

that has a high response

for the target patch and close to zero response for context

patches (Figure 2). We encourage this by adding the context

patches as a regularizer to the standard formulation (see Eq.

(7)). As a result, the target patch is regressed to y like in the

standard formulation (Eq. (1)), while the context patches

are regressed to zeros controlled by the parameter λ

min

A

w − y

+ λ

w

+ λ



i=1

A

w

(7)

Note, that there are other possible choices for incorporat-

ing the context term (e.g. hinge loss). This would enforce a

1389

论文的策略是：将负样本作为

正则项，加到标准相关滤波表

达式的后边

评论收藏

内容反馈

越野者

粉丝: 437
资源: 46

Context Aware CF Tracking目标跟踪官方源码CVPR 2017（含论文原文及补充材料）

最新资源

Context Aware CF Tracking目标跟踪官方源码CVPR 2017（含论文原文及补充材料）

Context-Aware-CF-Tracking

cf_tracking, 基于 C 的两种相关滤波器的实现.zip

CVPR2017 PAPER

实时目标跟踪经典算法BACF

matlab下SRDCF跟踪算法代码，可在OTB下运行

CVPR 2022 LaTex 模板

视频目标跟踪算法

2018年及以前经典实时目标跟踪算法代码合集

Context-aware-Visual-Tracking.zip_Aware_Context-aware_context aw

CVPR2022 Image Dehazing Transformer with Transmission-Aware 3D代码

Context-Aware Saliency Detection论文代码

Context_aware Sequential Recommender

2048的matlab源代码-RPCF:论文``用于视觉跟踪的ROI池相关滤波器''的代码（CVPR2019）

CSR-DCF跟踪源码

SRDCF_code.zip

SRDCF运行结果文件

视觉跟踪算法SRDCF-matlab代码

论文“Context-Aware Saliency Detection”的matlab代码及论文

通信网技术Context-aware composition of mobile service

显著性检测Context-Aware Saliency Detection自己编写的matlab代码

Context-Aware Saliency Detection的Matlab代码

cef_binary_74.1.19+gb62bacf+chromium-74.0.3729.157_windows32 最新demo

BA算法[实现

SCI论文模板：CVPR、IEEE、TPAMI、ICCV等通用模板

2017CVPR Person Re-id

Context Aware Recommendations at Netflix

最新资源